| apify/crawlee |
11,229 |
|
0 |
42 |
about 2 years ago |
747 |
December 10, 2023 |
96 |
apache-2.0 |
TypeScript |
| Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation. |
| yujiosaka/headless-chrome-crawler |
5,051 |
|
10 |
12 |
over 4 years ago |
21 |
June 11, 2018 |
28 |
mit |
JavaScript |
| Distributed crawler powered by Headless Chrome |
| niespodd/browser-fingerprinting |
3,353 |
|
0 |
0 |
about 3 years ago |
0 |
|
7 |
|
JavaScript |
| Analysis of Bot Protection systems with available countermeasures 🚿. How to defeat anti-bot system 👻 and get around browser fingerprinting scripts 🕵️♂️ when scraping the web? |
| emadehsan/thal |
2,268 |
|
0 |
0 |
over 5 years ago |
0 |
|
0 |
mit |
JavaScript |
| Getting started with Puppeteer and Chrome Headless for Web Scraping |
| transitive-bullshit/awesome-puppeteer |
2,245 |
|
0 |
0 |
over 2 years ago |
0 |
|
19 |
|
|
| A curated list of awesome puppeteer resources. |
| apify/fingerprint-suite |
587 |
|
0 |
13 |
about 2 years ago |
87 |
December 01, 2023 |
26 |
apache-2.0 |
TypeScript |
| Browser fingerprinting tools for anonymizing your scrapers. Developed by Apify. |
| ulixee/secret-agent |
575 |
|
0 |
32 |
about 3 years ago |
33 |
May 02, 2021 |
43 |
mit |
TypeScript |
| The web scraper that's nearly impossible to block - now called @ulixee/hero |
| linvo-io/linvo-scraper |
553 |
|
0 |
2 |
over 2 years ago |
6 |
October 09, 2022 |
14 |
mit |
TypeScript |
| Linkedin Automation Bot with every possible scraping! Valid for 2022 used by Linvo.io |
| fanyong920/jvppeteer |
549 |
|
0 |
0 |
almost 3 years ago |
15 |
October 30, 2021 |
67 |
apache-2.0 |
Java |
| Headless Chrome For Java (Java 爬虫) |
| NikolaiT/se-scraper |
477 |
|
2 |
1 |
about 4 years ago |
68 |
November 08, 2019 |
37 |
apache-2.0 |
HTML |
| Javascript scraping module based on puppeteer for many different search engines... |