| crawlab-team/crawlab |
10,521 |
|
0 |
0 |
over 2 years ago |
1 |
March 03, 2019 |
58 |
bsd-3-clause |
Go |
| Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架 |
| webrecorder/browsertrix-crawler |
470 |
|
0 |
0 |
about 2 years ago |
0 |
|
91 |
agpl-3.0 |
JavaScript |
| Run a high-fidelity browser-based crawler in a single Docker container |
| openaustralia/morph |
454 |
|
0 |
0 |
over 3 years ago |
0 |
|
351 |
agpl-3.0 |
Ruby |
| Take the hassle out of web scraping |
| rivermont/spidy |
287 |
|
0 |
0 |
almost 4 years ago |
12 |
January 25, 2018 |
11 |
gpl-3.0 |
Python |
| The simple, easy to use command line web crawler. |
| openzim/zimit |
209 |
|
0 |
0 |
about 2 years ago |
0 |
|
31 |
gpl-3.0 |
Python |
| Make a ZIM file from any Web site and surf offline! |
| siegfried415/portia-dashboard |
190 |
|
0 |
0 |
about 8 years ago |
0 |
|
6 |
other |
Python |
| portia-dashboard is a visual web crawler based on scrapinghub/portia |
| DedSecInside/gotor |
146 |
|
0 |
0 |
over 2 years ago |
3 |
November 10, 2022 |
3 |
gpl-3.0 |
Go |
| This program provides efficient web scraping services for Tor and non-Tor sites. The program has both a CLI and REST API. |
| bitmakerla/estela |
142 |
|
0 |
0 |
about 2 years ago |
0 |
|
10 |
mit |
TypeScript |
| estela, an elastic web scraping cluster 🕸 |
| hrbrmstr/splashr |
88 |
|
0 |
0 |
about 6 years ago |
0 |
|
13 |
other |
R |
| :sweat_drops: Tools to Work with the 'Splash' JavaScript Rendering Service in R |
| amerkurev/scrapper |
83 |
|
0 |
0 |
about 2 years ago |
0 |
|
0 |
apache-2.0 |
JavaScript |
| Web scraper with a simple REST API living in Docker and using a Headless browser and Readability.js for parsing. |