| projectdiscovery/katana |
7,995 |
|
0 |
1 |
about 2 years ago |
8 |
September 14, 2023 |
67 |
mit |
Go |
| A next-generation crawling and spidering framework. |
| MontFerret/ferret |
5,540 |
|
0 |
5 |
over 2 years ago |
56 |
March 28, 2023 |
52 |
apache-2.0 |
Go |
| Declarative web scraping |
| Gerapy/Gerapy |
3,144 |
|
8 |
0 |
over 2 years ago |
49 |
July 19, 2023 |
60 |
mit |
Python |
| Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Django and Vue.js |
| transitive-bullshit/awesome-puppeteer |
2,245 |
|
0 |
0 |
over 2 years ago |
0 |
|
19 |
|
|
| A curated list of awesome puppeteer resources. |
| HFrost0/bilix |
1,433 |
|
0 |
1 |
about 2 years ago |
77 |
July 17, 2023 |
20 |
apache-2.0 |
Python |
| ⚡️Lightning-fast async download tool for bilibili and more | 快如闪电的异步下载工具,支持bilibili及更多 |
| PuerkitoBio/fetchbot |
758 |
|
0 |
3 |
almost 5 years ago |
7 |
May 20, 2021 |
2 |
bsd-3-clause |
Go |
| A simple and flexible web crawler that follows the robots.txt policies and crawl delays. |
| ma6254/FictionDown |
601 |
|
0 |
0 |
over 2 years ago |
5 |
February 17, 2020 |
3 |
gpl-3.0 |
Go |
| 小说下载|小说爬取|起点|笔趣阁|导出Markdown|导出txt|转换epub|广告过滤|自动校对 |
| Florents-Tselai/WarcDB |
380 |
|
0 |
0 |
over 2 years ago |
4 |
October 22, 2023 |
7 |
apache-2.0 |
Python |
| WarcDB: Web crawl data as SQLite databases. |
| lgraubner/sitemap-generator-cli |
259 |
|
7 |
2 |
over 3 years ago |
30 |
January 21, 2020 |
29 |
mit |
JavaScript |
| Creates an XML-Sitemap by crawling a given site. |
| eight04/ComicCrawler |
251 |
|
0 |
0 |
about 2 years ago |
175 |
December 10, 2023 |
25 |
|
Python |
| An image crawler written in Python. |