| crawlab-team/crawlab |
10,521 |
|
0 |
0 |
over 2 years ago |
1 |
March 03, 2019 |
58 |
bsd-3-clause |
Go |
| Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架 |
| rmax/scrapy-redis |
5,392 |
|
176 |
21 |
over 2 years ago |
18 |
July 26, 2022 |
29 |
mit |
Python |
| Redis-based components for Scrapy. |
| SpiderClub/haipproxy |
5,329 |
|
1 |
0 |
over 3 years ago |
7 |
June 18, 2018 |
44 |
mit |
Python |
| :sparkling_heart: High available distributed ip proxy pool, powerd by Scrapy and Redis |
| gnemoug/distribute_crawler |
3,176 |
|
0 |
0 |
almost 9 years ago |
0 |
|
26 |
|
Python |
| 使用scrapy,redis, mongodb,graphite实现的一个分布式网络爬虫,底层存储mongodb集群,分布式使用redis实现,爬虫状态显示使用graphite实现 |
| istresearch/scrapy-cluster |
1,137 |
|
18 |
2 |
over 2 years ago |
15 |
December 23, 2020 |
17 |
mit |
Python |
| This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster. |
| mtianyan/FunpySpiderSearchEngine |
862 |
|
0 |
0 |
about 4 years ago |
0 |
|
3 |
mit |
Python |
| Word2vec 千人千面 个性化搜索 + Scrapy2.3.0(爬取数据) + ElasticSearch7.9.1(存储数据并提供对外Restful API) + Django3.1.1 搜索 |
| lb2281075105/Python-Spider |
680 |
|
0 |
0 |
over 3 years ago |
0 |
|
0 |
apache-2.0 |
Python |
| 豆瓣电影top250、斗鱼爬取json数据以及爬取美女图片、淘宝、有缘、CrawlSpider爬取红娘网相亲人的部分基本信息以及红娘网分布式爬取和存储redis、爬虫小demo、Selenium、爬取多点、django开发接口、爬取有缘网信息、模拟知乎登录、模拟github登录、模拟图虫网登录、爬取多点商城整站数据、爬取微信公众号历史文章、爬取微信群或者微信好友分享的文章、itchat监听指定微信公众号分享的文章 |
| TurboWay/spiderman |
498 |
|
0 |
0 |
about 3 years ago |
0 |
|
3 |
mit |
Python |
| 基于 scrapy-redis 的通用分布式爬虫框架 |
| zhangslob/awesome_crawl |
206 |
|
0 |
0 |
about 6 years ago |
0 |
|
0 |
|
Python |
| 腾讯新闻、知乎话题、微博粉丝,Tumblr爬虫、斗鱼弹幕、妹子图爬虫、分布式设计等 |
| guapier/zi5book |
183 |
|
0 |
0 |
about 7 years ago |
0 |
|
0 |
|
Python |
| book.zi5.me全站kindle电子书籍爬取,按照作者书籍名分类,每本书有mobi和equb两种格式,采用分布式进行全站爬取 |