| damklis/DataEngineeringProject |
644 |
|
0 |
0 |
over 3 years ago |
0 |
|
4 |
mit |
Python |
| Example end to end data engineering project. |
| dirtyfilthy/freshonions-torscraper |
313 |
|
0 |
0 |
almost 6 years ago |
0 |
|
22 |
agpl-3.0 |
Python |
| Fresh Onions is an open source TOR spider / hidden service onion crawler hosted at zlal32teyptf4tvi.onion |
| infinitbyte/gopa |
281 |
|
0 |
0 |
almost 5 years ago |
6 |
May 19, 2021 |
11 |
other |
Go |
| [WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn |
| hemin1003/java-spider |
276 |
|
0 |
0 |
almost 5 years ago |
0 |
|
6 |
|
Java |
| 一个基于webmagic框架二次开发的java爬虫框架实战,已实现能爬取腾讯,搜狐,今日头条(单独集成功能)等资讯内容,配合elasticsearch框架用法,实现了自动爬虫,已投入线上生产使用。 |
| simon987/od-database |
113 |
|
0 |
0 |
about 6 years ago |
0 |
|
5 |
mit |
Python |
| Distributed crawler, database and web frontend for public directories indexing |
| AlphaReign/scraper |
109 |
|
0 |
0 |
about 3 years ago |
0 |
|
8 |
mit |
JavaScript |
| AlphaReigns DHT Scraper, includes peer updater and categorizer |
| tsurupin/job_search |
85 |
|
0 |
0 |
almost 9 years ago |
0 |
|
1 |
mit |
Elixir |
| An app to search startup jobs scraped from websites written in Elixir, Phoenix, React and styled-components. |
| AlphaReign/AlphaReign |
50 |
|
0 |
0 |
about 9 years ago |
0 |
|
1 |
|
|
| Docs, About, Etc |
| jpryda/facebook-multi-scraper |
48 |
|
0 |
0 |
over 8 years ago |
0 |
|
5 |
|
Python |
| Multi-threaded Facebook scraper for social analytics of public and owned pages |
| StanGirard/TrollHunter |
27 |
|
0 |
0 |
about 5 years ago |
15 |
March 27, 2020 |
4 |
gpl-3.0 |
Jupyter Notebook |
| Twitter Troll & Fake News Hunter - Crawls news websites and twitter to identify fake news |