| USCDataScience/sparkler |
401 |
|
0 |
0 |
about 3 years ago |
0 |
|
55 |
apache-2.0 |
Java |
| Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark. |
| tmaciejewski/see |
27 |
|
0 |
0 |
almost 4 years ago |
0 |
|
0 |
gpl-3.0 |
Erlang |
| Search Engine in Erlang |
| chenxi-shi/Information-Retrieval |
15 |
|
0 |
0 |
over 3 years ago |
0 |
|
0 |
|
Python |
| Elasticsearch, MongoDB, Tornado Server, RESTful API, Python, Information Retrieval, Machine Learning, Web Crawler |
| peterbencze/serritor |
13 |
|
2 |
0 |
almost 6 years ago |
12 |
June 11, 2020 |
0 |
apache-2.0 |
Java |
| Serritor is an open source web crawler framework built upon Selenium and written in Java. It can be used to crawl dynamic web pages that require JavaScript to render data. |
| KKodiac/Covid19_Stats |
9 |
|
0 |
0 |
almost 5 years ago |
0 |
|
0 |
gpl-3.0 |
Python |
| 코로나-19 에 대한 확진/완치/사망 에 대한 국내, 해외 정보를 수집합니다. Data scrapes Covid-19 Confirmed/Cured/Deceases Cases. |
| nglthu/infoRetrieval |
8 |
|
0 |
0 |
about 7 years ago |
0 |
|
0 |
mit |
HTML |
| Inverted Indexer, web crawler, sort, search and poster steamer written using Python for information retrieval. |
| mirkomantovani/web-search-engine-UIC |
6 |
|
0 |
0 |
over 7 years ago |
0 |
|
0 |
|
Python |
| CS 582 Information Retrieval at University of Illinois at Chicago. Multithreaded crawling of UIC domain, inverted index, page rank, SEO with Context Pseudo-Relevance Feedback |
| IlyasHabeeb/Machine_Learning_Focused_Crawler |
5 |
|
0 |
0 |
over 7 years ago |
0 |
|
0 |
|
Python |
| A focused web crawler that uses Machine Learning to fetch better relevant results. |
| nasa-jpl-memex/sce |
5 |
|
0 |
0 |
over 8 years ago |
0 |
|
35 |
apache-2.0 |
Shell |
| Sparkler Crawl Environment - a packaged, dockerized version of http://github.com/USCDataScience/sparkler.git |