| vector4wang/spring-boot-quick |
2,282 |
|
0 |
0 |
over 2 years ago |
0 |
|
13 |
|
Java |
| :herb: 基于springboot的快速学习示例,整合自己遇到的开源框架,如:rabbitmq(延迟队列)、Kafka、jpa、redies、oauth2、swagger、jsp、docker、k3s、k3d、k8s、mybatis加解密插件、异常处理、日志输出、多模块开发、多环境打包、缓存cache、爬虫、jwt、GraphQL、dubbo、zookeeper和Async等等:pushpin: |
| USCDataScience/sparkler |
401 |
|
0 |
0 |
about 3 years ago |
0 |
|
55 |
apache-2.0 |
Java |
| Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark. |
| commoncrawl/cc-pyspark |
280 |
|
0 |
0 |
about 3 years ago |
0 |
|
4 |
mit |
Python |
| Process Common Crawl data with Python and Spark |
| zhangslob/docs |
102 |
|
0 |
0 |
almost 7 years ago |
0 |
|
3 |
|
|
| 《数据采集从入门到放弃》源码。内容简介:爬虫介绍、就业情况、爬虫工程师面试题 ;HTTP协议介绍; Requests使用 ;解析器Xpath介绍; MongoDB与MySQL; 多线程爬虫; Scrapy介绍 ;Scrapy-redis介绍; 使用docker部署; 使用nomad管理docker集群; 使用EFK查询docker日志 |
| commoncrawl/cc-index-table |
78 |
|
0 |
0 |
over 2 years ago |
0 |
|
8 |
apache-2.0 |
Java |
| Index Common Crawl archives in tabular format |
| CI-Research/KeywordAnalysis |
33 |
|
0 |
0 |
over 7 years ago |
0 |
|
0 |
|
|
| Word analysis, by domain, on the Common Crawl data set for the purpose of finding industry trends |
| YBIGTA/EngineeringTeam |
32 |
|
0 |
0 |
over 7 years ago |
0 |
|
2 |
|
|
| 와이빅타 엔지니어링팀의 자료를 정리해두는 곳입니다. |
| youhusky/Search_Ads_Web_Service |
27 |
|
0 |
0 |
over 8 years ago |
0 |
|
0 |
|
Java |
| Online search advertisement platform & Realtime Campaign Monitoring [Maybe Deprecated] |
| huntingzhu/Steam_Recommendation_System |
25 |
|
0 |
0 |
over 8 years ago |
0 |
|
0 |
|
Jupyter Notebook |
| Recommendation System, Collaborative Filtering, Spark, Hive, Flask, Web Crawler, AWS EC2, AWS RDS |
| r-spark/sparkwarc |
13 |
|
0 |
0 |
over 4 years ago |
4 |
January 11, 2022 |
0 |
apache-2.0 |
WebAssembly |
| Load WARC files into Apache Spark with sparklyr |