Miner Alternatives

Name: yoozi/miner
Brand: yoozi/miner
SKU: project/yoozi/miner
Rating: 4.42 (13 reviews)

Miner is a PHP library that extracting metadata and interesting text content (like author, summary, and etc.) from HTML pages. It acts like a simplified HTML metadata parser in Apache Tika.

Categories > Data Processing > Mining

Suggest Alternative

Stars

Alternatives

License

mit

Open Issues

Most Recent Commit

over 11 years ago

Programming Language

PHP

Dependent Repos

Dependent Packages

Total Releases

Latest Release

July 24, 2014

Categories

Programming Languages > Php

Web User Interface > Html

Blockchain > Mining

Text Processing > Webpage

Text Processing > Readability

Data Processing > Tika

Site

Repo

Alternatives To yoozi/miner

Project Name	Stars	Repos Using This	Packages Using This	Most Recent Commit	Total Releases	Latest Release	Open Issues	License	Language
laurilehmijoki/s3_website	2,259	606	0	over 3 years ago	109	October 11, 2017	76	other	Scala
Manage an S3 website: sync, deliver via CloudFront, benefit from advanced S3 website features.
apache/tika	2,007	1,687	570	over 2 years ago	66	October 17, 2023	49	apache-2.0	Java
The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF).
chrismattmann/tika-python	1,316	83	54	almost 3 years ago	35	January 02, 2023	4	apache-2.0	Python
Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
dadoonet/fscrawler	1,279	0	1	over 2 years ago	5	January 10, 2022	145	apache-2.0	Java
Elasticsearch File System Crawler (FS Crawler)
pemistahl/lingua	622	0	3	over 2 years ago	17	August 02, 2022	15	apache-2.0	Kotlin
The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike
ICIJ/datashare	519	0	0	over 2 years ago	135	November 21, 2023	17	agpl-3.0	Java
A self-hosted search engine for documents.
USCDataScience/sparkler	401	0	0	over 3 years ago	0		55	apache-2.0	Java
Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
pcbje/gransk	237	0	0	over 9 years ago	0		3	apache-2.0	Python
Document processing for investigations
ICIJ/extract	229	0	1	over 2 years ago	58	November 13, 2023	10	mit	Java
A cross-platform command line tool for parallelised content extraction and analysis.
michaelklishin/pantomime	171	27	0	about 7 years ago	27	January 19, 2018	3		JavaScript
A tiny Clojure library that deals with MIME types (Internet media types)

Alternatives To yoozi/miner

Select To Compare

laurilehmijoki/s3_website ⭐ 2,259

Manage an S3 website: sync, deliver via CloudFront, benefit from advanced S3 website features.

dependent packages 0 total releases 109 most recent commit over 3 years ago downloads badge

apache/tika ⭐ 2,007

The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF).

dependent packages 570 total releases 66 most recent commit over 2 years ago

chrismattmann/tika-python ⭐ 1,316

Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.

dependent packages 54 total releases 35 most recent commit almost 3 years ago downloads badge

dadoonet/fscrawler ⭐ 1,279

Elasticsearch File System Crawler (FS Crawler)

dependent packages 1 total releases 5 most recent commit over 2 years ago

pemistahl/lingua ⭐ 622

The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike

dependent packages 3 total releases 17 most recent commit over 2 years ago

ICIJ/datashare ⭐ 519

A self-hosted search engine for documents.

dependent packages 0 total releases 135 most recent commit over 2 years ago

USCDataScience/sparkler ⭐ 401

Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.

dependent packages 0 total releases 0 most recent commit over 3 years ago

pcbje/gransk ⭐ 237

Document processing for investigations

dependent packages 0 total releases 0 most recent commit over 9 years ago

ICIJ/extract ⭐ 229

A cross-platform command line tool for parallelised content extraction and analysis.

dependent packages 1 total releases 58 most recent commit over 2 years ago

michaelklishin/pantomime ⭐ 171

A tiny Clojure library that deals with MIME types (Internet media types)

dependent packages 0 total releases 27 most recent commit about 7 years ago

Suggest An Alternative To miner

Alternative Project Comparisons

yoozi/miner vs S3_website

yoozi/miner vs Tika

yoozi/miner vs Tika Python

yoozi/miner vs Fscrawler

yoozi/miner vs Lingua

yoozi/miner vs Datashare

yoozi/miner vs Sparkler

yoozi/miner vs Gransk

yoozi/miner vs Extract

yoozi/miner vs Pantomime

Popular Tika Projects

laurilehmijoki/s3_website⭐ 2,259

Manage an S3 website: sync, deliver via CloudFront, benefit from advanced S3 website features.

apache/tika⭐ 2,007

The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF).

chrismattmann/tika-python⭐ 1,316

Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.

dadoonet/fscrawler⭐ 1,279

Elasticsearch File System Crawler (FS Crawler)

pemistahl/lingua⭐ 622

The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike

Popular Mining Projects

xmrig/xmrig⭐ 8,099

RandomX, KawPow, CryptoNight and GhostRider unified CPU/GPU miner and RandomX benchmark

yzhao062/anomaly-detection-resources⭐ 7,616

Anomaly detection related books, papers, videos, and toolboxes

tycrek/degoogle⭐ 7,136

A huge list of alternatives to Google products. Privacy tips, tricks, and links.

rushter/data-science-blogs⭐ 6,048

A curated list of data science blogs

ethereum-mining/ethminer⭐ 5,886

Ethereum miner with OpenCL, CUDA and stratum support

Popular Data Processing Categories

Jupyter Notebook

Dataset

Sql

Validation

Pipeline

Translation

Data Science

Classification

Transaction

Scraper