All Projects → NightMachinary → readability-cli

NightMachinary / readability-cli

Licence: Unlicense license
A CLI for Mozilla Readability. Get clean, uncluttered, ready-to-read HTML from any webpage!

Projects that are alternatives of or similar to readability-cli

Reader
Extract clean(er), readable text from web pages via Mercury Web Parser.
Stars: ✭ 75 (+82.93%)
Mutual labels:  reader, readability, cleaner
ha-multiscrape
Home Assistant custom component for scraping (html, xml or json) multiple values (from a single HTTP request) with a separate sensor/attribute for each value. Support for (login) form-submit functionality.
Stars: ✭ 103 (+151.22%)
Mutual labels:  scraping, scrape
pupflare
A webpage proxy that request through Chromium (puppeteer) - can be used to bypass Cloudflare anti bot / anti ddos on any application (like curl)
Stars: ✭ 183 (+346.34%)
Mutual labels:  scrape, scraping-websites
scavenger
Scrape and take screenshots of dynamic and static webpages
Stars: ✭ 14 (-65.85%)
Mutual labels:  scraping, scraping-websites
reason-rust-scraper
🦀 Scraping & crawling websites using Rust, and ReasonML
Stars: ✭ 21 (-48.78%)
Mutual labels:  scraping, scraping-websites
scrapman
Retrieve real (with Javascript executed) HTML code from an URL, ultra fast and supports multiple parallel loading of webs
Stars: ✭ 21 (-48.78%)
Mutual labels:  scraping, scraping-websites
proxycrawl-python
ProxyCrawl Python library for scraping and crawling
Stars: ✭ 51 (+24.39%)
Mutual labels:  scraping, scraping-websites
gochanges
**[ARCHIVED]** website changes tracker 🔍
Stars: ✭ 12 (-70.73%)
Mutual labels:  scraping, scraping-websites
document-dl
Command line program to download documents from web portals.
Stars: ✭ 14 (-65.85%)
Mutual labels:  scraping, scraping-websites
Elixir Scrape
Scrape any website, article or RSS/Atom Feed with ease!
Stars: ✭ 306 (+646.34%)
Mutual labels:  scraping, readability
Autoscraper
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
Stars: ✭ 4,077 (+9843.9%)
Mutual labels:  scraping, scrape
diffbot-php-client
[Deprecated - Maintenance mode - use APIs directly please!] The official Diffbot client library
Stars: ✭ 53 (+29.27%)
Mutual labels:  scraping, scrape
trafilatura
Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
Stars: ✭ 711 (+1634.15%)
Mutual labels:  scraping, readability
easy reader
⏮ ⏯ ⏭ A Rust library for easily navigating forward, backward or randomly through the lines of huge files.
Stars: ✭ 83 (+102.44%)
Mutual labels:  read, reader
scrapers
scrapers for building your own image databases
Stars: ✭ 46 (+12.2%)
Mutual labels:  scraping, scrape
Instagram-to-discord
Monitor instagram user account and automatically post new images to discord channel via a webhook. Working 2022!
Stars: ✭ 113 (+175.61%)
Mutual labels:  scraping, scraping-websites
web-clipper
Easily download the main content of a web page in html, markdown, and/or epub format from command line.
Stars: ✭ 15 (-63.41%)
Mutual labels:  webpage, scraping
PoReader
本地小说阅读器,支持深色模式,Wifi传书,代码简洁有注释(local text reader, support dark modal, upload text by wifi)
Stars: ✭ 41 (+0%)
Mutual labels:  read, reader
torchestrator
Spin up Tor containers and then proxy HTTP requests via these Tor instances
Stars: ✭ 32 (-21.95%)
Mutual labels:  scraping, scraping-websites
Ferret
Declarative web scraping
Stars: ✭ 4,837 (+11697.56%)
Mutual labels:  scraping, scraping-websites

readability-cli

A CLI for Mozilla's Readability.

Install

To install globally with yarn:

yarn global add mozilla-readability-cli

To install globally with npm:

npm install -g mozilla-readability-cli

Usage

# run readability --help for the latest version, I just copy it here once in a while.

Usage: readability [options] <url>

Sanitizes stdin, parses the result with Mozilla Readability, somewhat sanitizes the output again, and finally prints it to stdout. Note that you need to also specify the URL in addition to feeding us the HTML in stdin. Using an empty URL seems to work though.

Options:
  -V, --version  output the version number
  -h, --help     display help for command

Examples

curl https://example.com | readability https://example.com

readmoz () {
    local url="$1"
    local html="$(curl "$url" | readability "$url")"
    print -nr -- "$html"
}
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].