ReaderExtract clean(er), readable text from web pages via Mercury Web Parser.
Stars: ✭ 75 (+82.93%)
Mutual labels: reader, readability, cleaner
ha-multiscrapeHome Assistant custom component for scraping (html, xml or json) multiple values (from a single HTTP request) with a separate sensor/attribute for each value. Support for (login) form-submit functionality.
Stars: ✭ 103 (+151.22%)
Mutual labels: scraping, scrape
pupflareA webpage proxy that request through Chromium (puppeteer) - can be used to bypass Cloudflare anti bot / anti ddos on any application (like curl)
Stars: ✭ 183 (+346.34%)
Mutual labels: scrape, scraping-websites
scavengerScrape and take screenshots of dynamic and static webpages
Stars: ✭ 14 (-65.85%)
Mutual labels: scraping, scraping-websites
reason-rust-scraper🦀 Scraping & crawling websites using Rust, and ReasonML
Stars: ✭ 21 (-48.78%)
Mutual labels: scraping, scraping-websites
scrapmanRetrieve real (with Javascript executed) HTML code from an URL, ultra fast and supports multiple parallel loading of webs
Stars: ✭ 21 (-48.78%)
Mutual labels: scraping, scraping-websites
proxycrawl-pythonProxyCrawl Python library for scraping and crawling
Stars: ✭ 51 (+24.39%)
Mutual labels: scraping, scraping-websites
gochanges**[ARCHIVED]** website changes tracker 🔍
Stars: ✭ 12 (-70.73%)
Mutual labels: scraping, scraping-websites
document-dlCommand line program to download documents from web portals.
Stars: ✭ 14 (-65.85%)
Mutual labels: scraping, scraping-websites
Elixir ScrapeScrape any website, article or RSS/Atom Feed with ease!
Stars: ✭ 306 (+646.34%)
Mutual labels: scraping, readability
AutoscraperA Smart, Automatic, Fast and Lightweight Web Scraper for Python
Stars: ✭ 4,077 (+9843.9%)
Mutual labels: scraping, scrape
diffbot-php-client[Deprecated - Maintenance mode - use APIs directly please!] The official Diffbot client library
Stars: ✭ 53 (+29.27%)
Mutual labels: scraping, scrape
trafilaturaPython & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
Stars: ✭ 711 (+1634.15%)
Mutual labels: scraping, readability
easy reader⏮ ⏯ ⏭ A Rust library for easily navigating forward, backward or randomly through the lines of huge files.
Stars: ✭ 83 (+102.44%)
Mutual labels: read, reader
scrapersscrapers for building your own image databases
Stars: ✭ 46 (+12.2%)
Mutual labels: scraping, scrape
Instagram-to-discordMonitor instagram user account and automatically post new images to discord channel via a webhook. Working 2022!
Stars: ✭ 113 (+175.61%)
Mutual labels: scraping, scraping-websites
web-clipperEasily download the main content of a web page in html, markdown, and/or epub format from command line.
Stars: ✭ 15 (-63.41%)
Mutual labels: webpage, scraping
PoReader本地小说阅读器,支持深色模式,Wifi传书,代码简洁有注释(local text reader, support dark modal, upload text by wifi)
Stars: ✭ 41 (+0%)
Mutual labels: read, reader
torchestratorSpin up Tor containers and then proxy HTTP requests via these Tor instances
Stars: ✭ 32 (-21.95%)
Mutual labels: scraping, scraping-websites
FerretDeclarative web scraping
Stars: ✭ 4,837 (+11697.56%)
Mutual labels: scraping, scraping-websites