shorter.recipesA website dedicated to making recipes from any website easy to read.
Stars: ✭ 27 (+107.69%)
proxycrawl-pythonProxyCrawl Python library for scraping and crawling
Stars: ✭ 51 (+292.31%)
ArchiteuthisMITM HTTP(S) proxy with integrated load-balancing, rate-limiting and error handling. Built for automated web scraping.
Stars: ✭ 35 (+169.23%)
anime-scraper[partially working] Scrape and add anime episode stream URLs to uGet (Linux) or IDM (Windows) ~ Python3
Stars: ✭ 21 (+61.54%)
node-red-contrib-nbrowserProvides a virtual web browser (a.k.a. "headless browser") appearing as a node.
Stars: ✭ 31 (+138.46%)
ogpParserOpen Graph Protocol Parser for Node.js
Stars: ✭ 43 (+230.77%)
RARBG-scraperWith Selenium headless browsing and CAPTCHA solving
Stars: ✭ 38 (+192.31%)
ha-multiscrapeHome Assistant custom component for scraping (html, xml or json) multiple values (from a single HTTP request) with a separate sensor/attribute for each value. Support for (login) form-submit functionality.
Stars: ✭ 103 (+692.31%)
puppeteer-botcheck🕵♂ Bot detection tests for Puppeteer. Hide and seek!
Stars: ✭ 42 (+223.08%)
scrapersscrapers for building your own image databases
Stars: ✭ 46 (+253.85%)
scavengerScrape and take screenshots of dynamic and static webpages
Stars: ✭ 14 (+7.69%)
crawling-frameworkEasily crawl news portals or blog sites using Storm Crawler.
Stars: ✭ 22 (+69.23%)
wget-luaWget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.
Stars: ✭ 52 (+300%)
covid19br-pubProjeto de monitoramento de publicações oficiais relacionadas a COVID-19 no Brasil.
Stars: ✭ 12 (-7.69%)
InstaBotSimple and friendly Bot for Instagram, using Selenium and Scrapy with Python.
Stars: ✭ 32 (+146.15%)
asyncio-hnPython (asyncio) wrapper for hackernews api
Stars: ✭ 27 (+107.69%)
namecoin-coreNamecoin full node + wallet based on the current Bitcoin Core codebase.
Stars: ✭ 425 (+3169.23%)
diffbot-php-client[Deprecated - Maintenance mode - use APIs directly please!] The official Diffbot client library
Stars: ✭ 53 (+307.69%)
Instagram-to-discordMonitor instagram user account and automatically post new images to discord channel via a webhook. Working 2022!
Stars: ✭ 113 (+769.23%)
ScrapeBotA Selenium-driven tool for automated website interaction and scraping.
Stars: ✭ 16 (+23.08%)
socials👨👩👦 Social account detection and extraction in Python, e.g. for crawling/scraping.
Stars: ✭ 37 (+184.62%)
zcrawlAn open source web crawling platform
Stars: ✭ 21 (+61.54%)
etf4u📊 Python tool to scrape real-time information about ETFs from the web and mixing them together by proportionally distributing their assets allocation
Stars: ✭ 29 (+123.08%)
browser-poolA Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteer, Playwright, or SecretAgent.
Stars: ✭ 71 (+446.15%)
yttrexyoutube & tiktok analysis + youchoose recommendation custmizer. backend, extensions, and tooling
Stars: ✭ 31 (+138.46%)
rubiumRubium is a lightweight alternative to Selenium/Capybara/Watir if you need to perform some operations (like web scraping) using Headless Chromium and Ruby
Stars: ✭ 65 (+400%)
selectorlibA library to read a YML file with Xpath or CSS Selectors and extract data from HTML pages using them
Stars: ✭ 53 (+307.69%)
ScrappingMastering the art of scrapping 🎓
Stars: ✭ 24 (+84.62%)
reason-rust-scraper🦀 Scraping & crawling websites using Rust, and ReasonML
Stars: ✭ 21 (+61.54%)
document-dlCommand line program to download documents from web portals.
Stars: ✭ 14 (+7.69%)
docker-selenium-lambdaThe simplest demo of chrome automation by python and selenium in AWS Lambda
Stars: ✭ 172 (+1223.08%)
copycatA PHP Scraping Class
Stars: ✭ 70 (+438.46%)
illuminsight💡👀 Read EPUB books with built-in insights from wikis, definitions, translations, and Google.
Stars: ✭ 55 (+323.08%)
oversmashOverwatch API library for player details and career stats
Stars: ✭ 42 (+223.08%)
scrapScrapping Facebook with JavaScript.
Stars: ✭ 25 (+92.31%)
iowebWeb Scraping Framework
Stars: ✭ 31 (+138.46%)
scrapy-distributedA series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.
Stars: ✭ 38 (+192.31%)
scrapy-fieldstatsA Scrapy extension to log items coverage when the spider shuts down
Stars: ✭ 17 (+30.77%)
ProjectLockdownProject Lockdown (an initiative from The IO Foundation) is a civic tech, interactive platform providing an overview of the state of Human and Digital Rights around the globe. It evaluates policies obtained from official sources that may impact their observance. It provides, among other tools, a layered map interface that allows for a visual repr…
Stars: ✭ 34 (+161.54%)
nrql-simplenrql-simple provides a convenient way to interact with the New Relic Insights query API.
Stars: ✭ 13 (+0%)
sg-food-mlThis script is used to scrap images from the Internet to classify 5 common noodle "mee" dishes in Singapore. Wanton Mee, Bak Chor Mee, Lor Mee, Prawn Mee and Mee Siam.
Stars: ✭ 18 (+38.46%)
4catThe 4CAT Capture and Analysis Toolkit provides modular data capture & analysis for a variety of social media platforms.
Stars: ✭ 144 (+1007.69%)
htmltabCommand-line utility to convert HTML tables into CSV files
Stars: ✭ 13 (+0%)
chesfCHeSF is the Chrome Headless Scraping Framework, a very very alpha code to scrape javascript intensive web pages
Stars: ✭ 18 (+38.46%)
stateOfVeganism🌱 Get insights into the current state of Veganism around the world based on global news
Stars: ✭ 26 (+100%)
browser-automation-apiBrowser automation API for repetitive web-based tasks, with a friendly user interface. You can use it to scrape content or do many other things like capture a screenshot, generate pdf, extract content or execute custom Puppeteer, Playwright functions.
Stars: ✭ 24 (+84.62%)
trafilaturaPython & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
Stars: ✭ 711 (+5369.23%)
ksoupKotlin Wrapper for Jsoup
Stars: ✭ 59 (+353.85%)
tnb-analysisGain insights about thenewboston digital crypto currency network by doing some analysis
Stars: ✭ 24 (+84.62%)
gunaydinYour good mornings ☀️
Stars: ✭ 16 (+23.08%)
go-scrapyWeb crawling and scraping framework for Golang
Stars: ✭ 17 (+30.77%)
torchestratorSpin up Tor containers and then proxy HTTP requests via these Tor instances
Stars: ✭ 32 (+146.15%)
scrapmanRetrieve real (with Javascript executed) HTML code from an URL, ultra fast and supports multiple parallel loading of webs
Stars: ✭ 21 (+61.54%)