scavengerScrape and take screenshots of dynamic and static webpages
Stars: ✭ 14 (-26.32%)
Instagram-to-discordMonitor instagram user account and automatically post new images to discord channel via a webhook. Working 2022!
Stars: ✭ 113 (+494.74%)
internet-affordability🌍 Dataset that shows the Internet affordability by country (a shocking reality!)
Stars: ✭ 13 (-31.58%)
sg-food-mlThis script is used to scrap images from the Internet to classify 5 common noodle "mee" dishes in Singapore. Wanton Mee, Bak Chor Mee, Lor Mee, Prawn Mee and Mee Siam.
Stars: ✭ 18 (-5.26%)
scrapmanRetrieve real (with Javascript executed) HTML code from an URL, ultra fast and supports multiple parallel loading of webs
Stars: ✭ 21 (+10.53%)
InstaBotSimple and friendly Bot for Instagram, using Selenium and Scrapy with Python.
Stars: ✭ 32 (+68.42%)
kuwalaKuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data sc…
Stars: ✭ 474 (+2394.74%)
browser-automation-apiBrowser automation API for repetitive web-based tasks, with a friendly user interface. You can use it to scrape content or do many other things like capture a screenshot, generate pdf, extract content or execute custom Puppeteer, Playwright functions.
Stars: ✭ 24 (+26.32%)
chesfCHeSF is the Chrome Headless Scraping Framework, a very very alpha code to scrape javascript intensive web pages
Stars: ✭ 18 (-5.26%)
swishC++ HTTP requests for humans
Stars: ✭ 52 (+173.68%)
http interceptorA lightweight, simple plugin that allows you to intercept request and response objects and modify them if desired.
Stars: ✭ 74 (+289.47%)
torchestratorSpin up Tor containers and then proxy HTTP requests via these Tor instances
Stars: ✭ 32 (+68.42%)
naos📉 Uptime and error monitoring CLI
Stars: ✭ 30 (+57.89%)
anime-scraper[partially working] Scrape and add anime episode stream URLs to uGet (Linux) or IDM (Windows) ~ Python3
Stars: ✭ 21 (+10.53%)
ferendaTransform unstructured document collections to structured Linked Data
Stars: ✭ 22 (+15.79%)
relayRelay lets you write HTTP requests as easy to read, structured YAML and dispatch them easily using a CLI. Similar to tools like Postman
Stars: ✭ 22 (+15.79%)
shupA POSIX shell script to parse HTML
Stars: ✭ 28 (+47.37%)
ha-multiscrapeHome Assistant custom component for scraping (html, xml or json) multiple values (from a single HTTP request) with a separate sensor/attribute for each value. Support for (login) form-submit functionality.
Stars: ✭ 103 (+442.11%)
gunaydinYour good mornings ☀️
Stars: ✭ 16 (-15.79%)
ksoupKotlin Wrapper for Jsoup
Stars: ✭ 59 (+210.53%)
AngleParseHTML parsing and processing tool for PowerShell.
Stars: ✭ 35 (+84.21%)
puppeteer-botcheck🕵♂ Bot detection tests for Puppeteer. Hide and seek!
Stars: ✭ 42 (+121.05%)
requesterThe package provides a very thin wrapper (no external dependencies) for http.Client allowing the use of layers (middleware).
Stars: ✭ 14 (-26.32%)
wget-luaWget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.
Stars: ✭ 52 (+173.68%)
selectorlibA library to read a YML file with Xpath or CSS Selectors and extract data from HTML pages using them
Stars: ✭ 53 (+178.95%)
subscene scraperLibrary to download subtitles from subscene.com
Stars: ✭ 14 (-26.32%)
ogpParserOpen Graph Protocol Parser for Node.js
Stars: ✭ 43 (+126.32%)
Scraper-Projects🕸 List of mini projects that involve web scraping 🕸
Stars: ✭ 25 (+31.58%)
node-fetch-harGenerate HAR entries for requests made with node-fetch
Stars: ✭ 23 (+21.05%)
browser-poolA Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteer, Playwright, or SecretAgent.
Stars: ✭ 71 (+273.68%)
dmi-instascraperA GUI for Instaloader to scrape users and hashtags with on Instagram
Stars: ✭ 21 (+10.53%)
ScrappingMastering the art of scrapping 🎓
Stars: ✭ 24 (+26.32%)
Captcha-ToolsAll-in-one Python (And now Go!) module to help solve captchas with Capmonster, 2captcha and Anticaptcha API's!
Stars: ✭ 23 (+21.05%)
copycatA PHP Scraping Class
Stars: ✭ 70 (+268.42%)
web-clipperEasily download the main content of a web page in html, markdown, and/or epub format from command line.
Stars: ✭ 15 (-21.05%)
scrapScrapping Facebook with JavaScript.
Stars: ✭ 25 (+31.58%)
proxiProxy pool. Finds and checks proxies with rest api for querying results. Can find over 25k proxies in under 5 minutes.
Stars: ✭ 32 (+68.42%)
proxycrawl-pythonProxyCrawl Python library for scraping and crawling
Stars: ✭ 51 (+168.42%)
scrapy facebookerCollection of scrapy spiders which can scrape posts, images, and so on from public Facebook Pages.
Stars: ✭ 22 (+15.79%)
htmltabCommand-line utility to convert HTML tables into CSV files
Stars: ✭ 13 (-31.58%)
scrapy-distributedA series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.
Stars: ✭ 38 (+100%)
FireMockMock and stub HTTP requests. Test your apps with fake data and files responses.
Stars: ✭ 25 (+31.58%)
requestifyParse a raw HTTP request and generate request code in different languages
Stars: ✭ 25 (+31.58%)
zcrawlAn open source web crawling platform
Stars: ✭ 21 (+10.53%)
document-dlCommand line program to download documents from web portals.
Stars: ✭ 14 (-26.32%)
restpectSuccint and readable integration tests over RESTful APIs
Stars: ✭ 83 (+336.84%)
chirpsTwitter bot powering @arichduvet
Stars: ✭ 35 (+84.21%)
go-scrapyWeb crawling and scraping framework for Golang
Stars: ✭ 17 (-10.53%)
yttrexyoutube & tiktok analysis + youchoose recommendation custmizer. backend, extensions, and tooling
Stars: ✭ 31 (+63.16%)
centraCore Node.js HTTP client
Stars: ✭ 52 (+173.68%)
SecurityHeaders GovUKA scan of all .gov.uk sites for the most common security headers or lack of
Stars: ✭ 14 (-26.32%)
pompScreen scraping and web crawling framework
Stars: ✭ 61 (+221.05%)
nativescript-httpThe best way to do HTTP requests in NativeScript, a drop-in replacement for the core HTTP with important improvements and additions like proper connection pooling, form data support and certificate pinning
Stars: ✭ 32 (+68.42%)
image-collectorDownload images from Google Image Search
Stars: ✭ 38 (+100%)
top-github-scraperScape top GitHub repositories and users based on keywords
Stars: ✭ 40 (+110.53%)
rubiumRubium is a lightweight alternative to Selenium/Capybara/Watir if you need to perform some operations (like web scraping) using Headless Chromium and Ruby
Stars: ✭ 65 (+242.11%)