All Projects → dust → Similar Projects or Alternatives

278 Open source projects that are alternatives of or similar to dust

scavenger
Scrape and take screenshots of dynamic and static webpages
Stars: ✭ 14 (-26.32%)
Mutual labels:  scraping
Instagram-to-discord
Monitor instagram user account and automatically post new images to discord channel via a webhook. Working 2022!
Stars: ✭ 113 (+494.74%)
Mutual labels:  scraping
internet-affordability
🌍 Dataset that shows the Internet affordability by country (a shocking reality!)
Stars: ✭ 13 (-31.58%)
Mutual labels:  scraping
sg-food-ml
This script is used to scrap images from the Internet to classify 5 common noodle "mee" dishes in Singapore. Wanton Mee, Bak Chor Mee, Lor Mee, Prawn Mee and Mee Siam.
Stars: ✭ 18 (-5.26%)
Mutual labels:  scraping
scrapman
Retrieve real (with Javascript executed) HTML code from an URL, ultra fast and supports multiple parallel loading of webs
Stars: ✭ 21 (+10.53%)
Mutual labels:  scraping
feedsearch-crawler
Crawl sites for RSS, Atom, and JSON feeds.
Stars: ✭ 23 (+21.05%)
Mutual labels:  scraping
InstaBot
Simple and friendly Bot for Instagram, using Selenium and Scrapy with Python.
Stars: ✭ 32 (+68.42%)
Mutual labels:  scraping
kuwala
Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data sc…
Stars: ✭ 474 (+2394.74%)
Mutual labels:  scraping
browser-automation-api
Browser automation API for repetitive web-based tasks, with a friendly user interface. You can use it to scrape content or do many other things like capture a screenshot, generate pdf, extract content or execute custom Puppeteer, Playwright functions.
Stars: ✭ 24 (+26.32%)
Mutual labels:  scraping
chesf
CHeSF is the Chrome Headless Scraping Framework, a very very alpha code to scrape javascript intensive web pages
Stars: ✭ 18 (-5.26%)
Mutual labels:  scraping
swish
C++ HTTP requests for humans
Stars: ✭ 52 (+173.68%)
Mutual labels:  http-requests
LInkedIn-Reverese-Lookup
🔎Search LinkedIn profile by email address📧
Stars: ✭ 20 (+5.26%)
Mutual labels:  scraping
http interceptor
A lightweight, simple plugin that allows you to intercept request and response objects and modify them if desired.
Stars: ✭ 74 (+289.47%)
Mutual labels:  http-requests
torchestrator
Spin up Tor containers and then proxy HTTP requests via these Tor instances
Stars: ✭ 32 (+68.42%)
Mutual labels:  scraping
naos
📉 Uptime and error monitoring CLI
Stars: ✭ 30 (+57.89%)
Mutual labels:  scraping
anime-scraper
[partially working] Scrape and add anime episode stream URLs to uGet (Linux) or IDM (Windows) ~ Python3
Stars: ✭ 21 (+10.53%)
Mutual labels:  scraping
ferenda
Transform unstructured document collections to structured Linked Data
Stars: ✭ 22 (+15.79%)
Mutual labels:  scraping
relay
Relay lets you write HTTP requests as easy to read, structured YAML and dispatch them easily using a CLI. Similar to tools like Postman
Stars: ✭ 22 (+15.79%)
Mutual labels:  http-requests
shup
A POSIX shell script to parse HTML
Stars: ✭ 28 (+47.37%)
Mutual labels:  scraping
ha-multiscrape
Home Assistant custom component for scraping (html, xml or json) multiple values (from a single HTTP request) with a separate sensor/attribute for each value. Support for (login) form-submit functionality.
Stars: ✭ 103 (+442.11%)
Mutual labels:  scraping
gunaydin
Your good mornings ☀️
Stars: ✭ 16 (-15.79%)
Mutual labels:  scraping
ksoup
Kotlin Wrapper for Jsoup
Stars: ✭ 59 (+210.53%)
Mutual labels:  scraping
AngleParse
HTML parsing and processing tool for PowerShell.
Stars: ✭ 35 (+84.21%)
Mutual labels:  scraping
puppeteer-botcheck
🕵‍♂ Bot detection tests for Puppeteer. Hide and seek!
Stars: ✭ 42 (+121.05%)
Mutual labels:  scraping
requester
The package provides a very thin wrapper (no external dependencies) for http.Client allowing the use of layers (middleware).
Stars: ✭ 14 (-26.32%)
Mutual labels:  http-requests
wget-lua
Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.
Stars: ✭ 52 (+173.68%)
Mutual labels:  scraping
selectorlib
A library to read a YML file with Xpath or CSS Selectors and extract data from HTML pages using them
Stars: ✭ 53 (+178.95%)
Mutual labels:  scraping
subscene scraper
Library to download subtitles from subscene.com
Stars: ✭ 14 (-26.32%)
Mutual labels:  scraping
ogpParser
Open Graph Protocol Parser for Node.js
Stars: ✭ 43 (+126.32%)
Mutual labels:  scraping
Scraper-Projects
🕸 List of mini projects that involve web scraping 🕸
Stars: ✭ 25 (+31.58%)
Mutual labels:  scraping
angel.co-companies-list-scraping
No description or website provided.
Stars: ✭ 54 (+184.21%)
Mutual labels:  scraping
node-fetch-har
Generate HAR entries for requests made with node-fetch
Stars: ✭ 23 (+21.05%)
Mutual labels:  http-requests
browser-pool
A Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteer, Playwright, or SecretAgent.
Stars: ✭ 71 (+273.68%)
Mutual labels:  scraping
dmi-instascraper
A GUI for Instaloader to scrape users and hashtags with on Instagram
Stars: ✭ 21 (+10.53%)
Mutual labels:  scraping
Scrapping
Mastering the art of scrapping 🎓
Stars: ✭ 24 (+26.32%)
Mutual labels:  scraping
Captcha-Tools
All-in-one Python (And now Go!) module to help solve captchas with Capmonster, 2captcha and Anticaptcha API's!
Stars: ✭ 23 (+21.05%)
Mutual labels:  scraping
copycat
A PHP Scraping Class
Stars: ✭ 70 (+268.42%)
Mutual labels:  scraping
web-clipper
Easily download the main content of a web page in html, markdown, and/or epub format from command line.
Stars: ✭ 15 (-21.05%)
Mutual labels:  scraping
scrap
Scrapping Facebook with JavaScript.
Stars: ✭ 25 (+31.58%)
Mutual labels:  scraping
proxi
Proxy pool. Finds and checks proxies with rest api for querying results. Can find over 25k proxies in under 5 minutes.
Stars: ✭ 32 (+68.42%)
Mutual labels:  scraping
proxycrawl-python
ProxyCrawl Python library for scraping and crawling
Stars: ✭ 51 (+168.42%)
Mutual labels:  scraping
scrapy facebooker
Collection of scrapy spiders which can scrape posts, images, and so on from public Facebook Pages.
Stars: ✭ 22 (+15.79%)
Mutual labels:  scraping
htmltab
Command-line utility to convert HTML tables into CSV files
Stars: ✭ 13 (-31.58%)
Mutual labels:  scraping
scrapy-distributed
A series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.
Stars: ✭ 38 (+100%)
Mutual labels:  scraping
FireMock
Mock and stub HTTP requests. Test your apps with fake data and files responses.
Stars: ✭ 25 (+31.58%)
Mutual labels:  http-requests
requestify
Parse a raw HTTP request and generate request code in different languages
Stars: ✭ 25 (+31.58%)
Mutual labels:  http-requests
zcrawl
An open source web crawling platform
Stars: ✭ 21 (+10.53%)
Mutual labels:  scraping
document-dl
Command line program to download documents from web portals.
Stars: ✭ 14 (-26.32%)
Mutual labels:  scraping
restpect
Succint and readable integration tests over RESTful APIs
Stars: ✭ 83 (+336.84%)
Mutual labels:  http-requests
chirps
Twitter bot powering @arichduvet
Stars: ✭ 35 (+84.21%)
Mutual labels:  scraping
html-table-extractor
extract data from html table
Stars: ✭ 74 (+289.47%)
Mutual labels:  scraping
go-scrapy
Web crawling and scraping framework for Golang
Stars: ✭ 17 (-10.53%)
Mutual labels:  scraping
yttrex
youtube & tiktok analysis + youchoose recommendation custmizer. backend, extensions, and tooling
Stars: ✭ 31 (+63.16%)
Mutual labels:  scraping
centra
Core Node.js HTTP client
Stars: ✭ 52 (+173.68%)
Mutual labels:  http-requests
SecurityHeaders GovUK
A scan of all .gov.uk sites for the most common security headers or lack of
Stars: ✭ 14 (-26.32%)
Mutual labels:  http-requests
pomp
Screen scraping and web crawling framework
Stars: ✭ 61 (+221.05%)
Mutual labels:  scraping
nativescript-http
The best way to do HTTP requests in NativeScript, a drop-in replacement for the core HTTP with important improvements and additions like proper connection pooling, form data support and certificate pinning
Stars: ✭ 32 (+68.42%)
Mutual labels:  http-requests
image-collector
Download images from Google Image Search
Stars: ✭ 38 (+100%)
Mutual labels:  scraping
top-github-scraper
Scape top GitHub repositories and users based on keywords
Stars: ✭ 40 (+110.53%)
Mutual labels:  scraping
rubium
Rubium is a lightweight alternative to Selenium/Capybara/Watir if you need to perform some operations (like web scraping) using Headless Chromium and Ruby
Stars: ✭ 65 (+242.11%)
Mutual labels:  scraping
1-60 of 278 similar projects