NimqueryNim library for querying HTML using CSS-selectors (like JavaScripts document.querySelector)
Api StoreContains all the public APIs listed in Phantombuster's API store. Pull requests welcome!
TorrengoTorrengo is a CLI (command line) program written in Go which concurrently searches torrents from various sources.
MechamlOCaml functional web scraping library
MtntCode for the collection and analysis of the MTNT dataset
Artooartoo.js - the client-side scraping companion.
ConfigsPublic, free to use, repository with diggers configs for scraping / extracting data from various e-commerce websites and online stores
PypatentSearch for and retrieve US Patent and Trademark Office Patent Data
Scrapy ClusterThis Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
Instagram ScraperScrape the Instagram frontend. Inspired from twitter-scraper by @kennethreitz.
WebhereHTML scraping for Objective-C.
Lulu[Unmaintained] A simple and clean video/music/image downloader 👾
Imagescraper✂️ High performance, multi-threaded image scraper
ParselParsel lets you extract data from XML/HTML documents using XPath or CSS selectors
TabulaTabula is a tool for liberating data tables trapped inside PDF files
Gazpacho🥫 The simple, fast, and modern web scraping library
OjTools for various online judges. Downloading sample cases, generating additional test cases, testing your code, and submitting it.
Facebook data analyzerAnalyze facebook copy of your data with ruby language. Download zip file from facebook and get info about friends ranking by message, vocabulary, contacts, friends added statistics and more
NickjsWeb scraping library made by the Phantombuster team. Modern, simple & works on all websites. (Deprecated)
Geeksforgeeks.pdfTopic wise PDFs of Geeks for Geeks articles. (Last updated in October 2018)
FerretDeclarative web scraping
ScrappleA framework for creating semi-automatic web content extractors
DataflowkitExtract structured data from web sites. Web sites scraping.
CrawlyCrawly, a high-level web crawling & scraping framework for Elixir.
JekyllJekyll-based static site for The Programming Historian
LookylooLookyloo is a web interface that allows users to capture a website page and then display a tree of domains that call each other.
Undetected ChromedriverCustom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)
Data ScienceCollection of useful data science topics along with code and articles
CoronadatascraperCOVID-19 Coronavirus data scraped from government and curated data sources.
Comic DlComic-dl is a command line tool to download manga and comics from various comic and manga sites. Supported sites : readcomiconline.to, mangafox.me, comic naver and many more.
KatanaA Python Tool For google Hacking
SocialreaperSocial media scraping / data collection library for Facebook, Twitter, Reddit, YouTube, Pinterest, and Tumblr APIs
AutoscraperA Smart, Automatic, Fast and Lightweight Web Scraper for Python
Tinking🧶 Extract data from any website without code, just clicks.
SpidermonScrapy Extension for monitoring spiders execution.
LinkedinLinkedin Scraper using Selenium Web Driver, Chromium headless, Docker and Scrapy
Elixir ScrapeScrape any website, article or RSS/Atom Feed with ease!
LambdasoupFunctional HTML scraping and rewriting with CSS in OCaml
Gopa[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn
Apify JsApify SDK — The scalable web scraping and crawling library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.
MechanizeMechanize is a ruby library that makes automated web interaction easy.
facebook-discussion-tkA collection of tools to (semi-)automatically collect and analyze data from online discussions on Facebook groups and pages.
jazzThe Scripting Engine that Combines Speed, Safety, and Simplicity
ARGUSARGUS is an easy-to-use web scraping tool. The program is based on the Scrapy Python framework and is able to crawl a broad range of different websites. On the websites, ARGUS is able to perform tasks like scraping texts or collecting hyperlinks between websites. See: https://link.springer.com/article/10.1007/s11192-020-03726-9
bots-zooNo description or website provided.