SpidrA versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
Stars: ✭ 656 (+4273.33%)
ScrappleA framework for creating semi-automatic web content extractors
Stars: ✭ 464 (+2993.33%)
PhpscraperPHP Scraper - an highly opinionated web-interface for PHP
Stars: ✭ 148 (+886.67%)
Scrape Linkedin Selenium`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.
Stars: ✭ 239 (+1493.33%)
Awesome CrawlerA collection of awesome web crawler,spider in different languages
Stars: ✭ 4,793 (+31853.33%)
Scrapy CraigslistWeb Scraping Craigslist's Engineering Jobs in NY with Scrapy
Stars: ✭ 54 (+260%)
Linkedin-ClientWeb scraper for grabing data from Linkedin profiles or company pages (personal project)
Stars: ✭ 42 (+180%)
sp-subway-scraper🚆This web scraper builds a dataset for São Paulo subway operation status
Stars: ✭ 24 (+60%)
ScrapersA list of scrapers from around the web.
Stars: ✭ 366 (+2340%)
FerretDeclarative web scraping
Stars: ✭ 4,837 (+32146.67%)
evineInteractive CLI Web Crawler
Stars: ✭ 140 (+833.33%)
scrapy facebookerCollection of scrapy spiders which can scrape posts, images, and so on from public Facebook Pages.
Stars: ✭ 22 (+46.67%)
AutoscraperA Smart, Automatic, Fast and Lightweight Web Scraper for Python
Stars: ✭ 4,077 (+27080%)
papercutPapercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.
Stars: ✭ 15 (+0%)
ScrapeMA monadic web scraping library
Stars: ✭ 17 (+13.33%)
ScrapoxyScrapoxy hides your scraper behind a cloud. It starts a pool of proxies to send your requests. Now, you can crawl without thinking about blacklisting!
Stars: ✭ 1,322 (+8713.33%)
RodA Devtools driver for web automation and scraping
Stars: ✭ 1,392 (+9180%)
Warta ScrapIndonesia Index News Crawler, including 10 online media
Stars: ✭ 57 (+280%)
SillyniumAutomate the creation of Python Selenium Scripts by drawing coloured boxes on webpage elements
Stars: ✭ 100 (+566.67%)
SeleniumcrawlerAn example using Selenium webdrivers for python and Scrapy framework to create a web scraper to crawl an ASP site
Stars: ✭ 117 (+680%)
GetsyA simple browser/client-side web scraper.
Stars: ✭ 238 (+1486.67%)
TradeTheEventImplementation of "Trade the Event: Corporate Events Detection for News-Based Event-Driven Trading." In Findings of ACL2021
Stars: ✭ 64 (+326.67%)
lopezCrawling and scraping the Web for fun and profit
Stars: ✭ 20 (+33.33%)
imdb-scraper🎬 An attempt at the most complete IMDb API
Stars: ✭ 24 (+60%)
metafetchNodeJS package that fetches a given URL's title, description, images, links etc.
Stars: ✭ 21 (+40%)
AzurLaneWikiScrapersA console application that can scrape the Azur Lane wiki and export the data to Json files
Stars: ✭ 12 (-20%)
torchestratorSpin up Tor containers and then proxy HTTP requests via these Tor instances
Stars: ✭ 32 (+113.33%)
LinkedinLinkedin Scraper using Selenium Web Driver, Chromium headless, Docker and Scrapy
Stars: ✭ 309 (+1960%)
Mimo-CrawlerA web crawler that uses Firefox and js injection to interact with webpages and crawl their content, written in nodejs.
Stars: ✭ 22 (+46.67%)
ScrapyrtHTTP API for Scrapy spiders
Stars: ✭ 637 (+4146.67%)
Voyages Sncf ApiA scrapy spider that scraps times and prices from Voyages Sncf. It uses scrapyrt to provide an API interface.
Stars: ✭ 7 (-53.33%)
FbcrawlA Facebook crawler
Stars: ✭ 536 (+3473.33%)
Hockey ScraperPython Package for scraping NHL Play-by-Play and Shift data
Stars: ✭ 93 (+520%)
Email ExtractorThe main functionality is to extract all the emails from one or several URLs - La funcionalidad principal es extraer todos los correos electrónicos de una o varias Url
Stars: ✭ 81 (+440%)
OpenScraperAn open source webapp for scraping: towards a public service for webscraping
Stars: ✭ 80 (+433.33%)
Ruiji.netcrawler framework, distributed crawler extractor
Stars: ✭ 220 (+1366.67%)
gochanges**[ARCHIVED]** website changes tracker 🔍
Stars: ✭ 12 (-20%)
scrapy-wayback-machineA Scrapy middleware for scraping time series data from Archive.org's Wayback Machine.
Stars: ✭ 92 (+513.33%)
antA web crawler for Go
Stars: ✭ 264 (+1660%)
yellowpages-scraperYellowpages.com Web Scraper written in Python and LXML to extract business details available based on a particular category and location.
Stars: ✭ 56 (+273.33%)
Goribot[Crawler/Scraper for Golang]🕷A lightweight distributed friendly Golang crawler framework.一个轻量的分布式友好的 Golang 爬虫框架。
Stars: ✭ 190 (+1166.67%)
scrapy-LBCAraignée LeBonCoin avec Scrapy et ElasticSearch
Stars: ✭ 14 (-6.67%)
doc crawler.pyExplore a website recursively and download all the wanted documents (PDF, ODT…)
Stars: ✭ 22 (+46.67%)
TikTokDownloader PyWebIO🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音|TikTok数据爬取工具,支持API调用,在线批量解析及下载。
Stars: ✭ 919 (+6026.67%)
iowebWeb Scraping Framework
Stars: ✭ 31 (+106.67%)
scrapmanRetrieve real (with Javascript executed) HTML code from an URL, ultra fast and supports multiple parallel loading of webs
Stars: ✭ 21 (+40%)
savedditBulk Downloader for Reddit
Stars: ✭ 130 (+766.67%)
rymscraperPython API to extract data from rateyourmusic.com.
Stars: ✭ 63 (+320%)
LeetCodeAt present contains scraped data from around 1500 problems present on the site. More to follow....
Stars: ✭ 45 (+200%)
document-dlCommand line program to download documents from web portals.
Stars: ✭ 14 (-6.67%)
ZillowZillow Scraper for Python using Selenium
Stars: ✭ 141 (+840%)