Hivelots of spider (很多爬虫)
Stars: ✭ 110 (-76.29%)
Scrapy ClusterThis Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
Stars: ✭ 921 (+98.49%)
Netflix CloneNetflix like full-stack application with SPA client and backend implemented in service oriented architecture
Stars: ✭ 156 (-66.38%)
Juno crawlerScrapy crawler to collect data on the back catalog of songs listed for sale.
Stars: ✭ 150 (-67.67%)
FerretDeclarative web scraping
Stars: ✭ 4,837 (+942.46%)
FbcrawlA Facebook crawler
Stars: ✭ 536 (+15.52%)
Scrapy RedisRedis-based components for Scrapy.
Stars: ✭ 4,998 (+977.16%)
Scraper-Projects🕸 List of mini projects that involve web scraping 🕸
Stars: ✭ 25 (-94.61%)
Lulu[Unmaintained] A simple and clean video/music/image downloader 👾
Stars: ✭ 789 (+70.04%)
Linkedin-ClientWeb scraper for grabing data from Linkedin profiles or company pages (personal project)
Stars: ✭ 42 (-90.95%)
proxiProxy pool. Finds and checks proxies with rest api for querying results. Can find over 25k proxies in under 5 minutes.
Stars: ✭ 32 (-93.1%)
scrapy facebookerCollection of scrapy spiders which can scrape posts, images, and so on from public Facebook Pages.
Stars: ✭ 22 (-95.26%)
scrapy-zyte-smartproxyZyte Smart Proxy Manager (formerly Crawlera) middleware for Scrapy
Stars: ✭ 317 (-31.68%)
ScrapyScrapy, a fast high-level web crawling & scraping framework for Python.
Stars: ✭ 42,343 (+9025.65%)
WebmagicA scalable web crawler framework for Java.
Stars: ✭ 10,186 (+2095.26%)
Qqmusicspider基于Scrapy的QQ音乐爬虫(QQ Music Spider),爬取歌曲信息、歌词、精彩评论等,并且分享了QQ音乐中排名前6400名的内地和港台歌手的49万+的音乐语料
Stars: ✭ 120 (-74.14%)
Marmot💐Marmot | Web Crawler/HTTP protocol Download Package 🐭
Stars: ✭ 186 (-59.91%)
Linkedin Profile Scraper🕵️♂️ LinkedIn profile scraper returning structured profile data in JSON. Works in 2020.
Stars: ✭ 171 (-63.15%)
AntchAntch, a fast, powerful and extensible web crawling & scraping framework for Go
Stars: ✭ 198 (-57.33%)
Goose ParserUniversal scrapping tool, which allows you to extract data using multiple environments
Stars: ✭ 211 (-54.53%)
CollyElegant Scraper and Crawler Framework for Golang
Stars: ✭ 15,535 (+3248.06%)
FilesensorDynamic file detection tool based on crawler 基于爬虫的动态敏感文件探测工具
Stars: ✭ 227 (-51.08%)
Iclr2019 OpenreviewdataScript that crawls meta data from ICLR OpenReview webpage. Tutorials on installing and using Selenium and ChromeDriver on Ubuntu.
Stars: ✭ 376 (-18.97%)
chopperChopper is a tool to extract elements from HTML by preserving ancestors and CSS rules
Stars: ✭ 22 (-95.26%)
iowebWeb Scraping Framework
Stars: ✭ 31 (-93.32%)
scrapy-fieldstatsA Scrapy extension to log items coverage when the spider shuts down
Stars: ✭ 17 (-96.34%)
Data-Wrangling-with-PythonSimplify your ETL processes with these hands-on data sanitation tips, tricks, and best practices
Stars: ✭ 90 (-80.6%)
Euro2016 TerminalApp⚽ Instantly find 🏆EURO 2016 live-streams & highlights, now a Web App!
Stars: ✭ 54 (-88.36%)
browser-poolA Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteer, Playwright, or SecretAgent.
Stars: ✭ 71 (-84.7%)
scrapy-distributedA series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.
Stars: ✭ 38 (-91.81%)
InstaBotSimple and friendly Bot for Instagram, using Selenium and Scrapy with Python.
Stars: ✭ 32 (-93.1%)
restaurant-finder-featureReviewsBuild a Flask web application to help users retrieve key restaurant information and feature-based reviews (generated by applying market-basket model – Apriori algorithm and NLP on user reviews).
Stars: ✭ 21 (-95.47%)
memes-apiAPI for scrapping common meme sites
Stars: ✭ 17 (-96.34%)
policy-data-analyzerBuilding a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.
Stars: ✭ 22 (-95.26%)
PythonScrapyBasicSetupBasic setup with random user agents and IP addresses for Python Scrapy Framework.
Stars: ✭ 57 (-87.72%)
TorScrapperA Scraper made 100% in Python using BeautifulSoup and Tor. It can be used to scrape both normal and onion links. Happy Scraping :)
Stars: ✭ 24 (-94.83%)
raspagem-de-dados-fatec📓 Minicurso de raspagem de dados web com Python ministrado na Semana de Tecnologia da FATEC Jundiaí
Stars: ✭ 22 (-95.26%)
MediumScraperScraping articles of medium and providing audio versions 📑 to 🔊 using django
Stars: ✭ 12 (-97.41%)
Php Curl ClassPHP Curl Class makes it easy to send HTTP requests and integrate with web APIs
Stars: ✭ 2,903 (+525.65%)
ARGUSARGUS is an easy-to-use web scraping tool. The program is based on the Scrapy Python framework and is able to crawl a broad range of different websites. On the websites, ARGUS is able to perform tasks like scraping texts or collecting hyperlinks between websites. See: https://link.springer.com/article/10.1007/s11192-020-03726-9
Stars: ✭ 68 (-85.34%)
Apify JsApify SDK — The scalable web scraping and crawling library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.
Stars: ✭ 3,154 (+579.74%)
bots-zooNo description or website provided.
Stars: ✭ 59 (-87.28%)
Sasila一个灵活、友好的爬虫框架
Stars: ✭ 286 (-38.36%)
Crawlertutorial爬蟲極簡教學(fetch, parse, search, multiprocessing, API)- PTT 為例
Stars: ✭ 282 (-39.22%)
Requests HtmlPythonic HTML Parsing for Humans™
Stars: ✭ 12,268 (+2543.97%)