torchestratorSpin up Tor containers and then proxy HTTP requests via these Tor instances
Stars: ✭ 32 (-89.91%)
ScrappleA framework for creating semi-automatic web content extractors
Stars: ✭ 464 (+46.37%)
ARGUSARGUS is an easy-to-use web scraping tool. The program is based on the Scrapy Python framework and is able to crawl a broad range of different websites. On the websites, ARGUS is able to perform tasks like scraping texts or collecting hyperlinks between websites. See: https://link.springer.com/article/10.1007/s11192-020-03726-9
Stars: ✭ 68 (-78.55%)
SeleniumcrawlerAn example using Selenium webdrivers for python and Scrapy framework to create a web scraper to crawl an ASP site
Stars: ✭ 117 (-63.09%)
policy-data-analyzerBuilding a model to recognize incentives for landscape restoration in environmental policies from Latin America, the US and India. Bringing NLP to the world of policy analysis through an extensible framework that includes scraping, preprocessing, active learning and text analysis pipelines.
Stars: ✭ 22 (-93.06%)
DotnetcrawlerDotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c
Stars: ✭ 100 (-68.45%)
memes-apiAPI for scrapping common meme sites
Stars: ✭ 17 (-94.64%)
LinkedinLinkedin Scraper using Selenium Web Driver, Chromium headless, Docker and Scrapy
Stars: ✭ 309 (-2.52%)
scrapy-fieldstatsA Scrapy extension to log items coverage when the spider shuts down
Stars: ✭ 17 (-94.64%)
RARBG-scraperWith Selenium headless browsing and CAPTCHA solving
Stars: ✭ 38 (-88.01%)
double-agentA test suite of common scraper detection techniques. See how detectable your scraper stack is.
Stars: ✭ 123 (-61.2%)
Email ExtractorThe main functionality is to extract all the emails from one or several URLs - La funcionalidad principal es extraer todos los correos electrónicos de una o varias Url
Stars: ✭ 81 (-74.45%)
scrapy-distributedA series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.
Stars: ✭ 38 (-88.01%)
proxiProxy pool. Finds and checks proxies with rest api for querying results. Can find over 25k proxies in under 5 minutes.
Stars: ✭ 32 (-89.91%)
Scrapy ClusterThis Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
Stars: ✭ 921 (+190.54%)
InstaBotSimple and friendly Bot for Instagram, using Selenium and Scrapy with Python.
Stars: ✭ 32 (-89.91%)
scrapy facebookerCollection of scrapy spiders which can scrape posts, images, and so on from public Facebook Pages.
Stars: ✭ 22 (-93.06%)
web-clipperEasily download the main content of a web page in html, markdown, and/or epub format from command line.
Stars: ✭ 15 (-95.27%)
internet-affordability🌍 Dataset that shows the Internet affordability by country (a shocking reality!)
Stars: ✭ 13 (-95.9%)
logparserA tool for parsing Scrapy log files periodically and incrementally, extending the HTTP JSON API of Scrapyd.
Stars: ✭ 70 (-77.92%)
shupA POSIX shell script to parse HTML
Stars: ✭ 28 (-91.17%)
AngleParseHTML parsing and processing tool for PowerShell.
Stars: ✭ 35 (-88.96%)
bgmtoolsBangumi小工具
Stars: ✭ 66 (-79.18%)
python-spiderpython爬虫小项目【持续更新】【笔趣阁小说下载、Tweet数据抓取、天气查询、网易云音乐逆向、天天基金网查询、微博数据抓取(生成cookie)、有道翻译逆向、企查查免登陆爬虫、大众点评svg加密破解、B站用户爬虫、拉钩免登录爬虫、自如租房字体加密、知乎问答
Stars: ✭ 45 (-85.8%)
naos📉 Uptime and error monitoring CLI
Stars: ✭ 30 (-90.54%)
dmi-instascraperA GUI for Instaloader to scrape users and hashtags with on Instagram
Stars: ✭ 21 (-93.38%)
scraping-ebayScraping Ebay's products using Scrapy Web Crawling Framework
Stars: ✭ 79 (-75.08%)
kuwalaKuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data sc…
Stars: ✭ 474 (+49.53%)
pompScreen scraping and web crawling framework
Stars: ✭ 61 (-80.76%)
restaurant-finder-featureReviewsBuild a Flask web application to help users retrieve key restaurant information and feature-based reviews (generated by applying market-basket model – Apriori algorithm and NLP on user reviews).
Stars: ✭ 21 (-93.38%)
IMDB-ScraperScrapy project for scraping data from IMDB with Movie Dataset including 58,623 movies' data.
Stars: ✭ 37 (-88.33%)
gunaydinYour good mornings ☀️
Stars: ✭ 16 (-94.95%)
OpenScraperAn open source webapp for scraping: towards a public service for webscraping
Stars: ✭ 80 (-74.76%)
hk0weatherWeb scraper project to collect the useful Hong Kong weather data from HKO website
Stars: ✭ 49 (-84.54%)
top-github-scraperScape top GitHub repositories and users based on keywords
Stars: ✭ 40 (-87.38%)
document-dlCommand line program to download documents from web portals.
Stars: ✭ 14 (-95.58%)
chesfCHeSF is the Chrome Headless Scraping Framework, a very very alpha code to scrape javascript intensive web pages
Stars: ✭ 18 (-94.32%)
subscene scraperLibrary to download subtitles from subscene.com
Stars: ✭ 14 (-95.58%)
XMQ-BackUp小密圈备份,圈子/话题/图片/文件。
Stars: ✭ 22 (-93.06%)
GPlayCrawlerNo description or website provided.
Stars: ✭ 47 (-85.17%)
factoryDocker microservice & Crawler by scrapy
Stars: ✭ 56 (-82.33%)
scrapy.dartScrapy, a fast high-level web crawling & scraping framework for dart and Flutter
Stars: ✭ 50 (-84.23%)
BOC FER SpiderUse Scrapy crawl foreign exchange rate from BOC (Bank of China)
Stars: ✭ 18 (-94.32%)
go-scrapyWeb crawling and scraping framework for Golang
Stars: ✭ 17 (-94.64%)
OLX Scraper📻 An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.
Stars: ✭ 15 (-95.27%)
ImageGrabberA Scrapy demo : Download all images from a site
Stars: ✭ 33 (-89.59%)
rubiumRubium is a lightweight alternative to Selenium/Capybara/Watir if you need to perform some operations (like web scraping) using Headless Chromium and Ruby
Stars: ✭ 65 (-79.5%)
JustDownlink基于Scrapy+Elasticsearch+Django搭建的分布式电影搜索
Stars: ✭ 28 (-91.17%)
elves🎊 Design and implement of lightweight crawler framework.
Stars: ✭ 322 (+1.58%)