ScrappleA framework for creating semi-automatic web content extractors
Stars: ✭ 464 (+759.26%)
OLX Scraper📻 An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.
Stars: ✭ 15 (-72.22%)
Netflix CloneNetflix like full-stack application with SPA client and backend implemented in service oriented architecture
Stars: ✭ 156 (+188.89%)
ArachnidPowerful web scraping framework for Crystal
Stars: ✭ 68 (+25.93%)
top-github-scraperScape top GitHub repositories and users based on keywords
Stars: ✭ 40 (-25.93%)
CascadiaGo cascadia package command line CSS selector
Stars: ✭ 67 (+24.07%)
Scrapyd Cluster On HerokuSet up free and scalable Scrapyd cluster for distributed web-crawling with just a few clicks. DEMO 👉
Stars: ✭ 106 (+96.3%)
Linkedin-ClientWeb scraper for grabing data from Linkedin profiles or company pages (personal project)
Stars: ✭ 42 (-22.22%)
scrapy-wayback-machineA Scrapy middleware for scraping time series data from Archive.org's Wayback Machine.
Stars: ✭ 92 (+70.37%)
Web ScrapingDetailed web scraping tutorials for dummies with financial data crawlers on Reddit WallStreetBets, CME (both options and futures), US Treasury, CFTC, LME, SHFE and news data crawlers on BBC, Wall Street Journal, Al Jazeera, Reuters, Financial Times, Bloomberg, CNN, Fortune, The Economist
Stars: ✭ 153 (+183.33%)
SpidrA versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
Stars: ✭ 656 (+1114.81%)
Juno crawlerScrapy crawler to collect data on the back catalog of songs listed for sale.
Stars: ✭ 150 (+177.78%)
Detect CmsPHP Library for detecting CMS
Stars: ✭ 78 (+44.44%)
PhpscraperPHP Scraper - an highly opinionated web-interface for PHP
Stars: ✭ 148 (+174.07%)
restaurant-finder-featureReviewsBuild a Flask web application to help users retrieve key restaurant information and feature-based reviews (generated by applying market-basket model – Apriori algorithm and NLP on user reviews).
Stars: ✭ 21 (-61.11%)
DaftlistingsA library that enables programmatic interaction with daft.ie. Daft.ie has nationwide coverage and contains about 80% of the total available properties in Ireland.
Stars: ✭ 86 (+59.26%)
Html MetadataMetaData html scraper and parser for Node.js (supports Promises and callback style)
Stars: ✭ 129 (+138.89%)
Scrape Linkedin Selenium`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.
Stars: ✭ 239 (+342.59%)
IMDB-ScraperScrapy project for scraping data from IMDB with Movie Dataset including 58,623 movies' data.
Stars: ✭ 37 (-31.48%)
scraping-ebayScraping Ebay's products using Scrapy Web Crawling Framework
Stars: ✭ 79 (+46.3%)
City ScrapersScrape, standardize and share public meetings from local government websites
Stars: ✭ 220 (+307.41%)
Php Curl ClassPHP Curl Class makes it easy to send HTTP requests and integrate with web APIs
Stars: ✭ 2,903 (+5275.93%)
Project TauroA Router WiFi key recovery/cracking tool with a twist.
Stars: ✭ 52 (-3.7%)
SnoopSnoop — инструмент разведки на основе открытых данных (OSINT world)
Stars: ✭ 886 (+1540.74%)
WebhubbotPython + Scrapy + MongoDB . 5 million data per day !!!💥 The world's largest website.
Stars: ✭ 5,427 (+9950%)
WebmiddleNode.js framework for modular web scraping and data extraction
Stars: ✭ 13 (-75.93%)
ScrapyrtHTTP API for Scrapy spiders
Stars: ✭ 637 (+1079.63%)
IcrawlerA multi-thread crawler framework with many builtin image crawlers provided.
Stars: ✭ 629 (+1064.81%)
Voyages Sncf ApiA scrapy spider that scraps times and prices from Voyages Sncf. It uses scrapyrt to provide an API interface.
Stars: ✭ 7 (-87.04%)
CoolqlcoolNextjs server to query websites with GraphQL
Stars: ✭ 623 (+1053.7%)
Python Spider豆瓣电影top250、斗鱼爬取json数据以及爬取美女图片、淘宝、有缘、CrawlSpider爬取红娘网相亲人的部分基本信息以及红娘网分布式爬取和存储redis、爬虫小demo、Selenium、爬取多点、django开发接口、爬取有缘网信息、模拟知乎登录、模拟github登录、模拟图虫网登录、爬取多点商城整站数据、爬取微信公众号历史文章、爬取微信群或者微信好友分享的文章、itchat监听指定微信公众号分享的文章
Stars: ✭ 615 (+1038.89%)
Wescraper依赖Scrapy和搜狗搜索微信公众号文章
Stars: ✭ 46 (-14.81%)
App comments spider爬取百度贴吧、TapTap、appstore、微博官方博主上的游戏评论(基于redis_scrapy),过滤器采用了bloomfilter。
Stars: ✭ 38 (-29.63%)
Scrapy ClusterThis Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
Stars: ✭ 921 (+1605.56%)
Letterboxd recommendationsScraping publicly-accessible Letterboxd data and creating a movie recommendation model with it that can generate recommendations when provided with a Letterboxd username
Stars: ✭ 23 (-57.41%)
Wechatsogou基于搜狗微信搜索的微信公众号爬虫接口
Stars: ✭ 5,220 (+9566.67%)
Actor Google Search ScraperApify actor that crawls Google Search result pages (SERPs) and extracts a list of organic results, ads, related queries and more. It supports selection of custom country, language and location.
Stars: ✭ 38 (-29.63%)
Scrapy SeleniumScrapy middleware to handle javascript pages using selenium
Stars: ✭ 550 (+918.52%)
Pdf downloaderA Scrapy Spider for downloading PDF files from a webpage.
Stars: ✭ 18 (-66.67%)
FbcrawlA Facebook crawler
Stars: ✭ 536 (+892.59%)
Scrapy RedisRedis-based components for Scrapy.
Stars: ✭ 4,998 (+9155.56%)
Reptile🏀 Python3 网络爬虫实战(部分含详细教程)猫眼 腾讯视频 豆瓣 研招网 微博 笔趣阁小说 百度热点 B站 CSDN 网易云阅读 阿里文学 百度股票 今日头条 微信公众号 网易云音乐 拉勾 有道 unsplash 实习僧 汽车之家 英雄联盟盒子 大众点评 链家 LPL赛程 台风 梦幻西游、阴阳师藏宝阁 天气 牛客网 百度文库 睡前故事 知乎 Wish
Stars: ✭ 1,048 (+1840.74%)
ScrapymonSimple Web UI for Scrapy spider management via Scrapyd
Stars: ✭ 35 (-35.19%)
Scrapy Finance[OUTDATED] scrapy spiders to crawl the financial text data 📚 📜 pertinent to train word vectors 🚀
Stars: ✭ 17 (-68.52%)
Haipproxy💖 High available distributed ip proxy pool, powerd by Scrapy and Redis
Stars: ✭ 4,993 (+9146.3%)
SeekerSeeker - another job board aggregator.
Stars: ✭ 16 (-70.37%)
Awesome CrawlerA collection of awesome web crawler,spider in different languages
Stars: ✭ 4,793 (+8775.93%)