All Projects → Scrapyrt → Similar Projects or Alternatives

1011 Open source projects that are alternatives of or similar to Scrapyrt

Fbcrawl
A Facebook crawler
Stars: ✭ 536 (-15.86%)
Mutual labels:  crawler, scraper, scrapy
Dotnetcrawler
DotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c
Stars: ✭ 100 (-84.3%)
Mutual labels:  crawler, scrapy, crawling
Scrapoxy
Scrapoxy hides your scraper behind a cloud. It starts a pool of proxies to send your requests. Now, you can crawl without thinking about blacklisting!
Stars: ✭ 1,322 (+107.54%)
Mutual labels:  crawler, scraper, scrapy
Crawly
Crawly, a high-level web crawling & scraping framework for Elixir.
Stars: ✭ 440 (-30.93%)
Mutual labels:  crawler, scraper, crawling
Lulu
[Unmaintained] A simple and clean video/music/image downloader 👾
Stars: ✭ 789 (+23.86%)
Mutual labels:  crawler, scraper, crawling
Newspaper
News, full-text, and article metadata extraction in Python 3. Advanced docs:
Stars: ✭ 11,545 (+1712.4%)
Mutual labels:  crawler, scraper, crawling
Headless Chrome Crawler
Distributed crawler powered by Headless Chrome
Stars: ✭ 5,129 (+705.18%)
Mutual labels:  crawler, scraper, crawling
Ferret
Declarative web scraping
Stars: ✭ 4,837 (+659.34%)
Mutual labels:  crawler, scraper, crawling
Easy Scraping Tutorial
Simple but useful Python web scraping tutorial code.
Stars: ✭ 583 (-8.48%)
Mutual labels:  crawler, scrapy, crawling
bots-zoo
No description or website provided.
Stars: ✭ 59 (-90.74%)
Mutual labels:  crawler, scraper, crawling
Colly
Elegant Scraper and Crawler Framework for Golang
Stars: ✭ 15,535 (+2338.78%)
Mutual labels:  crawler, scraper, crawling
Linkedin Profile Scraper
🕵️‍♂️ LinkedIn profile scraper returning structured profile data in JSON. Works in 2020.
Stars: ✭ 171 (-73.16%)
Mutual labels:  crawler, scraper, crawling
Goribot
[Crawler/Scraper for Golang]🕷A lightweight distributed friendly Golang crawler framework.一个轻量的分布式友好的 Golang 爬虫框架。
Stars: ✭ 190 (-70.17%)
Mutual labels:  crawler, scraper, scrapy
Ruiji.net
crawler framework, distributed crawler extractor
Stars: ✭ 220 (-65.46%)
Mutual labels:  crawler, scraper, scrapy
double-agent
A test suite of common scraper detection techniques. See how detectable your scraper stack is.
Stars: ✭ 123 (-80.69%)
Mutual labels:  crawling, scrapy
diffbot-php-client
[Deprecated - Maintenance mode - use APIs directly please!] The official Diffbot client library
Stars: ✭ 53 (-91.68%)
Mutual labels:  scraper, crawling
aioScrapy
基于asyncio与aiohttp的异步协程爬虫框架 欢迎Star
Stars: ✭ 34 (-94.66%)
Mutual labels:  twisted, scrapy
Wechatsogou
基于搜狗微信搜索的微信公众号爬虫接口
Stars: ✭ 5,220 (+719.47%)
Mutual labels:  crawler, scrapy
proxycrawl-python
ProxyCrawl Python library for scraping and crawling
Stars: ✭ 51 (-91.99%)
Mutual labels:  scraper, crawling
OpenScraper
An open source webapp for scraping: towards a public service for webscraping
Stars: ✭ 80 (-87.44%)
Mutual labels:  scraper, scrapy
scrapy-distributed
A series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.
Stars: ✭ 38 (-94.03%)
Mutual labels:  crawling, scrapy
papercut
Papercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.
Stars: ✭ 15 (-97.65%)
Mutual labels:  crawler, scraper
img-cli
An interactive Command-Line Interface Build in NodeJS for downloading a single or multiple images to disk from URL
Stars: ✭ 15 (-97.65%)
Mutual labels:  crawler, crawling
Scrapy Selenium
Scrapy middleware to handle javascript pages using selenium
Stars: ✭ 550 (-13.66%)
Mutual labels:  scrapy, crawling
Nintendo Switch Eshop
Crawler for Nintendo Switch eShop
Stars: ✭ 463 (-27.32%)
Mutual labels:  crawler, scraper
crawlkit
A crawler based on Phantom. Allows discovery of dynamic content and supports custom scrapers.
Stars: ✭ 23 (-96.39%)
Mutual labels:  scraper, crawling
flink-crawler
Continuous scalable web crawler built on top of Flink and crawler-commons
Stars: ✭ 48 (-92.46%)
Mutual labels:  crawler, crawling
Dataflowkit
Extract structured data from web sites. Web sites scraping.
Stars: ✭ 456 (-28.41%)
Mutual labels:  scraper, crawling
Scrapedin
LinkedIn Scraper (currently working 2020)
Stars: ✭ 453 (-28.89%)
Mutual labels:  crawler, scraper
Haipproxy
💖 High available distributed ip proxy pool, powerd by Scrapy and Redis
Stars: ✭ 4,993 (+683.83%)
Mutual labels:  crawler, scrapy
scrapy-LBC
Araignée LeBonCoin avec Scrapy et ElasticSearch
Stars: ✭ 14 (-97.8%)
Mutual labels:  scraper, scrapy
scrapy-fieldstats
A Scrapy extension to log items coverage when the spider shuts down
Stars: ✭ 17 (-97.33%)
Mutual labels:  crawling, scrapy
Polite
Be nice on the web
Stars: ✭ 253 (-60.28%)
Mutual labels:  crawler, scraper
OLX Scraper
📻 An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.
Stars: ✭ 15 (-97.65%)
Mutual labels:  scraper, scrapy
wget-lua
Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.
Stars: ✭ 52 (-91.84%)
Mutual labels:  scraper, crawling
Skrape.it
A Kotlin-based testing/scraping/parsing library providing the ability to analyze and extract data from HTML (server & client-side rendered). It places particular emphasis on ease of use and a high level of readability by providing an intuitive DSL. It aims to be a testing lib, but can also be used to scrape websites in a convenient fashion.
Stars: ✭ 231 (-63.74%)
Mutual labels:  crawler, scraper
Awesome Crawler
A collection of awesome web crawler,spider in different languages
Stars: ✭ 4,793 (+652.43%)
Mutual labels:  crawler, scraper
scrapy facebooker
Collection of scrapy spiders which can scrape posts, images, and so on from public Facebook Pages.
Stars: ✭ 22 (-96.55%)
Mutual labels:  scraper, scrapy
Mimo-Crawler
A web crawler that uses Firefox and js injection to interact with webpages and crawl their content, written in nodejs.
Stars: ✭ 22 (-96.55%)
Mutual labels:  scraper, crawling
Scrapple
A framework for creating semi-automatic web content extractors
Stars: ✭ 464 (-27.16%)
Mutual labels:  crawler, scrapy
arachnod
High performance crawler for Nodejs
Stars: ✭ 17 (-97.33%)
Mutual labels:  crawler, scraper
Ecommercecrawlers
码云仓库链接:AJay13/ECommerceCrawlers Github 仓库链接:DropsDevopsOrg/ECommerceCrawlers 项目展示平台链接:http://wechat.doonsec.com
Stars: ✭ 3,073 (+382.42%)
Mutual labels:  crawler, scrapy
ARGUS
ARGUS is an easy-to-use web scraping tool. The program is based on the Scrapy Python framework and is able to crawl a broad range of different websites. On the websites, ARGUS is able to perform tasks like scraping texts or collecting hyperlinks between websites. See: https://link.springer.com/article/10.1007/s11192-020-03726-9
Stars: ✭ 68 (-89.32%)
Mutual labels:  crawling, scrapy
weibo-scraper
Simple Weibo Scraper
Stars: ✭ 50 (-92.15%)
Mutual labels:  crawler, scraper
Scrapy Redis
Redis-based components for Scrapy.
Stars: ✭ 4,998 (+684.62%)
Mutual labels:  crawler, scrapy
Bookcorpus
Crawl BookCorpus
Stars: ✭ 443 (-30.46%)
Mutual labels:  crawler, scraper
lightnovel epub
🍭 epub generator for (light)novels (轻) 小说 epub 生成器,支持站点:轻之国度、轻小说文库
Stars: ✭ 89 (-86.03%)
Mutual labels:  crawler, scraper
Spidy
The simple, easy to use command line web crawler.
Stars: ✭ 257 (-59.65%)
Mutual labels:  crawler, crawling
Weibo terminator workflow
Update Version of weibo_terminator, This is Workflow Version aim at Get Job Done!
Stars: ✭ 259 (-59.34%)
Mutual labels:  crawler, scraper
Gopa
[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn
Stars: ✭ 277 (-56.51%)
Mutual labels:  crawler, crawling
dijnet-bot
Az összes számlád még egy helyen :)
Stars: ✭ 17 (-97.33%)
Mutual labels:  crawler, scraper
MyCrawler
我的爬虫合集
Stars: ✭ 55 (-91.37%)
Mutual labels:  crawler, scraper
Rcrawler
An R web crawler and scraper
Stars: ✭ 274 (-56.99%)
Mutual labels:  crawler, scraper
Scrapy Crawlera
Crawlera middleware for Scrapy
Stars: ✭ 281 (-55.89%)
Mutual labels:  crawler, scrapy
Autoscraper
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
Stars: ✭ 4,077 (+540.03%)
Mutual labels:  crawler, scraper
Linkedin
Linkedin Scraper using Selenium Web Driver, Chromium headless, Docker and Scrapy
Stars: ✭ 309 (-51.49%)
Mutual labels:  scraper, scrapy
Xcrawler
快速、简洁且强大的PHP爬虫框架
Stars: ✭ 344 (-46%)
Mutual labels:  crawler, scraper
Freshonions Torscraper
Fresh Onions is an open source TOR spider / hidden service onion crawler hosted at zlal32teyptf4tvi.onion
Stars: ✭ 348 (-45.37%)
Mutual labels:  crawler, scraper
Hquery.php
An extremely fast web scraper that parses megabytes of invalid HTML in a blink of an eye. PHP5.3+, no dependencies.
Stars: ✭ 295 (-53.69%)
Mutual labels:  crawler, scraper
Vault
swiss army knife for hackers
Stars: ✭ 346 (-45.68%)
Mutual labels:  crawler, scrapy
1-60 of 1011 similar projects