All Projects → Scrapyrt → Similar Projects or Alternatives

1011 Open source projects that are alternatives of or similar to Scrapyrt

A Facebook crawler

Stars: ✭ 536 (-15.86%)

DotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c

Stars: ✭ 100 (-84.3%)

Mutual labels: crawler, scrapy, crawling

Scrapoxy

Scrapoxy hides your scraper behind a cloud. It starts a pool of proxies to send your requests. Now, you can crawl without thinking about blacklisting!

Stars: ✭ 1,322 (+107.54%)

Mutual labels: crawler, scraper, scrapy

Crawly

Crawly, a high-level web crawling & scraping framework for Elixir.

Stars: ✭ 440 (-30.93%)

Mutual labels: crawler, scraper, crawling

Lulu

[Unmaintained] A simple and clean video/music/image downloader 👾

Stars: ✭ 789 (+23.86%)

Mutual labels: crawler, scraper, crawling

Newspaper

News, full-text, and article metadata extraction in Python 3. Advanced docs:

Stars: ✭ 11,545 (+1712.4%)

Mutual labels: crawler, scraper, crawling

Headless Chrome Crawler

Distributed crawler powered by Headless Chrome

Stars: ✭ 5,129 (+705.18%)

Mutual labels: crawler, scraper, crawling

Ferret

Declarative web scraping

Stars: ✭ 4,837 (+659.34%)

Mutual labels: crawler, scraper, crawling

Easy Scraping Tutorial

Simple but useful Python web scraping tutorial code.

Stars: ✭ 583 (-8.48%)

Mutual labels: crawler, scrapy, crawling

bots-zoo

No description or website provided.

Stars: ✭ 59 (-90.74%)

Mutual labels: crawler, scraper, crawling

Colly

Elegant Scraper and Crawler Framework for Golang

Stars: ✭ 15,535 (+2338.78%)

Mutual labels: crawler, scraper, crawling

Linkedin Profile Scraper

🕵️‍♂️ LinkedIn profile scraper returning structured profile data in JSON. Works in 2020.

Stars: ✭ 171 (-73.16%)

Mutual labels: crawler, scraper, crawling

Goribot

[Crawler/Scraper for Golang]🕷A lightweight distributed friendly Golang crawler framework.一个轻量的分布式友好的 Golang 爬虫框架。

Stars: ✭ 190 (-70.17%)

Mutual labels: crawler, scraper, scrapy

Ruiji.net

crawler framework, distributed crawler extractor

Stars: ✭ 220 (-65.46%)

Mutual labels: crawler, scraper, scrapy

double-agent

A test suite of common scraper detection techniques. See how detectable your scraper stack is.

Stars: ✭ 123 (-80.69%)

Mutual labels: crawling, scrapy

diffbot-php-client

[Deprecated - Maintenance mode - use APIs directly please!] The official Diffbot client library

Stars: ✭ 53 (-91.68%)

Mutual labels: scraper, crawling

aioScrapy

基于asyncio与aiohttp的异步协程爬虫框架欢迎Star

Stars: ✭ 34 (-94.66%)

Mutual labels: twisted, scrapy

Wechatsogou

基于搜狗微信搜索的微信公众号爬虫接口

Stars: ✭ 5,220 (+719.47%)

Mutual labels: crawler, scrapy

proxycrawl-python

ProxyCrawl Python library for scraping and crawling

Stars: ✭ 51 (-91.99%)

Mutual labels: scraper, crawling

OpenScraper

An open source webapp for scraping: towards a public service for webscraping

Stars: ✭ 80 (-87.44%)

Mutual labels: scraper, scrapy

scrapy-distributed

A series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.

Stars: ✭ 38 (-94.03%)

Mutual labels: crawling, scrapy

papercut

Papercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.

Stars: ✭ 15 (-97.65%)

Mutual labels: crawler, scraper

img-cli

An interactive Command-Line Interface Build in NodeJS for downloading a single or multiple images to disk from URL

Stars: ✭ 15 (-97.65%)

Mutual labels: crawler, crawling

Scrapy Selenium

Scrapy middleware to handle javascript pages using selenium

Stars: ✭ 550 (-13.66%)

Mutual labels: scrapy, crawling

Nintendo Switch Eshop

Crawler for Nintendo Switch eShop

Stars: ✭ 463 (-27.32%)

Mutual labels: crawler, scraper

crawlkit

A crawler based on Phantom. Allows discovery of dynamic content and supports custom scrapers.

Stars: ✭ 23 (-96.39%)

Mutual labels: scraper, crawling

flink-crawler

Continuous scalable web crawler built on top of Flink and crawler-commons

Stars: ✭ 48 (-92.46%)

Mutual labels: crawler, crawling

Dataflowkit

Extract structured data from web sites. Web sites scraping.

Stars: ✭ 456 (-28.41%)

Mutual labels: scraper, crawling

Scrapedin

LinkedIn Scraper (currently working 2020)

Stars: ✭ 453 (-28.89%)

Mutual labels: crawler, scraper

Haipproxy

💖 High available distributed ip proxy pool, powerd by Scrapy and Redis

Stars: ✭ 4,993 (+683.83%)

Mutual labels: crawler, scrapy

scrapy-LBC

Araignée LeBonCoin avec Scrapy et ElasticSearch

Stars: ✭ 14 (-97.8%)

Mutual labels: scraper, scrapy

scrapy-fieldstats

A Scrapy extension to log items coverage when the spider shuts down

Stars: ✭ 17 (-97.33%)

Mutual labels: crawling, scrapy

Polite

Be nice on the web

Stars: ✭ 253 (-60.28%)

Mutual labels: crawler, scraper

OLX Scraper

📻 An OLX Scraper using Scrapy + MongoDB. It Scrapes recent ads posted regarding requested product and dumps to NOSQL MONGODB.

Stars: ✭ 15 (-97.65%)

Mutual labels: scraper, scrapy

wget-lua

Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.

Stars: ✭ 52 (-91.84%)

Mutual labels: scraper, crawling

Skrape.it

A Kotlin-based testing/scraping/parsing library providing the ability to analyze and extract data from HTML (server & client-side rendered). It places particular emphasis on ease of use and a high level of readability by providing an intuitive DSL. It aims to be a testing lib, but can also be used to scrape websites in a convenient fashion.

Stars: ✭ 231 (-63.74%)

Mutual labels: crawler, scraper

Awesome Crawler

A collection of awesome web crawler,spider in different languages

Stars: ✭ 4,793 (+652.43%)

Mutual labels: crawler, scraper

scrapy facebooker

Collection of scrapy spiders which can scrape posts, images, and so on from public Facebook Pages.

Stars: ✭ 22 (-96.55%)

Mutual labels: scraper, scrapy

Mimo-Crawler

A web crawler that uses Firefox and js injection to interact with webpages and crawl their content, written in nodejs.

Stars: ✭ 22 (-96.55%)

Mutual labels: scraper, crawling

Scrapple

A framework for creating semi-automatic web content extractors

Stars: ✭ 464 (-27.16%)

Mutual labels: crawler, scrapy

arachnod

High performance crawler for Nodejs

Stars: ✭ 17 (-97.33%)

Mutual labels: crawler, scraper

Ecommercecrawlers

码云仓库链接:AJay13/ECommerceCrawlers Github 仓库链接:DropsDevopsOrg/ECommerceCrawlers 项目展示平台链接:http://wechat.doonsec.com

Stars: ✭ 3,073 (+382.42%)

Mutual labels: crawler, scrapy

ARGUS

ARGUS is an easy-to-use web scraping tool. The program is based on the Scrapy Python framework and is able to crawl a broad range of different websites. On the websites, ARGUS is able to perform tasks like scraping texts or collecting hyperlinks between websites. See: https://link.springer.com/article/10.1007/s11192-020-03726-9

Stars: ✭ 68 (-89.32%)

Mutual labels: crawling, scrapy

weibo-scraper

Simple Weibo Scraper

Stars: ✭ 50 (-92.15%)

Mutual labels: crawler, scraper

Scrapy Redis

Redis-based components for Scrapy.

Stars: ✭ 4,998 (+684.62%)

Mutual labels: crawler, scrapy

Bookcorpus

Crawl BookCorpus