All Projects → Ruiji.net → Similar Projects or Alternatives

1299 Open source projects that are alternatives of or similar to Ruiji.net

A Facebook crawler

Stars: ✭ 536 (+143.64%)

Scrapoxy hides your scraper behind a cloud. It starts a pool of proxies to send your requests. Now, you can crawl without thinking about blacklisting!

Stars: ✭ 1,322 (+500.91%)

Mutual labels: crawler, scraper, scrapy

Goribot

[Crawler/Scraper for Golang]🕷A lightweight distributed friendly Golang crawler framework.一个轻量的分布式友好的 Golang 爬虫框架。

Stars: ✭ 190 (-13.64%)

Mutual labels: crawler, scraper, scrapy

Scrapyrt

HTTP API for Scrapy spiders

Stars: ✭ 637 (+189.55%)

Mutual labels: crawler, scraper, scrapy

Headless Chrome Crawler

Distributed crawler powered by Headless Chrome

Stars: ✭ 5,129 (+2231.36%)

Mutual labels: crawler, scraper, headless-chrome

Mailinglistscraper

A python web scraper for public email lists.

Stars: ✭ 19 (-91.36%)

Mutual labels: scraper, scrapy

Puppeteer Sharp Extra

Plugin framework for PuppeteerSharp

Stars: ✭ 39 (-82.27%)

Mutual labels: netcore, headless-chrome

Tianyancha

pip安装的天眼查爬虫API，指定的单个/多个企业工商信息一键保存为Excel/JSON格式。A Battery-included Scraper API of Tianyancha, the best Chinese business data and investigation platform.

Stars: ✭ 206 (-6.36%)

Mutual labels: crawler, scraper

Goose Parser

Universal scrapping tool, which allows you to extract data using multiple environments

Stars: ✭ 211 (-4.09%)

Mutual labels: crawler, scraper

Terpene Profile Parser For Cannabis Strains

Parser and database to index the terpene profile of different strains of Cannabis from online databases

Stars: ✭ 63 (-71.36%)

Mutual labels: crawler, scrapy

Wombat

Lightweight Ruby web crawler/scraper with an elegant DSL which extracts structured data from pages.

Stars: ✭ 1,220 (+454.55%)

Mutual labels: crawler, scraper

Taiwan News Crawlers

Scrapy-based Crawlers for news of Taiwan

Stars: ✭ 83 (-62.27%)

Mutual labels: crawler, scrapy

Media Scraper

Scrapes all photos and videos in a web page / Instagram / Twitter / Tumblr / Reddit / pixiv / TikTok

Stars: ✭ 206 (-6.36%)

Mutual labels: crawler, scraper

Lulu

[Unmaintained] A simple and clean video/music/image downloader 👾

Stars: ✭ 789 (+258.64%)

Mutual labels: crawler, scraper

Scrapy Azuresearch Crawler Samples

Scrapy as a Web Crawler for Azure Search Samples

Stars: ✭ 20 (-90.91%)

Mutual labels: crawler, scrapy

Scrapit

Scraping scripts for various websites.

Stars: ✭ 25 (-88.64%)

Mutual labels: crawler, scraper

Colly

Elegant Scraper and Crawler Framework for Golang

Stars: ✭ 15,535 (+6961.36%)

Mutual labels: crawler, scraper

Warta Scrap

Indonesia Index News Crawler, including 10 online media

Stars: ✭ 57 (-74.09%)

Mutual labels: scraper, scrapy

Goscraper

Golang pkg to quickly return a preview of a webpage (title/description/images)

Stars: ✭ 72 (-67.27%)

Mutual labels: crawler, scraper

Avbook

AV 电影管理系统， avmoo , javbus , javlibrary 爬虫，线上 AV 影片图书馆，AV 磁力链接数据库，Japanese Adult Video Library,Adult Video Magnet Links - Japanese Adult Video Database

Stars: ✭ 8,133 (+3596.82%)

Mutual labels: crawler, scraper

Not Your Average Web Crawler

A web crawler (for bug hunting) that gathers more than you can imagine.

Stars: ✭ 107 (-51.36%)

Mutual labels: crawler, scraper

Google Play Scraper

Node.js scraper to get data from Google Play

Stars: ✭ 1,606 (+630%)

Mutual labels: crawler, scraper

Seleniumcrawler

An example using Selenium webdrivers for python and Scrapy framework to create a web scraper to crawl an ASP site

Stars: ✭ 117 (-46.82%)

Mutual labels: scraper, scrapy

Crawlab Lite

Lite version of Crawlab. 轻量版 Crawlab 爬虫管理平台

Stars: ✭ 122 (-44.55%)

Mutual labels: crawler, scrapy

Docs

《数据采集从入门到放弃》源码。内容简介：爬虫介绍、就业情况、爬虫工程师面试题；HTTP协议介绍； Requests使用；解析器Xpath介绍； MongoDB与MySQL；多线程爬虫； Scrapy介绍；Scrapy-redis介绍；使用docker部署；使用nomad管理docker集群；使用EFK查询docker日志

Stars: ✭ 118 (-46.36%)

Mutual labels: crawler, scrapy

Squidwarc

Squidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head

Stars: ✭ 125 (-43.18%)

Mutual labels: crawler, headless-chrome

Querylist

🕷️ The progressive PHP crawler framework! 优雅的渐进式PHP采集框架。

Stars: ✭ 2,392 (+987.27%)

Mutual labels: crawler, scraper

Crawler

A high performance web crawler in Elixir.

Stars: ✭ 781 (+255%)

Mutual labels: crawler, scraper

Spidr

A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.

Stars: ✭ 656 (+198.18%)

Mutual labels: crawler, scraper

Py3 scripts

Life is short, *****.

Stars: ✭ 5 (-97.73%)

Mutual labels: crawler, scrapy

Pypergrabber

Fetches PubMed article IDs (PMIDs) from email inbox, then crawls PubMed, Google Scholar and Sci-Hub for respective PDF files.

Stars: ✭ 14 (-93.64%)

Mutual labels: crawler, scraper

Voyages Sncf Api

A scrapy spider that scraps times and prices from Voyages Sncf. It uses scrapyrt to provide an API interface.

Stars: ✭ 7 (-96.82%)

Mutual labels: scraper, scrapy

Crawlab

Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台，支持任何语言和框架

Stars: ✭ 8,392 (+3714.55%)

Mutual labels: crawler, scrapy

Icrawler

A multi-thread crawler framework with many builtin image crawlers provided.

Stars: ✭ 629 (+185.91%)

Mutual labels: crawler, scrapy

Jvppeteer

Headless Chrome For Java （Java 爬虫）

Stars: ✭ 193 (-12.27%)

Mutual labels: crawler, scraper

Social Scraper

Tổng hợp script crawl dữ liệu từ các mạng xã hội & website tiếng Việt

Stars: ✭ 47 (-78.64%)

Mutual labels: crawler, scraper

Crawlergo

A powerful dynamic crawler for web vulnerability scanners

Stars: ✭ 1,088 (+394.55%)

Mutual labels: crawler, headless-chrome

Django Dynamic Scraper

Creating Scrapy scrapers via the Django admin interface

Stars: ✭ 1,024 (+365.45%)

Mutual labels: scraper, scrapy

Jd Autobuy

Python爬虫，京东自动登录，在线抢购商品

Stars: ✭ 1,174 (+433.64%)

Mutual labels: crawler, scraper

Scrapy Examples

Some scrapy and web.py exmaples

Stars: ✭ 71 (-67.73%)

Mutual labels: crawler, scrapy

Email Extractor

The main functionality is to extract all the emails from one or several URLs - La funcionalidad principal es extraer todos los correos electrónicos de una o varias Url

Stars: ✭ 81 (-63.18%)

Mutual labels: scraper, scrapy

Easy Scraping Tutorial

Simple but useful Python web scraping tutorial code.

Stars: ✭ 583 (+165%)

Mutual labels: crawler, scrapy

Github Spider

Github 仓库及用户分析爬虫

Stars: ✭ 190 (-13.64%)

Mutual labels: crawler, scrapy

Crawler

爬虫, http代理, 模拟登陆!

Stars: ✭ 106 (-51.82%)

Mutual labels: crawler, scrapy

Patentcrawler

scrapy专利爬虫（停止维护）

Stars: ✭ 114 (-48.18%)

Mutual labels: crawler, scrapy

Dotnetcrawler

DotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c

Stars: ✭ 100 (-54.55%)

Mutual labels: crawler, scrapy

Instagram Scraper

scrapes medias, likes, followers, tags and all metadata. Inspired by instagram-php-scraper,bot

Stars: ✭ 2,209 (+904.09%)

Mutual labels: crawler, scraper

Qqmusicspider

基于Scrapy的QQ音乐爬虫(QQ Music Spider)，爬取歌曲信息、歌词、精彩评论等，并且分享了QQ音乐中排名前6400名的内地和港台歌手的49万+的音乐语料

Stars: ✭ 120 (-45.45%)

Mutual labels: crawler, scrapy

Newspaper

News, full-text, and article metadata extraction in Python 3. Advanced docs: