All Projects → Geziyor → Similar Projects or Alternatives

1170 Open source projects that are alternatives of or similar to Geziyor

Home Assistant custom component for scraping (html, xml or json) multiple values (from a single HTTP request) with a separate sensor/attribute for each value. Support for (login) form-submit functionality.

Stars: ✭ 103 (-91.73%)

Mutual labels: scraper, scraping

proxycrawl-python

ProxyCrawl Python library for scraping and crawling

Stars: ✭ 51 (-95.91%)

Mutual labels: scraper, scraping

robotstxt

robots.txt file parsing and checking for R

Stars: ✭ 65 (-94.78%)

Mutual labels: scraper, spider

document-dl

Command line program to download documents from web portals.

Stars: ✭ 14 (-98.88%)

Mutual labels: scraper, scraping

scrapy-distributed

A series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.

Stars: ✭ 38 (-96.95%)

Mutual labels: spider, scraping

scrapman

Retrieve real (with Javascript executed) HTML code from an URL, ultra fast and supports multiple parallel loading of webs

Stars: ✭ 21 (-98.31%)

Mutual labels: scraper, scraping

Crawlab

Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台，支持任何语言和框架

Stars: ✭ 8,392 (+573.52%)

Mutual labels: crawler, spider

flink-crawler

Continuous scalable web crawler built on top of Flink and crawler-commons

Stars: ✭ 48 (-96.15%)

Mutual labels: crawler, spider

Scraper-Projects

🕸 List of mini projects that involve web scraping 🕸

Stars: ✭ 25 (-97.99%)

Mutual labels: scraper, scraping

Zeiver

A Scraper, Downloader, & Recorder for static open directories.

Stars: ✭ 14 (-98.88%)

Mutual labels: scraper, scraping

TorScrapper

A Scraper made 100% in Python using BeautifulSoup and Tor. It can be used to scrape both normal and onion links. Happy Scraping :)

Stars: ✭ 24 (-98.07%)

Mutual labels: scraper, scraping

TikTokDownloader PyWebIO

🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音|TikTok数据爬取工具，支持API调用，在线批量解析及下载。

Stars: ✭ 919 (-26.24%)

Mutual labels: scraper, spider

Bilili

🍻 bilibili video (including bangumi) and danmaku downloader | B站视频（含番剧）、弹幕下载器

Stars: ✭ 379 (-69.58%)

Mutual labels: crawler, spider

Lizard

💐 Full Amazon Automatic Download

Stars: ✭ 41 (-96.71%)

Mutual labels: crawler, spider

weibo-scraper

Simple Weibo Scraper

Stars: ✭ 50 (-95.99%)

Mutual labels: crawler, scraper

facebook-discussion-tk

A collection of tools to (semi-)automatically collect and analyze data from online discussions on Facebook groups and pages.

Stars: ✭ 33 (-97.35%)

Mutual labels: scraper, scraping

lightnovel epub

🍭 epub generator for (light)novels (轻) 小说 epub 生成器，支持站点：轻之国度、轻小说文库

Stars: ✭ 89 (-92.86%)

Mutual labels: crawler, scraper

ZhengFang System Spider

🐛一只登录正方教务管理系统，爬取数据的小爬虫

Stars: ✭ 21 (-98.31%)

Mutual labels: crawler, spider

Rcrawler

An R web crawler and scraper

Stars: ✭ 274 (-78.01%)

Mutual labels: crawler, scraper

Bt Btt

磁力網站U3C3介紹以及域名更新

Stars: ✭ 261 (-79.05%)

Mutual labels: crawler, spider

Bookcorpus

Crawl BookCorpus

Stars: ✭ 443 (-64.45%)

Mutual labels: crawler, scraper

Scrapedin

LinkedIn Scraper (currently working 2020)

Stars: ✭ 453 (-63.64%)

Mutual labels: crawler, scraper

Learnpython

Python的基础练习代码与各种爬虫代码

Stars: ✭ 451 (-63.8%)

Mutual labels: crawler, spider

Maman

Rust Web Crawler saving pages on Redis

Stars: ✭ 39 (-96.87%)

Mutual labels: crawler, spider

Sasila

一个灵活、友好的爬虫框架

Stars: ✭ 286 (-77.05%)

Mutual labels: crawler, scraping

Toapi

Every web site provides APIs.

Stars: ✭ 3,209 (+157.54%)

Mutual labels: crawler, spider

Gospider

golang实现的爬虫框架，使用者只需关心页面规则，提供web管理界面。基于colly开发。

Stars: ✭ 285 (-77.13%)

Mutual labels: crawler, spider

91porn Api

🌭💦 91porn爬虫在线无限制API接口（永久有效，口令每日更新）及在线web预览

Stars: ✭ 341 (-72.63%)

Mutual labels: crawler, spider

scraper

图片爬取下载工具，极速爬取下载站酷https://www.zcool.com.cn/, CNU 视觉 http://www.cnu.cc/ 设计师/用户上传的图片/照片/插画。

Stars: ✭ 64 (-94.86%)

Mutual labels: scraper, spider

Scrapple

A framework for creating semi-automatic web content extractors

Stars: ✭ 464 (-62.76%)

Mutual labels: crawler, scraping

Signature algorithm

各种App、小程序、网站的请求签名或加密算法。现已有：自如、小红书、蛋壳公寓、luckin coffee(瑞幸咖啡)、bangkokair(曼谷航空)

Stars: ✭ 380 (-69.5%)

Mutual labels: crawler, spider

Crawler examples

Some classic web crawler projects.一些经典的爬虫

Stars: ✭ 74 (-94.06%)

Mutual labels: crawler, spider

Spider Flow

新一代爬虫平台，以图形化方式定义爬虫流程，不写代码即可完成爬虫。

Stars: ✭ 365 (-70.71%)

Mutual labels: crawler, spider

Haipproxy

💖 High available distributed ip proxy pool, powerd by Scrapy and Redis

Stars: ✭ 4,993 (+300.72%)

Mutual labels: crawler, spider

Xsrfprobe

The Prime Cross Site Request Forgery (CSRF) Audit and Exploitation Toolkit.

Stars: ✭ 532 (-57.3%)

Mutual labels: crawler, spider

Html2article

Html网页正文提取

Stars: ✭ 441 (-64.61%)

Mutual labels: crawler, spider

Webster

a reliable high-level web crawling & scraping framework for Node.js.

Stars: ✭ 364 (-70.79%)

Mutual labels: crawler, spider

Netdiscovery

NetDiscovery 是一款基于 Vert.x、RxJava 2 等框架实现的通用爬虫框架/中间件。

Stars: ✭ 573 (-54.01%)

Mutual labels: crawler, spider

Dataflowkit

Extract structured data from web sites. Web sites scraping.

Stars: ✭ 456 (-63.4%)

Mutual labels: scraper, scraping

Easy Scraping Tutorial

Simple but useful Python web scraping tutorial code.

Stars: ✭ 583 (-53.21%)

Mutual labels: crawler, scraping

Douyin

API of DouYin for Humans used to Crawl Popular Videos and Musics

Stars: ✭ 580 (-53.45%)

Mutual labels: crawler, spider

Goscraper

Golang pkg to quickly return a preview of a webpage (title/description/images)

Stars: ✭ 72 (-94.22%)

Mutual labels: crawler, scraper

Go jobs

带你了解一下Golang的市场行情

Stars: ✭ 526 (-57.78%)

Mutual labels: crawler, spider

Nintendo Switch Eshop

Crawler for Nintendo Switch eShop

Stars: ✭ 463 (-62.84%)

Mutual labels: crawler, scraper

Xxl Crawler

A distributed web crawler framework.（分布式爬虫框架XXL-CRAWLER）

Stars: ✭ 561 (-54.98%)

Mutual labels: crawler, spider

Fictiondown

Stars: ✭ 362 (-70.95%)

Mutual labels: crawler, spider

Icrawler

A multi-thread crawler framework with many builtin image crawlers provided.

Stars: ✭ 629 (-49.52%)

Mutual labels: crawler, spider

Scrapyrt

HTTP API for Scrapy spiders

Stars: ✭ 637 (-48.88%)

Mutual labels: crawler, scraper

Creeper

🐾 Creeper - The Next Generation Crawler Framework (Go)

Stars: ✭ 762 (-38.84%)

Mutual labels: crawler, spider

Imagescraper

✂️ High performance, multi-threaded image scraper

Stars: ✭ 630 (-49.44%)

Mutual labels: scraper, scraping

Grab Site

The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns

Stars: ✭ 680 (-45.43%)

Mutual labels: crawler, spider

Jd Autobuy

Python爬虫，京东自动登录，在线抢购商品

Stars: ✭ 1,174 (-5.78%)

Mutual labels: crawler, scraper

Gospider

Gospider - Fast web spider written in Go

Stars: ✭ 785 (-37%)

Mutual labels: crawler, spider

Nodespider

[DEPRECATED] Simple, flexible, delightful web crawler/spider package

Stars: ✭ 33 (-97.35%)

Mutual labels: crawler, spider

Baiduimagespider

一个超级轻量的百度图片爬虫

Stars: ✭ 591 (-52.57%)

Mutual labels: crawler, spider

Zhihu Crawler

zhihu-crawler是一个基于Java的高性能、支持免费http代理池、支持横向扩展、分布式爬虫项目

Stars: ✭ 890 (-28.57%)

Mutual labels: crawler, spider

Torbot

Dark Web OSINT Tool

Stars: ✭ 821 (-34.11%)

Mutual labels: crawler, spider

Beanbun

Beanbun 是用 PHP 编写的多进程网络爬虫框架，具有良好的开放性、高可扩展性，基于 Workerman。

Stars: ✭ 1,096 (-12.04%)

Mutual labels: crawler, spider

Pypergrabber

Fetches PubMed article IDs (PMIDs) from email inbox, then crawls PubMed, Google Scholar and Sci-Hub for respective PDF files.

Stars: ✭ 14 (-98.88%)

Mutual labels: crawler, scraper

Pypatent

Search for and retrieve US Patent and Trademark Office Patent Data

Stars: ✭ 31 (-97.51%)

Mutual labels: scraper, scraping

61-120 of 1170 similar projects

‹

›

next*5