A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.

Stars: ✭ 656 (-47.35%)

Mutual labels: crawler, spider, scraper

Crawler

A high performance web crawler in Elixir.

Stars: ✭ 781 (-37.32%)

Mutual labels: crawler, spider, scraper

wget-lua

Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.

Stars: ✭ 52 (-95.83%)

Mutual labels: scraper, spider, scraping

crawler-chrome-extensions

爬虫工程师常用的 Chrome 插件 | Chrome extensions used by crawler developer

Stars: ✭ 53 (-95.75%)

Mutual labels: scraper, spider, scraping

Scrapit

Scraping scripts for various websites.

Stars: ✭ 25 (-97.99%)

Mutual labels: crawler, spider, scraper

arachnod

High performance crawler for Nodejs

Stars: ✭ 17 (-98.64%)

Mutual labels: crawler, scraper, spider

Headless Chrome Crawler

Distributed crawler powered by Headless Chrome

Stars: ✭ 5,129 (+311.64%)

Mutual labels: crawler, scraper, scraping

Lulu

[Unmaintained] A simple and clean video/music/image downloader 👾

Stars: ✭ 789 (-36.68%)

Mutual labels: crawler, scraper, scraping

Fbcrawl

A Facebook crawler

Stars: ✭ 536 (-56.98%)

Mutual labels: crawler, spider, scraper

Avbook

AV 电影管理系统， avmoo , javbus , javlibrary 爬虫，线上 AV 影片图书馆，AV 磁力链接数据库，Japanese Adult Video Library,Adult Video Magnet Links - Japanese Adult Video Database

Stars: ✭ 8,133 (+552.73%)

Mutual labels: crawler, spider, scraper

papercut

Papercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.

Stars: ✭ 15 (-98.8%)

Mutual labels: crawler, scraper, scraping

Not Your Average Web Crawler

A web crawler (for bug hunting) that gathers more than you can imagine.

Stars: ✭ 107 (-91.41%)

Mutual labels: crawler, spider, scraper

Goose Parser

Universal scrapping tool, which allows you to extract data using multiple environments

Stars: ✭ 211 (-83.07%)

Mutual labels: crawler, scraper, scraping

Goribot

[Crawler/Scraper for Golang]🕷A lightweight distributed friendly Golang crawler framework.一个轻量的分布式友好的 Golang 爬虫框架。

Stars: ✭ 190 (-84.75%)

Mutual labels: crawler, spider, scraper

bots-zoo

No description or website provided.

Stars: ✭ 59 (-95.26%)

Mutual labels: crawler, scraper, scraping

Gopa

[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn

Stars: ✭ 277 (-77.77%)

Mutual labels: crawler, spider, scraping

Django Dynamic Scraper

Creating Scrapy scrapers via the Django admin interface

Stars: ✭ 1,024 (-17.82%)

Mutual labels: spider, scraper, scraping

Xcrawler

快速、简洁且强大的PHP爬虫框架

Stars: ✭ 344 (-72.39%)

Mutual labels: crawler, spider, scraper

scrapy facebooker

Collection of scrapy spiders which can scrape posts, images, and so on from public Facebook Pages.

Stars: ✭ 22 (-98.23%)

Mutual labels: scraper, spider, scraping

Autoscraper

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

Stars: ✭ 4,077 (+227.21%)

Mutual labels: crawler, scraper, scraping

Freshonions Torscraper

Fresh Onions is an open source TOR spider / hidden service onion crawler hosted at zlal32teyptf4tvi.onion

Stars: ✭ 348 (-72.07%)

Mutual labels: crawler, spider, scraper

Html2article

Html网页正文提取

Stars: ✭ 441 (-64.61%)

Mutual labels: crawler, spider

Spider

python crawler spider

Stars: ✭ 70 (-94.38%)

Mutual labels: crawler, spider

Learnpython

Python的基础练习代码与各种爬虫代码

Stars: ✭ 451 (-63.8%)

Mutual labels: crawler, spider

Bookcorpus

Crawl BookCorpus

Stars: ✭ 443 (-64.45%)

Mutual labels: crawler, scraper

Scrapedin

LinkedIn Scraper (currently working 2020)

Stars: ✭ 453 (-63.64%)

Mutual labels: crawler, scraper

Scrapple

A framework for creating semi-automatic web content extractors

Stars: ✭ 464 (-62.76%)

Mutual labels: crawler, scraping

Nintendo Switch Eshop

Crawler for Nintendo Switch eShop

Stars: ✭ 463 (-62.84%)

Mutual labels: crawler, scraper

Arachnid

Powerful web scraping framework for Crystal

Stars: ✭ 68 (-94.54%)

Mutual labels: crawler, spider

Bilili

🍻 bilibili video (including bangumi) and danmaku downloader | B站视频（含番剧）、弹幕下载器

Stars: ✭ 379 (-69.58%)

Mutual labels: crawler, spider

Dataflowkit

Extract structured data from web sites. Web sites scraping.

Stars: ✭ 456 (-63.4%)

Mutual labels: scraper, scraping

Wombat

Lightweight Ruby web crawler/scraper with an elegant DSL which extracts structured data from pages.

Stars: ✭ 1,220 (-2.09%)

Mutual labels: crawler, scraper

Xsrfprobe

The Prime Cross Site Request Forgery (CSRF) Audit and Exploitation Toolkit.

Stars: ✭ 532 (-57.3%)

Mutual labels: crawler, spider

Netdiscovery

NetDiscovery 是一款基于 Vert.x、RxJava 2 等框架实现的通用爬虫框架/中间件。

Stars: ✭ 573 (-54.01%)

Mutual labels: crawler, spider

Go jobs

带你了解一下Golang的市场行情

Stars: ✭ 526 (-57.78%)

Mutual labels: crawler, spider

Xxl Crawler

A distributed web crawler framework.（分布式爬虫框架XXL-CRAWLER）

Stars: ✭ 561 (-54.98%)

Mutual labels: crawler, spider

Douyin

API of DouYin for Humans used to Crawl Popular Videos and Musics

Stars: ✭ 580 (-53.45%)

Mutual labels: crawler, spider

Baiduimagespider

一个超级轻量的百度图片爬虫

Stars: ✭ 591 (-52.57%)

Mutual labels: crawler, spider

Imagescraper

✂️ High performance, multi-threaded image scraper

Stars: ✭ 630 (-49.44%)

Mutual labels: scraper, scraping

Signature algorithm

各种App、小程序、网站的请求签名或加密算法。现已有：自如、小红书、蛋壳公寓、luckin coffee(瑞幸咖啡)、bangkokair(曼谷航空)

Stars: ✭ 380 (-69.5%)

Mutual labels: crawler, spider

Haipproxy

💖 High available distributed ip proxy pool, powerd by Scrapy and Redis

Stars: ✭ 4,993 (+300.72%)

Mutual labels: crawler, spider

Easy Scraping Tutorial

Simple but useful Python web scraping tutorial code.

Stars: ✭ 583 (-53.21%)

Mutual labels: crawler, scraping

Icrawler

A multi-thread crawler framework with many builtin image crawlers provided.

Stars: ✭ 629 (-49.52%)

Mutual labels: crawler, spider

Scrapyrt

HTTP API for Scrapy spiders

Stars: ✭ 637 (-48.88%)

Mutual labels: crawler, scraper

Gospider

Gospider - Fast web spider written in Go

Stars: ✭ 785 (-37%)

Mutual labels: crawler, spider

Crawler examples

Some classic web crawler projects.一些经典的爬虫

Stars: ✭ 74 (-94.06%)

Mutual labels: crawler, spider

Zhihu Crawler

zhihu-crawler是一个基于Java的高性能、支持免费http代理池、支持横向扩展、分布式爬虫项目

Stars: ✭ 890 (-28.57%)

Mutual labels: crawler, spider

Torbot

Dark Web OSINT Tool

Stars: ✭ 821 (-34.11%)

Mutual labels: crawler, spider

Mailinglistscraper

A python web scraper for public email lists.

Stars: ✭ 19 (-98.48%)

Mutual labels: spider, scraper

Pypatent

Search for and retrieve US Patent and Trademark Office Patent Data

Stars: ✭ 31 (-97.51%)

Mutual labels: scraper, scraping

Pypergrabber

Fetches PubMed article IDs (PMIDs) from email inbox, then crawls PubMed, Google Scholar and Sci-Hub for respective PDF files.

Stars: ✭ 14 (-98.88%)

Mutual labels: crawler, scraper

1-60 of 1170 similar projects

›

next*5