A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.

Stars: ✭ 656 (+22.39%)

Mutual labels: crawler, spider, scraper

Xcrawler

快速、简洁且强大的PHP爬虫框架

Stars: ✭ 344 (-35.82%)

Mutual labels: crawler, spider, scraper

Freshonions Torscraper

Fresh Onions is an open source TOR spider / hidden service onion crawler hosted at zlal32teyptf4tvi.onion

Stars: ✭ 348 (-35.07%)

Mutual labels: crawler, spider, scraper

Scrapoxy

Scrapoxy hides your scraper behind a cloud. It starts a pool of proxies to send your requests. Now, you can crawl without thinking about blacklisting!

Stars: ✭ 1,322 (+146.64%)

Mutual labels: crawler, scraper, scrapy

Not Your Average Web Crawler

A web crawler (for bug hunting) that gathers more than you can imagine.

Stars: ✭ 107 (-80.04%)

Mutual labels: crawler, spider, scraper

Linkedin Profile Scraper

🕵️‍♂️ LinkedIn profile scraper returning structured profile data in JSON. Works in 2020.

Stars: ✭ 171 (-68.1%)

Mutual labels: crawler, spider, scraper

Scrapingoutsourcing

ScrapingOutsourcing专注分享爬虫代码尽量每周更新一个

Stars: ✭ 164 (-69.4%)

Mutual labels: crawler, spider, scrapy

crawler-chrome-extensions

爬虫工程师常用的 Chrome 插件 | Chrome extensions used by crawler developer

Stars: ✭ 53 (-90.11%)

Mutual labels: scraper, spider, crawl

wget-lua

Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.

Stars: ✭ 52 (-90.3%)

Mutual labels: scraper, spider, crawl

OpenScraper

An open source webapp for scraping: towards a public service for webscraping

Stars: ✭ 80 (-85.07%)

Mutual labels: scraper, spider, scrapy

Mailinglistscraper

A python web scraper for public email lists.

Stars: ✭ 19 (-96.46%)

Mutual labels: spider, scraper, scrapy

Crawler

A high performance web crawler in Elixir.

Stars: ✭ 781 (+45.71%)

Mutual labels: crawler, spider, scraper

Ruiji.net

crawler framework, distributed crawler extractor

Stars: ✭ 220 (-58.96%)

Mutual labels: crawler, scraper, scrapy

Avbook

AV 电影管理系统， avmoo , javbus , javlibrary 爬虫，线上 AV 影片图书馆，AV 磁力链接数据库，Japanese Adult Video Library,Adult Video Magnet Links - Japanese Adult Video Database

Stars: ✭ 8,133 (+1417.35%)

Mutual labels: crawler, spider, scraper

Geziyor

Geziyor, a fast web crawling & scraping framework for Go. Supports JS rendering.

Stars: ✭ 1,246 (+132.46%)

Mutual labels: crawler, spider, scraper

Colly

Elegant Scraper and Crawler Framework for Golang

Stars: ✭ 15,535 (+2798.32%)

Mutual labels: crawler, spider, scraper

Scrapit

Scraping scripts for various websites.

Stars: ✭ 25 (-95.34%)

Mutual labels: crawler, spider, scraper

Marmot

💐Marmot | Web Crawler/HTTP protocol Download Package 🐭

Stars: ✭ 186 (-65.3%)

Mutual labels: crawler, spider, scrapy

arachnod

High performance crawler for Nodejs

Stars: ✭ 17 (-96.83%)

Mutual labels: crawler, scraper, spider

Gosint

OSINT Swiss Army Knife

Stars: ✭ 401 (-25.19%)

Mutual labels: crawler, spider, scraper

Scrapple

A framework for creating semi-automatic web content extractors

Stars: ✭ 464 (-13.43%)

Mutual labels: crawler, scrapy

PttImageSpider

PTT 圖片下載器 (抓取整個看板的圖片，並用文章標題作為資料夾的名稱 ) (使用Scrapy)

Stars: ✭ 16 (-97.01%)

Mutual labels: spider, scrapy

Nintendo Switch Eshop

Crawler for Nintendo Switch eShop

Stars: ✭ 463 (-13.62%)

Mutual labels: crawler, scraper

ZhengFang System Spider

🐛一只登录正方教务管理系统，爬取数据的小爬虫

Stars: ✭ 21 (-96.08%)

Mutual labels: crawler, spider

facebook-discussion-tk

A collection of tools to (semi-)automatically collect and analyze data from online discussions on Facebook groups and pages.

Stars: ✭ 33 (-93.84%)

Mutual labels: scraper, facebook

ip proxy pool

Generating spiders dynamically to crawl and check those free proxy ip on the internet with scrapy.

Stars: ✭ 39 (-92.72%)

Mutual labels: spider, scrapy

MyCrawler

我的爬虫合集

Stars: ✭ 55 (-89.74%)

Mutual labels: crawler, scraper

weibo-scraper

Simple Weibo Scraper

Stars: ✭ 50 (-90.67%)

Mutual labels: crawler, scraper

galer

A fast tool to fetch URLs from HTML attributes by crawl-in.

Stars: ✭ 138 (-74.25%)

Mutual labels: crawler, spider

lightnovel epub

🍭 epub generator for (light)novels (轻) 小说 epub 生成器，支持站点：轻之国度、轻小说文库

Stars: ✭ 89 (-83.4%)

Mutual labels: crawler, scraper

bots-zoo

No description or website provided.

Stars: ✭ 59 (-88.99%)

Mutual labels: crawler, scraper

Douban Crawler

Uno Crawler por https://douban.com

Stars: ✭ 13 (-97.57%)

Mutual labels: spider, scrapy

fb-scraper

Scrape a Facebook profile and turn it into a JSON file

Stars: ✭ 18 (-96.64%)

Mutual labels: scraper, facebook

Tieba spider

百度贴吧爬虫(基于scrapy和mysql)

Stars: ✭ 257 (-52.05%)

Mutual labels: spider, scrapy

Scrapedin

LinkedIn Scraper (currently working 2020)

Stars: ✭ 453 (-15.49%)

Mutual labels: crawler, scraper

Bt Btt

磁力網站U3C3介紹以及域名更新

Stars: ✭ 261 (-51.31%)

Mutual labels: crawler, spider

Learnpython

Python的基础练习代码与各种爬虫代码

Stars: ✭ 451 (-15.86%)

Mutual labels: crawler, spider

Java Spider

一个基于webmagic框架二次开发的java爬虫框架实战，已实现能爬取腾讯，搜狐，今日头条（单独集成功能）等资讯内容，配合elasticsearch框架用法，实现了自动爬虫，已投入线上生产使用。

Stars: ✭ 276 (-48.51%)

Mutual labels: spider, scraper

Happy Spiders

🔧 🔩 🔨 收集整理了爬虫相关的工具、模拟登陆技术、代理IP、scrapy模板代码等内容。

Stars: ✭ 261 (-51.31%)

Mutual labels: spider, scrapy

Rcrawler

An R web crawler and scraper

Stars: ✭ 274 (-48.88%)

Mutual labels: crawler, scraper

Bookcorpus

Crawl BookCorpus

Stars: ✭ 443 (-17.35%)

Mutual labels: crawler, scraper

Alltheplaces

A set of spiders and scrapers to extract location information from places that post their location on the internet.

Stars: ✭ 277 (-48.32%)

Mutual labels: spider, scrapy

Html2article

Html网页正文提取

Stars: ✭ 441 (-17.72%)

Mutual labels: crawler, spider

1-60 of 1540 similar projects

›

next*5