A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.

Stars: ✭ 656 (+437.7%)

Mutual labels: crawler, spider, web-crawler

Marmot

💐Marmot | Web Crawler/HTTP protocol Download Package 🐭

Stars: ✭ 186 (+52.46%)

Mutual labels: crawler, spider, scrapy

Icrawler

A multi-thread crawler framework with many builtin image crawlers provided.

Stars: ✭ 629 (+415.57%)

Mutual labels: crawler, spider, scrapy

Terpene Profile Parser For Cannabis Strains

Parser and database to index the terpene profile of different strains of Cannabis from online databases

Stars: ✭ 63 (-48.36%)

Mutual labels: crawler, scrapy, web-crawler

Zhihu Crawler

zhihu-crawler是一个基于Java的高性能、支持免费http代理池、支持横向扩展、分布式爬虫项目

Stars: ✭ 890 (+629.51%)

Mutual labels: crawler, spider

Mailinglistscraper

A python web scraper for public email lists.

Stars: ✭ 19 (-84.43%)

Mutual labels: spider, scrapy

Scrala

Unmaintained 🐳 ☕️ 🕷 Scala crawler(spider) framework, inspired by scrapy, created by @gaocegege

Stars: ✭ 113 (-7.38%)

Mutual labels: spider, scrapy

Decryptlogin

APIs for loginning some websites by using requests.

Stars: ✭ 1,861 (+1425.41%)

Mutual labels: crawler, spider

Scrapy Azuresearch Crawler Samples

Scrapy as a Web Crawler for Azure Search Samples

Stars: ✭ 20 (-83.61%)

Mutual labels: crawler, scrapy

Nodespider

[DEPRECATED] Simple, flexible, delightful web crawler/spider package

Stars: ✭ 33 (-72.95%)

Mutual labels: crawler, spider

Copybook

用爬虫爬取小说网站上所有小说，存储到数据库中，并用爬到的数据构建自己的小说网站

Stars: ✭ 117 (-4.1%)

Mutual labels: spider, scrapy

Seeker

Seeker - another job board aggregator.

Stars: ✭ 16 (-86.89%)

Mutual labels: spider, scrapy

Torbot

Dark Web OSINT Tool

Stars: ✭ 821 (+572.95%)

Mutual labels: crawler, spider

Scrapit

Scraping scripts for various websites.

Stars: ✭ 25 (-79.51%)

Mutual labels: crawler, spider

Py3 scripts

Life is short, *****.

Stars: ✭ 5 (-95.9%)

Mutual labels: crawler, scrapy

Pkulaw spider

爬取北大法宝网http://www.pkulaw.cn/Case/

Stars: ✭ 113 (-7.38%)

Mutual labels: crawler, spider

Jspider

JSpider会每周更新至少一个网站的JS解密方式，欢迎 Star，交流微信：13298307816

Stars: ✭ 914 (+649.18%)

Mutual labels: spider, scrapy

Qqmusicspider

基于Scrapy的QQ音乐爬虫(QQ Music Spider)，爬取歌曲信息、歌词、精彩评论等，并且分享了QQ音乐中排名前6400名的内地和港台歌手的49万+的音乐语料

Stars: ✭ 120 (-1.64%)

Mutual labels: crawler, scrapy

Gospider

Gospider - Fast web spider written in Go

Stars: ✭ 785 (+543.44%)

Mutual labels: crawler, spider

Examples Of Web Crawlers

一些非常有趣的python爬虫例子,对新手比较友好,主要爬取淘宝、天猫、微信、豆瓣、QQ等网站。(Some interesting examples of python crawlers that are friendly to beginners. )

Stars: ✭ 10,724 (+8690.16%)

Mutual labels: crawler, spider

Lizard

💐 Full Amazon Automatic Download

Stars: ✭ 41 (-66.39%)

Mutual labels: crawler, spider

Photon

Incredibly fast crawler designed for OSINT.

Stars: ✭ 8,332 (+6729.51%)

Mutual labels: crawler, spider

Avbook

AV 电影管理系统， avmoo , javbus , javlibrary 爬虫，线上 AV 影片图书馆，AV 磁力链接数据库，Japanese Adult Video Library,Adult Video Magnet Links - Japanese Adult Video Database

Stars: ✭ 8,133 (+6566.39%)

Mutual labels: crawler, spider

Baiduspider

BaiduSpider，一个爬取百度搜索结果的爬虫，目前支持百度网页搜索，百度图片搜索，百度知道搜索，百度视频搜索，百度资讯搜索，百度文库搜索，百度经验搜索和百度百科搜索。

Stars: ✭ 105 (-13.93%)

Mutual labels: crawler, spider

Awesome Python Primer

自学入门 Python 优质中文资源索引，包含书籍 / 文档 / 视频，适用于爬虫 / Web / 数据分析 / 机器学习方向

Stars: ✭ 57 (-53.28%)

Mutual labels: crawler, spider

Reptile

🏀 Python3 网络爬虫实战（部分含详细教程）猫眼腾讯视频豆瓣研招网微博笔趣阁小说百度热点 B站 CSDN 网易云阅读阿里文学百度股票今日头条微信公众号网易云音乐拉勾有道 unsplash 实习僧汽车之家英雄联盟盒子大众点评链家 LPL赛程台风梦幻西游、阴阳师藏宝阁天气牛客网百度文库睡前故事知乎 Wish

Stars: ✭ 1,048 (+759.02%)

Mutual labels: spider, scrapy

Car Prices

Golang爬虫爬取汽车之家二手车产品库

Stars: ✭ 57 (-53.28%)

Mutual labels: crawler, spider

Funpyspidersearchengine

Word2vec 千人千面个性化搜索 + Scrapy2.3.0(爬取数据) + ElasticSearch7.9.1(存储数据并提供对外Restful API) + Django3.1.1 搜索

Stars: ✭ 782 (+540.98%)

Mutual labels: spider, scrapy

App comments spider

爬取百度贴吧、TapTap、appstore、微博官方博主上的游戏评论(基于redis_scrapy)，过滤器采用了bloomfilter。

Stars: ✭ 38 (-68.85%)

Mutual labels: spider, scrapy

Django Dynamic Scraper

Creating Scrapy scrapers via the Django admin interface

Stars: ✭ 1,024 (+739.34%)

Mutual labels: spider, scrapy

Beanbun

Beanbun 是用 PHP 编写的多进程网络爬虫框架，具有良好的开放性、高可扩展性，基于 Workerman。

Stars: ✭ 1,096 (+798.36%)

Mutual labels: crawler, spider

Hive

lots of spider (很多爬虫）

Stars: ✭ 110 (-9.84%)

Mutual labels: spider, scrapy

Spider

python crawler spider

Stars: ✭ 70 (-42.62%)

Mutual labels: crawler, spider

Alipayspider Scrapy

AlipaySpider on Scrapy(use chrome driver); 支付宝爬虫(基于Scrapy)

Stars: ✭ 70 (-42.62%)

Mutual labels: spider, scrapy

Image Downloader

Download images from Google, Bing, Baidu. 谷歌、百度、必应图片下载.

Stars: ✭ 1,173 (+861.48%)

Mutual labels: spider, scrapy

Crawler examples

Some classic web crawler projects.一些经典的爬虫

Stars: ✭ 74 (-39.34%)

Mutual labels: crawler, spider

Arachnid

Powerful web scraping framework for Crystal

Stars: ✭ 68 (-44.26%)

Mutual labels: crawler, spider

Scrapy Examples

Some scrapy and web.py exmaples

Stars: ✭ 71 (-41.8%)

Mutual labels: crawler, scrapy

Capturer

capture pictures from website like sina, lofter, huaban and so on

Stars: ✭ 76 (-37.7%)

Mutual labels: spider, scrapy

Geziyor

Geziyor, a fast web crawling & scraping framework for Go. Supports JS rendering.

Stars: ✭ 1,246 (+921.31%)

Mutual labels: crawler, spider

Taiwan News Crawlers

Scrapy-based Crawlers for news of Taiwan

Stars: ✭ 83 (-31.97%)

Mutual labels: crawler, scrapy

Scrapoxy

Scrapoxy hides your scraper behind a cloud. It starts a pool of proxies to send your requests. Now, you can crawl without thinking about blacklisting!

Stars: ✭ 1,322 (+983.61%)

Mutual labels: crawler, scrapy

Gopa Abandoned

GOPA, a spider written in Go.（NOTE: this project moved to https://github.com/infinitbyte/gopa ）

Stars: ✭ 98 (-19.67%)

Mutual labels: crawler, spider

Abotx

Cross Platform C# Web crawler framework, headless browser, parallel crawler. Please star this project! +1.

Stars: ✭ 63 (-48.36%)

Mutual labels: spider, web-crawler

Puppeteer Walker

a puppeteer walker 🕷 🕸

Stars: ✭ 78 (-36.07%)

Mutual labels: crawler, spider

1-60 of 1017 similar projects

›

next*5