Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → k1995 → Baiduyunspider

k1995 / Baiduyunspider

百度云网盘搜索引擎，包含爬虫 & 网站

Programming Languages

javascript

184084 projects - #8 most used programming language

python

139335 projects - #7 most used programming language

Labels

spider

Projects that are alternatives of or similar to Baiduyunspider

Infospider

INFO-SPIDER 是一个集众多数据源于一身的爬虫工具箱🧰，旨在安全快捷的帮助用户拿回自己的数据，工具代码开源，流程透明。支持数据源包括GitHub、QQ邮箱、网易邮箱、阿里邮箱、新浪邮箱、Hotmail邮箱、Outlook邮箱、京东、淘宝、支付宝、中国移动、中国联通、中国电信、知乎、哔哩哔哩、网易云音乐、QQ好友、QQ群、生成朋友圈相册、浏览器浏览历史、12306、博客园、CSDN博客、开源中国博客、简书。

Stars: ✭ 5,984 (+562.68%)

Mutual labels: spider

Querido Diario

📰 Brazilian government gazettes, accessible to everyone.

Stars: ✭ 681 (-24.58%)

Mutual labels: spider

Torbot

Dark Web OSINT Tool

Stars: ✭ 821 (-9.08%)

Mutual labels: spider

Istock

👉一个基于spring boot 实现的java股票爬虫(仅支持A股)，如果你❤️请⭐️ . V2升级版正在开发中！

Stars: ✭ 622 (-31.12%)

Mutual labels: spider

Oneblog

👽 OneBlog，一个简洁美观、功能强大并且自适应的Java博客

Stars: ✭ 678 (-24.92%)

Mutual labels: spider

Creeper

🐾 Creeper - The Next Generation Crawler Framework (Go)

Stars: ✭ 762 (-15.61%)

Mutual labels: spider

Baiduimagespider

一个超级轻量的百度图片爬虫

Stars: ✭ 591 (-34.55%)

Mutual labels: spider

Zhihu Crawler

zhihu-crawler是一个基于Java的高性能、支持免费http代理池、支持横向扩展、分布式爬虫项目

Stars: ✭ 890 (-1.44%)

Mutual labels: spider

Grab Site

The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns

Stars: ✭ 680 (-24.7%)

Mutual labels: spider

Gospider

Gospider - Fast web spider written in Go

Stars: ✭ 785 (-13.07%)

Mutual labels: spider

Icrawler

A multi-thread crawler framework with many builtin image crawlers provided.

Stars: ✭ 629 (-30.34%)

Mutual labels: spider

Spidr

A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.

Stars: ✭ 656 (-27.35%)

Mutual labels: spider

Crawler

A high performance web crawler in Elixir.

Stars: ✭ 781 (-13.51%)

Mutual labels: spider

Python Spider

豆瓣电影top250、斗鱼爬取json数据以及爬取美女图片、淘宝、有缘、CrawlSpider爬取红娘网相亲人的部分基本信息以及红娘网分布式爬取和存储redis、爬虫小demo、Selenium、爬取多点、django开发接口、爬取有缘网信息、模拟知乎登录、模拟github登录、模拟图虫网登录、爬取多点商城整站数据、爬取微信公众号历史文章、爬取微信群或者微信好友分享的文章、itchat监听指定微信公众号分享的文章

Stars: ✭ 615 (-31.89%)

Mutual labels: spider

Anti Anti Spider

越来越多的网站具有反爬虫特性，有的用图片隐藏关键数据，有的使用反人类的验证码，建立反反爬虫的代码仓库，通过与不同特性的网站做斗争（无恶意）提高技术。（欢迎提交难以采集的网站）（因工作原因，项目暂停）

Stars: ✭ 6,907 (+664.89%)

Mutual labels: spider

Domain hunter

A Burp Suite Extension that try to find all sub-domain, similar-domain and related-domain of an organization automatically! 基于流量自动收集整个企业或组织的子域名、相似域名、相关域名的burp插件

Stars: ✭ 594 (-34.22%)

Mutual labels: spider

Bilibili Api

哔哩哔哩的API调用模块

Stars: ✭ 704 (-22.04%)

Mutual labels: spider

Javlibrary

Javlibrary spider

Stars: ✭ 17 (-98.12%)

Mutual labels: spider

Seeker

Seeker - another job board aggregator.

Stars: ✭ 16 (-98.23%)

Mutual labels: spider

Funpyspidersearchengine

Word2vec 千人千面个性化搜索 + Scrapy2.3.0(爬取数据) + ElasticSearch7.9.1(存储数据并提供对外Restful API) + Django3.1.1 搜索

Stars: ✭ 782 (-13.4%)

Mutual labels: spider

View All Similar Projects ➔

BaiduyunSpider

分布式百度网盘爬虫，使用当前最流行的技术框架。适合个人学习以及二次开发。

爬虫基于 Scrapy，灵活简单、易扩展，方便二次开发。使用 Scrapy-Redis 作为分布式中间件，可同时部署多个爬虫实例，以提升采集效率。Web后台管理基于React，Material Design 设计风格。

依赖

MongoDB
Python3
Redis
Node.js > 8.0 (可选)

安装

pip install -r requirements.txt

如何使用

1.运行爬虫

scrapy crawl baidupan

2.运行Web Service

cd api
python rest.py

3.开始采集

开源版目前需要通过后台管理界面，手动提交待采集的分享链接。或者使用API方式：

POST http://localhost:5000/addUrl
表单参数: url

curl 例子

curl -X POST http://localhost:5000/addUrl \
  -F url=https://pan.baidu.com/s/17BtXyO-i02gsC7h4QsKexg

运行截图

爬虫运行截图

后台管理界面

技术支持

提供高级版本，包含额外的搜索引擎和私密分享采集部分，暂仅用于毕业设计。联系邮箱：

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 903

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (8) 🔗