Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

librauee / Reptile

🏀 Python3 网络爬虫实战（部分含详细教程）猫眼腾讯视频豆瓣研招网微博笔趣阁小说百度热点 B站 CSDN 网易云阅读阿里文学百度股票今日头条微信公众号网易云音乐拉勾有道 unsplash 实习僧汽车之家英雄联盟盒子大众点评链家 LPL赛程台风梦幻西游、阴阳师藏宝阁天气牛客网百度文库睡前故事知乎 Wish

Programming Languages

python

139335 projects - #7 most used programming language

python3

1442 projects

Labels

spider scrapy requests

Projects that are alternatives of or similar to Reptile

python-fxxk-spider

收集各种免费的 Python 爬虫项目

Stars: ✭ 184 (-82.44%)

Mutual labels: spider, requests, scrapy

Scrapingoutsourcing

ScrapingOutsourcing专注分享爬虫代码尽量每周更新一个

Stars: ✭ 164 (-84.35%)

Mutual labels: spider, scrapy, requests

Haipproxy

💖 High available distributed ip proxy pool, powerd by Scrapy and Redis

Stars: ✭ 4,993 (+376.43%)

Mutual labels: spider, scrapy

Fbcrawl

A Facebook crawler

Stars: ✭ 536 (-48.85%)

Mutual labels: spider, scrapy

Python Spider

豆瓣电影top250、斗鱼爬取json数据以及爬取美女图片、淘宝、有缘、CrawlSpider爬取红娘网相亲人的部分基本信息以及红娘网分布式爬取和存储redis、爬虫小demo、Selenium、爬取多点、django开发接口、爬取有缘网信息、模拟知乎登录、模拟github登录、模拟图虫网登录、爬取多点商城整站数据、爬取微信公众号历史文章、爬取微信群或者微信好友分享的文章、itchat监听指定微信公众号分享的文章

Stars: ✭ 615 (-41.32%)

Mutual labels: spider, scrapy

Django Dynamic Scraper

Creating Scrapy scrapers via the Django admin interface

Stars: ✭ 1,024 (-2.29%)

Mutual labels: spider, scrapy

Bilili

🍻 bilibili video (including bangumi) and danmaku downloader | B站视频（含番剧）、弹幕下载器

Stars: ✭ 379 (-63.84%)

Mutual labels: spider, requests

Easy Scraping Tutorial

Simple but useful Python web scraping tutorial code.

Stars: ✭ 583 (-44.37%)

Mutual labels: scrapy, requests

Happy Spiders

🔧 🔩 🔨 收集整理了爬虫相关的工具、模拟登陆技术、代理IP、scrapy模板代码等内容。

Stars: ✭ 261 (-75.1%)

Mutual labels: spider, scrapy

Crawlab

Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台，支持任何语言和框架

Stars: ✭ 8,392 (+700.76%)

Mutual labels: spider, scrapy

Funpyspidersearchengine

Word2vec 千人千面个性化搜索 + Scrapy2.3.0(爬取数据) + ElasticSearch7.9.1(存储数据并提供对外Restful API) + Django3.1.1 搜索

Stars: ✭ 782 (-25.38%)

Mutual labels: spider, scrapy

Seeker

Seeker - another job board aggregator.

Stars: ✭ 16 (-98.47%)

Mutual labels: spider, scrapy

Webspider

在线地址: http://119.23.223.90:8000

Stars: ✭ 340 (-67.56%)

Mutual labels: spider, requests

Elves

🎊 Design and implement of lightweight crawler framework.

Stars: ✭ 315 (-69.94%)

Mutual labels: spider, scrapy

App comments spider

爬取百度贴吧、TapTap、appstore、微博官方博主上的游戏评论(基于redis_scrapy)，过滤器采用了bloomfilter。

Stars: ✭ 38 (-96.37%)

Mutual labels: spider, scrapy

Alltheplaces

A set of spiders and scrapers to extract location information from places that post their location on the internet.

Stars: ✭ 277 (-73.57%)

Mutual labels: spider, scrapy

Spider python

python爬虫

Stars: ✭ 557 (-46.85%)

Mutual labels: scrapy, requests

Jspider

JSpider会每周更新至少一个网站的JS解密方式，欢迎 Star，交流微信：13298307816

Stars: ✭ 914 (-12.79%)

Mutual labels: spider, scrapy

Douban Crawler

Uno Crawler por https://douban.com

Stars: ✭ 13 (-98.76%)

Mutual labels: spider, scrapy

Tieba spider

百度贴吧爬虫(基于scrapy和mysql)

Stars: ✭ 257 (-75.48%)

Mutual labels: spider, scrapy

View All Similar Projects ➔

Spider Learning

Language : Python3
Content : 一些爬虫的学习实例和自己的爬虫实战汇总，包含入门阶段和中级阶段的两阶段实战内容，技术手段包括XPath、BeautifulSoup、正则表达式、Ajax异步加载、代理IP、多线程、抓包工具、字体反爬、 JS逆向、Scrapy框架、反调试、验证码等。
Notice : 欢迎关注我的微信公众号，与我一起成长~
内含大量Python学习资源，电子书，视频，扫码关注即可

入门阶段

推荐嵩天教授的Python语言课和爬虫课入门，下面是课程的慕课链接
- Python语言程序设计
- Python网络爬虫与信息提取
因为网页代码的变动，课程内的部分爬虫都无法正确爬取内容，理解学习爬虫技术即可
戳我看课程的爬虫代码
下面是一些重要的爬虫技术手段，有些代码辅以文章，可以拉到底部表格查阅~

XPath

BeautifulSoup

正则表达式

Ajax异步加载

代理IP

多线程

抓包工具Fiddler

中级阶段

字体反爬

JS逆向

Scrapy框架

反调试

反调试问题

验证码

Number	Website	Article
1	豆瓣	豆瓣电影排行榜
2	大学排名
3	微博
4	研招网	爬取研招网调剂信息
5	代理IP
6	淘宝
7	股票
8	猫眼	爬取豆瓣、猫眼流浪地球数万条评论信息
9	儿童故事	给女友定时发送睡前小故事
10	CSDN
11	百度热点
12	笔趣阁
13	腾讯视频	爬取腾讯视频电视剧弹幕
14	英文短文
15	公交信息
16	网易云阅读
17	今日头条
18	网易云音乐	JS逆向之网易云音乐
19	拉勾
20	有道翻译	JS逆向初探之有道翻译
21	阿里文学	JS逆向之阿里文学
22	unsplash	scrapy实战之unsplash
23	掌上英雄联盟	一键抓取掌盟文章
24	微信公众号	批量下载文章
25	链家
26	实习僧	字体反爬之实习僧
27	汽车之家	字体反爬之汽车之家
28	大众点评	字体反爬之大众点评
29	阴阳师
30	梦幻西游
31	台风
32	全国历史天气
33	牛客网	Python爬取海量面经
34	PentaQ电竞	Python爬取英雄联盟职业比赛数据
35	~~百度文库~~	因不可抗力已删除
36	知乎	知乎海量表情包
37	wish

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 1,048

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (2) 🔗