Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → stevedsun → landchina-spider

stevedsun / landchina-spider

Licence: MIT license

项目已经过时！无法应用在改版后的网站上。

Programming Languages

139335 projects - #7 most used programming language

Labels

Projects that are alternatives of or similar to landchina-spider

golang spider Crawler 爬虫电影

Stars: ✭ 168 (+1192.31%)

Mutual labels: spider

🎊 Design and implement of lightweight crawler framework.

Stars: ✭ 322 (+2376.92%)

Mutual labels: spider

千万级图片爬虫、视频爬虫 [开源版本] Image Spider

Stars: ✭ 122 (+838.46%)

Mutual labels: spider

A web spider framework

Stars: ✭ 25 (+92.31%)

Mutual labels: spider

Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.

Stars: ✭ 52 (+300%)

Mutual labels: spider

node-html-crawler

Simple for use node html crawler (spider) of site web pages

Stars: ✭ 30 (+130.77%)

Mutual labels: spider

Spider项目将会不断更新本人学习使用过的爬虫方法！！！

Stars: ✭ 16 (+23.08%)

Mutual labels: spider

Get movie info from douban(豆瓣) and display in your terminal

Stars: ✭ 17 (+30.77%)

Mutual labels: spider

🌟 powered by python3( simple learning of spider) 百度文库；网易云歌曲；豆瓣电影； GitHub；京东； QQ空间；天气； vip解析助手； TED文本内容； wifi破解脚本；必应图片设置为桌面等爬取

Stars: ✭ 124 (+853.85%)

Mutual labels: spider

photo-spider-scrapy

10 photo website spiders, 10 个国外图库的 scrapy 爬虫代码

Stars: ✭ 17 (+30.77%)

Mutual labels: spider

妹子图全站采集10G套图资源

Stars: ✭ 80 (+515.38%)

Mutual labels: spider

163music spider by scrapy.

Stars: ✭ 60 (+361.54%)

Mutual labels: spider

自动答题程序🎉

Stars: ✭ 37 (+184.62%)

Mutual labels: spider

Subbranch-China

银行、支行名称。中国各地区各银行支行名称数据爬虫，数据来源微信商户平台，已经整理可直接导入的sql文件

Stars: ✭ 31 (+138.46%)

Mutual labels: spider

An open source webapp for scraping: towards a public service for webscraping

Stars: ✭ 80 (+515.38%)

Mutual labels: spider

新浪爬虫，基于Python+Selenium。模拟登陆后保存cookie，实现登录状态的保存。可以通过输入关键词来爬取到关键词相关的热门微博。

Stars: ✭ 25 (+92.31%)

Mutual labels: spider

Scrapy IPProxyPool

免费 IP 代理池。Scrapy 爬虫框架插件

Stars: ✭ 100 (+669.23%)

Mutual labels: spider

scrapy-distributed

A series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.

Stars: ✭ 38 (+192.31%)

Mutual labels: spider

Get Aliexpress product details in JSON

Stars: ✭ 80 (+515.38%)

Mutual labels: spider

ChineseStarsRelationship

中国明星数据爬取。你甚至可以拿到互联网上所有的人之间的关系，接下来你可以自己发挥！基于这些数据，你可以完成更多有趣的事情。比如说社交网络分析，关系网络可视化，算法研究，和其他有意思的事情。Chinese star data crawling. You can even get all the people on the internet! Based on these data, you can do more interesting things. For example, social network analysis, relational network visualization, algorithm research, and other interesting things.

Stars: ✭ 26 (+100%)

Mutual labels: spider

View All Similar Projects ➔

landchina 数据爬虫

基于 scrapy 框架，使用 selenium + chromeDriver 解析动态数据。

快速使用

macOS/Linux 环境下，先下载对应平台的~~phantomJS~~ Chrome Driver，放在系统全局PATH目录下(如/usr/local/bin)。

安装 python2 及 pip，执行：

pip install -r requirements.txt
python manage.py

数据保存在results文件内。

配置

想要更改爬取的行政区，需要在info.ini里填写对应的行政区编码(可在location.json文件里找到)，查询的起止时间。

ChangeLog

2017-01-18 增加伪造UA；修改日期格式，按月存储表单
2017-01-16 增加断点日志，完善中间件调用流程
2017-01-13 更新第一个可用版本

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 13

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (2) 🔗