All Projects → wwj718 → jobSpider

wwj718 / jobSpider

Licence: other
jobSpider是一只scrapy爬虫,用于爬取职位信息

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to jobSpider

python-data-viz-workshop
A workshop on data visualization in Python with notebooks and exercises for following along.
Stars: ✭ 136 (+385.71%)
Mutual labels:  bokeh
OpenScraper
An open source webapp for scraping: towards a public service for webscraping
Stars: ✭ 80 (+185.71%)
Mutual labels:  spider
QQSpider
爬取QQ用户信息(qq号、昵称、生日、地址等基本信息)并做简要analysis。
Stars: ✭ 21 (-25%)
Mutual labels:  spider
spider-school
自动答题程序🎉
Stars: ✭ 37 (+32.14%)
Mutual labels:  spider
photo-spider-scrapy
10 photo website spiders, 10 个国外图库的 scrapy 爬虫代码
Stars: ✭ 17 (-39.29%)
Mutual labels:  spider
douban-movie
Get movie info from douban(豆瓣) and display in your terminal
Stars: ✭ 17 (-39.29%)
Mutual labels:  spider
elves
🎊 Design and implement of lightweight crawler framework.
Stars: ✭ 322 (+1050%)
Mutual labels:  spider
python-spider
python爬虫小项目【持续更新】【笔趣阁小说下载、Tweet数据抓取、天气查询、网易云音乐逆向、天天基金网查询、微博数据抓取(生成cookie)、有道翻译逆向、企查查免登陆爬虫、大众点评svg加密破解、B站用户爬虫、拉钩免登录爬虫、自如租房字体加密、知乎问答
Stars: ✭ 45 (+60.71%)
Mutual labels:  spider
OpenYspider
千万级图片爬虫、视频爬虫 [开源版本] Image Spider
Stars: ✭ 122 (+335.71%)
Mutual labels:  spider
Spydan
A web spider for shodan.io without using the Developer API.
Stars: ✭ 30 (+7.14%)
Mutual labels:  spider
ChineseStarsRelationship
中国明星数据爬取。你甚至可以拿到互联网上所有的人之间的关系,接下来你可以自己发挥!基于这些数据,你可以完成更多有趣的事情。比如说社交网络分析,关系网络可视化,算法研究,和其他有意思的事情。Chinese star data crawling. You can even get all the people on the internet! Based on these data, you can do more interesting things. For example, social network analysis, relational network visualization, algorithm research, and other interesting things.
Stars: ✭ 26 (-7.14%)
Mutual labels:  spider
angular-bokeh
An example angular project on integrating Bokeh with Angular and send data from a python backend
Stars: ✭ 24 (-14.29%)
Mutual labels:  bokeh
scrapy-distributed
A series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.
Stars: ✭ 38 (+35.71%)
Mutual labels:  spider
node-html-crawler
Simple for use node html crawler (spider) of site web pages
Stars: ✭ 30 (+7.14%)
Mutual labels:  spider
learning spider
这其实是一份学习笔记。包括学习记录、爬虫练习平台(网站)、自制工具脚本
Stars: ✭ 54 (+92.86%)
Mutual labels:  spider
Scrapy IPProxyPool
免费 IP 代理池。Scrapy 爬虫框架插件
Stars: ✭ 100 (+257.14%)
Mutual labels:  spider
aliexscrape
Get Aliexpress product details in JSON
Stars: ✭ 80 (+185.71%)
Mutual labels:  spider
youdao
有道词典网页爬虫
Stars: ✭ 22 (-21.43%)
Mutual labels:  spider
benchmark-http
No description or website provided.
Stars: ✭ 15 (-46.43%)
Mutual labels:  spider
landchina-spider
项目已经过时!无法应用在改版后的网站上。
Stars: ✭ 13 (-53.57%)
Mutual labels:  spider

#jobSpider

jobSpider是一只scrapy爬虫,用于爬取职位信息

目前收录:

功能

  1. 爬取Lagou网的职位信息(爬取最新的5000条)

安装与依赖

  • git clone https://github.com/wwj718/jobSpider
  • cd jobSpider
  • pip install -r requirements.txt
  • mongodb(可选)
  • 在setting.py中修改csv保存的路径(FEED_URI变量),默认是当前目录
  • 运行 : scrapy crawl LagouSpider(开始爬取数据)

我的开发环境

OSX python2.7

在windows7下测试可用

可选特性

如果要使用mongodb数据库,取消setting.py中的ITEM_PIPELINES注释

代码风格

采用yapf来统一代码风格

yapf -i filename.py

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].