All Projects → BruceDone → Scrapy_demo

BruceDone / Scrapy_demo

all kinds of scrapy demo

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Scrapy demo

Spider job
招聘网数据爬虫
Stars: ✭ 234 (+82.81%)
Mutual labels:  spider, scrapy, mongodb
Python Spider
豆瓣电影top250、斗鱼爬取json数据以及爬取美女图片、淘宝、有缘、CrawlSpider爬取红娘网相亲人的部分基本信息以及红娘网分布式爬取和存储redis、爬虫小demo、Selenium、爬取多点、django开发接口、爬取有缘网信息、模拟知乎登录、模拟github登录、模拟图虫网登录、爬取多点商城整站数据、爬取微信公众号历史文章、爬取微信群或者微信好友分享的文章、itchat监听指定微信公众号分享的文章
Stars: ✭ 615 (+380.47%)
Mutual labels:  spider, scrapy, mongodb
Examples
Demo applications and code examples for Confluent Platform and Apache Kafka
Stars: ✭ 571 (+346.09%)
Mutual labels:  kafka, example, demo
Hive
lots of spider (很多爬虫)
Stars: ✭ 110 (-14.06%)
Mutual labels:  spider, scrapy
Rsyslog
a Rocket-fast SYStem for LOG processing
Stars: ✭ 1,385 (+982.03%)
Mutual labels:  kafka, mongodb
Spring Boot 2.x Examples
Spring Boot 2.x code examples
Stars: ✭ 104 (-18.75%)
Mutual labels:  kafka, mongodb
Dataengineeringproject
Example end to end data engineering project.
Stars: ✭ 82 (-35.94%)
Mutual labels:  kafka, mongodb
Ultimate Go
This repo contains my notes on working with Go and computer systems.
Stars: ✭ 1,530 (+1095.31%)
Mutual labels:  kafka, example
Stm32 Dma Uart
Efficient DMA timeout mechanism for peripheral DMA configured in circular mode demonstrated on a STM32 microcontroller.
Stars: ✭ 111 (-13.28%)
Mutual labels:  example, demo
Feapder
feapder是一款支持分布式、批次采集、任务防丢、报警丰富的python爬虫框架
Stars: ✭ 110 (-14.06%)
Mutual labels:  spider, scrapy
Copybook
用爬虫爬取小说网站上所有小说,存储到数据库中,并用爬到的数据构建自己的小说网站
Stars: ✭ 117 (-8.59%)
Mutual labels:  spider, scrapy
Circleci Demo Python Django
Example Django application running on CircleCI
Stars: ✭ 100 (-21.87%)
Mutual labels:  example, demo
Springboot Templates
springboot和dubbo、netty的集成,redis mongodb的nosql模板, kafka rocketmq rabbit的MQ模板, solr solrcloud elasticsearch查询引擎
Stars: ✭ 100 (-21.87%)
Mutual labels:  kafka, mongodb
Mean Stack Angular5 Crud
MEAN Stack (Angular 5) CRUD Web Application Example
Stars: ✭ 107 (-16.41%)
Mutual labels:  mongodb, example
Distributed Multi User Scrapy System With A Web Ui
Django based application that allows creating, deploying and running Scrapy spiders in a distributed manner
Stars: ✭ 88 (-31.25%)
Mutual labels:  scrapy, mongodb
Kkbinlog
支持mysql、MongoDB数据变更订阅分发
Stars: ✭ 112 (-12.5%)
Mutual labels:  kafka, mongodb
Examples Of Web Crawlers
一些非常有趣的python爬虫例子,对新手比较友好,主要爬取淘宝、天猫、微信、豆瓣、QQ等网站。(Some interesting examples of python crawlers that are friendly to beginners. )
Stars: ✭ 10,724 (+8278.13%)
Mutual labels:  spider, example
Docs
《数据采集从入门到放弃》源码。内容简介:爬虫介绍、就业情况、爬虫工程师面试题 ;HTTP协议介绍; Requests使用 ;解析器Xpath介绍; MongoDB与MySQL; 多线程爬虫; Scrapy介绍 ;Scrapy-redis介绍; 使用docker部署; 使用nomad管理docker集群; 使用EFK查询docker日志
Stars: ✭ 118 (-7.81%)
Mutual labels:  scrapy, mongodb
Apiproject
[https://www.sofineday.com], golang项目开发脚手架,集成最佳实践(gin+gorm+go-redis+mongo+cors+jwt+json日志库zap(支持日志收集到kafka或mongo)+消息队列kafka+微信支付宝支付gopay+api加密+api反向代理+go modules依赖管理+headless爬虫chromedp+makefile+二进制压缩+livereload热加载)
Stars: ✭ 124 (-3.12%)
Mutual labels:  kafka, spider
Expo Three Demo
🍎👩‍🏫 Collection of Demos for THREE.js in Expo!
Stars: ✭ 76 (-40.62%)
Mutual labels:  example, demo

Scrapy_demo

this project scrapes a list of websites I used to crawl most often if this project helped you, please give it a star, thanks :)

Spider list

  • douban
  • douban_oss
  • googleplay
  • cnbeta
  • ka
  • cnblogs

Project Feature

  • google play uses the crawl spider and pymongo
  • douban use the images pipeline to download image (use the headers in case of being banned), after finish it will output the txt file of item information
  • cnbeta uses sqlalchmey to save items to mysql database (or other database if sqlalchemy supports)
  • ka uses the kafka , this is a demo spider how to use the scrapy and kafka together , this spider will not close , if you push a message to the kafka ,the spider will start to crawl the url you just give
  • cnblogs use the signal handler.
  • douban_oss use the aliyun oss sdk upload the images pipeline download image to oss store.

How to use

for each project there is a run_spider.py script, just run it and enjoy :)

python run_spider.py
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].