All Projects → IaroslavR → scrapy-mysql-pipeline

IaroslavR / scrapy-mysql-pipeline

Licence: other
scrapy mysql pipeline

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to scrapy-mysql-pipeline

domains
World’s single largest Internet domains dataset
Stars: ✭ 461 (+880.85%)
Mutual labels:  scrapy
asyncpy
使用asyncio和aiohttp开发的轻量级异步协程web爬虫框架
Stars: ✭ 86 (+82.98%)
Mutual labels:  scrapy
ArticleSpider
Crawling zhihu, jobbole, lagou by Scrapy, and using Elasticsearch+Django to build a Search Engine website --- README_zh.md (including: implementation roadmap, distributed-crawler and coping with anti-crawling strategies).
Stars: ✭ 34 (-27.66%)
Mutual labels:  scrapy
lgcrawl
python+scrapy+splash 爬取拉勾全站职位信息
Stars: ✭ 22 (-53.19%)
Mutual labels:  scrapy
scrapy helper
Dynamic configurable crawl (动态可配置化爬虫)
Stars: ✭ 84 (+78.72%)
Mutual labels:  scrapy
vietnam-ecommerce-crawler
Crawling the data from lazada, websosanh, compare.vn, cdiscount and cungmua with flexible configs
Stars: ✭ 28 (-40.43%)
Mutual labels:  scrapy
Awesome crawl
腾讯新闻、知乎话题、微博粉丝,Tumblr爬虫、斗鱼弹幕、妹子图爬虫、分布式设计等
Stars: ✭ 246 (+423.4%)
Mutual labels:  scrapy
fernando-pessoa
Classificador de poemas do Fernando Pessoa de acordo com os seus heterônimos
Stars: ✭ 31 (-34.04%)
Mutual labels:  scrapy
Web-Iota
Iota is a web scraper which can find all of the images and links/suburls on a webpage
Stars: ✭ 60 (+27.66%)
Mutual labels:  scrapy
double-agent
A test suite of common scraper detection techniques. See how detectable your scraper stack is.
Stars: ✭ 123 (+161.7%)
Mutual labels:  scrapy
arche
Analyze scraped data
Stars: ✭ 49 (+4.26%)
Mutual labels:  scrapy
scrapy-rotated-proxy
A scrapy middleware to use rotated proxy ip list.
Stars: ✭ 22 (-53.19%)
Mutual labels:  scrapy
scrapy-LBC
Araignée LeBonCoin avec Scrapy et ElasticSearch
Stars: ✭ 14 (-70.21%)
Mutual labels:  scrapy
pagser
Pagser is a simple, extensible, configurable parse and deserialize html page to struct based on goquery and struct tags for golang crawler
Stars: ✭ 82 (+74.47%)
Mutual labels:  scrapy
scrapy-wayback-machine
A Scrapy middleware for scraping time series data from Archive.org's Wayback Machine.
Stars: ✭ 92 (+95.74%)
Mutual labels:  scrapy
estate-crawler
Scraping the real estate agencies for up-to-date house listings as soon as they arrive!
Stars: ✭ 20 (-57.45%)
Mutual labels:  scrapy
crawler
python爬虫项目集合
Stars: ✭ 29 (-38.3%)
Mutual labels:  scrapy
scrapy-html-storage
Scrapy downloader middleware that stores response HTMLs to disk.
Stars: ✭ 17 (-63.83%)
Mutual labels:  scrapy
itemadapter
Common interface for data container classes
Stars: ✭ 47 (+0%)
Mutual labels:  scrapy
Scrape-Finance-Data
My code for scraping financial data in Vietnam
Stars: ✭ 13 (-72.34%)
Mutual labels:  scrapy

Python 3.6

Pull requests are always welcome

scrapy-mysql-pipeline

Asynchronous mysql Scrapy item pipeline

Installation

pip install scrapy-mysql-pipeline

Configuration

Add pipeline

ITEM_PIPELINES = {
    'scrapy_mysql_pipeline.MySQLPipeline': 300,
}

Default values:

MYSQL_HOST = 'localhost'
MYSQL_PORT = 3306
MYSQL_USER = None
MYSQL_PASSWORD = ''
MYSQL_DB = None
MYSQL_TABLE = None
MYSQL_UPSERT = False
MYSQL_RETRIES = 3
MYSQL_CLOSE_ON_ERROR = True
MYSQL_CHARSET = 'utf8'

MYSQL_USER, MYSQL_PASSWORD, MYSQL_DB and MYSQL_TABLE, variables must be set in settings.py

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].