《数据采集从入门到放弃》源码。内容简介：爬虫介绍、就业情况、爬虫工程师面试题；HTTP协议介绍； Requests使用；解析器Xpath介绍； MongoDB与MySQL；多线程爬虫； Scrapy介绍；Scrapy-redis介绍；使用docker部署；使用nomad管理docker集群；使用EFK查询docker日志

✭ 118

python docker mysql http mongodb crawler scrapy requests xpath

Seleniumcrawler

An example using Selenium webdrivers for python and Scrapy framework to create a web scraper to crawl an ASP site

✭ 117

python selenium scraper scrapy scraping selenium-webdriver asp-net

Copybook

用爬虫爬取小说网站上所有小说，存储到数据库中，并用爬到的数据构建自己的小说网站

✭ 117

python css django spider scrapy

Cnkispider

a spider for cnki patent content, just for study and commucation, no use for business.

✭ 117

python scrapy

Maria Quiteria

Backend para coleta e disponibilização dos dados 📜

✭ 115

python hacktoberfest django django-rest-framework scrapy civic-tech

Patentcrawler

scrapy专利爬虫（停止维护）

✭ 114

python visualization data crawler scrapy echarts

Weibo hot search

微博爬虫：每天定时爬取微博热搜榜的内容，留下互联网人的记忆。

✭ 113

python scrapy weibo

Scrala

Unmaintained 🐳 ☕️ 🕷 Scala crawler(spider) framework, inspired by scrapy, created by @gaocegege

✭ 113

scala docker spider scrapy actor-model

Programer log

最新动态在这里【我的程序员日志】

✭ 112

python jupyter-notebook docker scrapy

Wswp

Code for the second edition Web Scraping with Python book by Packt Publications

✭ 112

python python3 selenium scrapy teaching webscraping

Hive

lots of spider (很多爬虫）

✭ 110

python python3 spider scrapy selenium-webdriver beautifulsoup

Crawler

爬虫, http代理, 模拟登陆!

✭ 106

python crawler scrapy

Scrapyd Cluster On Heroku

Set up free and scalable Scrapyd cluster for distributed web-crawling with just a few clicks. DEMO 👉

✭ 106

python heroku cluster scrapy web-scraping

Decoration Design Crawler

土巴兔和谷居装修网站爬虫

✭ 105

python scrapy

Dotnetcrawler

DotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c

✭ 100

csharp crawler dotnetcore scrapy scraping entity-framework-core webscraping ddd-architecture crawling

Experiments

Some research experiments

✭ 95

jupyter-notebook deep-learning scrapy word2vec

Proxy server crawler

an awesome public proxy server crawler based on scrapy framework

✭ 94

python scrapy

Scrapoxy

Scrapoxy hides your scraper behind a cloud. It starts a pool of proxies to send your requests. Now, you can crawl without thinking about blacklisting!

✭ 1,322

javascript nodejs proxy cloud crawler angularjs scraper scrapy

1-60 of 228 scrapy projects

›