Crawling zhihu, jobbole, lagou by Scrapy, and using Elasticsearch+Django to build a Search Engine website --- README_zh.md (including: implementation roadmap, distributed-crawler and coping with anti-crawling strategies).

Stars: ✭ 34 (-43.33%)

Mutual labels: scrapy

tuchong Spider

⭐ 图虫网爬虫

Stars: ✭ 16 (-73.33%)

Mutual labels: spider

python-crawler

爬虫学习仓库，适合零基础的人学习，对新手比较友好

Stars: ✭ 37 (-38.33%)

Mutual labels: scrapy

PTT Beauty Spider

PTT 表特版爬蟲圖片下載器

Stars: ✭ 47 (-21.67%)

Mutual labels: spider

php-crawler

🕷️ A simple crawler (spider) writen in php just for fun, with zero dependencies

Stars: ✭ 39 (-35%)

Mutual labels: spider

gathertool

gathertool是golang脚本化开发库，目的是提高对应场景程序开发的效率；轻量级爬虫库，接口测试&压力测试库，DB操作库等。

Stars: ✭ 36 (-40%)

Mutual labels: spider

ufc fight predictor

UFC bout winner prediction using neural nets.

Stars: ✭ 22 (-63.33%)

Mutual labels: scrapy

article-spider

文章采集工具 Article collection tool

Stars: ✭ 130 (+116.67%)

Mutual labels: spider

double-agent

A test suite of common scraper detection techniques. See how detectable your scraper stack is.

Stars: ✭ 123 (+105%)

Mutual labels: scrapy

Sina Spider

新浪爬虫，基于Python+Selenium。模拟登陆后保存cookie，实现登录状态的保存。可以通过输入关键词来爬取到关键词相关的热门微博。

Stars: ✭ 25 (-58.33%)

Mutual labels: spider

Autohome

Using Scrapy to crawl Autohome, storage into MonogDB, simple analysis and NLP coming soon

Stars: ✭ 23 (-61.67%)

Mutual labels: scrapy

bangumi yearly report

No description or website provided.

Stars: ✭ 24 (-60%)

Mutual labels: spider

Scrape-Finance-Data

My code for scraping financial data in Vietnam

Stars: ✭ 13 (-78.33%)

Mutual labels: scrapy

ZSpider

基于Electron爬虫程序

Stars: ✭ 37 (-38.33%)

Mutual labels: spider

MoMo

利用墨墨背单词的分享功能拿每日20个的单词上限奖励（多线程

Stars: ✭ 45 (-25%)

Mutual labels: spider

scrapy-LBC

Araignée LeBonCoin avec Scrapy et ElasticSearch

Stars: ✭ 14 (-76.67%)

Mutual labels: scrapy

vietnam-ecommerce-crawler

Crawling the data from lazada, websosanh, compare.vn, cdiscount and cungmua with flexible configs

Stars: ✭ 28 (-53.33%)

Mutual labels: scrapy

robotstxt

robots.txt file parsing and checking for R

Stars: ✭ 65 (+8.33%)

Mutual labels: spider

DSpiderDemo-Android

客户端爬虫安卓端demo

Stars: ✭ 43 (-28.33%)

Mutual labels: spider

TaobaoSpider

This taobao spider has been archived

Stars: ✭ 28 (-53.33%)

Mutual labels: spider

crawler

python爬虫项目集合

Stars: ✭ 29 (-51.67%)

Mutual labels: scrapy

goSpider

some small project and some articles

Stars: ✭ 56 (-6.67%)

Mutual labels: spider

asyncpy

使用asyncio和aiohttp开发的轻量级异步协程web爬虫框架

Stars: ✭ 86 (+43.33%)

Mutual labels: scrapy

grapy

Grapy, a fast high-level web crawling framework for Python 3.3 or later base on asyncio.

Stars: ✭ 18 (-70%)

Mutual labels: spider

😚 Q & A website based on Spring Boot.

Stars: ✭ 46 (-23.33%)

Mutual labels: spider

spider

A web spider framework

Stars: ✭ 25 (-58.33%)

Mutual labels: spider

invana-bot

A Web Crawler that scrapes using YAML and python code.

Stars: ✭ 30 (-50%)

Mutual labels: scrapy

js block

研究学习各种拦截：反爬虫、拦截ad、防广告注入、斗黄牛等

Stars: ✭ 59 (-1.67%)

Mutual labels: spider

feaplat

爬虫管理系统，支持集群，弹性伸缩。支持运行feapder、scrapy、selenium、playwright等各种框架及脚本

Stars: ✭ 42 (-30%)

Mutual labels: spider

weixin article spiders

A spiders' program for weixin which made by Express & cheerio

Stars: ✭ 33 (-45%)

Mutual labels: spider

crawler-chrome-extensions

爬虫工程师常用的 Chrome 插件 | Chrome extensions used by crawler developer

Stars: ✭ 53 (-11.67%)

Mutual labels: spider

61-120 of 565 similar projects

‹

›

next*5