All Projects → web-data-extractor → Similar Projects or Alternatives

494 Open source projects that are alternatives of or similar to web-data-extractor

豆瓣电影top250、斗鱼爬取json数据以及爬取美女图片、淘宝、有缘、CrawlSpider爬取红娘网相亲人的部分基本信息以及红娘网分布式爬取和存储redis、爬虫小demo、Selenium、爬取多点、django开发接口、爬取有缘网信息、模拟知乎登录、模拟github登录、模拟图虫网登录、爬取多点商城整站数据、爬取微信公众号历史文章、爬取微信群或者微信好友分享的文章、itchat监听指定微信公众号分享的文章

Stars: ✭ 615 (+1082.69%)

Mutual labels: spider, xpath

Spider Flow

新一代爬虫平台，以图形化方式定义爬虫流程，不写代码即可完成爬虫。

Stars: ✭ 365 (+601.92%)

Mutual labels: spider, xpath

Z-Spider

一些爬虫开发的技巧和案例

Stars: ✭ 33 (-36.54%)

Mutual labels: spider, xpath

OpenScraper

An open source webapp for scraping: towards a public service for webscraping

Stars: ✭ 80 (+53.85%)

Mutual labels: spider, xpath

fs2-data

streaming data parsing and transformation library

Stars: ✭ 103 (+98.08%)

Mutual labels: xpath, jsonpath

go-xmldom

XML DOM processing for Golang, supports xpath query

Stars: ✭ 38 (-26.92%)

Mutual labels: xpath

JsonPathKt

A lighter and more efficient implementation of JsonPath in Kotlin

Stars: ✭ 37 (-28.85%)

Mutual labels: jsonpath

OpenYspider

千万级图片爬虫、视频爬虫 [开源版本] Image Spider

Stars: ✭ 122 (+134.62%)

Mutual labels: spider

spider-school

自动答题程序🎉

Stars: ✭ 37 (-28.85%)

Mutual labels: spider

fb scraper

FBLYZE is a Facebook scraping system and analysis system.

Stars: ✭ 61 (+17.31%)

Mutual labels: extract-data

jessie

JsonPath for Dart

Stars: ✭ 23 (-55.77%)

Mutual labels: jsonpath

elves

🎊 Design and implement of lightweight crawler framework.

Stars: ✭ 322 (+519.23%)

Mutual labels: spider

landchina-spider

项目已经过时！无法应用在改版后的网站上。

Stars: ✭ 13 (-75%)

Mutual labels: spider

rb-spider

基于 RabbitMQ 中间件的爬虫的 Ruby 实现 [Developing]

Stars: ✭ 13 (-75%)

Mutual labels: spider

aliexscrape

Get Aliexpress product details in JSON

Stars: ✭ 80 (+53.85%)

Mutual labels: spider

araneid

一个基于Glang语言开发的站群系统（蜘蛛池系统）

Stars: ✭ 25 (-51.92%)

Mutual labels: spider

dotnet-security-unit-tests

A web application that contains several unit tests for the purpose of .NET security

Stars: ✭ 25 (-51.92%)

Mutual labels: xpath

python-fxxk-spider

收集各种免费的 Python 爬虫项目

Stars: ✭ 184 (+253.85%)

Mutual labels: spider

node-html-crawler

Simple for use node html crawler (spider) of site web pages

Stars: ✭ 30 (-42.31%)

Mutual labels: spider

codechef-rank-comparator

Web application hosted on Heroku cloud platform based on web scraping in python using lxml library (XML Path Language).

Stars: ✭ 23 (-55.77%)

Mutual labels: xpath

wget-lua

Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.

Stars: ✭ 52 (+0%)

Mutual labels: spider

youdao

有道词典网页爬虫

Stars: ✭ 22 (-57.69%)

Mutual labels: spider

😚 Q & A website based on Spring Boot.

Stars: ✭ 46 (-11.54%)

Mutual labels: spider

nodejs-meizitu

妹子图全站采集10G套图资源

Stars: ✭ 80 (+53.85%)

Mutual labels: spider

douyin-api

抖音接口、抖音API、抖音数据爬虫、抖音直播数据、抖音直播Api、抖音视频Api、抖音爬虫、抖音去水印、抖音视频下载、抖音视频解析、抖音直播监控、抖音数据采集

Stars: ✭ 41 (-21.15%)

Mutual labels: spider

benchmark-http

No description or website provided.

Stars: ✭ 15 (-71.15%)

Mutual labels: spider

Subbranch-China

银行、支行名称。中国各地区各银行支行名称数据爬虫，数据来源微信商户平台，已经整理可直接导入的sql文件

Stars: ✭ 31 (-40.38%)

Mutual labels: spider

DouBanReptile

豆瓣租房小组多线程爬虫。爬取后自动按时间排序生成markdown文件。

Stars: ✭ 31 (-40.38%)

Mutual labels: xpath

hupu Album Downloader

虎扑网相册下载工具

Stars: ✭ 17 (-67.31%)

Mutual labels: spider

scrapy-distributed

A series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.

Stars: ✭ 38 (-26.92%)

Mutual labels: spider

zhihu

搜索你的知乎收藏：可以直观地浏览你的所有收藏夹的内容，并进行全文搜索

Stars: ✭ 39 (-25%)

Mutual labels: spider

douban-movie

Get movie info from douban(豆瓣) and display in your terminal

Stars: ✭ 17 (-67.31%)

Mutual labels: spider

zucc xk ZhengFang

ZUCC正方教务系统抢课助手。针对ZUCC正方教务系统模拟登录，爬取课程信息，自动抓包发包抢课。具体实现流程可参考README中的实现原理链接

Stars: ✭ 40 (-23.08%)

Mutual labels: spider

documentDownloader

download document from book118 for free

Stars: ✭ 72 (+38.46%)

Mutual labels: spider

photo-spider-scrapy

10 photo website spiders, 10 个国外图库的 scrapy 爬虫代码

Stars: ✭ 17 (-67.31%)

Mutual labels: spider

jsonuri

🌳 阿里剑鱼、iceluna、vanex 数据操作底层库，使用O(n) 复杂度回溯祖先节点

Stars: ✭ 131 (+151.92%)

Mutual labels: jsonpath

ChineseStarsRelationship

中国明星数据爬取。你甚至可以拿到互联网上所有的人之间的关系，接下来你可以自己发挥！基于这些数据，你可以完成更多有趣的事情。比如说社交网络分析，关系网络可视化，算法研究，和其他有意思的事情。Chinese star data crawling. You can even get all the people on the internet! Based on these data, you can do more interesting things. For example, social network analysis, relational network visualization, algorithm research, and other interesting things.

Stars: ✭ 26 (-50%)

Mutual labels: spider

SpiderDemo

爬虫Demo，基于Python实现

Stars: ✭ 56 (+7.69%)

Mutual labels: spider

JSONPath.sh

JSONPath implementation in Bash for filtering, merging and modifying JSON

Stars: ✭ 45 (-13.46%)

Mutual labels: jsonpath

MusicSpider

Music Spider. Go 👾 Music Spider 是使用Golang写的音乐聚合爬虫，目前支持的站点包括网易、QQ、虾米、酷狗、百度。