All Projects → web-data-extractor → Similar Projects or Alternatives

494 Open source projects that are alternatives of or similar to web-data-extractor

Python Spider
豆瓣电影top250、斗鱼爬取json数据以及爬取美女图片、淘宝、有缘、CrawlSpider爬取红娘网相亲人的部分基本信息以及红娘网分布式爬取和存储redis、爬虫小demo、Selenium、爬取多点、django开发接口、爬取有缘网信息、模拟知乎登录、模拟github登录、模拟图虫网登录、爬取多点商城整站数据、爬取微信公众号历史文章、爬取微信群或者微信好友分享的文章、itchat监听指定微信公众号分享的文章
Stars: ✭ 615 (+1082.69%)
Mutual labels:  spider, xpath
Spider Flow
新一代爬虫平台,以图形化方式定义爬虫流程,不写代码即可完成爬虫。
Stars: ✭ 365 (+601.92%)
Mutual labels:  spider, xpath
Z-Spider
一些爬虫开发的技巧和案例
Stars: ✭ 33 (-36.54%)
Mutual labels:  spider, xpath
OpenScraper
An open source webapp for scraping: towards a public service for webscraping
Stars: ✭ 80 (+53.85%)
Mutual labels:  spider, xpath
fs2-data
streaming data parsing and transformation library
Stars: ✭ 103 (+98.08%)
Mutual labels:  xpath, jsonpath
go-xmldom
XML DOM processing for Golang, supports xpath query
Stars: ✭ 38 (-26.92%)
Mutual labels:  xpath
JsonPathKt
A lighter and more efficient implementation of JsonPath in Kotlin
Stars: ✭ 37 (-28.85%)
Mutual labels:  jsonpath
OpenYspider
千万级图片爬虫、视频爬虫 [开源版本] Image Spider
Stars: ✭ 122 (+134.62%)
Mutual labels:  spider
spider-school
自动答题程序🎉
Stars: ✭ 37 (-28.85%)
Mutual labels:  spider
fb scraper
FBLYZE is a Facebook scraping system and analysis system.
Stars: ✭ 61 (+17.31%)
Mutual labels:  extract-data
jessie
JsonPath for Dart
Stars: ✭ 23 (-55.77%)
Mutual labels:  jsonpath
elves
🎊 Design and implement of lightweight crawler framework.
Stars: ✭ 322 (+519.23%)
Mutual labels:  spider
landchina-spider
项目已经过时!无法应用在改版后的网站上。
Stars: ✭ 13 (-75%)
Mutual labels:  spider
rb-spider
基于 RabbitMQ 中间件的爬虫的 Ruby 实现 [Developing]
Stars: ✭ 13 (-75%)
Mutual labels:  spider
aliexscrape
Get Aliexpress product details in JSON
Stars: ✭ 80 (+53.85%)
Mutual labels:  spider
araneid
一个基于Glang语言开发的站群系统(蜘蛛池系统)
Stars: ✭ 25 (-51.92%)
Mutual labels:  spider
dotnet-security-unit-tests
A web application that contains several unit tests for the purpose of .NET security
Stars: ✭ 25 (-51.92%)
Mutual labels:  xpath
python-fxxk-spider
收集各种免费的 Python 爬虫项目
Stars: ✭ 184 (+253.85%)
Mutual labels:  spider
node-html-crawler
Simple for use node html crawler (spider) of site web pages
Stars: ✭ 30 (-42.31%)
Mutual labels:  spider
codechef-rank-comparator
Web application hosted on Heroku cloud platform based on web scraping in python using lxml library (XML Path Language).
Stars: ✭ 23 (-55.77%)
Mutual labels:  xpath
wget-lua
Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.
Stars: ✭ 52 (+0%)
Mutual labels:  spider
youdao
有道词典网页爬虫
Stars: ✭ 22 (-57.69%)
Mutual labels:  spider
qa
😚 Q & A website based on Spring Boot.
Stars: ✭ 46 (-11.54%)
Mutual labels:  spider
nodejs-meizitu
妹子图全站采集10G套图资源
Stars: ✭ 80 (+53.85%)
Mutual labels:  spider
douyin-api
抖音接口、抖音API、抖音数据爬虫、抖音直播数据、抖音直播Api、抖音视频Api、抖音爬虫、抖音去水印、抖音视频下载、抖音视频解析、抖音直播监控、抖音数据采集
Stars: ✭ 41 (-21.15%)
Mutual labels:  spider
benchmark-http
No description or website provided.
Stars: ✭ 15 (-71.15%)
Mutual labels:  spider
Subbranch-China
银行、支行名称。中国各地区各银行支行名称数据爬虫,数据来源微信商户平台,已经整理可直接导入的sql文件
Stars: ✭ 31 (-40.38%)
Mutual labels:  spider
DouBanReptile
豆瓣租房小组多线程爬虫。爬取后自动按时间排序生成markdown文件。
Stars: ✭ 31 (-40.38%)
Mutual labels:  xpath
hupu Album Downloader
虎扑网相册下载工具
Stars: ✭ 17 (-67.31%)
Mutual labels:  spider
scrapy-distributed
A series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.
Stars: ✭ 38 (-26.92%)
Mutual labels:  spider
zhihu
搜索你的知乎收藏:可以直观地浏览你的所有收藏夹的内容,并进行全文搜索
Stars: ✭ 39 (-25%)
Mutual labels:  spider
douban-movie
Get movie info from douban(豆瓣) and display in your terminal
Stars: ✭ 17 (-67.31%)
Mutual labels:  spider
zucc xk ZhengFang
ZUCC正方教务系统抢课助手。针对ZUCC正方教务系统模拟登录,爬取课程信息,自动抓包发包抢课。具体实现流程可参考README中的实现原理链接
Stars: ✭ 40 (-23.08%)
Mutual labels:  spider
documentDownloader
download document from book118 for free
Stars: ✭ 72 (+38.46%)
Mutual labels:  spider
photo-spider-scrapy
10 photo website spiders, 10 个国外图库的 scrapy 爬虫代码
Stars: ✭ 17 (-67.31%)
Mutual labels:  spider
jsonuri
🌳 阿里剑鱼、iceluna、vanex 数据操作底层库,使用O(n) 复杂度回溯祖先节点
Stars: ✭ 131 (+151.92%)
Mutual labels:  jsonpath
ChineseStarsRelationship
中国明星数据爬取。你甚至可以拿到互联网上所有的人之间的关系,接下来你可以自己发挥!基于这些数据,你可以完成更多有趣的事情。比如说社交网络分析,关系网络可视化,算法研究,和其他有意思的事情。Chinese star data crawling. You can even get all the people on the internet! Based on these data, you can do more interesting things. For example, social network analysis, relational network visualization, algorithm research, and other interesting things.
Stars: ✭ 26 (-50%)
Mutual labels:  spider
SpiderDemo
爬虫Demo,基于Python实现
Stars: ✭ 56 (+7.69%)
Mutual labels:  spider
JSONPath.sh
JSONPath implementation in Bash for filtering, merging and modifying JSON
Stars: ✭ 45 (-13.46%)
Mutual labels:  jsonpath
MusicSpider
Music Spider. Go 👾 Music Spider 是使用Golang写的音乐聚合爬虫,目前支持的站点包括 网易、QQ、虾米、酷狗、百度。
Stars: ✭ 24 (-53.85%)
Mutual labels:  spider
Scrapy IPProxyPool
免费 IP 代理池。Scrapy 爬虫框架插件
Stars: ✭ 100 (+92.31%)
Mutual labels:  spider
scrapy facebooker
Collection of scrapy spiders which can scrape posts, images, and so on from public Facebook Pages.
Stars: ✭ 22 (-57.69%)
Mutual labels:  spider
spider
🌟 powered by python3( simple learning of spider) 百度文库;网易云歌曲; 豆瓣电影; GitHub; 京东; QQ空间; 天气; vip解析助手; TED文本内容; wifi破解脚本; 必应图片设置为桌面等爬取
Stars: ✭ 124 (+138.46%)
Mutual labels:  spider
jobSpider
jobSpider是一只scrapy爬虫,用于爬取职位信息
Stars: ✭ 28 (-46.15%)
Mutual labels:  spider
163Music
163music spider by scrapy.
Stars: ✭ 60 (+15.38%)
Mutual labels:  spider
fontoxpath
A minimalistic XPath 3.1 implementation in pure JavaScript
Stars: ✭ 97 (+86.54%)
Mutual labels:  xpath
ctxexp-parser
In the dynamic execution of JS language environment (wechat applet) to execute JS class calling function.
Stars: ✭ 17 (-67.31%)
Mutual labels:  jsonpath
python-spider
python爬虫小项目【持续更新】【笔趣阁小说下载、Tweet数据抓取、天气查询、网易云音乐逆向、天天基金网查询、微博数据抓取(生成cookie)、有道翻译逆向、企查查免登陆爬虫、大众点评svg加密破解、B站用户爬虫、拉钩免登录爬虫、自如租房字体加密、知乎问答
Stars: ✭ 45 (-13.46%)
Mutual labels:  spider
spider
A web spider framework
Stars: ✭ 25 (-51.92%)
Mutual labels:  spider
L-Spider
A DHT Spider allows you to sniff the torrents and magnets.You can download them directly.
Stars: ✭ 64 (+23.08%)
Mutual labels:  spider
go-movies
golang spider Crawler 爬虫 电影
Stars: ✭ 168 (+223.08%)
Mutual labels:  spider
learning spider
这其实是一份学习笔记。包括学习记录、爬虫练习平台(网站)、自制工具脚本
Stars: ✭ 54 (+3.85%)
Mutual labels:  spider
Scrapy-Spiders
一个基于Scrapy的数据采集爬虫代码库
Stars: ✭ 34 (-34.62%)
Mutual labels:  spider
Sina Spider
新浪爬虫,基于Python+Selenium。模拟登陆后保存cookie,实现登录状态的保存。可以通过输入关键词来爬取到关键词相关的热门微博。
Stars: ✭ 25 (-51.92%)
Mutual labels:  spider
QQSpider
爬取QQ用户信息(qq号、昵称、生日、地址等基本信息)并做简要analysis。
Stars: ✭ 21 (-59.62%)
Mutual labels:  spider
Spider
Spider项目将会不断更新本人学习使用过的爬虫方法!!!
Stars: ✭ 16 (-69.23%)
Mutual labels:  spider
auto-click-auto-fill
Auto Click Auto Fill on any web page
Stars: ✭ 111 (+113.46%)
Mutual labels:  xpath
DAM
Temario y ejercicios de Desarrollo de Aplicaciones Multiplataforma (DAM)
Stars: ✭ 96 (+84.62%)
Mutual labels:  xpath
scrapy-admin
A django admin site for scrapy
Stars: ✭ 44 (-15.38%)
Mutual labels:  spider
FofaMap
FofaMap是一款基于Python3开发的跨平台FOFA数据采集器,支持网站图标查询、批量查询和自定义查询FOFA数据,能够根据查询结果自动去重并生成对应的Excel表格。另外春节特别版还可以调用Nuclei对目标进行漏洞扫描,让你在挖洞路上快人一步。
Stars: ✭ 118 (+126.92%)
Mutual labels:  spider
1-60 of 494 similar projects