Spider Flow新一代爬虫平台,以图形化方式定义爬虫流程,不写代码即可完成爬虫。
Stars: ✭ 365 (+16.99%)
Skrape.itA Kotlin-based testing/scraping/parsing library providing the ability to analyze and extract data from HTML (server & client-side rendered). It places particular emphasis on ease of use and a high level of readability by providing an intuitive DSL. It aims to be a testing lib, but can also be used to scrape websites in a convenient fashion.
Stars: ✭ 231 (-25.96%)
GeccoEasy to use lightweight web crawler(易用的轻量化网络爬虫)
Stars: ✭ 2,310 (+640.38%)
Appcrawler基于appium的app自动遍历工具
Stars: ✭ 925 (+196.47%)
Docs《数据采集从入门到放弃》源码。内容简介:爬虫介绍、就业情况、爬虫工程师面试题 ;HTTP协议介绍; Requests使用 ;解析器Xpath介绍; MongoDB与MySQL; 多线程爬虫; Scrapy介绍 ;Scrapy-redis介绍; 使用docker部署; 使用nomad管理docker集群; 使用EFK查询docker日志
Stars: ✭ 118 (-62.18%)
GraphqueryGraphQuery is a query language and execution engine tied to any backend service.
Stars: ✭ 112 (-64.1%)
Jsoupjsoup: the Java HTML parser, built for HTML editing, cleaning, scraping, and XSS safety.
Stars: ✭ 9,184 (+2843.59%)
crawlerA simple and flexible web crawler framework for java.
Stars: ✭ 20 (-93.59%)
rankr🇰🇷 Realtime integrated information analysis service
Stars: ✭ 21 (-93.27%)
Weixin Spider微信公众号爬虫,公众号历史文章,文章评论,文章阅读及在看数据,可视化web页面,可部署于Windows服务器。基于Python3之flask/mysql/redis/mitmproxy/pywin32等实现,高效微信爬虫,微信公众号爬虫,历史文章,文章评论,数据更新。
Stars: ✭ 287 (-8.01%)
indieweb-searchSource code for the IndieWeb search engine.
Stars: ✭ 16 (-94.87%)
ArachniWeb Application Security Scanner Framework
Stars: ✭ 2,942 (+842.95%)
bots-zooNo description or website provided.
Stars: ✭ 59 (-81.09%)
Bt Btt磁力網站U3C3介紹以及域名更新
Stars: ✭ 261 (-16.35%)
dijnet-botAz összes számlád még egy helyen :)
Stars: ✭ 17 (-94.55%)
slime🍰 一个可视化的爬虫平台
Stars: ✭ 27 (-91.35%)
Hquery.phpAn extremely fast web scraper that parses megabytes of invalid HTML in a blink of an eye. PHP5.3+, no dependencies.
Stars: ✭ 295 (-5.45%)
Gospidergolang实现的爬虫框架,使用者只需关心页面规则,提供web管理界面。基于colly开发。
Stars: ✭ 285 (-8.65%)
Tumblr crawlerThis is a Multi-thread crawler for Tumblr.
Stars: ✭ 258 (-17.31%)
CrawlBoxEasy way to brute-force web directory.
Stars: ✭ 118 (-62.18%)
tg crawlerJust a crawler based on tg-cli for Telegram. Deprecated by now, please use telegram-export.
Stars: ✭ 71 (-77.24%)
Gopa[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn
Stars: ✭ 277 (-11.22%)
scraperScraper example built on Scala, Akka and Jsoup
Stars: ✭ 15 (-95.19%)
spparseran async ETL tool written in Python.
Stars: ✭ 34 (-89.1%)
RcrawlerAn R web crawler and scraper
Stars: ✭ 274 (-12.18%)
html-queryA fluent and functional approach to querying HTML
Stars: ✭ 48 (-84.62%)
Go DorkThe fastest dork scanner written in Go.
Stars: ✭ 274 (-12.18%)
snapcrawlCrawl a website and take screenshots
Stars: ✭ 37 (-88.14%)
TumblTwoTumblTwo, an Improved Fork of TumblOne, a Tumblr Downloader.
Stars: ✭ 57 (-81.73%)
Sasila一个灵活、友好的爬虫框架
Stars: ✭ 286 (-8.33%)
WebCrawler一个轻量级、快速、多线程、多管道、灵活配置的网络爬虫。
Stars: ✭ 39 (-87.5%)
Weibo terminator workflowUpdate Version of weibo_terminator, This is Workflow Version aim at Get Job Done!
Stars: ✭ 259 (-16.99%)
videodlVideodl: A lightweight video downloader written by pure python.
Stars: ✭ 320 (+2.56%)
SupercrawlerA web crawler. Supercrawler automatically crawls websites. Define custom handlers to parse content. Obeys robots.txt, rate limits and concurrency limits.
Stars: ✭ 306 (-1.92%)
2017 PyConTW Talktw.pycon.org/2017/events/talk/314386410792550475/
Stars: ✭ 18 (-94.23%)
SpidyThe simple, easy to use command line web crawler.
Stars: ✭ 257 (-17.63%)
Crawlertutorial爬蟲極簡教學(fetch, parse, search, multiprocessing, API)- PTT 為例
Stars: ✭ 282 (-9.62%)
WeiboCrawler无cookie版微博爬虫,可以连续爬取一个或多个新浪微博用户信息、用户微博及其微博评论转发。
Stars: ✭ 45 (-85.58%)
lightnovel epub🍭 epub generator for (light)novels (轻) 小说 epub 生成器,支持站点:轻之国度、轻小说文库
Stars: ✭ 89 (-71.47%)
BilibiliCrawler🌀 crawl bilibili user info and video info for data analysis | BiliBili爬虫
Stars: ✭ 25 (-91.99%)
galerA fast tool to fetch URLs from HTML attributes by crawl-in.
Stars: ✭ 138 (-55.77%)
spiderable-middleware🤖 Prerendering for JavaScript powered websites. Great solution for PWAs (Progressive Web Apps), SPAs (Single Page Applications), and other websites based on top of front-end JavaScript frameworks
Stars: ✭ 29 (-90.71%)
ExisteXist Native XML Database and Application Platform
Stars: ✭ 294 (-5.77%)
octopusRecursive and multi-threaded broken link checker
Stars: ✭ 19 (-93.91%)
domfindA Python DNS crawler to find identical domain names under different TLDs.
Stars: ✭ 22 (-92.95%)
PY-Login模拟登录各类网站,操作 API 完成各种不可描述的事情
Stars: ✭ 26 (-91.67%)
medium-stat-boxPractical pinned gist which show your latest medium status 📌
Stars: ✭ 29 (-90.71%)
php-googleGoogle search results crawler, get google search results that you need - php
Stars: ✭ 23 (-92.63%)