Skycaiji蓝天采集器是一款免费的数据采集发布爬虫软件,采用php+mysql开发,可部署在云服务器,几乎能采集所有类型的网页,无缝对接各类CMS建站程序,免登录实时发布数据,全自动无需人工干预!是网页大数据采集软件中完全跨平台的云端爬虫系统
Stars: ✭ 1,514 (+512.96%)
arachnodHigh performance crawler for Nodejs
Stars: ✭ 17 (-93.12%)
flink-crawlerContinuous scalable web crawler built on top of Flink and crawler-commons
Stars: ✭ 48 (-80.57%)
Hacker News Digest📰 A responsive interface of Hacker News with summaries and thumbnails.
Stars: ✭ 278 (+12.55%)
Gopa[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn
Stars: ✭ 277 (+12.15%)
Spoon🥄 A package for building specific Proxy Pool for different Sites.
Stars: ✭ 173 (-29.96%)
ToapiEvery web site provides APIs.
Stars: ✭ 3,209 (+1199.19%)
Linkedin Profile Scraper🕵️♂️ LinkedIn profile scraper returning structured profile data in JSON. Works in 2020.
Stars: ✭ 171 (-30.77%)
Crawler Detect🕷 CrawlerDetect is a PHP class for detecting bots/crawlers/spiders via the user agent
Stars: ✭ 1,549 (+527.13%)
Webstera reliable high-level web crawling & scraping framework for Node.js.
Stars: ✭ 364 (+47.37%)
Fictiondown小说下载|小说爬取|起点|笔趣阁|导出Markdown|导出txt|转换epub|广告过滤|自动校对
Stars: ✭ 362 (+46.56%)
GainWeb crawling framework based on asyncio.
Stars: ✭ 2,002 (+710.53%)
Freshonions TorscraperFresh Onions is an open source TOR spider / hidden service onion crawler hosted at zlal32teyptf4tvi.onion
Stars: ✭ 348 (+40.89%)
Bilili🍻 bilibili video (including bangumi) and danmaku downloader | B站视频(含番剧)、弹幕下载器
Stars: ✭ 379 (+53.44%)
Signature algorithm各种App、小程序、网站的请求签名或加密算法。 现已有:自如、小红书、蛋壳公寓、luckin coffee(瑞幸咖啡)、bangkokair(曼谷航空)
Stars: ✭ 380 (+53.85%)
Xcrawler快速、简洁且强大的PHP爬虫框架
Stars: ✭ 344 (+39.27%)
Go jobs带你了解一下Golang的市场行情
Stars: ✭ 526 (+112.96%)
Haipproxy💖 High available distributed ip proxy pool, powerd by Scrapy and Redis
Stars: ✭ 4,993 (+1921.46%)
FbcrawlA Facebook crawler
Stars: ✭ 536 (+117%)
Awesome CrawlerA collection of awesome web crawler,spider in different languages
Stars: ✭ 4,793 (+1840.49%)
NetdiscoveryNetDiscovery 是一款基于 Vert.x、RxJava 2 等框架实现的通用爬虫框架/中间件。
Stars: ✭ 573 (+131.98%)
NewcrawlerFree Web Scraping Tool with Java
Stars: ✭ 589 (+138.46%)
AbotCross Platform C# web crawler framework built for speed and flexibility. Please star this project! +1.
Stars: ✭ 1,961 (+693.93%)
Creeper🐾 Creeper - The Next Generation Crawler Framework (Go)
Stars: ✭ 762 (+208.5%)
GospiderGospider - Fast web spider written in Go
Stars: ✭ 785 (+217.81%)
Grab SiteThe archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
Stars: ✭ 680 (+175.3%)
ScrapitScraping scripts for various websites.
Stars: ✭ 25 (-89.88%)
Zhihu Crawlerzhihu-crawler是一个基于Java的高性能、支持免费http代理池、支持横向扩展、分布式爬虫项目
Stars: ✭ 890 (+260.32%)
Python3 SpiderPython爬虫实战 - 模拟登陆各大网站 包含但不限于:滑块验证、拼多多、美团、百度、bilibili、大众点评、淘宝,如果喜欢请start ❤️
Stars: ✭ 2,129 (+761.94%)
SpidrA versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
Stars: ✭ 656 (+165.59%)
Awesome Python Primer自学入门 Python 优质中文资源索引,包含 书籍 / 文档 / 视频,适用于 爬虫 / Web / 数据分析 / 机器学习 方向
Stars: ✭ 57 (-76.92%)
PhotonIncredibly fast crawler designed for OSINT.
Stars: ✭ 8,332 (+3273.28%)
BeanbunBeanbun 是用 PHP 编写的多进程网络爬虫框架,具有良好的开放性、高可扩展性,基于 Workerman。
Stars: ✭ 1,096 (+343.72%)
AvbookAV 电影管理系统, avmoo , javbus , javlibrary 爬虫,线上 AV 影片图书馆,AV 磁力链接数据库,Japanese Adult Video Library,Adult Video Magnet Links - Japanese Adult Video Database
Stars: ✭ 8,133 (+3192.71%)
Laravel Crawler DetectA Laravel wrapper for CrawlerDetect - the web crawler detection library
Stars: ✭ 227 (-8.1%)
IcrawlerA multi-thread crawler framework with many builtin image crawlers provided.
Stars: ✭ 629 (+154.66%)
Douyinsdk抖音 SDK,数据采集,爬虫抓取不是梦
Stars: ✭ 99 (-59.92%)
DiggerDigger is a powerful and flexible web crawler implemented by pure golang
Stars: ✭ 130 (-47.37%)
Ppspiderweb spider built by puppeteer, support task-queue and task-scheduling by decorators,support nedb / mongodb, support data visualization; 基于puppeteer的web爬虫框架,提供灵活的任务队列管理调度方案,提供便捷的数据保存方案(nedb/mongodb),提供数据可视化和用户交互的实现方案
Stars: ✭ 237 (-4.05%)
Algoliasearch NetlifyOfficial Algolia Plugin for Netlify. Index your website to Algolia when deploying your project to Netlify with the Algolia Crawler
Stars: ✭ 208 (-15.79%)
Crawler illegal cases in chinaCollection of China illegal cases about web crawler 本项目用来整理所有中国大陆爬虫开发者涉诉与违规相关的新闻、资料与法律法规。致力于帮助在中国大陆工作的爬虫行业从业者了解我国相关法律,避免触碰数据合规红线。 [AD]中文知识图谱门户
Stars: ✭ 2,448 (+891.09%)
GrabWeb Scraping Framework
Stars: ✭ 2,147 (+769.23%)
ProxybrokerProxy [Finder | Checker | Server]. HTTP(S) & SOCKS 🎭
Stars: ✭ 2,767 (+1020.24%)
Fiction house小说精品屋是一个多平台(web、安卓app、微信小程序)、功能完善的屏幕自适应小说漫画连载系统,包含精品小说专区、轻小说专区和漫画专区。包括小说/漫画分类、小说/漫画搜索、小说/漫画排行、完本小说/漫画、小说/漫画评分、小说/漫画在线阅读、小说/漫画书架、小说/漫画阅读记录、小说下载、小说弹幕、小说/漫画自动采集/更新/纠错、小说内容自动分享到微博、邮件自动推广、链接自动推送到百度搜索引擎等功能。
Stars: ✭ 2,710 (+997.17%)
NosmokeA cross platform UI crawler which scans view trees then generate and execute UI test cases.
Stars: ✭ 178 (-27.94%)
Instagram CrawlerCrawl instagram photos, posts and videos for download.
Stars: ✭ 178 (-27.94%)
Media ScraperScrapes all photos and videos in a web page / Instagram / Twitter / Tumblr / Reddit / pixiv / TikTok
Stars: ✭ 206 (-16.6%)
N2h4네이버 뉴스 수집을 위한 도구
Stars: ✭ 177 (-28.34%)