Gospidergolang实现的爬虫框架,使用者只需关心页面规则,提供web管理界面。基于colly开发。
Stars: ✭ 285 (+1400%)
Spiderman基于 scrapy-redis 的通用分布式爬虫框架
Stars: ✭ 392 (+1963.16%)
memes-apiAPI for scrapping common meme sites
Stars: ✭ 17 (-10.53%)
awesome-interfaceAngularJS SPA interface for awesome lists. Awesome lists parsed using python.
Stars: ✭ 25 (+31.58%)
Crawlertutorial爬蟲極簡教學(fetch, parse, search, multiprocessing, API)- PTT 為例
Stars: ✭ 282 (+1384.21%)
nivinEdu拟物校园,一个开源的高校教务移动化解决方案。
Stars: ✭ 24 (+26.32%)
FerretDeclarative web scraping
Stars: ✭ 4,837 (+25357.89%)
Hacker News Digest📰 A responsive interface of Hacker News with summaries and thumbnails.
Stars: ✭ 278 (+1363.16%)
Find Cheapest FlightsUse Google Flights API and scrape Expedia to find the cheapest/shortest flights!
Stars: ✭ 18 (-5.26%)
NetdiscoveryNetDiscovery 是一款基于 Vert.x、RxJava 2 等框架实现的通用爬虫框架/中间件。
Stars: ✭ 573 (+2915.79%)
Templatespider扒网站工具,看好哪个网站,指定好URL,自动扒下来做成模版。所见网站,皆可为我所用!
Stars: ✭ 390 (+1952.63%)
munich-scriptsSome useful scripts simplifying bureaucracy
Stars: ✭ 105 (+452.63%)
Captcha-ToolsAll-in-one Python (And now Go!) module to help solve captchas with Capmonster, 2captcha and Anticaptcha API's!
Stars: ✭ 23 (+21.05%)
Gopa[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn
Stars: ✭ 277 (+1357.89%)
ScrappleA framework for creating semi-automatic web content extractors
Stars: ✭ 464 (+2342.11%)
anikimiapiA Simple, LightWeight, Statically-Typed Python3 API wrapper for GogoAnime.
Stars: ✭ 15 (-21.05%)
FilesDocs and files for ScrapydWeb, Scrapyd, Scrapy, and other projects
Stars: ✭ 390 (+1952.63%)
SocialInfo4Jfetch data from Facebook, Instagram and LinkedIn
Stars: ✭ 44 (+131.58%)
jobSpiderjobSpider是一只scrapy爬虫,用于爬取职位信息
Stars: ✭ 28 (+47.37%)
Utlyz-CLILet's you to access your FB account from the command line and returns various things number of unread notifications, messages or friend requests you have.
Stars: ✭ 30 (+57.89%)
OnlyfansScrape all the media from an OnlyFans account - Updated regularly
Stars: ✭ 731 (+3747.37%)
Spider163抓取网易云音乐热门评论
Stars: ✭ 569 (+2894.74%)
QzoneexporterQQ空间爬虫,可导出并显示日志、相册、留言板、说说、照片、视频等数据。
Stars: ✭ 386 (+1931.58%)
QzoneexportQQ空间导出助手,用于备份QQ空间的说说、日志、私密日记、相册、视频、留言板、QQ好友、收藏夹、分享、最近访客为文件,便于迁移与保存
Stars: ✭ 456 (+2300%)
proxiProxy pool. Finds and checks proxies with rest api for querying results. Can find over 25k proxies in under 5 minutes.
Stars: ✭ 32 (+68.42%)
Dumpall一款信息泄漏利用工具,适用于.git/.svn源代码泄漏和.DS_Store泄漏
Stars: ✭ 250 (+1215.79%)
dolarPyChecks USD/PYG exchange rate from several sites, with a calculator, RESTful API and a twitter bot
Stars: ✭ 45 (+136.84%)
scraping-ebayScraping Ebay's products using Scrapy Web Crawling Framework
Stars: ✭ 79 (+315.79%)
DataflowkitExtract structured data from web sites. Web sites scraping.
Stars: ✭ 456 (+2300%)
Android-Web-ScraperAndroid Web Scraper is a simple library for android web automation. You can perform web task in background to fetch website data programmatically.
Stars: ✭ 38 (+100%)
lightnovel epub🍭 epub generator for (light)novels (轻) 小说 epub 生成器,支持站点:轻之国度、轻小说文库
Stars: ✭ 89 (+368.42%)
bgmtoolsBangumi小工具
Stars: ✭ 66 (+247.37%)
Instagram4j📷 Instagram private API in Java
Stars: ✭ 629 (+3210.53%)
zimitMake a ZIM file from any Web site and surf offline!
Stars: ✭ 67 (+252.63%)
TrollHunterTwitter Troll & Fake News Hunter - Crawls news websites and twitter to identify fake news
Stars: ✭ 38 (+100%)
ScrapedinLinkedIn Scraper (currently working 2020)
Stars: ✭ 453 (+2284.21%)
galerA fast tool to fetch URLs from HTML attributes by crawl-in.
Stars: ✭ 138 (+626.32%)
Zhihu Crawlerzhihu-crawler是一个基于Java的高性能、支持免费http代理池、支持横向扩展、分布式爬虫项目
Stars: ✭ 890 (+4584.21%)
Xxl CrawlerA distributed web crawler framework.(分布式爬虫框架XXL-CRAWLER)
Stars: ✭ 561 (+2852.63%)
ttc subway timesA scraper to grab and publish TTC subway arrival times.
Stars: ✭ 40 (+110.53%)
html2rss-web🕸 Generates and delivers RSS feeds via HTTP. Create your own feeds or get started quickly with the included configs.
Stars: ✭ 36 (+89.47%)
document-dlCommand line program to download documents from web portals.
Stars: ✭ 14 (-26.32%)
BookcorpusCrawl BookCorpus
Stars: ✭ 443 (+2231.58%)
chesfCHeSF is the Chrome Headless Scraping Framework, a very very alpha code to scrape javascript intensive web pages
Stars: ✭ 18 (-5.26%)
kaa.si-cliStream anime from kaa.si and sync with anilist
Stars: ✭ 12 (-36.84%)
RPICovidScraperscraper for Rensselaer Polytechnic Institute (RPI)'s Covid Dashboard
Stars: ✭ 12 (-36.84%)
Bilili🍻 bilibili video (including bangumi) and danmaku downloader | B站视频(含番剧)、弹幕下载器
Stars: ✭ 379 (+1894.74%)
Flight Prices ScraperAutomated Script to scrape flight prices from any website into a csv format
Stars: ✭ 17 (-10.53%)
DuckduckgoAn unofficial DuckDuckGo search API.
Stars: ✭ 6 (-68.42%)
Wechatsogou基于搜狗微信搜索的微信公众号爬虫接口
Stars: ✭ 5,220 (+27373.68%)
Signature algorithm各种App、小程序、网站的请求签名或加密算法。 现已有:自如、小红书、蛋壳公寓、luckin coffee(瑞幸咖啡)、bangkokair(曼谷航空)
Stars: ✭ 380 (+1900%)
ZeiverA Scraper, Downloader, & Recorder for static open directories.
Stars: ✭ 14 (-26.32%)