CollyElegant Scraper and Crawler Framework for Golang
Stars: ✭ 15,535 (+7745.96%)
Gopa[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn
Stars: ✭ 277 (+39.9%)
ScrapyScrapy, a fast high-level web crawling & scraping framework for Python.
Stars: ✭ 42,343 (+21285.35%)
Sasila一个灵活、友好的爬虫框架
Stars: ✭ 286 (+44.44%)
bots-zooNo description or website provided.
Stars: ✭ 59 (-70.2%)
FerretDeclarative web scraping
Stars: ✭ 4,837 (+2342.93%)
flink-crawlerContinuous scalable web crawler built on top of Flink and crawler-commons
Stars: ✭ 48 (-75.76%)
CrawlyCrawly, a high-level web crawling & scraping framework for Elixir.
Stars: ✭ 440 (+122.22%)
SpidyThe simple, easy to use command line web crawler.
Stars: ✭ 257 (+29.8%)
DotnetcrawlerDotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c
Stars: ✭ 100 (-49.49%)
Lulu[Unmaintained] A simple and clean video/music/image downloader 👾
Stars: ✭ 789 (+298.48%)
WebmagicA scalable web crawler framework for Java.
Stars: ✭ 10,186 (+5044.44%)
Linkedin Profile Scraper🕵️♂️ LinkedIn profile scraper returning structured profile data in JSON. Works in 2020.
Stars: ✭ 171 (-13.64%)
CrawlBoxEasy way to brute-force web directory.
Stars: ✭ 118 (-40.4%)
papercutPapercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.
Stars: ✭ 15 (-92.42%)
ARGUSARGUS is an easy-to-use web scraping tool. The program is based on the Scrapy Python framework and is able to crawl a broad range of different websites. On the websites, ARGUS is able to perform tasks like scraping texts or collecting hyperlinks between websites. See: https://link.springer.com/article/10.1007/s11192-020-03726-9
Stars: ✭ 68 (-65.66%)
Apify JsApify SDK — The scalable web scraping and crawling library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.
Stars: ✭ 3,154 (+1492.93%)
SupercrawlerA web crawler. Supercrawler automatically crawls websites. Define custom handlers to parse content. Obeys robots.txt, rate limits and concurrency limits.
Stars: ✭ 306 (+54.55%)
AutoscraperA Smart, Automatic, Fast and Lightweight Web Scraper for Python
Stars: ✭ 4,077 (+1959.09%)
ScrappleA framework for creating semi-automatic web content extractors
Stars: ✭ 464 (+134.34%)
Spider Flow新一代爬虫平台,以图形化方式定义爬虫流程,不写代码即可完成爬虫。
Stars: ✭ 365 (+84.34%)
Awesome CrawlerA collection of awesome web crawler,spider in different languages
Stars: ✭ 4,793 (+2320.71%)
SpidrA versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
Stars: ✭ 656 (+231.31%)
Webstera reliable high-level web crawling & scraping framework for Node.js.
Stars: ✭ 364 (+83.84%)
Creeper🐾 Creeper - The Next Generation Crawler Framework (Go)
Stars: ✭ 762 (+284.85%)
MamanRust Web Crawler saving pages on Redis
Stars: ✭ 39 (-80.3%)
img-cliAn interactive Command-Line Interface Build in NodeJS for downloading a single or multiple images to disk from URL
Stars: ✭ 15 (-92.42%)
NutchApache Nutch is an extensible and scalable web crawler
Stars: ✭ 2,277 (+1050%)
SpidermonScrapy Extension for monitoring spiders execution.
Stars: ✭ 309 (+56.06%)
pompScreen scraping and web crawling framework
Stars: ✭ 61 (-69.19%)
DataflowkitExtract structured data from web sites. Web sites scraping.
Stars: ✭ 456 (+130.3%)
GeziyorGeziyor, a fast web crawling & scraping framework for Go. Supports JS rendering.
Stars: ✭ 1,246 (+529.29%)
PastepwnPython framework to scrape Pastebin pastes and analyze them
Stars: ✭ 87 (-56.06%)
D4n155OWASP D4N155 - Intelligent and dynamic wordlist using OSINT
Stars: ✭ 105 (-46.97%)
ScrapyrtHTTP API for Scrapy spiders
Stars: ✭ 637 (+221.72%)
NewcrawlerFree Web Scraping Tool with Java
Stars: ✭ 589 (+197.47%)
Awesome Python Primer自学入门 Python 优质中文资源索引,包含 书籍 / 文档 / 视频,适用于 爬虫 / Web / 数据分析 / 机器学习 方向
Stars: ✭ 57 (-71.21%)
InfinitycrawlerA simple but powerful web crawler library for .NET
Stars: ✭ 97 (-51.01%)
ArachnidPowerful web scraping framework for Crystal
Stars: ✭ 68 (-65.66%)
AbotxCross Platform C# Web crawler framework, headless browser, parallel crawler. Please star this project! +1.
Stars: ✭ 63 (-68.18%)
Web Bee🐝 Web vertical crawler framework for fun
Stars: ✭ 184 (-7.07%)
GrawlerGrawler is a tool written in PHP which comes with a web interface that automates the task of using google dorks, scrapes the results, and stores them in a file.
Stars: ✭ 98 (-50.51%)
SquidwarcSquidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head
Stars: ✭ 125 (-36.87%)
NewspaperNews, full-text, and article metadata extraction in Python 3. Advanced docs:
Stars: ✭ 11,545 (+5730.81%)
CrawlerGo process used to crawl websites
Stars: ✭ 147 (-25.76%)
Crawlab LiteLite version of Crawlab. 轻量版 Crawlab 爬虫管理平台
Stars: ✭ 122 (-38.38%)
Pspider简单易用的Python爬虫框架,QQ交流群:597510560
Stars: ✭ 1,611 (+713.64%)
Instagram BotAn Instagram bot developed using the Selenium Framework
Stars: ✭ 138 (-30.3%)
Awesome PuppeteerA curated list of awesome puppeteer resources.
Stars: ✭ 1,728 (+772.73%)
Mimo-CrawlerA web crawler that uses Firefox and js injection to interact with webpages and crawl their content, written in nodejs.
Stars: ✭ 22 (-88.89%)
proxiProxy pool. Finds and checks proxies with rest api for querying results. Can find over 25k proxies in under 5 minutes.
Stars: ✭ 32 (-83.84%)
CrawlabDistributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架
Stars: ✭ 8,392 (+4138.38%)
Skycaiji蓝天采集器是一款免费的数据采集发布爬虫软件,采用php+mysql开发,可部署在云服务器,几乎能采集所有类型的网页,无缝对接各类CMS建站程序,免登录实时发布数据,全自动无需人工干预!是网页大数据采集软件中完全跨平台的云端爬虫系统
Stars: ✭ 1,514 (+664.65%)
AbotCross Platform C# web crawler framework built for speed and flexibility. Please star this project! +1.
Stars: ✭ 1,961 (+890.4%)