All Projects → Gopa → Similar Projects or Alternatives

1912 Open source projects that are alternatives of or similar to Gopa

A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.

Stars: ✭ 656 (+136.82%)

Mutual labels: crawler, spider, web-scraping, web-crawler

Arachnid

Powerful web scraping framework for Crystal

Stars: ✭ 68 (-75.45%)

Mutual labels: crawler, spider, web-scraping, crawling

Linkedin Profile Scraper

🕵️‍♂️ LinkedIn profile scraper returning structured profile data in JSON. Works in 2020.

Stars: ✭ 171 (-38.27%)

Mutual labels: crawler, spider, scraping, crawling

Colly

Elegant Scraper and Crawler Framework for Golang

Stars: ✭ 15,535 (+5508.3%)

Mutual labels: crawler, spider, scraping, crawling

flink-crawler

Continuous scalable web crawler built on top of Flink and crawler-commons

Stars: ✭ 48 (-82.67%)

Mutual labels: crawler, spider, web-crawler, crawling

Crawly

Crawly, a high-level web crawling & scraping framework for Elixir.

Stars: ✭ 440 (+58.84%)

Mutual labels: crawler, spider, scraping, crawling

Antch

Antch, a fast, powerful and extensible web crawling & scraping framework for Go

Stars: ✭ 198 (-28.52%)

Mutual labels: crawler, scraping, crawling, web-crawler

Sasila

一个灵活、友好的爬虫框架

Stars: ✭ 286 (+3.25%)

Mutual labels: crawler, scraping, crawling

Easy Scraping Tutorial

Simple but useful Python web scraping tutorial code.

Stars: ✭ 583 (+110.47%)

Mutual labels: crawler, scraping, crawling

Headless Chrome Crawler

Distributed crawler powered by Headless Chrome

Stars: ✭ 5,129 (+1751.62%)

Mutual labels: crawler, scraping, crawling

Pspider

简单易用的Python爬虫框架，QQ交流群：597510560

Stars: ✭ 1,611 (+481.59%)

Mutual labels: crawler, spider, web-crawler

Awesome Python Primer

自学入门 Python 优质中文资源索引，包含书籍 / 文档 / 视频，适用于爬虫 / Web / 数据分析 / 机器学习方向

Stars: ✭ 57 (-79.42%)

Mutual labels: crawler, spider, scraping

Autoscraper

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

Stars: ✭ 4,077 (+1371.84%)

Mutual labels: crawler, scraping, web-scraping

Apify Js

Apify SDK — The scalable web scraping and crawling library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.

Stars: ✭ 3,154 (+1038.63%)

Mutual labels: scraping, web-scraping, crawling

Skycaiji

蓝天采集器是一款免费的数据采集发布爬虫软件，采用php+mysql开发，可部署在云服务器，几乎能采集所有类型的网页，无缝对接各类CMS建站程序，免登录实时发布数据，全自动无需人工干预！是网页大数据采集软件中完全跨平台的云端爬虫系统

Stars: ✭ 1,514 (+446.57%)

Mutual labels: crawler, spider, crawling

Spider Flow

新一代爬虫平台，以图形化方式定义爬虫流程，不写代码即可完成爬虫。

Stars: ✭ 365 (+31.77%)

Mutual labels: crawler, spider, web-crawler

Scrapple

A framework for creating semi-automatic web content extractors

Stars: ✭ 464 (+67.51%)

Mutual labels: crawler, scraping, web-scraping

papercut

Papercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.

Stars: ✭ 15 (-94.58%)

Mutual labels: crawler, scraping, web-scraping

Webster

a reliable high-level web crawling & scraping framework for Node.js.

Stars: ✭ 364 (+31.41%)

Mutual labels: crawler, spider, crawling

Lulu

[Unmaintained] A simple and clean video/music/image downloader 👾

Stars: ✭ 789 (+184.84%)

Mutual labels: crawler, scraping, crawling

Newcrawler

Free Web Scraping Tool with Java

Stars: ✭ 589 (+112.64%)

Mutual labels: crawler, spider, scraping

Maman

Rust Web Crawler saving pages on Redis

Stars: ✭ 39 (-85.92%)

Mutual labels: crawler, spider, web-crawler

Geziyor

Geziyor, a fast web crawling & scraping framework for Go. Supports JS rendering.

Stars: ✭ 1,246 (+349.82%)

Mutual labels: crawler, spider, scraping

Crawlab

Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台，支持任何语言和框架

Stars: ✭ 8,392 (+2929.6%)

Mutual labels: crawler, spider, web-crawler

Gopa Abandoned

GOPA, a spider written in Go.（NOTE: this project moved to https://github.com/infinitbyte/gopa ）

Stars: ✭ 98 (-64.62%)

Mutual labels: crawler, spider, lightweight

wget-lua

Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.

Stars: ✭ 52 (-81.23%)

Mutual labels: spider, scraping, crawling

Crawlab Lite

Lite version of Crawlab. 轻量版 Crawlab 爬虫管理平台

Stars: ✭ 122 (-55.96%)

Mutual labels: crawler, spider, web-crawler

Scrapy

Scrapy, a fast high-level web crawling & scraping framework for Python.

Stars: ✭ 42,343 (+15186.28%)

Mutual labels: crawler, scraping, crawling

Abot

Cross Platform C# web crawler framework built for speed and flexibility. Please star this project! +1.

Stars: ✭ 1,961 (+607.94%)

Mutual labels: crawler, spider, web-crawler

Awesome Crawler

A collection of awesome web crawler,spider in different languages

Stars: ✭ 4,793 (+1630.32%)

Mutual labels: crawler, spider, web-crawler

Spidy

The simple, easy to use command line web crawler.

Stars: ✭ 257 (-7.22%)

Mutual labels: crawler, crawling, web-crawler

scrapy-distributed

A series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.

Stars: ✭ 38 (-86.28%)

Mutual labels: spider, scraping, crawling

Ferret

Declarative web scraping

Stars: ✭ 4,837 (+1646.21%)

Mutual labels: crawler, scraping, crawling

Dotnetcrawler

DotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c

Stars: ✭ 100 (-63.9%)

Mutual labels: crawler, scraping, crawling

Zhihu Crawler People

A simple distributed crawler for zhihu && data analysis

Stars: ✭ 182 (-34.3%)

Mutual labels: crawler, spider, web-crawler

bots-zoo

No description or website provided.

Stars: ✭ 59 (-78.7%)

Mutual labels: crawler, scraping, crawling

Arachnid

Crawl all unique internal links found on a given website, and extract SEO related information - supports javascript based sites

Stars: ✭ 224 (-19.13%)

Mutual labels: crawler, scraping

Chromium for spider

dynamic crawler for web vulnerability scanner

Stars: ✭ 220 (-20.58%)

Mutual labels: crawler, spider

Laravel Crawler Detect

A Laravel wrapper for CrawlerDetect - the web crawler detection library

Stars: ✭ 227 (-18.05%)

Mutual labels: crawler, spider

Strong Web Crawler

基于C#.NET+PhantomJS+Sellenium的高级网络爬虫程序。可执行Javascript代码、触发各类事件、操纵页面Dom结构。

Stars: ✭ 238 (-14.08%)

Mutual labels: crawler, web-crawler

Jd mask robot

京东口罩库存监控爬虫(非selenium)，扫码登录、查价、加购、下单、秒杀

Stars: ✭ 216 (-22.02%)

Mutual labels: crawler, spider

Ppspider

web spider built by puppeteer, support task-queue and task-scheduling by decorators，support nedb / mongodb, support data visualization; 基于puppeteer的web爬虫框架，提供灵活的任务队列管理调度方案，提供便捷的数据保存方案（nedb/mongodb），提供数据可视化和用户交互的实现方案

Stars: ✭ 237 (-14.44%)

Mutual labels: crawler, spider

Fast Lianjia Crawler

直接通过链家 API 抓取数据的极速爬虫，宇宙最快~~ 🚀

Stars: ✭ 247 (-10.83%)

Mutual labels: crawler, spider

BaiduSpider

项目已经移动至：https://github.com/BaiduSpider/BaiduSpider ！！一个爬取百度搜索结果的爬虫，目前支持百度网页搜索，百度图片搜索，百度知道搜索，百度视频搜索，百度资讯搜索，百度文库搜索，百度经验搜索和百度百科搜索。

Stars: ✭ 29 (-89.53%)

Mutual labels: spider, crawling

scrape-github-trending

Tutorial for web scraping / crawling with Node.js.