All Projects → Gopa → Similar Projects or Alternatives

1912 Open source projects that are alternatives of or similar to Gopa

Spidr
A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
Stars: ✭ 656 (+136.82%)
Mutual labels:  crawler, spider, web-scraping, web-crawler
Arachnid
Powerful web scraping framework for Crystal
Stars: ✭ 68 (-75.45%)
Mutual labels:  crawler, spider, web-scraping, crawling
Linkedin Profile Scraper
🕵️‍♂️ LinkedIn profile scraper returning structured profile data in JSON. Works in 2020.
Stars: ✭ 171 (-38.27%)
Mutual labels:  crawler, spider, scraping, crawling
Colly
Elegant Scraper and Crawler Framework for Golang
Stars: ✭ 15,535 (+5508.3%)
Mutual labels:  crawler, spider, scraping, crawling
flink-crawler
Continuous scalable web crawler built on top of Flink and crawler-commons
Stars: ✭ 48 (-82.67%)
Mutual labels:  crawler, spider, web-crawler, crawling
Crawly
Crawly, a high-level web crawling & scraping framework for Elixir.
Stars: ✭ 440 (+58.84%)
Mutual labels:  crawler, spider, scraping, crawling
Antch
Antch, a fast, powerful and extensible web crawling & scraping framework for Go
Stars: ✭ 198 (-28.52%)
Mutual labels:  crawler, scraping, crawling, web-crawler
Sasila
一个灵活、友好的爬虫框架
Stars: ✭ 286 (+3.25%)
Mutual labels:  crawler, scraping, crawling
Easy Scraping Tutorial
Simple but useful Python web scraping tutorial code.
Stars: ✭ 583 (+110.47%)
Mutual labels:  crawler, scraping, crawling
Headless Chrome Crawler
Distributed crawler powered by Headless Chrome
Stars: ✭ 5,129 (+1751.62%)
Mutual labels:  crawler, scraping, crawling
Pspider
简单易用的Python爬虫框架,QQ交流群:597510560
Stars: ✭ 1,611 (+481.59%)
Mutual labels:  crawler, spider, web-crawler
Awesome Python Primer
自学入门 Python 优质中文资源索引,包含 书籍 / 文档 / 视频,适用于 爬虫 / Web / 数据分析 / 机器学习 方向
Stars: ✭ 57 (-79.42%)
Mutual labels:  crawler, spider, scraping
Autoscraper
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
Stars: ✭ 4,077 (+1371.84%)
Mutual labels:  crawler, scraping, web-scraping
Apify Js
Apify SDK — The scalable web scraping and crawling library for JavaScript/Node.js. Enables development of data extraction and web automation jobs (not only) with headless Chrome and Puppeteer.
Stars: ✭ 3,154 (+1038.63%)
Mutual labels:  scraping, web-scraping, crawling
Skycaiji
蓝天采集器是一款免费的数据采集发布爬虫软件,采用php+mysql开发,可部署在云服务器,几乎能采集所有类型的网页,无缝对接各类CMS建站程序,免登录实时发布数据,全自动无需人工干预!是网页大数据采集软件中完全跨平台的云端爬虫系统
Stars: ✭ 1,514 (+446.57%)
Mutual labels:  crawler, spider, crawling
Spider Flow
新一代爬虫平台,以图形化方式定义爬虫流程,不写代码即可完成爬虫。
Stars: ✭ 365 (+31.77%)
Mutual labels:  crawler, spider, web-crawler
Scrapple
A framework for creating semi-automatic web content extractors
Stars: ✭ 464 (+67.51%)
Mutual labels:  crawler, scraping, web-scraping
papercut
Papercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.
Stars: ✭ 15 (-94.58%)
Mutual labels:  crawler, scraping, web-scraping
Webster
a reliable high-level web crawling & scraping framework for Node.js.
Stars: ✭ 364 (+31.41%)
Mutual labels:  crawler, spider, crawling
Lulu
[Unmaintained] A simple and clean video/music/image downloader 👾
Stars: ✭ 789 (+184.84%)
Mutual labels:  crawler, scraping, crawling
Newcrawler
Free Web Scraping Tool with Java
Stars: ✭ 589 (+112.64%)
Mutual labels:  crawler, spider, scraping
Maman
Rust Web Crawler saving pages on Redis
Stars: ✭ 39 (-85.92%)
Mutual labels:  crawler, spider, web-crawler
Geziyor
Geziyor, a fast web crawling & scraping framework for Go. Supports JS rendering.
Stars: ✭ 1,246 (+349.82%)
Mutual labels:  crawler, spider, scraping
Crawlab
Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架
Stars: ✭ 8,392 (+2929.6%)
Mutual labels:  crawler, spider, web-crawler
Gopa Abandoned
GOPA, a spider written in Go.(NOTE: this project moved to https://github.com/infinitbyte/gopa )
Stars: ✭ 98 (-64.62%)
Mutual labels:  crawler, spider, lightweight
wget-lua
Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.
Stars: ✭ 52 (-81.23%)
Mutual labels:  spider, scraping, crawling
Crawlab Lite
Lite version of Crawlab. 轻量版 Crawlab 爬虫管理平台
Stars: ✭ 122 (-55.96%)
Mutual labels:  crawler, spider, web-crawler
Scrapy
Scrapy, a fast high-level web crawling & scraping framework for Python.
Stars: ✭ 42,343 (+15186.28%)
Mutual labels:  crawler, scraping, crawling
Abot
Cross Platform C# web crawler framework built for speed and flexibility. Please star this project! +1.
Stars: ✭ 1,961 (+607.94%)
Mutual labels:  crawler, spider, web-crawler
Awesome Crawler
A collection of awesome web crawler,spider in different languages
Stars: ✭ 4,793 (+1630.32%)
Mutual labels:  crawler, spider, web-crawler
Spidy
The simple, easy to use command line web crawler.
Stars: ✭ 257 (-7.22%)
Mutual labels:  crawler, crawling, web-crawler
scrapy-distributed
A series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.
Stars: ✭ 38 (-86.28%)
Mutual labels:  spider, scraping, crawling
Ferret
Declarative web scraping
Stars: ✭ 4,837 (+1646.21%)
Mutual labels:  crawler, scraping, crawling
Dotnetcrawler
DotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c
Stars: ✭ 100 (-63.9%)
Mutual labels:  crawler, scraping, crawling
Zhihu Crawler People
A simple distributed crawler for zhihu && data analysis
Stars: ✭ 182 (-34.3%)
Mutual labels:  crawler, spider, web-crawler
bots-zoo
No description or website provided.
Stars: ✭ 59 (-78.7%)
Mutual labels:  crawler, scraping, crawling
Arachnid
Crawl all unique internal links found on a given website, and extract SEO related information - supports javascript based sites
Stars: ✭ 224 (-19.13%)
Mutual labels:  crawler, scraping
Chromium for spider
dynamic crawler for web vulnerability scanner
Stars: ✭ 220 (-20.58%)
Mutual labels:  crawler, spider
Laravel Crawler Detect
A Laravel wrapper for CrawlerDetect - the web crawler detection library
Stars: ✭ 227 (-18.05%)
Mutual labels:  crawler, spider
Strong Web Crawler
基于C#.NET+PhantomJS+Sellenium的高级网络爬虫程序。可执行Javascript代码、触发各类事件、操纵页面Dom结构。
Stars: ✭ 238 (-14.08%)
Mutual labels:  crawler, web-crawler
Jd mask robot
京东口罩库存监控爬虫(非selenium),扫码登录、查价、加购、下单、秒杀
Stars: ✭ 216 (-22.02%)
Mutual labels:  crawler, spider
Ppspider
web spider built by puppeteer, support task-queue and task-scheduling by decorators,support nedb / mongodb, support data visualization; 基于puppeteer的web爬虫框架,提供灵活的任务队列管理调度方案,提供便捷的数据保存方案(nedb/mongodb),提供数据可视化和用户交互的实现方案
Stars: ✭ 237 (-14.44%)
Mutual labels:  crawler, spider
Fast Lianjia Crawler
直接通过链家 API 抓取数据的极速爬虫,宇宙最快~~ 🚀
Stars: ✭ 247 (-10.83%)
Mutual labels:  crawler, spider
BaiduSpider
项目已经移动至:https://github.com/BaiduSpider/BaiduSpider !! 一个爬取百度搜索结果的爬虫,目前支持百度网页搜索,百度图片搜索,百度知道搜索,百度视频搜索,百度资讯搜索,百度文库搜索,百度经验搜索和百度百科搜索。
Stars: ✭ 29 (-89.53%)
Mutual labels:  spider, crawling
scrape-github-trending
Tutorial for web scraping / crawling with Node.js.
Stars: ✭ 42 (-84.84%)
Mutual labels:  scraping, crawling
Webvideobot
Web crawler.
Stars: ✭ 214 (-22.74%)
Mutual labels:  crawler, spider
Magic google
Google search results crawler, get google search results that you need
Stars: ✭ 247 (-10.83%)
Mutual labels:  crawler, spider
PythonScrapyBasicSetup
Basic setup with random user agents and IP addresses for Python Scrapy Framework.
Stars: ✭ 57 (-79.42%)
Mutual labels:  scraping, web-scraping
core
The complete web scraping toolkit for PHP.
Stars: ✭ 1,110 (+300.72%)
Mutual labels:  crawling, web-scraping
socials
👨‍👩‍👦 Social account detection and extraction in Python, e.g. for crawling/scraping.
Stars: ✭ 37 (-86.64%)
Mutual labels:  scraping, crawling
trafilatura
Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
Stars: ✭ 711 (+156.68%)
Mutual labels:  scraping, web-scraping
ant
A web crawler for Go
Stars: ✭ 264 (-4.69%)
Mutual labels:  spider, web-crawler
scrapy-fieldstats
A Scrapy extension to log items coverage when the spider shuts down
Stars: ✭ 17 (-93.86%)
Mutual labels:  scraping, crawling
double-agent
A test suite of common scraper detection techniques. See how detectable your scraper stack is.
Stars: ✭ 123 (-55.6%)
Mutual labels:  scraping, crawling
diffbot-php-client
[Deprecated - Maintenance mode - use APIs directly please!] The official Diffbot client library
Stars: ✭ 53 (-80.87%)
Mutual labels:  scraping, crawling
ioweb
Web Scraping Framework
Stars: ✭ 31 (-88.81%)
Mutual labels:  scraping, web-scraping
selectorlib
A library to read a YML file with Xpath or CSS Selectors and extract data from HTML pages using them
Stars: ✭ 53 (-80.87%)
Mutual labels:  scraping, web-scraping
crawling-framework
Easily crawl news portals or blog sites using Storm Crawler.
Stars: ✭ 22 (-92.06%)
Mutual labels:  scraping, crawling
Bt Btt
磁力網站U3C3介紹以及域名更新
Stars: ✭ 261 (-5.78%)
Mutual labels:  crawler, spider
proxycrawl-python
ProxyCrawl Python library for scraping and crawling
Stars: ✭ 51 (-81.59%)
Mutual labels:  scraping, crawling
1-60 of 1912 similar projects