zhshch2002 / Goribot
Licence: apache-2.0
[Crawler/Scraper for Golang]🕷A lightweight distributed friendly Golang crawler framework.一个轻量的分布式友好的 Golang 爬虫框架。
Stars: ✭ 190
Projects that are alternatives of or similar to Goribot
Crawlab
Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架
Stars: ✭ 8,392 (+4316.84%)
Mutual labels: crawler, spider, scrapy
Scrapit
Scraping scripts for various websites.
Stars: ✭ 25 (-86.84%)
Mutual labels: crawler, spider, scraper
Python3 Spider
Python爬虫实战 - 模拟登陆各大网站 包含但不限于:滑块验证、拼多多、美团、百度、bilibili、大众点评、淘宝,如果喜欢请start ❤️
Stars: ✭ 2,129 (+1020.53%)
Mutual labels: crawler, spider, scrapy
Spidr
A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
Stars: ✭ 656 (+245.26%)
Mutual labels: crawler, spider, scraper
Crawler
A high performance web crawler in Elixir.
Stars: ✭ 781 (+311.05%)
Mutual labels: crawler, spider, scraper
Scrapingoutsourcing
ScrapingOutsourcing专注分享爬虫代码 尽量每周更新一个
Stars: ✭ 164 (-13.68%)
Mutual labels: crawler, spider, scrapy
Haipproxy
💖 High available distributed ip proxy pool, powerd by Scrapy and Redis
Stars: ✭ 4,993 (+2527.89%)
Mutual labels: crawler, spider, scrapy
Geziyor
Geziyor, a fast web crawling & scraping framework for Go. Supports JS rendering.
Stars: ✭ 1,246 (+555.79%)
Mutual labels: crawler, spider, scraper
Marmot
💐Marmot | Web Crawler/HTTP protocol Download Package 🐭
Stars: ✭ 186 (-2.11%)
Mutual labels: crawler, spider, scrapy
Scrapoxy
Scrapoxy hides your scraper behind a cloud. It starts a pool of proxies to send your requests. Now, you can crawl without thinking about blacklisting!
Stars: ✭ 1,322 (+595.79%)
Mutual labels: crawler, scraper, scrapy
Icrawler
A multi-thread crawler framework with many builtin image crawlers provided.
Stars: ✭ 629 (+231.05%)
Mutual labels: crawler, spider, scrapy
Mailinglistscraper
A python web scraper for public email lists.
Stars: ✭ 19 (-90%)
Mutual labels: spider, scraper, scrapy
Crawlab Lite
Lite version of Crawlab. 轻量版 Crawlab 爬虫管理平台
Stars: ✭ 122 (-35.79%)
Mutual labels: crawler, spider, scrapy
Avbook
AV 电影管理系统, avmoo , javbus , javlibrary 爬虫,线上 AV 影片图书馆,AV 磁力链接数据库,Japanese Adult Video Library,Adult Video Magnet Links - Japanese Adult Video Database
Stars: ✭ 8,133 (+4180.53%)
Mutual labels: crawler, spider, scraper
Crawly
Crawly, a high-level web crawling & scraping framework for Elixir.
Stars: ✭ 440 (+131.58%)
Mutual labels: crawler, spider, scraper
Awesome Crawler
A collection of awesome web crawler,spider in different languages
Stars: ✭ 4,793 (+2422.63%)
Mutual labels: crawler, spider, scraper
Django Dynamic Scraper
Creating Scrapy scrapers via the Django admin interface
Stars: ✭ 1,024 (+438.95%)
Mutual labels: spider, scraper, scrapy
Not Your Average Web Crawler
A web crawler (for bug hunting) that gathers more than you can imagine.
Stars: ✭ 107 (-43.68%)
Mutual labels: crawler, spider, scraper
Goribot
一个分布式友好的轻量的 Golang 爬虫框架。
!! Warning !!
Goribot 已经被迁移到 Gospider|github.com/zhshch2002/gospider。修复了一些调度问题并分离了网络请求部分到另一个仓库。此仓库会继续保留,建议新朋友使用新的 Gospider。
Goribot has been moved to Gospider|github.com/zhshch2002/gospider. Fixed some scheduling issues and separated the network request part to another repo. This repo will continue to be kept, suggest new friends to use the new Gospider.
🚀Feature
- 优雅的 API
- 整洁的文档
- 高速(单核处理 >1K task/sec)
- 友善的分布式支持
- 便捷的细节
- 相对链接自动转换
- 字符编码自动解码
- HTML,JSON 自动解析
- 丰富的扩展支持
- 请求去重(👈支持分布式)
- 限制请求、速率、并发
- Json,CSV 存储结果
- Robots.txt 支持
- 记录请求异常
- 随机 UA 、随机代理
- 失败重试
- 轻量,适于学习或快速开箱搭建
版本警告
Goribot 仅支持 Go1.13 及以上版本。
👜获取 Goribot
go get -u github.com/zhshch2002/goribot
Goribot 包含一个历史开发版本,如果您需要使用过那个版本,请拉取 Tag 为 v0.0.1 版本。
⚡建立你的第一个项目
package main
import (
"fmt"
"github.com/zhshch2002/goribot"
)
func main() {
s := goribot.NewSpider()
s.AddTask(
goribot.GetReq("https://httpbin.org/get"),
func(ctx *goribot.Context) {
fmt.Println(ctx.Resp.Text)
fmt.Println(ctx.Resp.Json("headers.User-Agent"))
},
)
s.Run()
}
🎉完成
至此你已经可以使用 Goribot 了。更多内容请从 开始使用 了解。
🙏感谢
万分感谢以上项目的帮助🙏。
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].