All Projects → phpgao → proxy_pool

phpgao / proxy_pool

Licence: MIT license
A simple proxy pool

Programming Languages

go
31211 projects - #10 most used programming language

Projects that are alternatives of or similar to proxy pool

Prox5
🧮 SOCKS5/4/4a 🌾 validating proxy pool and upstream SOCKS5 server for 🤽 LOLXDsoRANDum connections 🎋
Stars: ✭ 39 (-46.58%)
Mutual labels:  proxy-server, proxy-pool
mtproxy
Alpine-based Docker Image for Telegram MTProto Proxy
Stars: ✭ 89 (+21.92%)
Mutual labels:  proxy-server
yastack
YAStack: User-space network-stack based on DPDK, FreeBSD TCP/IP Stack, EnvoyProxy
Stars: ✭ 90 (+23.29%)
Mutual labels:  proxy-server
Free-Proxy
Hi there will be a lot of proxies here.
Stars: ✭ 135 (+84.93%)
Mutual labels:  proxy-server
nimSocks
A filtering SOCKS proxy server and client library written in nim.
Stars: ✭ 51 (-30.14%)
Mutual labels:  proxy-server
LiveProxies
Asynchronous proxy checker
Stars: ✭ 17 (-76.71%)
Mutual labels:  proxy-server
microservice-demo
A cloud-native project management microservice application. Services are built with various technologies e.g Go, NodeJS, Python, Ruby, MongoDB, MySQL, PostgreSQL
Stars: ✭ 143 (+95.89%)
Mutual labels:  proxy-server
proxy
An HTTP proxy server written in C# and targeting .NET Core 3.
Stars: ✭ 31 (-57.53%)
Mutual labels:  proxy-server
microsocks11
A cross-platform SOCKS5 library and server based on the microsocks project.
Stars: ✭ 22 (-69.86%)
Mutual labels:  proxy-server
socks5-proxy
Socks5 Proxy with Go Lang. support USER_ID/PASSWORD. able to bypass HTTPS(SNI) censorship
Stars: ✭ 29 (-60.27%)
Mutual labels:  proxy-server
saml-auth-proxy
Provides a SAML SP authentication proxy for backend web services
Stars: ✭ 38 (-47.95%)
Mutual labels:  proxy-server
gathertool
gathertool是golang脚本化开发库,目的是提高对应场景程序开发的效率;轻量级爬虫库,接口测试&压力测试库,DB操作库等。
Stars: ✭ 36 (-50.68%)
Mutual labels:  proxy-server
http-knocking
🚪HTTP-Knocking hides a Web server and open it by knocking sequence: Hide Web server until your knocks
Stars: ✭ 28 (-61.64%)
Mutual labels:  proxy-server
RandomProxyRuby
Tiny Library for get random proxy (free).
Stars: ✭ 16 (-78.08%)
Mutual labels:  proxy-server
nginx-reverseproxy
A simple implementation of a multidomain nginx reverse proxy, using Node apps.
Stars: ✭ 46 (-36.99%)
Mutual labels:  proxy-server
Viewfinder
📷 BrowserBox - Remote isolated browser API for security, automation visibility and interactivity. Run on our cloud, or bring your own. Full scope double reverse web proxy with multi-tab, mobile-ready browser UI frontend. Plus co-browsing, advanced adaptive streaming, secure document viewing and more! But only in the Pro version. Get BB today! Se…
Stars: ✭ 1,741 (+2284.93%)
Mutual labels:  proxy-server
p3y
A single binary reverse proxy written in go. It was developed for use in Kubernetes, to wrap services like Prometheus with simple BasicAuth and TLS encryption.
Stars: ✭ 15 (-79.45%)
Mutual labels:  proxy-server
thumbai
Go Mod Repository, Go Vanity Server and Proxy Server
Stars: ✭ 84 (+15.07%)
Mutual labels:  proxy-server
ProxyChecker
proxy checker to check the status of the ip-port proxy list
Stars: ✭ 24 (-67.12%)
Mutual labels:  proxy-server
sparql-proxy
SPARQL-proxy: provides cache, job control, and logging for any SPARQL endpoint
Stars: ✭ 26 (-64.38%)
Mutual labels:  proxy-server

proxy_pool

一个简单的代理池工具

A simple proxy pool written in go

功能

  • 定时抓取互联网公开免费的代理
  • 定时验证可用代理
  • 支持动态代理(https仅支持connect)
  • 使用采集到的代理访问代理网站
  • 使用命令行环境变量进行配置
  • 当没有IP可用时使用本地转发

依赖

  • redis

使用说明

编译

go build

下载

# 版本 v0.3.3
wget https://github.com/phpgao/proxy_pool/releases/download/v0.3.3/proxy_pool_linux_amd64
chmod a+x proxy_pool_linux_amd64

使用

cp config_example.json config.json
# 修改redis和端口配置

# 感谢ipip.net提供精准的IP数据(已内置)
./proxy_pool

# 打印可设定参数
./proxy_pool_linux_amd64 --help

# 命令行指定配置
./proxy_pool_linux_amd64 -host 8.8.8.8 -port 6379 -auth laogao

# 后台运行
nohup ./proxy_pool_linux_amd64 > /dev/null 2>&1 &

api

# 统计
curl 127.0.0.1:8088
# 随机
curl 127.0.0.1:8088/random
# 获取列表
curl 127.0.0.1:8088/get

动态代理

# http
curl http://cip.cc -x 127.0.0.1:8089
# https
curl https://cip.cc -x 127.0.0.1:8089

一些细节

流程图

                                                                 +-------------------------+
                                                                 |                         |
                                                                 |                         |
+-------------+      +------------+         +-------------+      |                         |
|             |      |            |         |             |      |                         |
|   source    +------> new proxy  +--------->  validator  +------>                         |
|             +------>            +--------->             +------>                         |
|             |      |            |         |             |      |         The Pool        |
+-------------+      +------------+         +-------------+      |                         |
                                                                 |                         |
                             +----------------------------------->                         |
                             |           +/- score               |                         |
                             |                                   |                         |
                             |                                   |                         |
                             |                                   +-------------+-----------+
                             |                                                 |
                     +-------+------+        +--------------+                  |
                     |              |        |              |                  |
                     |              <--------+              |      cron        |
                     |  old proxy   <--------+  validator   +<-----------------+
                     |              |        |              |
                     |              |        |              |
                     +--------------+        +--------------+

关于验证逻辑

  1. 代理检测采用打分机制,新代理默认60分,满分100,检测每失败一次扣30分,成功一次加10分,当分数小于等于0时,对应的代理地址将会被删除
  2. 新的代理入库前有三道检测(tcp,http,https),只要通过了http测试,就会被添加到数据库中
  3. 定时检测只会测试tcp和https的connect方法,同时会把之前判定为http的代理修正为https,但是如果一个https被检测到错误,会扣20分
  4. 目前规则还不算很完善,欢迎大家一起讨论,提高代理的稳定性

关于添加采集源

  1. demo见源码source文件夹下
  2. 爬虫分为html、json、re还有text型
  3. 爬虫复用了Spider结构体,新爬虫必须实现的方法如下
  • Cron --> 定义了启动间隔
  • Name --> 定义了爬虫名
  • Run --> 通用方法,用来执行下载和解析,照抄即可
  • StartUrl --> 返回目标网站的入口页面
  • Parse --> 接收最终的html代理,返回[]model.HttpProxy实例的指针

关于动态代理

  1. 核心代码取自HTTP(S) Proxy in Golang in less than 100 lines of code,做了一些针对性的优化
  2. 如果当前没有代理可用,软件会把自身作为透明代理

关于传统API代理

  1. 接口分为统计和获取
  2. 查询支持schema=http(s),source=spider.name,score=100,country=cn

todo

  • tcp池
  • go test
  • 更精细的超时控制
  • 主从模式
  • 代理认证

反馈

期待大家的测试和反馈!

更多

如果你在使用scrapy,为什么不试试scrapy-random-useragent-pro呢?

感谢

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].