ZKeeer / Ipproxy
爬虫所需要的IP代理,抓取九个网站的代理IP检测/清洗/入库/更新,添加调用接口
Stars: ✭ 136
Programming Languages
python
139335 projects - #7 most used programming language
Projects that are alternatives of or similar to Ipproxy
Spoon
🥄 A package for building specific Proxy Pool for different Sites.
Stars: ✭ 173 (+27.21%)
Mutual labels: spider, proxies
Proxy
A simple tool for fetching usable proxies from several websites.
Stars: ✭ 124 (-8.82%)
Mutual labels: proxies
Lambdaattack
Minecraft bot for servers. Currently supports stress testing. More features are planned
Stars: ✭ 133 (-2.21%)
Mutual labels: proxies
Apiproject
[https://www.sofineday.com], golang项目开发脚手架,集成最佳实践(gin+gorm+go-redis+mongo+cors+jwt+json日志库zap(支持日志收集到kafka或mongo)+消息队列kafka+微信支付宝支付gopay+api加密+api反向代理+go modules依赖管理+headless爬虫chromedp+makefile+二进制压缩+livereload热加载)
Stars: ✭ 124 (-8.82%)
Mutual labels: spider
Barbatunnel
A layer that hide, redirect. forward, re-encrypt internet packet to keep VPN, Proxies and other p2p software hidden from Firewall. Free implementation for HTTP-Tunnel, UDP-Tunnel, port forwarding, port redirecting and packet re-encryption that can work in network data-link layer and transport layer
Stars: ✭ 128 (-5.88%)
Mutual labels: proxies
Digger
Digger is a powerful and flexible web crawler implemented by pure golang
Stars: ✭ 130 (-4.41%)
Mutual labels: spider
Decryptlogin
APIs for loginning some websites by using requests.
Stars: ✭ 1,861 (+1268.38%)
Mutual labels: spider
Bilibili User Information Spider
B站3亿用户信息爬虫(mid号,昵称,性别,关注,粉丝,等级)
Stars: ✭ 136 (+0%)
Mutual labels: spider
IPProxy
爬虫所需要的IP代理,抓取八个网站的代理IP检测/清洗/入库/更新,添加调用接口
目前只在win10 64位机,python3.5 / ubuntu server 16.04.1 LTS 64位 ,python 3.5下测试通过
不同配置的机器, 请在Config.py中修改最大线程数。详情可以看下面Config.py部分
查看 如何使用demo.py
Util.Refresh():数据库和新的数据需要主动调用此函数更新
Util.Get():调用可获取一条可用的代理,Util.Get()返回的代理:
{'http': 'http://115.159.152.130:81', 'https': 'https://115.159.152.130:81'}
requests可以直接使用:requests.get(url,proxies=Util.Get(),headers={})
设置最大线程数量限制,MaxThreads。如果说,我的电脑配置很低,那么设置16,32慢慢跑;如果对你的电脑贼自信,我电脑牛X啊,i7 志强,又是什么N多G内存,网络带宽贼6,那么你可以设置1024。 Config.py 部分:
如果你还有代理网站可以添加,请添加在Url_Regular字典中。
代理IP网址和对应的正则式,正则式一定要IP和Port分开获取,例如[(192.168.1.1, 80), (192.168.1.1, 90),]
只抓取首页,想要抓取首页以后页面的可以将链接和正则式贴上来,例如,将某网站的1、2、……页的链接和对应的正则式分别添加到Url_Regular字典中。
添加正则式之前请先在 站长工具-正则表达式在线测试 测试通过后添加
数据来源:
http://www.kuaidaili.com/free/ http://www.66ip.cn/ http://www.xicidaili.com/nn/ http://www.ip3366.net/free/ http://www.proxy360.cn/Region/China http://www.mimiip.com/ http://www.data5u.com/free/index.shtml http://www.ip181.com/ http://www.kxdaili.com/欢迎添加你知道的代理网站,大家资源共享
逻辑结构:

欢迎issue和pull,代码渣渣,大神轻喷
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].