All Projects → pangxiaobin → proxy_ip_pool

pangxiaobin / proxy_ip_pool

Licence: Apache-2.0 license
python编写的爬虫代理ip池

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to proxy ip pool

usim800
usim800 is a Python driver module for SIM800 GSM/GPRS .
Stars: ✭ 36 (+100%)
Mutual labels:  requests
python-crawler
爬虫学习仓库,适合零基础的人学习,对新手比较友好
Stars: ✭ 37 (+105.56%)
Mutual labels:  requests
requests
http requests lib for golang
Stars: ✭ 67 (+272.22%)
Mutual labels:  requests
resto
🔗 a CLI app can send pretty HTTP & API requests with TUI
Stars: ✭ 113 (+527.78%)
Mutual labels:  requests
covid-19
Data ETL & Analysis on the global and Mexican datasets of the COVID-19 pandemic.
Stars: ✭ 14 (-22.22%)
Mutual labels:  requests
Geolocator-2
Learn how to find and work with locations in Django, the Yelp API, and Google Maps api.
Stars: ✭ 24 (+33.33%)
Mutual labels:  requests
feupy
The sigarra scraping library no one asked for
Stars: ✭ 13 (-27.78%)
Mutual labels:  requests
requestsR
R interface to Python requests module
Stars: ✭ 12 (-33.33%)
Mutual labels:  requests
gists
Methods for working with the GitHub Gist API. Node.js/JavaScript
Stars: ✭ 96 (+433.33%)
Mutual labels:  requests
Email-Crawler-Lead-Generator
This email crawler will visit all pages of a provided website and parse and save emails found to a csv file.
Stars: ✭ 47 (+161.11%)
Mutual labels:  requests
option chain analysis
NSE Nifty Option chain analysis on the web page.
Stars: ✭ 63 (+250%)
Mutual labels:  requests
cpr
C++ Requests: Curl for People, a spiritual port of Python Requests.
Stars: ✭ 5,005 (+27705.56%)
Mutual labels:  requests
crawler
requests+lxml爬虫,简单爬虫架构
Stars: ✭ 72 (+300%)
Mutual labels:  requests
schedule-system
demos of schedule-system build with apscheduler and rpyc
Stars: ✭ 39 (+116.67%)
Mutual labels:  apscheduler
angular-http-cache
Speed up your remote requests by automatically caching them on client and add support for offline navigation.
Stars: ✭ 25 (+38.89%)
Mutual labels:  requests
get LibSeat
利昂图书馆预约系统自动预约&签到程序。支持包括中国人民大学、北京师范大学、济南大学、哈尔滨工业大学等在内的38所高校的图书馆系统
Stars: ✭ 39 (+116.67%)
Mutual labels:  requests
python3-concurrency
Python3爬虫系列的理论验证,首先研究I/O模型,分别用Python实现了blocking I/O、nonblocking I/O、I/O multiplexing各模型下的TCP服务端和客户端。然后,研究同步I/O操作(依序下载、多进程并发、多线程并发)和异步I/O(asyncio)之间的效率差别
Stars: ✭ 49 (+172.22%)
Mutual labels:  requests
wc18-cli
An easy command line interface for the 2018 World Cup
Stars: ✭ 15 (-16.67%)
Mutual labels:  requests
premeStock
Monitors for restocks
Stars: ✭ 53 (+194.44%)
Mutual labels:  requests
weibo topic
微博话题关键词,个人微博采集, 微博博文一键删除 selenium获取cookie,requests处理
Stars: ✭ 28 (+55.56%)
Mutual labels:  requests

proxy_ip_pool

说明:使用django框架和requests库搭建 可以访问http://47.102.205.85:9000/ 显示示例 渣渣云服务器,里面只有有测试数据,勿大量请求。

运行环境

  • python3 和mysql数据库

下载使用

  • 下载源码
git clone https://github.com/pangxiaobin/proxy_ip_pool.git

或者在https://github.com/pangxiaobin/proxy_ip_pool下载zip文件
  • 安装依赖
pip install -i https://pypi.douban.com/simple/ -r requments.txt
  • 创建数据库
mysql -uroot -p
create database ippool charset=utf8;
  • 配置项目
# ProxyIPPool/settings.py 基本的配置文件
# Database 使用mysql
DATABASES = {
    'default': {
        'ENGINE': 'django.db.backends.mysql',
        'NAME': 'ippool', # db name
        'USER': 'root', # 用户名
        'PASSWORD': 'password', # 密码
        'HOST': 'localhost',
        'PORT': 3306,
    }
}
# uwsgi.ini
 uwsgi.ini
[uwsgi]
# 监听的ip地址和端口 这里修改访问端口
http=0.0.0.0:8000 
# 配置工程目录 项目所在的绝对路径
chdir=/path/to/proxy_ip_pool/ProxyIPPool/
# 配置项目的wsgi目录。相对于工程目录
wsgi-file=ProxyIPPool/wsgi.py

生成迁移文件和执行迁移文件

python manage.py makemigrations
python manage.py migrate

启动

  • 方法一

    cd  ProxyIPPool  # 进入到manage.py这一级
    python manage.py runserver
    # 启动后访问http://127.0.0.1:8000
  • 方式二

    # 使用uwgi 启动服务 这样可以后台启动
    uwsgi --ini uwsgi.ini

可以使用方式一进行调试运行,方式二进行稳定运行

  • 启动爬取代理ip的脚本

    # 调试时运行
    python run.py
    # 在服务器中可以运行 
    nohup python -u run.py >> crawler.out 2>&1 & 

注意在项目下创建存储日志的文件

/ProxyIPPool/log/log.txt

API接口

  • 请求方式GET

    • http://{运行服务器的ip}/api/fetch/ 随机返回一个代理ip信息

    • http://{运行服务器的ip}/api/random/{个数}, 随机返回指定个数

  • 首页展示的内容可以在这里IPPool/views.py中修改

# IPPool/views.py

# 修改context 改变返回首页的内容


def index(requests):
    """
    返回到说明页
    :param requests:
    :return:
    """
    context = '<h3>1.访问接口http://{运行服务器的ip}/api/fetch/ 随机返回一个代理ip信息</h3> <br/>' \
              '<h3>2.访问接口http://{运行服务器的ip}/api/random/{个数}, 随机返回指定个数</h3> <br/>'
    return HttpResponse(context)
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].