Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

Karmenzind / Fp Server

Licence: mit

Free proxy server, continuously crawling and providing proxies, based on Tornado and Scrapy. 免费代理服务器，基于Tornado和Scrapy，在本地搭建属于自己的代理池

Programming Languages

python

139335 projects - #7 most used programming language

Labels

proxy spider scrapy tornado proxypool

Projects that are alternatives of or similar to Fp Server

Proxy pool

Python爬虫代理IP池(proxy pool)

Stars: ✭ 13,964 (+8967.53%)

Mutual labels: spider, proxy, proxypool

Ok ip proxy pool

🍿爬虫代理IP池(proxy pool) python🍟一个还ok的IP代理池

Stars: ✭ 196 (+27.27%)

Mutual labels: spider, proxy, proxypool

Spoon

🥄 A package for building specific Proxy Pool for different Sites.

Stars: ✭ 173 (+12.34%)

Mutual labels: spider, proxy, proxypool

Scrapy IPProxyPool

免费 IP 代理池。Scrapy 爬虫框架插件

Stars: ✭ 100 (-35.06%)

Mutual labels: spider, scrapy, proxypool

Marmot

💐Marmot | Web Crawler/HTTP protocol Download Package 🐭

Stars: ✭ 186 (+20.78%)

Mutual labels: spider, scrapy, proxy

OpenScraper

An open source webapp for scraping: towards a public service for webscraping

Stars: ✭ 80 (-48.05%)

Mutual labels: spider, tornado, scrapy

Capturer

capture pictures from website like sina, lofter, huaban and so on

Stars: ✭ 76 (-50.65%)

Mutual labels: spider, scrapy

Scrapoxy

Scrapoxy hides your scraper behind a cloud. It starts a pool of proxies to send your requests. Now, you can crawl without thinking about blacklisting!

Stars: ✭ 1,322 (+758.44%)

Mutual labels: scrapy, proxy

Hive

lots of spider (很多爬虫）

Stars: ✭ 110 (-28.57%)

Mutual labels: spider, scrapy

Free proxy website

获取免费socks/https/http代理的网站集合

Stars: ✭ 119 (-22.73%)

Mutual labels: spider, proxy

Django Dynamic Scraper

Creating Scrapy scrapers via the Django admin interface

Stars: ✭ 1,024 (+564.94%)

Mutual labels: spider, scrapy

Scrala

Unmaintained 🐳 ☕️ 🕷 Scala crawler(spider) framework, inspired by scrapy, created by @gaocegege

Stars: ✭ 113 (-26.62%)

Mutual labels: spider, scrapy

Python3 Spider

Python爬虫实战 - 模拟登陆各大网站包含但不限于：滑块验证、拼多多、美团、百度、bilibili、大众点评、淘宝，如果喜欢请start ❤️

Stars: ✭ 2,129 (+1282.47%)

Mutual labels: spider, scrapy

Image Downloader

Download images from Google, Bing, Baidu. 谷歌、百度、必应图片下载.

Stars: ✭ 1,173 (+661.69%)

Mutual labels: spider, scrapy

Alipayspider Scrapy

AlipaySpider on Scrapy(use chrome driver); 支付宝爬虫(基于Scrapy)

Stars: ✭ 70 (-54.55%)

Mutual labels: spider, scrapy

Jcrandomproxy

随机代理

Stars: ✭ 105 (-31.82%)

Mutual labels: proxy, proxypool

Reptile

🏀 Python3 网络爬虫实战（部分含详细教程）猫眼腾讯视频豆瓣研招网微博笔趣阁小说百度热点 B站 CSDN 网易云阅读阿里文学百度股票今日头条微信公众号网易云音乐拉勾有道 unsplash 实习僧汽车之家英雄联盟盒子大众点评链家 LPL赛程台风梦幻西游、阴阳师藏宝阁天气牛客网百度文库睡前故事知乎 Wish

Stars: ✭ 1,048 (+580.52%)

Mutual labels: spider, scrapy

Copybook

用爬虫爬取小说网站上所有小说，存储到数据库中，并用爬到的数据构建自己的小说网站

Stars: ✭ 117 (-24.03%)

Mutual labels: spider, scrapy

Feapder

feapder是一款支持分布式、批次采集、任务防丢、报警丰富的python爬虫框架

Stars: ✭ 110 (-28.57%)

Mutual labels: spider, scrapy

Proxy pool

ip proxy pool

Stars: ✭ 126 (-18.18%)

Mutual labels: proxy, proxypool

View All Similar Projects ➔

fp-server

❗️ This code really sucks. I'll rewrite it when I have free time. 这项目写的很烂，仅供参考，后续有时间会重构……

A free proxy server based on Tornado and Scrapy.

Build your own proxy pool!

Features:

continuously crawling and providing free proxies
asynchronous and high-perfermance
automatically check proxies in cycle and ditch unavailable ones
easy-to-use HTTP api

免费代理服务器，基于Tornado和Scrapy，在本地搭建自己的代理池

持续爬取新的免费代理，检测可用后存入本地数据库
完全异步，支持高并发
易用的HTTP API
周期性检测代理可用性，自动更新

查看中文文档_(:ι」∠)_

This project has been tested on:

Archlinux; Python-3.6.5
Debian(WSL, Raspbian); Python-3.5.3

And it cannot directly run on Windows. Windows users may try using Docker or WSL to run this project.

Get started
- Using Docker
- Manually install
web APIs
Config
- Introduction
- Customization
Source webs
FAQ
Examples
- Use fp-server with Python requests module
- Use fp-server in Scrapy Project
Bugs and feature requests
TODOs and ideas

Get started

Choose either one option as follows. After successful deployment, use the APIs to get proxies.

Using Docker

The easiest way to run this repo is using Docker. Install Docker and then run:

# download the image
docker pull karmenzind/fp-server:stable
# run the container
# don't forget to modify `-p` if you prefer another port
docker run -itd --name fpserver -p 12345:12345 karmenzind/fp-server:stable
# check the output inside the container
docker logs -f fpserver

For custom configuratiuon, see this section.

Manually install

Install Redis and python>=3.5(I use Python-3.6.5).
Clone this repo.
Install python packages by:

pip install -r requirements.txt

Read the config and modify it according to your need.
Start the server:

python ./src/main.py

web APIs

typical response:

{
    "code": 0,
    "msg": "ok",
    "data": {}
}

code: result of event (not http code), 0 for sucess
msg: message for event
data: detail for sucessful event

get proxies

GET /api/proxy/

params	Must/ Optional	detail	default
count	O	the number of proxies you need	1
scheme	O	choices:`HTTP` `HTTPS`	both*
anonymity	O	choices:`transparent` `anonymous`	both
(TODO) sort_by_speed	O	choices: 1: desending order 0: no order -1: ascending order	0

both: include all type, not grouped

example

To acquire 10 proxies in HTTP scheme with anonymity:

GET /api/proxy/?count=10&scheme=HTTP&anonymity=anonymous

The response:

{
    "code": 0,
    "msg": "ok",
    "data": {
        "count": 9,
        "items": [
        {
            "port": 2000,
            "ip": "xxx.xxx.xx.xxx",
            "scheme": "HTTP",
            "url": "http://xxx.xxx.xxx.xx:xxxx",
            "anonymity": "transparent"
        }
        ]
    }
}

screenshot

create new proxy manually

POST /api/proxy/

params	Must/ Optional	detail	default
ip	M	e.g. 111.111.111.111
port	M	e.g. 12345
scheme	M	choices:`HTTP` `HTTPS`
anonymity	O	choices:`transparent` `anonymous`	`transparent`
need_auth	O	choices: 0 1
user	O
password	O
url	O		generated by given scheme+ip+port

screenshot

check status

Check server status. Include:

Running spiders
Stored proxies

GET /api/status/

No params.

screenshot

Config

Introduction

I choose YAML language for configuration file. The defination and default value for supported items are:

# server's http port
HTTP_PORT: 12345

# redirect output to console other than log file
CONSOLE_OUTPUT: 1

# Log
# dir and filename requires `CONSOLE_OUTPUT: 0`
LOG: 
  level: 'debug'
  dir: './logs'
  filename: 'fp-server.log'

# redis database
REDIS:
  host: '127.0.0.1'
  port: 6379
  db: 0
  password:

# stop crawling new proxies
# after stored this many proxies
PROXY_STORE_NUM: 500

# Check availability in cycle
# It's for each single proxy, not the checker
PROXY_STORE_CHECK_SEC: 3600

Customization

If you use Docker:
- Create a directory such as /x/config_dir and put your config.yml in it. Then modify the docker-run command like this:
```
docker run -itd --name fpserver -p 12345:12345 -v "/x/config_dir":"/fps-config" karmenzind/fp-server:stable
```
- External config.yml doesn't need to contain all config items. For example, it can be:
```
PROXY_STORE_NUM: 100
LOG:
    level: 'info'
PROXY_STORE_CHECK_SEC: 7200
```
  And other items will be default values.
- If you need to set a log file, don't modify LOG-dir in config.yml. Instead create a directory for log file such as /x/log_dir and change the docker-run command like:
```
docker run -itd --name fpserver -p 12345:12345 -v "/x/config_dir":"/fps_config" -v "/x/log_dir":"/fp_server/logs" karmenzind/fp-server:stable
```
- There's no need to modify the exposed port of the container. If you prefer publishing it to another port(say, 9999) on the host, change the -p parameter in docker-run command to -p 9999:12345
- If you need to access the Redis from host, add a new publishing parameter like -p 6379:6379 to docker-run command.
If you manually deploy the project:
- Modify the internal config file: src/config/common.py

Source webs

Growing……

If you knew good free-proxy websites, please tell me and I will add them to this project.

Supporting:

[x] 西刺代理
[x] 快代理
[x] 云代理
[x] 66免费代理
[x] 无忧代理
[x] 3464
[x] coderbusy
[x] ip181
[x] iphai
[x] a2u
[x] coolproxy
[ ] 万能代理
[ ] 小幻代理 (figuring)
[ ] 89免费代理(figuring)
[ ] ~~baizhongsou~~ (stop providing free proxies)

Thanks to: Golmic Eric_Chan

FAQ

How about the availability and quality of the proxies?

Before storing new proxy, fp-server will check its availability, anonymity and speed based on your local network. So, feel free to use the crawled proxies.
How many PROXY_STORE_NUM should I set? Is there any limitation?

You should set it depends on your real requirement. If your project is a normal spider, then 300-500 will be fair enough. I haven't set any limitation for now. After stored 10000 available proxies, I stopped testing. The upper limit is relevant to source websites. I will add more websites if more people use this project.
How to use it in my project?

See the next section.

Examples

These code can be directly copied to your project. Remember to modify the configuration and settings at first.

I will write more snippets at leisure. Or you can tell me what example you want.

Use fp-server with Python requests module

Here.

Use fp-server in Scrapy Project

Here is a middleware for Scrapy to fetch and apply proxy for each request. Copy it to your middlewares.py and add the name to DOWNLOADER_MIDDLEWARES in your settings.py.

If you want to keep a cookie pool for your proxies(an independent cookiejar for each IP), this middleware may help you.

Bugs and feature requests

I need your feedback to make it better.
Please create an issue for any problems or advice.

Known bugs:

Block while using Tornado-4.5.3
Afer check, the redis key might change

TODOs and ideas

Use ZSET
Add supervisor
Divide log module
More detailed api
Web frontend via bootstrap
Add user-agent pool
the checker's scheduler:
- Periodically calculating the average speed of checking request, then reassign the checker based on this average and the quantity of stored proxies.
Provide region information.
use redis's HSET for calculation

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 154

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (6) 🔗

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Karmenzind / Fp Server

Programming Languages

Labels

Projects that are alternatives of or similar to Fp Server

fp-server

Contents

Get started

Using Docker

Manually install

web APIs

get proxies

create new proxy manually

check status

Config

Introduction

Customization

Source webs

FAQ

Examples

Use fp-server with Python requests module

Use fp-server in Scrapy Project

Bugs and feature requests

TODOs and ideas