All Projects → xelzmm → Proxy_server_crawler

xelzmm / Proxy_server_crawler

Licence: mit
an awesome public proxy server crawler based on scrapy framework

Programming Languages

python
139335 projects - #7 most used programming language

Labels

Projects that are alternatives of or similar to Proxy server crawler

Django Dynamic Scraper
Creating Scrapy scrapers via the Django admin interface
Stars: ✭ 1,024 (+989.36%)
Mutual labels:  scrapy
Terpene Profile Parser For Cannabis Strains
Parser and database to index the terpene profile of different strains of Cannabis from online databases
Stars: ✭ 63 (-32.98%)
Mutual labels:  scrapy
Olxscraper
OLX Scraper in Python Scrapy
Stars: ✭ 76 (-19.15%)
Mutual labels:  scrapy
Wescraper
依赖Scrapy和搜狗搜索微信公众号文章
Stars: ✭ 46 (-51.06%)
Mutual labels:  scrapy
Scrapy S3pipeline
Scrapy pipeline to store chunked items into Amazon S3 or Google Cloud Storage bucket.
Stars: ✭ 57 (-39.36%)
Mutual labels:  scrapy
Alipayspider Scrapy
AlipaySpider on Scrapy(use chrome driver); 支付宝爬虫(基于Scrapy)
Stars: ✭ 70 (-25.53%)
Mutual labels:  scrapy
Articlespider
慕课网python分布式爬虫源码-长期更新维护
Stars: ✭ 40 (-57.45%)
Mutual labels:  scrapy
Distributed Multi User Scrapy System With A Web Ui
Django based application that allows creating, deploying and running Scrapy spiders in a distributed manner
Stars: ✭ 88 (-6.38%)
Mutual labels:  scrapy
Warta Scrap
Indonesia Index News Crawler, including 10 online media
Stars: ✭ 57 (-39.36%)
Mutual labels:  scrapy
Capturer
capture pictures from website like sina, lofter, huaban and so on
Stars: ✭ 76 (-19.15%)
Mutual labels:  scrapy
Scrapy Pyppeteer
Pyppeteer integration for Scrapy
Stars: ✭ 48 (-48.94%)
Mutual labels:  scrapy
Scrapy Craigslist
Web Scraping Craigslist's Engineering Jobs in NY with Scrapy
Stars: ✭ 54 (-42.55%)
Mutual labels:  scrapy
Image Downloader
Download images from Google, Bing, Baidu. 谷歌、百度、必应图片下载.
Stars: ✭ 1,173 (+1147.87%)
Mutual labels:  scrapy
Pixiv Crawler
Scrapy框架下的pixiv多功能爬虫
Stars: ✭ 46 (-51.06%)
Mutual labels:  scrapy
Email Extractor
The main functionality is to extract all the emails from one or several URLs - La funcionalidad principal es extraer todos los correos electrónicos de una o varias Url
Stars: ✭ 81 (-13.83%)
Mutual labels:  scrapy
Crawlab
Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架
Stars: ✭ 8,392 (+8827.66%)
Mutual labels:  scrapy
Taobao duoshou
使用Scrapy采集淘宝数据,Flask展示
Stars: ✭ 63 (-32.98%)
Mutual labels:  scrapy
Scrapoxy
Scrapoxy hides your scraper behind a cloud. It starts a pool of proxies to send your requests. Now, you can crawl without thinking about blacklisting!
Stars: ✭ 1,322 (+1306.38%)
Mutual labels:  scrapy
Taiwan News Crawlers
Scrapy-based Crawlers for news of Taiwan
Stars: ✭ 83 (-11.7%)
Mutual labels:  scrapy
Scrapy Examples
Some scrapy and web.py exmaples
Stars: ✭ 71 (-24.47%)
Mutual labels:  scrapy

##Introduction

Proxy Server Crawler is a tool used to crawl public proxy servers from proxy websites. When crawled a proxy server(ip::port::type), it will test the functionality of the server automatically.

Currently supported websites:

Currently supported testing(for http proxy)

  • ssl support
  • post support
  • speed (tested with 10 frequently used sites)
  • type(high/anonymous/transparent)

Requirements

  • Python >= 2.7
  • Scrapy 1.3.0 (not tested for lower version)
  • node (for some sites, you need node to bypass waf based on javascript)

Usage

cd proxy_server_crawler
scrapy crawl chunzhen

[log]

[ result] ip: 59.41.214.218  , port: 3128 , type: http, proxy server not alive or healthy.
[ result] ip: 117.90.6.67    , port: 9000 , type: http, proxy server not alive or healthy.
[ result] ip: 117.175.183.10 , port: 8123 , speed: 984 , type: high
[ result] ip: 180.95.154.221 , port: 80   , type: http, proxy server not alive or healthy.
[ result] ip: 110.73.0.206   , port: 8123 , type: http, proxy server not alive or healthy.
[  proxy] ip: 124.88.67.54   , port: 80   , speed: 448 , type: high       , post: True , ssl: False
[ result] ip: 117.90.2.149   , port: 9000 , type: http, proxy server not alive or healthy.
[ result] ip: 115.212.165.170, port: 9000 , type: http, proxy server not alive or healthy.
[  proxy] ip: 118.123.22.192 , port: 3128 , speed: 769 , type: high       , post: True , ssl: False
[  proxy] ip: 117.175.183.10 , port: 8123 , speed: 908 , type: high       , post: True , ssl: True 

##License

The MIT License (MIT)

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].