All Projects → liazylee → pronhubSpider

liazylee / pronhubSpider

Licence: other
pornhubをクロールしているWebHubBotプロジェクトの模倣、効率が遅すぎる、方法を探しています

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to pronhubSpider

Bgabanner Android
引导界面滑动导航 + 大于等于1页时无限轮播 + 各种切换动画轮播效果
Stars: ✭ 4,060 (+10584.21%)
Mutual labels:  splash
PornHub-Downloader
基于 Aiohttp 和 Pyppeteer 的 PornHub 视频下载工具,支持多任务并行下载。
Stars: ✭ 20 (-47.37%)
Mutual labels:  pornhub
splash-screen
Android library for getting a nice and simple SlashScreen into your Android app
Stars: ✭ 107 (+181.58%)
Mutual labels:  splash
lgcrawl
python+scrapy+splash 爬取拉勾全站职位信息
Stars: ✭ 22 (-42.11%)
Mutual labels:  splash
SmartPutty
Multi-Tabbed PuTTY written in Java
Stars: ✭ 34 (-10.53%)
Mutual labels:  splash
kodi-repo-gaymods
Kodi Repo Gay Mods
Stars: ✭ 77 (+102.63%)
Mutual labels:  pornhub
SharpGrabber
Download from YouTube, Vimeo, PornHub, HLS (M3U8 files) with .NET and JavaScript, Library and desktop app for downloading high quality media
Stars: ✭ 138 (+263.16%)
Mutual labels:  pornhub
ReactNativeStarterKits
Agiletech React Native Starter Kits
Stars: ✭ 21 (-44.74%)
Mutual labels:  splash
tinyPornManager
Made for pornhub. Fork from tinyMediaManager v3
Stars: ✭ 57 (+50%)
Mutual labels:  pornhub
ionic-resource-generator
Painless, Offline First, No Dependency, Ionic resources generator
Stars: ✭ 31 (-18.42%)
Mutual labels:  splash
just-tit
Adult video search engine
Stars: ✭ 60 (+57.89%)
Mutual labels:  pornhub
HisokaBOT-Whatsapp-Bot
Whatsapp Bot - Node Js.
Stars: ✭ 75 (+97.37%)
Mutual labels:  pornhub
godot-awesome-splash
Collection of splash screens in Godot
Stars: ✭ 137 (+260.53%)
Mutual labels:  splash
Python3 Spider
Python爬虫实战 - 模拟登陆各大网站 包含但不限于:滑块验证、拼多多、美团、百度、bilibili、大众点评、淘宝,如果喜欢请start ❤️
Stars: ✭ 2,129 (+5502.63%)
Mutual labels:  splash
Splashy
Splash screen library for Android
Stars: ✭ 112 (+194.74%)
Mutual labels:  splash
Logoly
A Pornhub Flavour Logo Generator
Stars: ✭ 6,242 (+16326.32%)
Mutual labels:  pornhub
Android-Touch-Helper
开屏跳过-安卓系统的开屏广告自动跳过助手
Stars: ✭ 488 (+1184.21%)
Mutual labels:  splash
SplashScreen
A demo project showcasing different methods to create splash screen in Android and discusses the details in the companion Medium article.
Stars: ✭ 37 (-2.63%)
Mutual labels:  splash
BNSBoost
A simple launcher for Blade & Soul patches. Working as of the Fire and Blood game update.
Stars: ✭ 19 (-50%)
Mutual labels:  splash
vue-splash
splash plugin for vue js
Stars: ✭ 120 (+215.79%)
Mutual labels:  splash

pronhubSpider

pornhubをクロールしているWebHubBotプロジェクトの模倣、効率が遅すぎる、方法を探しています

GitHub forks GitHub stars GitHub license issues

Disclaimer: This project is intended to study the Scrapy Spider Framework and the MongoDB database, it cannot be used for commercial or other personal intentions. If used improperly, it will be the individuals bear.

  • The project is mainly used for crawling a Website, the largest site for pornhub in the world. In doing so it retrieves video titles, duration, mp4 file, cover url and direct Website`s url.
  • This project crawls PornHub.com slow, but with a simple structure.
  • This project can crawl up to 5 millon Website`s videos per day(it is not work), depending on your personal network. Because of my slow bandwith my results are relatively slow.
  • The crawler requests 10 threads at a time, and because of this can achieve the speed mentioned above. If your network is more performant you can request more threads and crawl a larger amount of videos per day. For the specific configuration see [pre-boot configuration]

Environment, Architecture

Language: Python3.6

Environment: ubuntu, 4G RAM

Database: MongoDB

  • Mainly uses the scrapy reptile framework.
  • Join to the Spider randomly by extracted from the Cookie pool and UA pool and tor ip pool.(if you are not in China,tor is no need)
  • Start_requests start five Request based on Website`s classification, and crawl the one categorie.
  • Support paging crawl data, and join to the queue.

Instructions for use

Pre-boot configuration

  • Install MongoDB and start without configuration
  • Install Python dependent modules:Scrapy, pymongo, requests ,scrapy-splash,or pip install -r requirements.txt
  • Modify the configuration by needed, such as the interval time, the number of threads, etc.
  • Install the Splash and use docker to run.

Start up

  • cd WebHub
  • python quickstart.py

Database description

The table in the database that holds the data is PhRes. The following is a field description:

PhRes table:

video_title:     The title of the video, and as a unique.
link_url:        Video jump to Website`s link
image_url:       Video cover link
video_duration:  The length of the video, in seconds
quality_480p:    Video 480p mp4 download path

改进、未知请告知,邮件,issue,push都可。

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].