All Projects → Halfish → Captcha-Cracking

Halfish / Captcha-Cracking

Licence: other
Crack number and Chinese captcha with both traditional and deep learning methods, based on Torch and python.

Programming Languages

lua
6591 projects
Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Captcha-Cracking

graftr
graftr: an interactive shell to view and edit PyTorch checkpoints.
Stars: ✭ 89 (+154.29%)
Mutual labels:  torch
yann
Yet Another Neural Network Library 🤔
Stars: ✭ 26 (-25.71%)
Mutual labels:  torch
inpainting FRRN
Progressive Image Inpainting (Kolmogorov Team solution for Huawei Hackathon 2019 summer)
Stars: ✭ 30 (-14.29%)
Mutual labels:  torch
captcha-mini
captcha-mini.js是一个生成验证码的插件,使用js和canvas生成的,确保后端服务被暴力攻击,简单判断人机以及系统的安全性,体积小,功能多,支持配置。展示地址:https://www.mwcxs.top/static/testTool/demo/index.html
Stars: ✭ 98 (+180%)
Mutual labels:  captcha
deepgenres.torch
Predict the genre of a song using the Torch deep learning library
Stars: ✭ 18 (-48.57%)
Mutual labels:  torch
recaptcha2
Easy verifier for google reCAPTCHA version 2 for Node.js and Express.js
Stars: ✭ 48 (+37.14%)
Mutual labels:  captcha
canvas-captcha
A simple captcha module for nodejs based on node-canvas
Stars: ✭ 31 (-11.43%)
Mutual labels:  captcha
simple-recaptcha-v3
🤖 This repository contains simple reCAPTCHA v3 integration for your Laravel application.
Stars: ✭ 25 (-28.57%)
Mutual labels:  captcha
dcat-auth-captcha
Sliding captcha for dcat-admin auth / dcat-admin登陆 滑动验证插件 多平台支持
Stars: ✭ 38 (+8.57%)
Mutual labels:  captcha
Bearcat captcha
熊猫识别不定长验证码,基于tensorflow2.2(tensorflow2.3也可以运行)轻松就能练出不错的模型
Stars: ✭ 67 (+91.43%)
Mutual labels:  captcha
lantern
[Android Library] Handling device flash as torch for Android.
Stars: ✭ 81 (+131.43%)
Mutual labels:  torch
TikTokBot
Bot save videos from instagram and then post them to Tik-Tok
Stars: ✭ 21 (-40%)
Mutual labels:  captcha
eccv16 attr2img
Torch Implemention of ECCV'16 paper: Attribute2Image
Stars: ✭ 93 (+165.71%)
Mutual labels:  torch
discaptcha
Captcha for Discord!
Stars: ✭ 14 (-60%)
Mutual labels:  captcha
captcha
😁一个Laravel5使用的简单图形验证码组件包
Stars: ✭ 13 (-62.86%)
Mutual labels:  captcha
mbus
基于RabbitMQ简单实现验证码识别平台,训练网络模型智能识别图形验证码
Stars: ✭ 37 (+5.71%)
Mutual labels:  captcha
image-background-remove-tool
✂️ Automated high-quality background removal framework for an image using neural networks. ✂️
Stars: ✭ 767 (+2091.43%)
Mutual labels:  torch
RARBG-scraper
With Selenium headless browsing and CAPTCHA solving
Stars: ✭ 38 (+8.57%)
Mutual labels:  captcha
Formidable
The PHP pragmatic forms library
Stars: ✭ 116 (+231.43%)
Mutual labels:  captcha
dcat-login-captcha
Dcat-admin 登陆验证码
Stars: ✭ 24 (-31.43%)
Mutual labels:  captcha

Captcha-Cracking Program Using Torch

This is a program aiming to crack some CAPTCHA of several websites, which may include both traditional and deep learning method.

1. Traditional Methods

By traditional methods, we firstly need to preprocess the image like removing noises in the background, and do the slant correction if the character have some rotated angles. Then just cut out each single characters and train a classifier to recognize them.

2. Deep Learning Methods

In this program, we mainly use a Convolutional Neural Network model developed by Google, which was slightly different from LeNet-5, and was firstly desigined to extract street view house number(SVHN) from Google Map.Click here to read the origin article. Multi-digi Number Recognition from Street View Imagery using Deep Convolutional Neural Networks

About Torch7 & OpenCV

Torch7 is a scientific computing framework based on Lua. We can easily build any complex deep learing model using Torch7. Install torch7 by following these commands,

git clone https://github.com/torch/distro.git ~/torch --recursive
cd ~/torch; bash install-deps
./install.sh
source ~/.zshrc

OpenCV is a open source computer vision library. We use opencv to pre-process the image before we formally begin the recogize it. And we mainly use Python interface in the program. Install opencv through apt-get

sudo apt-get install python-opencv

web service

Tornado is a Python web framewrok and asynchronous networking library. Install tornado-4.3 by pip, and using redis to connect tornado and torch7.

sudo apt-get install tesseract-ocr-chi-sim
sudo pip install pytesseract
sudo pip install futures
sudo pip install tornado
sudo apt-get install redis-server
redis-server &
sudo pip install redis      # install redis for python interface
luarocks install redis-lua  # install redis for lua interface

# start service
git clone https://github.com/Halfish/Captcha-Cracking.git
cd captcha-web
nohup python upload.py & > logpy.txt
nohup th breaking.lua -gpuid 1 & > loglua.txt

Model A: SVHN Model

When cracking type1 to type10 CAPTCHA, our model is always prefixed with svhn, which we have already explained up there. Follow the following steps to manually train a CAPTCHA recognizer.

Step 0: Go to the ./src/ sub derectory

cd src/

Step 1: Generate synthetic pictures with labels

# to see how to use engine
python engine.py -h

# generate 1000 type 2 pictures, saving in ../trainpic/type2/
python engine.py -t 2 -n 1000 -d ../trainpic/type2
python engine.py -t 6 -n 1000 -d ../trainpic/type6

Step 2: Dump full data set

-- to see how to dump data
th svhn_dump.lua -h

-- dump 1000 type 2 picture for every font
th svhn_dump.lua -persize 1000 -type 2 -datadir ../trainpic/type2 -savename type2_1000.dat
th svhn_dump.lua -persize 1000 -type 6 -datadir ../trainpic/type6 -savename type6_1000.dat

Step 3: Train the model

-- to see how to train a model
th svhn_train.lua -h

-- using GPU-2(start from 1) to train a CNN model from type1 CAPTCHA
th svhn_train.lua -gpuid 2 -type 1 -dataname type1_data.dat -savename model_type1.t7
th svhn_train.lua -gpuid 2 -type 6 -dataname type6_1000.dat -savename model_type6.t7

Model B: Simple Model

Some type of Captcha has fixed position of every character we need to crack, so we can cut out and use any simple classifier to recognize them. But the pre-process is essential and important. Our type4 Captcha, including four websites belonging to four provinces, can be cracked by this way. Type4 Captcha including chongqing(chq), gansu(gs), ningxia(nx) and tianjin(tj). Here are some details.

Step 0:

cd src/

Step 1: Generate some pictures with labels

python type4_cutAndDump.py chq
python type4_cutAndDump.py gs
python type4_cutAndDump.py nx
python type4_cutAndDump.py tj

This script will generate some pictures under ./trainpic/type4/

Step 2: dump data before training

th type4_dump.lua -province chq -typename num
th type4_dump.lua -province chq -typename symb

th type4_dump.lua -province gs -typename num
th type4_dump.lua -province gs -typename symb

th type4_dump.lua -province nx -typename num
th type4_dump.lua -province nx -typename symb

th type4_dump.lua -province tj -typename num
th type4_dump.lua -province tj -typename symb

You can manually move the *.dat to ../data/ for better directory organization.

Step 3: training

th type4_train.lua -maxiters 300 -model chq -type num -datpath ../data/type4_chq_num.dat

Step 4: prediction We have 200 pictures without labels prepared for prediction. Or you can just predict just one picture.

th type4_predict.py -province chq -picpath ../testpic/type4/chq/5000.png
th type4_predict.py -province chq 
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].