All Projects → amzn → Convolutional Handwriting Gan

amzn / Convolutional Handwriting Gan

Licence: mit
ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation (CVPR20)

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Convolutional Handwriting Gan

Mimicry
[CVPR 2020 Workshop] A PyTorch GAN library that reproduces research results for popular GANs.
Stars: ✭ 458 (+328.04%)
Mutual labels:  gan, cvpr
Cross Domain ner
Cross-domain NER using cross-domain language modeling, code for ACL 2019 paper
Stars: ✭ 67 (-37.38%)
Mutual labels:  transfer-learning, domain-adaptation
Transfer Learning Library
Transfer-Learning-Library
Stars: ✭ 678 (+533.64%)
Mutual labels:  transfer-learning, domain-adaptation
adapt
Awesome Domain Adaptation Python Toolbox
Stars: ✭ 46 (-57.01%)
Mutual labels:  transfer-learning, domain-adaptation
Ddc Transfer Learning
A simple implementation of Deep Domain Confusion: Maximizing for Domain Invariance
Stars: ✭ 83 (-22.43%)
Mutual labels:  transfer-learning, domain-adaptation
Basicocr
BasicOCR是一个致力于解决自然场景文字识别算法研究的项目。该项目由长城数字大数据应用技术研究院佟派AI团队发起和维护。
Stars: ✭ 336 (+214.02%)
Mutual labels:  gan, ocr
Neural Painters X
Neural Paiters
Stars: ✭ 61 (-42.99%)
Mutual labels:  gan, transfer-learning
Meta-Fine-Tuning
[CVPR 2020 VL3] The repository for meta fine-tuning in cross-domain few-shot learning.
Stars: ✭ 29 (-72.9%)
Mutual labels:  transfer-learning, cvpr
Man
Multinomial Adversarial Networks for Multi-Domain Text Classification (NAACL 2018)
Stars: ✭ 72 (-32.71%)
Mutual labels:  gan, domain-adaptation
Libtlda
Library of transfer learners and domain-adaptive classifiers.
Stars: ✭ 71 (-33.64%)
Mutual labels:  transfer-learning, domain-adaptation
HistoGAN
Reference code for the paper HistoGAN: Controlling Colors of GAN-Generated and Real Images via Color Histograms (CVPR 2021).
Stars: ✭ 158 (+47.66%)
Mutual labels:  gan, cvpr
Awesome Transfer Learning
Best transfer learning and domain adaptation resources (papers, tutorials, datasets, etc.)
Stars: ✭ 1,349 (+1160.75%)
Mutual labels:  transfer-learning, domain-adaptation
SHOT-plus
code for our TPAMI 2021 paper "Source Data-absent Unsupervised Domain Adaptation through Hypothesis Transfer and Labeling Transfer"
Stars: ✭ 46 (-57.01%)
Mutual labels:  transfer-learning, domain-adaptation
Multitask Learning
Awesome Multitask Learning Resources
Stars: ✭ 361 (+237.38%)
Mutual labels:  transfer-learning, domain-adaptation
TextBoxGAN
Generate text boxes from input words with a GAN.
Stars: ✭ 50 (-53.27%)
Mutual labels:  ocr, gan
Transferlearning
Transfer learning / domain adaptation / domain generalization / multi-task learning etc. Papers, codes, datasets, applications, tutorials.-迁移学习
Stars: ✭ 8,481 (+7826.17%)
Mutual labels:  transfer-learning, domain-adaptation
CADA
Attending to Discriminative Certainty for Domain Adaptation
Stars: ✭ 17 (-84.11%)
Mutual labels:  cvpr, domain-adaptation
KD3A
Here is the official implementation of the model KD3A in paper "KD3A: Unsupervised Multi-Source Decentralized Domain Adaptation via Knowledge Distillation".
Stars: ✭ 63 (-41.12%)
Mutual labels:  transfer-learning, domain-adaptation
Deep Transfer Learning
Deep Transfer Learning Papers
Stars: ✭ 68 (-36.45%)
Mutual labels:  transfer-learning, domain-adaptation
Awesome Computer Vision
Awesome Resources for Advanced Computer Vision Topics
Stars: ✭ 92 (-14.02%)
Mutual labels:  gan, transfer-learning

ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation

This is a pytorch implementation of the paper "ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation"

Dependency

  • This work was tested with PyTorch 1.2.0, CUDA 9.0, python 3.6 and Ubuntu 16.04.
  • requirements can be found in the file environmentPytorch12.yml. The command to create the environment from the file is: conda env create --name pytorch1.2 --file=environmentPytorch12.yml
  • To activate the environment use: source activate pytorch1.2

Training

  • To view the results during the training process, you need to set up a visdom port: visdom -port 8192

Supervised Training

python train.py --name_prefix demo --dataname RIMEScharH32W16 --capitalize --display_port 8192 
  • Main arguments:
    • --name: unless specified in the arguments, the experiment name is determined by the name_prefix, the dataset and parameters different from the default ones (see code in options/base_options.py).
    • --name_prefix: the prefix to the automatically generated experiment name.
    • --dataname: name of dataset which will determine the dataroot path according to data/dataset_catalog.py
    • --lex: the lexicon used to generate the fake images. There is a default lexicon for english/french data specified in options/base_options.py.
    • --capitalize: randomly capitalize first letters of words in the lexicon used.
    • --display_port: visdom display port
    • --checkpoints_dir: the networks weights and sample images are saved to checkpoints_dir/experiment_name.
    • --use_rnn: whether to use LSTM
    • --seed: determine the seed for numpy and pytorch instead of using a random one.
    • --gb_alpha: the balance between the recognizer and discriminator loss. Higher alpha means larger weight for the recognizer.
  • Other arguments are explained in the file options/base_options.py and options/train_options.py.

Semi-Supervised Training

python train_semi_supervised.py --dataname IAMcharH32W16rmPunct --unlabeled_dataname CVLtrH32 --disjoint
  • Main arguments:

    • --dataname: name of dataset which will determine the labeled dataroot path according to data/dataset_catalog.py. This data is used to train only the Recognizer (in the disjoint case) or the Recognizer and the Discriminator networks.
    • --unlabeled_dataname: name of dataset which will determine the unlabeled dataroot path according to data/dataset_catalog.py. This data is used to train only the Discriminator network.
    • --disjoint: Disjoint training of the discriminator and the recognizer (each sees only the unlabeled/labeled data accordingly).
  • Other arguments are explained in the file options/base_options.py and options/train_options.py.

LMDB file generation for training data

Before generating an LMDB download the desired dataset into Datasets:

The structure of the directories should be:

  • Datasets
    • IAM
      • wordImages (the downloaded words dataset)
      • lineImages (the downloaded lines dataset)
      • original (the downloaded xml labels data)
      • original_partition (the downloaded partition)
        • te.lst
        • tr.lst
        • va1.lst
        • va2.lst
    • RIMES
      • orig (the downloaded dataset)
        • training_WR
        • groundtruth_training_icdar2011.txt
        • testdataset_ICDAR
        • ground_truth_test_icdar2011.txt
        • valdataset_ICDAR
        • ground_truth_validation_icdar2011.txt
    • CVL
      • cvl-database-1-1 (the downloaded dataset)
        • trainset
        • testset
        • readme.txt
    • Lexicon
      • english_words.txt
      • Lexique383.tsv

To generate an LMDB file of one of the datasets CVL/IAM/RIMES/GW for training use the code:

cd data
python create_text_data.py
  • Main arguments (determined inside the file):
    • create_Dict = False: create a dictionary of the generated dataset
    • dataset = 'IAM': CVL/IAM/RIMES/gw
    • mode = 'va2': tr/te/va1/va2/all
    • labeled = True: save the labels of the images or not.
    • top_dir = 'Datasets': The directory containing the folders with the different datasets.
    • words = False: parameter relevant for IAM/RIMES. Use words images, otherwise use lines
    • parameters relevant for IAM:
    • offline = True: use offline images
    • author_number = -1: use only images of a specific writer. If the value is -1, use all writers, otherwise use the index of this specific writer
    • remove_punc = True: remove images which include only one punctuation mark from the list ['.', '', ',', '"', "'", '(', ')', ':', ';', '!']
    • resize parameters:
    • resize='noResize': charResize|keepRatio|noResize - type of resize, char - resize so that each character's width will be in a specific range (inside this range the width will be chosen randomly), keepRatio - resize to a specific image height while keeping the height-width aspect-ratio the same. noResize - do not resize the image
    • imgH = 32: height of the resized image
    • init_gap = 0: insert a gap before the beginning of the text with this number of pixels
    • charmaxW = 18: The maximum character width
    • charminW = 10: The minimum character width
    • h_gap = 0: Insert a gap below and above the text
    • discard_wide = True: Discard images which have a character width 3 times larger than the maximum allowed character size (instead of resizing them) - this helps discard outlier images
    • discard_narr = True: Discard images which have a character width 3 times smaller than the minimum allowed charcter size.

The generated lmdb will be saved in the relevant dataset folder and the dictionary with be saved in Lexicon folder.

Generating an LMDB file with GAN data

python generate_wordsLMDB.py --dataname IAMcharH32rmPunct --results_dir ./lmdb_files/IAM_concat --n_synth 100,200 --name model_name 
  • Main arguments:
    • --dataname: name of dataset which will determine the dataroot path according to data/dataset_catalog.py. note that will be concatenated to the generated image.
    • --no_concat_dataset: ignore “dataname” (previous parameter), do not concatenate
    • --results_dir: path to result, will be concatenated with "n_synth"
    • --n_synth: number of examples to generate in thousands
    • --name: name of model used to generate the images
    • --lex: lexicon used to generate the images

Main Folders

The structure of the code is based on the structure of the CycleGAN code.

  1. data/ - Folder containing functions relating to the data, including generation, dataloading, alphabetes and a catalog which translates dataset names into folder location. The dataset_catalog should be updated according to the path to the lmdb you are using.
  2. models/ - Folder containing the models (with the forward, backward and optimization functions) and the network architectures. The generator and discriminator architectures are based on BigGAN. The recognizer architecture is based on crnn.
  3. options/ - Files containing the arguments for the training and data generation process.
  4. plots/ - Python notebook files with visualizations of the data.
  5. util/ - General function that are used in packages such as loss definitions.

Citation

If you use this code for your research, please cite our paper.

@inproceedings{fogel2020scrabblegan,
    title={ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation},
    author={Sharon Fogel and Hadar Averbuch-Elor and Sarel Cohen and Shai Mazor and Roee Litman},
    booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    month = {June},
    year = {2020}
}

License

ScrabbleGAN is released under the MIT license. See the LICENSE and THIRD-PARTY-NOTICES.txt files for more information.

Contributing

Your contributions are welcome!
See CONTRIBUTING.md and CODE_OF_CONDUCT.md for more info.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].