All Projects → ZexinYan → Medical Report Generation

ZexinYan / Medical Report Generation

A pytorch implementation of On the Automatic Generation of Medical Imaging Reports.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Medical Report Generation

Cs231
Complete Assignments for CS231n: Convolutional Neural Networks for Visual Recognition
Stars: ✭ 317 (+217%)
Mutual labels:  image-captioning
Im2p
Tensorflow implementation of paper: A Hierarchical Approach for Generating Descriptive Image Paragraphs
Stars: ✭ 15 (-85%)
Mutual labels:  image-captioning
Cameramanager
Simple Swift class to provide all the configurations you need to create custom camera view in your app
Stars: ✭ 1,130 (+1030%)
Mutual labels:  image-captioning
Oscar
Oscar and VinVL
Stars: ✭ 396 (+296%)
Mutual labels:  image-captioning
Show Attend And Tell
TensorFlow Implementation of "Show, Attend and Tell"
Stars: ✭ 869 (+769%)
Mutual labels:  image-captioning
Bottom Up Attention
Bottom-up attention model for image captioning and VQA, based on Faster R-CNN and Visual Genome
Stars: ✭ 989 (+889%)
Mutual labels:  image-captioning
Adaptiveattention
Implementation of "Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning"
Stars: ✭ 303 (+203%)
Mutual labels:  image-captioning
Transformer image caption
Image Captioning based on Bottom-Up and Top-Down Attention model
Stars: ✭ 94 (-6%)
Mutual labels:  image-captioning
Neural Image Captioning
Implementation of Neural Image Captioning model using Keras with Theano backend
Stars: ✭ 12 (-88%)
Mutual labels:  image-captioning
Coco Cn
Enriching MS-COCO with Chinese sentences and tags for cross-lingual multimedia tasks
Stars: ✭ 57 (-43%)
Mutual labels:  image-captioning
Neuralmonkey
An open-source tool for sequence learning in NLP built on TensorFlow.
Stars: ✭ 400 (+300%)
Mutual labels:  image-captioning
Self Critical.pytorch
Unofficial pytorch implementation for Self-critical Sequence Training for Image Captioning. and others.
Stars: ✭ 716 (+616%)
Mutual labels:  image-captioning
Image captioning
generate captions for images using a CNN-RNN model that is trained on the Microsoft Common Objects in COntext (MS COCO) dataset
Stars: ✭ 51 (-49%)
Mutual labels:  image-captioning
Virtex
[CVPR 2021] VirTex: Learning Visual Representations from Textual Annotations
Stars: ✭ 323 (+223%)
Mutual labels:  image-captioning
Image Text Papers
Image Caption and Text to Image papers.
Stars: ✭ 71 (-29%)
Mutual labels:  image-captioning
Scan
PyTorch source code for "Stacked Cross Attention for Image-Text Matching" (ECCV 2018)
Stars: ✭ 306 (+206%)
Mutual labels:  image-captioning
Punny captions
An implementation of the NAACL 2018 paper "Punny Captions: Witty Wordplay in Image Descriptions".
Stars: ✭ 31 (-69%)
Mutual labels:  image-captioning
Arnet
CVPR 2018 - Regularizing RNNs for Caption Generation by Reconstructing The Past with The Present
Stars: ✭ 94 (-6%)
Mutual labels:  image-captioning
Automatic Image Captioning
Generating Captions for images using Deep Learning
Stars: ✭ 84 (-16%)
Mutual labels:  image-captioning
Image Captioning
Image Captioning: Implementing the Neural Image Caption Generator with python
Stars: ✭ 52 (-48%)
Mutual labels:  image-captioning

On the Automatic Generation of Medical Imaging Reports

A pytorch implementation of On the Automatic Generation of Medical Imaging Reports.

The detail of the paper can be found in On the Automatic Generation of Medical Imaging Reports.

Performance

From model only_training/only_training/20180528-02:44:52/

Mode BLEU-1 BLEU-2 BLEU-3 BLEU-4 METEOR ROUGE CIDEr
Train 0.386 0.275 0.215 0.176 0.187 0.369 1.075
Val 0.303 0.182 0.118 0.077 0.143 0.256 0.214
Test 0.316 0.190 0.123 0.081 0.148 0.264 0.221
Paper 0.517 0.386 0.306 0.247 0.217 0.447 0.327

Tags Prediction

Stary 2018-07-07 at 10.30.57 AM

Comparison

Stary 2018-07-07 at 10.31.02 AM

Visual Results

Stary 2018-07-07 at 10.26.54 AM

Stary 2018-07-07 at 10.26.30 AM

Stary 2018-07-07 at 10.26.37 AM

Stary 2018-07-07 at 10.26.45 AM

Training

usage: trainer.py [-h] [--patience PATIENCE] [--mode MODE]
                  [--vocab_path VOCAB_PATH] [--image_dir IMAGE_DIR]
                  [--caption_json CAPTION_JSON]
                  [--train_file_list TRAIN_FILE_LIST]
                  [--val_file_list VAL_FILE_LIST] [--resize RESIZE]
                  [--crop_size CROP_SIZE] [--model_path MODEL_PATH]
                  [--load_model_path LOAD_MODEL_PATH]
                  [--saved_model_name SAVED_MODEL_NAME] [--momentum MOMENTUM]
                  [--visual_model_name VISUAL_MODEL_NAME] [--pretrained]
                  [--classes CLASSES]
                  [--sementic_features_dim SEMENTIC_FEATURES_DIM] [--k K]
                  [--attention_version ATTENTION_VERSION]
                  [--embed_size EMBED_SIZE] [--hidden_size HIDDEN_SIZE]
                  [--sent_version SENT_VERSION]
                  [--sentence_num_layers SENTENCE_NUM_LAYERS]
                  [--dropout DROPOUT] [--word_num_layers WORD_NUM_LAYERS]
                  [--batch_size BATCH_SIZE] [--learning_rate LEARNING_RATE]
                  [--epochs EPOCHS] [--clip CLIP] [--s_max S_MAX]
                  [--n_max N_MAX] [--lambda_tag LAMBDA_TAG]
                  [--lambda_stop LAMBDA_STOP] [--lambda_word LAMBDA_WORD]

optional arguments:
  -h, --help            show this help message and exit
  --patience PATIENCE
  --mode MODE
  --vocab_path VOCAB_PATH
                        the path for vocabulary object
  --image_dir IMAGE_DIR
                        the path for images
  --caption_json CAPTION_JSON
                        path for captions
  --train_file_list TRAIN_FILE_LIST
                        the train array
  --val_file_list VAL_FILE_LIST
                        the val array
  --resize RESIZE       size for resizing images
  --crop_size CROP_SIZE
                        size for randomly cropping images
  --model_path MODEL_PATH
                        path for saving trained models
  --load_model_path LOAD_MODEL_PATH
                        The path of loaded model
  --saved_model_name SAVED_MODEL_NAME
                        The name of saved model
  --momentum MOMENTUM
  --visual_model_name VISUAL_MODEL_NAME
                        CNN model name
  --pretrained          not using pretrained model when training
  --classes CLASSES
  --sementic_features_dim SEMENTIC_FEATURES_DIM
  --k K
  --attention_version ATTENTION_VERSION
  --embed_size EMBED_SIZE
  --hidden_size HIDDEN_SIZE
  --sent_version SENT_VERSION
  --sentence_num_layers SENTENCE_NUM_LAYERS
  --dropout DROPOUT
  --word_num_layers WORD_NUM_LAYERS
  --batch_size BATCH_SIZE
  --learning_rate LEARNING_RATE
  --epochs EPOCHS
  --clip CLIP           gradient clip, -1 means no clip (default: 0.35)
  --s_max S_MAX
  --n_max N_MAX
  --lambda_tag LAMBDA_TAG
  --lambda_stop LAMBDA_STOP
  --lambda_word LAMBDA_WORD

Tester

usage: tester.py [-h] [--model_dir MODEL_DIR] [--image_dir IMAGE_DIR]
                 [--caption_json CAPTION_JSON] [--vocab_path VOCAB_PATH]
                 [--file_lits FILE_LITS] [--load_model_path LOAD_MODEL_PATH]
                 [--resize RESIZE] [--cam_size CAM_SIZE]
                 [--generate_dir GENERATE_DIR] [--result_path RESULT_PATH]
                 [--result_name RESULT_NAME] [--momentum MOMENTUM]
                 [--visual_model_name VISUAL_MODEL_NAME] [--pretrained]
                 [--classes CLASSES]
                 [--sementic_features_dim SEMENTIC_FEATURES_DIM] [--k K]
                 [--attention_version ATTENTION_VERSION]
                 [--embed_size EMBED_SIZE] [--hidden_size HIDDEN_SIZE]
                 [--sent_version SENT_VERSION]
                 [--sentence_num_layers SENTENCE_NUM_LAYERS]
                 [--dropout DROPOUT] [--word_num_layers WORD_NUM_LAYERS]
                 [--s_max S_MAX] [--n_max N_MAX] [--batch_size BATCH_SIZE]
                 [--lambda_tag LAMBDA_TAG] [--lambda_stop LAMBDA_STOP]
                 [--lambda_word LAMBDA_WORD]

optional arguments:
  -h, --help            show this help message and exit
  --model_dir MODEL_DIR
  --image_dir IMAGE_DIR
                        the path for images
  --caption_json CAPTION_JSON
                        path for captions
  --vocab_path VOCAB_PATH
                        the path for vocabulary object
  --file_lits FILE_LITS
                        the path for test file list
  --load_model_path LOAD_MODEL_PATH
                        The path of loaded model
  --resize RESIZE       size for resizing images
  --cam_size CAM_SIZE
  --generate_dir GENERATE_DIR
  --result_path RESULT_PATH
                        the path for storing results
  --result_name RESULT_NAME
                        the name of results
  --momentum MOMENTUM
  --visual_model_name VISUAL_MODEL_NAME
                        CNN model name
  --pretrained          not using pretrained model when training
  --classes CLASSES
  --sementic_features_dim SEMENTIC_FEATURES_DIM
  --k K
  --attention_version ATTENTION_VERSION
  --embed_size EMBED_SIZE
  --hidden_size HIDDEN_SIZE
  --sent_version SENT_VERSION
  --sentence_num_layers SENTENCE_NUM_LAYERS
  --dropout DROPOUT
  --word_num_layers WORD_NUM_LAYERS
  --s_max S_MAX
  --n_max N_MAX
  --batch_size BATCH_SIZE
  --lambda_tag LAMBDA_TAG
  --lambda_stop LAMBDA_STOP
  --lambda_word LAMBDA_WORD

Method:

  • test(): Compute loss
  • generate(): generate captions for each image, and saved result (json) in os.path.join(model_dir, result_path).
  • sample(img_name): generate a caption for an image and its heatmap (cam).

quantify the model performance

python2 metric_performance.py
usage: metric_performance.py [-h] [--result_path RESULT_PATH]

optional arguments:
  -h, --help            show this help message and exit
  --result_path RESULT_PATH

Review generated captions

By using jupyter to read review_captions.ipynb, you can review the model generated captions for each image.

visualize training procedure

By changing tensorboard --logdir report_models to your owned saved models path in tensorboard.sh, you can visualize training procedure.

./tensorboard.sh

Improve performance by change the model

In utils/models, I have implemented all models in basic version, and I think there will be some more powerful model structures which can improve the performance. So enjoy your work ^_^.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].