All Projects → Wentong-DST → Im2p

Wentong-DST / Im2p

Licence: mit
Tensorflow implementation of paper: A Hierarchical Approach for Generating Descriptive Image Paragraphs

Programming Languages

lua
6591 projects

Projects that are alternatives of or similar to Im2p

Robot Surgery Segmentation
Wining solution and its improvement for MICCAI 2017 Robotic Instrument Segmentation Sub-Challenge
Stars: ✭ 528 (+3420%)
Mutual labels:  medical-imaging
Torchio
Medical image preprocessing and augmentation toolkit for deep learning
Stars: ✭ 708 (+4620%)
Mutual labels:  medical-imaging
Ganseg
Framework for medical image segmentation using deep neural networks
Stars: ✭ 18 (+20%)
Mutual labels:  medical-imaging
Pyradiomics
Open-source python package for the extraction of Radiomics features from 2D and 3D images and binary masks. Support: https://discourse.slicer.org/c/community/radiomics
Stars: ✭ 563 (+3653.33%)
Mutual labels:  medical-imaging
Dicom
⚡High Performance DICOM Medical Image Parser in Go.
Stars: ✭ 643 (+4186.67%)
Mutual labels:  medical-imaging
Self Critical.pytorch
Unofficial pytorch implementation for Self-critical Sequence Training for Image Captioning. and others.
Stars: ✭ 716 (+4673.33%)
Mutual labels:  image-captioning
Omninet
Official Pytorch implementation of "OmniNet: A unified architecture for multi-modal multi-task learning" | Authors: Subhojeet Pramanik, Priyanka Agrawal, Aman Hussain
Stars: ✭ 448 (+2886.67%)
Mutual labels:  image-captioning
Show Attend And Tell
TensorFlow Implementation of "Show, Attend and Tell"
Stars: ✭ 869 (+5693.33%)
Mutual labels:  image-captioning
Fo Dicom
Fellow Oak DICOM for .NET, .NET Core, Universal Windows, Android, iOS, Mono and Unity
Stars: ✭ 674 (+4393.33%)
Mutual labels:  medical-imaging
Slicergitsvnarchive
Multi-platform, free open source software for visualization and image computing.
Stars: ✭ 896 (+5873.33%)
Mutual labels:  medical-imaging
Kaggle ndsb2017
Kaggle datascience bowl 2017
Stars: ✭ 599 (+3893.33%)
Mutual labels:  medical-imaging
All About The Gan
All About the GANs(Generative Adversarial Networks) - Summarized lists for GAN
Stars: ✭ 630 (+4100%)
Mutual labels:  medical-imaging
Itk
Insight Toolkit (ITK) -- Official Repository. ITK builds on a proven, spatially-oriented architecture for processing, segmentation, and registration of scientific images in two, three, or more dimensions.
Stars: ✭ 801 (+5240%)
Mutual labels:  medical-imaging
Medicalzoopytorch
A pytorch-based deep learning framework for multi-modal 2D/3D medical image segmentation
Stars: ✭ 546 (+3540%)
Mutual labels:  medical-imaging
Mousemorph
Tools for MRI mouse brain morphometry
Stars: ✭ 19 (+26.67%)
Mutual labels:  medical-imaging
Ctk
A set of common support code for medical imaging, surgical navigation, and related purposes.
Stars: ✭ 498 (+3220%)
Mutual labels:  medical-imaging
Medicaltorch
A medical imaging framework for Pytorch
Stars: ✭ 716 (+4673.33%)
Mutual labels:  medical-imaging
Neural Image Captioning
Implementation of Neural Image Captioning model using Keras with Theano backend
Stars: ✭ 12 (-20%)
Mutual labels:  image-captioning
Medicaldetectiontoolkit
The Medical Detection Toolkit contains 2D + 3D implementations of prevalent object detectors such as Mask R-CNN, Retina Net, Retina U-Net, as well as a training and inference framework focused on dealing with medical images.
Stars: ✭ 917 (+6013.33%)
Mutual labels:  medical-imaging
Deepmedic
Efficient Multi-Scale 3D Convolutional Neural Network for Segmentation of 3D Medical Scans
Stars: ✭ 809 (+5293.33%)
Mutual labels:  medical-imaging

im2p

Tensorflow implement of paper: A Hierarchical Approach for Generating Descriptive Image Paragraphs.

Thanks to the original repo author chenxinpeng.

I haven't fine-tunning the parameters, but I achieve the metric scores (by chenxinpeng): metric scores

Please feel free to ask questions in Issues.

Step 1

Configure the torch running environment. Upgrade to Tensorflow v1.2 or above. Install Torch, recommend to use the approach described in Installing Torch without root privileges. Then deploy the running environment follow by densecap step by step.

To verify the running environment, run the script:

$ th check_lua_packages.lua

Also clone pycocoevalcap in same directory, but I have written some patches to fix some bugs, some replace [bleu.py, cider.py, meteor.py, rouge.py] with their corresponding files in pycocoevalcap folder.

Step 2

Download the VisualGenome dataset, we get the two files: VG_100K, VG_100K_2. According to the paper, we download the training, val and test splits json files. These three json files save the image names of train, validation, test data. We save them into data folder.

Running the script:

$ python split_dataset.py

We will get images from [VisualGenome dataset] which the authors used in the paper.

Step 3

Run the scripts:

$ python get_imgs_path.py

We will get three txt files: imgs_train_path.txt, imgs_val_path.txt, imgs_test_path.txt. They save the train, val, test images path.

After this, we use dense caption to extract features.

Step 4

Run the script:

$ ./download_pretrained_model.sh

We should download the pre-trained model: densecap-pretrained-vgg16.t7. Then, according to the paper, we extract 50 boxes and the features from each image. So run the script:

$ ./extract_features.sh

in which the following command will be executed:

$ th extract_features.lua -boxes_per_image 50 -max_images -1 -input_txt imgs_train_path.txt \
                          -output_h5 ./data/im2p_train_output.h5 -gpu -1 -use_cudnn 0

Note that -gpu -1 means we are only using CPU when cudnn fails to run properly in torch.

Also note that my hdf5 module always crashes in torch, so I have to rewrite the features saving part in extract_features.lua by saving them directly to hard disk first, and then use h5py in Python to convert these features into hdf5 format. Run this script:

$ ./convert-to-hdf5.sh

Step 5

Run the script:

$ python parse_json.py

In this step, we process the paragraphs_v1.json file for training and testing, which looks like this: paragraphs_v1.json

We get the img2paragraph file in the ./data directory. Its structure is like this: img2paragraph

Step 6

Finally, we can train and test model, in the terminal:

$ CUDA_VISIBLE_DEVICES=0 ipython
>>> import HRNN_paragraph_batch.py
>>> HRNN_paragraph_batch.train()

After training, we can test the model:

>>> HRNN_paragraph_batch.test()

And then compute all evaluation metrics:

>>> HRNN_paragraph_batch.eval()

Loss record

loss

Results

demo

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].