Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Open-source python package for the extraction of Radiomics features from 2D and 3D images and binary masks. Support: https://discourse.slicer.org/c/community/radiomics

Stars: ✭ 563 (-6.01%)

Mutual labels: medical-imaging

Kaggle Imaterialist

The First Place Solution of Kaggle iMaterialist (Fashion) 2019 at FGVC6

Stars: ✭ 451 (-24.71%)

Mutual labels: kaggle

U Net Brain Tumor

U-Net Brain Tumor Segmentation

Stars: ✭ 399 (-33.39%)

Mutual labels: medical-imaging

Ctk

A set of common support code for medical imaging, surgical navigation, and related purposes.

Stars: ✭ 498 (-16.86%)

Mutual labels: medical-imaging

Pytorch Unet

PyTorch implementation of the U-Net for image semantic segmentation with high quality images

Stars: ✭ 4,770 (+696.33%)

Mutual labels: kaggle

Multi Class Text Classification Cnn Rnn

Classify Kaggle San Francisco Crime Description into 39 classes. Build the model with CNN, RNN (GRU and LSTM) and Word Embeddings on Tensorflow.

Stars: ✭ 570 (-4.84%)

Mutual labels: kaggle

Data Science Ipython Notebooks

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

Stars: ✭ 22,048 (+3580.8%)

Mutual labels: kaggle

Medicalzoopytorch

A pytorch-based deep learning framework for multi-modal 2D/3D medical image segmentation

Stars: ✭ 546 (-8.85%)

Mutual labels: medical-imaging

Dipy

DIPY is the paragon 3D/4D+ imaging library in Python. Contains generic methods for spatial normalization, signal processing, machine learning, statistical analysis and visualization of medical images. Additionally, it contains specialized methods for computational anatomy including diffusion, perfusion and structural imaging.

Stars: ✭ 417 (-30.38%)

Mutual labels: medical-imaging

D2l Vn

Một cuốn sách tương tác về học sâu có mã nguồn, toán và thảo luận. Đề cập đến nhiều framework phổ biến (TensorFlow, Pytorch & MXNet) và được sử dụng tại 175 trường Đại học.

Stars: ✭ 402 (-32.89%)

Mutual labels: kaggle

Robot Surgery Segmentation

Wining solution and its improvement for MICCAI 2017 Robotic Instrument Segmentation Sub-Challenge

Stars: ✭ 528 (-11.85%)

Mutual labels: medical-imaging

Open Solution Home Credit

Open solution to the Home Credit Default Risk challenge 🏡

Stars: ✭ 397 (-33.72%)

Mutual labels: kaggle

Tutorials

CatBoost tutorials repository

Stars: ✭ 563 (-6.01%)

Mutual labels: kaggle

U Net

U-Net: Convolutional Networks for Biomedical Image Segmentation

Stars: ✭ 374 (-37.56%)

Mutual labels: medical-imaging

Machinejs

[UNMAINTAINED] Automated machine learning- just give it a data file! Check out the production-ready version of this project at ClimbsRocks/auto_ml

Stars: ✭ 412 (-31.22%)

Mutual labels: kaggle

Kaggle Homedepot

3rd Place Solution for HomeDepot Product Search Results Relevance Competition on Kaggle.

Stars: ✭ 452 (-24.54%)

Mutual labels: kaggle

Data Science Competitions

Goal of this repo is to provide the solutions of all Data Science Competitions(Kaggle, Data Hack, Machine Hack, Driven Data etc...).

Stars: ✭ 572 (-4.51%)

Mutual labels: kaggle

View All Similar Projects ➔

Kaggle national datascience bowl 2017 2nd place code

This is the source code for my part of the 2nd place solution to the National Data Science Bowl 2017 hosted by Kaggle.com. For documenation about the approach go to: http://juliandewit.github.io/kaggle-ndsb2017/
Note that this is my part of the code.
The work of my teammate Daniel Hammack can be found here: https://github.com/dhammack/DSB2017

Dependencies & data

The solution is built using Keras with a tensorflow backend on windows 64bit. Next to this I used scikit-learn, pydicom, simpleitk, beatifulsoup, opencv and XgBoost. All in all it was quite an engineering effort.

General

The source is cleaned up as much as possible. However I was afraid that results would not be 100% reproducible if I changed too much. Therefore some pieces could be a bit cleaner. Also I left in some bugs that I found while cleaning up. (See end of this document),

The solution relies on manual labels, generated labels and 2 resulting submissions from team member Daniel Hammack. These files are all in the "resources" map. All other file location can be configured in the settings.py. The raw patient data must be downloaded from the Kaggle website and the LUNA16 website.

Trained models as provided to Kaggle after phase 1 are also provided through the following download: https://retinopaty.blob.core.windows.net/ndsb3/trained_models.rar

The solution is a combination of nodule detectors/malignancy regressors. My two parts are trained with LUNA16 data with a mix of positive and negative labels + malignancy info from the LIDC dataset. My second part also uses some manual annotations made on the NDSB3 trainset. Predictions are generated from the raw nodule/malignancy predictions combined with the location information and general “mass” information. Masses are no nodules but big suspicious tissues present in the CT-images. De masses are detected with a U-net trained with manual labels.

The 3rd and 4th part of te solution come from Daniel Hammack. The final solution is a blend of the 4 different part. Blending is done by taking a simple average.

Preprocessing

First run step1_preprocess_ndsb.py. This will extract all the ndsb dicom files , scale to 1x1x1 mm, and make a directory containing .png slice images. Lung segmentation mask images are also generated. They will be used later in the process for faster predicting. Then run step1_preprocess_luna16.py. This will extract all the LUNA source files , scale to 1x1x1 mm, and make a directory containing .png slice images. Lung segmentation mask images are also generated. This step also generates various CSV files for positive and negative examples.

The nodule detectors are trained on positive and negative 3d cubes which must be generated from the LUNA16 and NDSB datasets. step1b_preprocess_make_train_cubes.py takes the different csv files and cuts out 3d cubes from the patient slices. The cubes are saved in different directories. resources/step1_preprocess_mass_segmenter.py is to generate the mass u-net trainset. It can be run but the generated resized images + labels is provided in this archive so this step does not need to be run. However, this file can be used to regenerate the traindata.

Training neural nets

First train the 3D convnets that detect nodules and predict malignancy. This can be done by running the step2_train_nodule_detector.py file. This will train various combinations of positive and negative labels. The resulting models (NAMES) are stored in the ./workdir directory and the final results are copied to the models folder. The mass detector can be trained using step2_train_mass_segmenter.py. It trains 3 folds and final models are stored in the models (names) folder. Training the 3D convnets will be around 10 hours per piece. The 3 mass detector folds will take around 8 hours in total

Predicting neural nets

Once trained or downloaded through the url (https://retinopaty.blob.core.windows.net/ndsb3/trained_models.rar) the models are placed in the ./models/ directory. From there the nodule detector step3_predict_nodules.py can be run to detect nodules in a 3d grid per patient. The detected nodules and predicted malignancy are stored per patient in a separate directory. The masses detector is already run through the step2_train_mass_segmenter.py and will stored a csv with estimated masses per patient.

Training of submissions, combining submissions for final submission.

Based on the per-patient csv’s the masses.csv and other metadata we will train an xgboost model to generate submissions (step4_train_submissions.py). There are 3 levels of submissions. First the per-model submissions. (level1). Different models are combined in level2, and Daniel’s submissions are added. These level 2 submissions will be combined (averaged) into one final submission. Below are the different models that will be generated/combined.

Level 1:
Luna16_fs (trained on full luna16 set)
Luna16_ndsbposneg v1 (trained on luna16 + manual pos/neg labels in ndsb)
Luna16_ndsbposneg v2 (trained on luna16 + manual pos/neg labels in ndsb)
Daniel model 1
Daniel model 2
posneg, daniel will be averaged into one level 2 model
Level 2.
Luna16_fs
Luna16_ndsbposneg
Daniel

These 3 models will be averaged into 1 final_submission.csv

Bugs and suggestions.

First of all. Duringing cleanup I noticed that I missed 10% of the LUNA16 patients because I overlooked subset0. That might be a 100.000 dollar mistake. For reprodicibility reasons I kept the bug in. In settings.py you can adjust the code to also take this subset into account.

Suggestions for improvement would be:

Take the 10% extra LUNA16 condidates.
Use different blends of the positive and negative labels
Other neural network architectures.
Etc..

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 599

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (42) 🔗