All Projects → ikibardin → kaggle-camera-model-identification

ikibardin / kaggle-camera-model-identification

Licence: other
Code for reproducing 2nd place solution for Kaggle competition IEEE's Signal Processing Society - Camera Model Identification

Programming Languages

python
139335 projects - #7 most used programming language
shell
77523 projects

Projects that are alternatives of or similar to kaggle-camera-model-identification

awesome-kaggle-kernels
Compilation of good Kaggle Kernels.
Stars: ✭ 51 (-20.31%)
Mutual labels:  kaggle
histopathologic cancer detector
CNN histopathologic tumor identifier.
Stars: ✭ 26 (-59.37%)
Mutual labels:  kaggle
fer
Facial Expression Recognition
Stars: ✭ 32 (-50%)
Mutual labels:  kaggle
Kaggle-Cdiscount-Image-Classification-Challenge
No description or website provided.
Stars: ✭ 15 (-76.56%)
Mutual labels:  kaggle
Data-Science-Articles
A collection of my blogs on Data Science and Machine learning.
Stars: ✭ 66 (+3.13%)
Mutual labels:  kaggle
open-solution-cdiscount-starter
Open solution to the Cdiscount’s Image Classification Challenge
Stars: ✭ 20 (-68.75%)
Mutual labels:  kaggle
kaggle-berlin
Material of the Kaggle Berlin meetup group!
Stars: ✭ 36 (-43.75%)
Mutual labels:  kaggle
gender-unbiased BERT-based pronoun resolution
Source code for the ACL workshop paper and Kaggle competition by Google AI team
Stars: ✭ 42 (-34.37%)
Mutual labels:  kaggle
kdsb17
Gaussian Mixture Convolutional AutoEncoder applied to CT lung scans from the Kaggle Data Science Bowl 2017
Stars: ✭ 18 (-71.87%)
Mutual labels:  kaggle
kaggledatasets
Collection of Kaggle Datasets ready to use for Everyone (Looking for contributors)
Stars: ✭ 44 (-31.25%)
Mutual labels:  kaggle
digit recognizer
CNN digit recognizer implemented in Keras Notebook, Kaggle/MNIST (0.995).
Stars: ✭ 27 (-57.81%)
Mutual labels:  kaggle
kaggle-airbnb
🌍 Where will a new guest book their first travel experience?
Stars: ✭ 53 (-17.19%)
Mutual labels:  kaggle
MSDS696-Masters-Final-Project
Earthquake Prediction Challenge with LightGBM and XGBoost
Stars: ✭ 58 (-9.37%)
Mutual labels:  kaggle
Kaggle-Quora-Question-Pairs
This is our team's solution report, which achieves top 10% (305/3307) in this competition.
Stars: ✭ 58 (-9.37%)
Mutual labels:  kaggle
open-solution-ship-detection
Open solution to the Airbus Ship Detection Challenge
Stars: ✭ 54 (-15.62%)
Mutual labels:  kaggle
Recruit-Restaurant-Visitor-Forecasting
6th place solution for Recruit-Restaurant-Visitor-Forecasting
Stars: ✭ 16 (-75%)
Mutual labels:  kaggle
kaggle-malware-classification
Kaggle "Microsoft Malware Classification Challenge". 6th place solution
Stars: ✭ 29 (-54.69%)
Mutual labels:  kaggle
PyData-Pseudolabelling-Keynote
Accompanying notebook and sources to "A Guide to Pseudolabelling: How to get a Kaggle medal with only one model" (Dec. 2020 PyData Boston-Cambridge Keynote)
Stars: ✭ 23 (-64.06%)
Mutual labels:  kaggle
Data-Science-Projects
Data Science projects on various problem statements and datasets using Data Analysis, Machine Learning Algorithms, Deep Learning Algorithms, Natural Language Processing, Business Intelligence concepts by Python
Stars: ✭ 28 (-56.25%)
Mutual labels:  kaggle
Dog-Breed-Identification-Gluon
Kaggle 120种狗分类,Gluon实现
Stars: ✭ 45 (-29.69%)
Mutual labels:  kaggle

Kaggle IEEE's Signal Processing Society - Camera Model Identification

Implementation of camera model identification system by team "[ods.ai] GPU_muscles" (2nd place overall in Kaggle competition IEEE's Signal Processing Society - Camera Model Identification and 1st place among student eligible teams).

Should any questions arise regarding the solution, please do not hesitate to contact me on Telegram or via e-mail [email protected]

Our team

Requirements

To train models and get predictions the following is required:

  • OS: Ubuntu 16.04
  • Python 3.6
  • Hardware:
    • Any decent modern computer with x86-64 CPU,
    • 32 GB RAM
    • 4 x Nvidia GeForce GTX 1080 Ti

Installation

  1. Install required OS and Python
  2. Install packages with pip install -r requirements.txt
  3. Create data folder at the root of the repository. Place train dataset from Kaggle competition to data/train. Place test dataset from Kaggle competition to data/test. Place additional validation images to data/val_images.
  4. Place se_resnet50.pth and se_resnext50.pth to imagenet_pretrain folder.
  5. Place the following final weights to final_weights folder:
    • densenet161_28_0.08377413648371115.pth
    • densenet161_55_0.08159203971706519.pth
    • densenet161_45_0.0813179751742137.pth
    • dpn92_tune_11_0.1398952918197271.pth
    • dpn92_tune_23_0.12260739478774665.pth
    • dpn92_tune_29_0.14363511492280367.pth

Producing the final submission

Run bash final_submit.sh -d <folder with test images> -o <output .csv filename>

Training ensemble from scratch

This section describes the steps required to train our ensemble.

1. Download external dataset

Images from both Yandex.Fotki and Flickr are essential for reproducing our solution.

Downloading images from Yandex.Fotki

Run bash download_from_yandex.sh

Downloading images from Flickr

Unfortunately, this step involves some manual actions.

  1. cd into downloader/flickr
  2. For every model go to the telephone model group page from flickr_groups.txt. Scroll every gallery page to the end and download as html file to the corresponding folder. As a result you will have a set of folders with .html files corresponding to a specific phone model at html_pages folder.
  3. Run python pages_to_image_links.py. The result of the script will be folder links of .csv files with links to photos of each phone model.
  4. Run python download_from_links.py to download images from the links received in the previous paragraph (previous two steps could be skipped, because the links folder already contains necessary files).

2. Filter external dataset

Run bash filter.sh

3. Train the ensemble

  1. Download and filter external dataset as described above.
  2. Run bash init_train.sh to train 9 models.
  3. Run bash make_pseudo.sh to get predictions from these models for images at data/test and create pseudo labels.
  4. Run bash final_train.sh to train the same 9 models but using pseudo labels this time.
  5. Run bash predict.sh -d <folder with test images> -o <output .csv filename> to get predictions from the ensemble.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].