All Projects → Wrosinski → Kaggle Quora

Wrosinski / Kaggle Quora

Licence: mit
Kaggle Quora Questions Pairs Competition

Projects that are alternatives of or similar to Kaggle Quora

Carvana Challenge
My repository for the Carvana Image Masking Challenge
Stars: ✭ 83 (-1.19%)
Mutual labels:  jupyter-notebook
Pandas Tutorial
适合初级到中级晋升者,有了体系之后就看熟练度了。
Stars: ✭ 1,250 (+1388.1%)
Mutual labels:  jupyter-notebook
Textsum
Stars: ✭ 84 (+0%)
Mutual labels:  jupyter-notebook
Torchtexttutorial
A short tutorial for Torchtext, the NLP-specific add-on for Pytorch.
Stars: ✭ 83 (-1.19%)
Mutual labels:  jupyter-notebook
Pylbm
Numerical simulations using flexible Lattice Boltzmann solvers
Stars: ✭ 83 (-1.19%)
Mutual labels:  jupyter-notebook
Python script Manual
《Python工具代码速查手册》是我们的python培训教材,主要面向数据分析方向。其中包含了python的常用总结性操作,使用jupyter notebook,利用markdown和script结果对常用操作进行总结,包括了使用方式和脚本。之所以使用notebook形式是可以方便大家编辑,方便大家形成自己的总结笔记。当然各位有更好的操作建议也欢迎向我们团队分享~
Stars: ✭ 84 (+0%)
Mutual labels:  jupyter-notebook
Deep Knowledge Tracing Plus
It is the DKT+ model implemented in python3 and tensorflow1.2
Stars: ✭ 83 (-1.19%)
Mutual labels:  jupyter-notebook
Cytokit
Microscopy Image Cytometry Toolkit
Stars: ✭ 84 (+0%)
Mutual labels:  jupyter-notebook
Ml
Machine learning projects, often on audio datasets
Stars: ✭ 83 (-1.19%)
Mutual labels:  jupyter-notebook
Tensor Learning
Python codes for low-rank tensor factorization, tensor completion, and tensor regression techniques.
Stars: ✭ 83 (-1.19%)
Mutual labels:  jupyter-notebook
Gancs
Compressed Sensing MRI based on Deep Generative Adversarial Network
Stars: ✭ 83 (-1.19%)
Mutual labels:  jupyter-notebook
Islr Python
Stars: ✭ 83 (-1.19%)
Mutual labels:  jupyter-notebook
Cloudtestdrive
Stars: ✭ 84 (+0%)
Mutual labels:  jupyter-notebook
Jupyter case studies
Inference case studies in jupyter
Stars: ✭ 83 (-1.19%)
Mutual labels:  jupyter-notebook
Airflow project
scaffold of Apache Airflow executing Docker containers
Stars: ✭ 84 (+0%)
Mutual labels:  jupyter-notebook
Logomaker
Software for the visualization of sequence-function relationships
Stars: ✭ 83 (-1.19%)
Mutual labels:  jupyter-notebook
Covid19 Data
COVID-19 workflows and datasets.
Stars: ✭ 84 (+0%)
Mutual labels:  jupyter-notebook
Pulmonary Nodules Segmentation
Tianchi medical AI competition [Season 1]: Lung nodules image segmentation of U-Net. U-Net训练基于卷积神经网络的肺结节分割器
Stars: ✭ 84 (+0%)
Mutual labels:  jupyter-notebook
Automatic Image Captioning
Generating Captions for images using Deep Learning
Stars: ✭ 84 (+0%)
Mutual labels:  jupyter-notebook
Feature Selection Techniques
Stars: ✭ 84 (+0%)
Mutual labels:  jupyter-notebook

Kaggle Quora Questions Pairs Competition

14th place solution. My part. Code is uncleaned, latest versions are uploaded. Not every feature, that can be created with features notebooks was contained in final model - idea of this repository is to give more of an overview of methods used and those that could be used for similar problems.

Big thanks to the authors of all kernels & posts, which were of great inspiration and some features were derived based on them.

Features

  • Data Encoding:
    • Pipeline for text cleaning using Textacy
    • Lemmatization
    • Stemming
    • NER Encoding (based on Kernel)
  • NLP Features:
    • Features based on Kaggle Kernels & Discussions posts by: Abhishek, SRK, Jared Turkewitz, the_1owl, Mephistopheles & more
    • Latent Semantic Analysis, Latent Dirichlet Allocation, tSVD
    • Word2Vec
    • Doc2Vec
    • Distances based on data transformations - similarity measures
    • Textacy-based features
    • KNN-based features
  • Magic Features:
    • Jared Turkewitz's frequency features
    • NetworkX features

& some more.

Models:

  • XGB & LGBM models
    • Training
    • BayesianOptimization
    • Test Predictions
  • SpaCy Decomposable Attention Model on Quora data
  • LSTM Experiments
  • MLP models
  • Stacking
    • Sklearn Models Ensemble
    • Stacking with LGBM
    • Finding weights for ensemble using Scipy minimize function in-fold
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].