All Projects → zixuanweeei → tianchi-diabetes

zixuanweeei / tianchi-diabetes

Licence: other
天池精准医疗大赛——人工智能辅助糖尿病遗传风险预测 第一赛季

Programming Languages

Jupyter Notebook
11667 projects
python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to tianchi-diabetes

ttbbeer
An R Dataset Package for US Beer Statistics From TTB 🍺
Stars: ✭ 23 (+15%)
Mutual labels:  data-analysis
ospi
Open Source Presence Infographic of Indian Startups
Stars: ✭ 25 (+25%)
Mutual labels:  data-analysis
dataquest-guided-projects-solutions
My dataquest project solutions
Stars: ✭ 35 (+75%)
Mutual labels:  data-analysis
DataProfiler
What's in your data? Extract schema, statistics and entities from datasets
Stars: ✭ 843 (+4115%)
Mutual labels:  data-analysis
Chapter-2
Code examples for Chapter 2 of Data Wrangling with JavaScript
Stars: ✭ 16 (-20%)
Mutual labels:  data-analysis
mixedvines
Python package for canonical vine copula trees with mixed continuous and discrete marginals
Stars: ✭ 36 (+80%)
Mutual labels:  data-analysis
architect big data solutions with spark
code, labs and lectures for the course
Stars: ✭ 40 (+100%)
Mutual labels:  data-analysis
computational-neuroscience
Short undergraduate course taught at University of Pennsylvania on computational and theoretical neuroscience. Provides an introduction to programming in MATLAB, single-neuron models, ion channel models, basic neural networks, and neural decoding.
Stars: ✭ 36 (+80%)
Mutual labels:  data-analysis
PandasVersusExcel
Python数据分析入门,数据分析师入门
Stars: ✭ 120 (+500%)
Mutual labels:  data-analysis
elucidate
convenience functions to help researchers elucidate patterns in their data
Stars: ✭ 26 (+30%)
Mutual labels:  data-analysis
Infinite Stories with Data
This repo consists of my analysis of random datasets using various statistical and visualization techniques.
Stars: ✭ 21 (+5%)
Mutual labels:  data-analysis
python ml tutorial
A complete tutorial in python for Data Analysis and Machine Learning
Stars: ✭ 118 (+490%)
Mutual labels:  data-analysis
iMOKA
interactive Multi Objective K-mer Analysis
Stars: ✭ 19 (-5%)
Mutual labels:  data-analysis
Loan-Approval-Prediction
Loan Application Data Analysis
Stars: ✭ 61 (+205%)
Mutual labels:  data-analysis
online-course-recommendation-system
Built on data from Pluralsight's course API fetched results. Works with model trained with K-means unsupervised clustering algorithm.
Stars: ✭ 31 (+55%)
Mutual labels:  data-analysis
IndexedTables.jl
Flexible tables with ordered indices
Stars: ✭ 108 (+440%)
Mutual labels:  data-analysis
Moose
MOOSE - Platform for software and data analysis.
Stars: ✭ 110 (+450%)
Mutual labels:  data-analysis
Fraud-Detection-in-Online-Transactions
Detecting Frauds in Online Transactions using Anamoly Detection Techniques Such as Over Sampling and Under-Sampling as the ratio of Frauds is less than 0.00005 thus, simply applying Classification Algorithm may result in Overfitting
Stars: ✭ 41 (+105%)
Mutual labels:  data-analysis
RepSeP
Reproducible Self-Publishing - Demo Publications in the Most Common Formats
Stars: ✭ 14 (-30%)
Mutual labels:  data-analysis
advanced-pandas
Pandas is a powerful tool for data exploration and analysis (including timeseries).
Stars: ✭ 22 (+10%)
Mutual labels:  data-analysis

天池精准医疗大赛——人工智能辅助糖尿病遗传风险预测 第一赛季

这次大赛第一赛季的主题,是通过对病人的临床数据和体检指标来预测其血糖值。 大赛提供的训练数据包含病人的性别、体检日期以及血常规、肾功能检查等指标, 每个指标分别作为一个字段储存在数据表中。最后一列为我们要预测的血糖值。

该repo记录了参加本次大赛的各种数据探索、特征工程、特征选择、交叉验证模型以及 线上提交模型。虽然,最后的结果并不尽人意, 但是从doufu大佬和wufei大佬那里学到了很多。

basic_analysis & offline

这个文件夹下,包含了最开始的数据探索和线下的交叉验证模型。 通过数据探索,了解了数据的大体分布情况。

线下模型从一开始的按性别划分,分别进行训练预测,演变成直接把性别作为特征 全量训练预测。期间,doufu大佬开源的基于交叉验证的LightGBM融合模型给了我很大的启发。 相信排行榜中,有不少参赛队是在那份开源代码的基础上修改来的。 而那份代码确实“四两拨千斤”,仅仅用原始特征就取得了较好的结果。

后来,看到wufei大佬的融合方案,借鉴了其nn模型,并利用其nn模型最后隐层的26个Batch-Normalized的输出 作为特征输入到LightGBM进行测试,线下也得到了提高。同时,也与nn模型的结果进行了融合。

online

包含了线上提交所使用的各种模型。最后,提交所使用的模型为lgb_nn_ensembing.py, 融合了nn模型与LightGBM,同时将nn模型的隐层输出作为特征加入到LightGBM中。

util

包含了特征工程、评价函数、模型参数。

缺失数据采用随机森林填充,特征间的相关性,比各特征与血糖值之间高多了。

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].