All Projects → mratsim → Mckinsey Smartcities Traffic Prediction

mratsim / Mckinsey Smartcities Traffic Prediction

Adventure into using multi attention recurrent neural networks for time-series (city traffic) for the 2017-11-18 McKinsey IronMan (24h non-stop) prediction challenge

Projects that are alternatives of or similar to Mckinsey Smartcities Traffic Prediction

Mydatascienceportfolio
Applying Data Science and Machine Learning to Solve Real World Business Problems
Stars: ✭ 227 (+363.27%)
Mutual labels:  jupyter-notebook, data-science, neural-networks
Data Science
Collection of useful data science topics along with code and articles
Stars: ✭ 315 (+542.86%)
Mutual labels:  jupyter-notebook, data-science, time-series
Pycaret
An open-source, low-code machine learning library in Python
Stars: ✭ 4,594 (+9275.51%)
Mutual labels:  jupyter-notebook, data-science, time-series
Radio
RadIO is a library for data science research of computed tomography imaging
Stars: ✭ 198 (+304.08%)
Mutual labels:  jupyter-notebook, data-science, neural-networks
Sciblog support
Support content for my blog
Stars: ✭ 694 (+1316.33%)
Mutual labels:  jupyter-notebook, data-science, neural-networks
Unsupervisedscalablerepresentationlearningtimeseries
Unsupervised Scalable Representation Learning for Multivariate Time Series: Experiments
Stars: ✭ 205 (+318.37%)
Mutual labels:  jupyter-notebook, time-series, neural-networks
Deltapy
DeltaPy - Tabular Data Augmentation (by @firmai)
Stars: ✭ 344 (+602.04%)
Mutual labels:  jupyter-notebook, data-science, time-series
Scipy con 2019
Tutorial Sessions for SciPy Con 2019
Stars: ✭ 142 (+189.8%)
Mutual labels:  jupyter-notebook, data-science, time-series
Tsfresh
Automatic extraction of relevant features from time series:
Stars: ✭ 6,077 (+12302.04%)
Mutual labels:  jupyter-notebook, data-science, time-series
Edward
A probabilistic programming language in TensorFlow. Deep generative models, variational inference.
Stars: ✭ 4,674 (+9438.78%)
Mutual labels:  jupyter-notebook, data-science, neural-networks
Lstm anomaly thesis
Anomaly detection for temporal data using LSTMs
Stars: ✭ 178 (+263.27%)
Mutual labels:  jupyter-notebook, time-series, neural-networks
Awesome Ai Ml Dl
Awesome Artificial Intelligence, Machine Learning and Deep Learning as we learn it. Study notes and a curated list of awesome resources of such topics.
Stars: ✭ 831 (+1595.92%)
Mutual labels:  jupyter-notebook, time-series, neural-networks
Fixy
Amacımız Türkçe NLP literatüründeki birçok farklı sorunu bir arada çözebilen, eşsiz yaklaşımlar öne süren ve literatürdeki çalışmaların eksiklerini gideren open source bir yazım destekleyicisi/denetleyicisi oluşturmak. Kullanıcıların yazdıkları metinlerdeki yazım yanlışlarını derin öğrenme yaklaşımıyla çözüp aynı zamanda metinlerde anlamsal analizi de gerçekleştirerek bu bağlamda ortaya çıkan yanlışları da fark edip düzeltebilmek.
Stars: ✭ 165 (+236.73%)
Mutual labels:  jupyter-notebook, data-science, neural-networks
Tutorials
AI-related tutorials. Access any of them for free → https://towardsai.net/editorial
Stars: ✭ 204 (+316.33%)
Mutual labels:  jupyter-notebook, data-science, neural-networks
Ml Workspace
🛠 All-in-one web-based IDE specialized for machine learning and data science.
Stars: ✭ 2,337 (+4669.39%)
Mutual labels:  jupyter-notebook, data-science, neural-networks
Probability
Probabilistic reasoning and statistical analysis in TensorFlow
Stars: ✭ 3,550 (+7144.9%)
Mutual labels:  jupyter-notebook, data-science, neural-networks
Codesearchnet
Datasets, tools, and benchmarks for representation learning of code.
Stars: ✭ 1,378 (+2712.24%)
Mutual labels:  jupyter-notebook, data-science, neural-networks
Sigmoidal ai
Tutoriais de Python, Data Science, Machine Learning e Deep Learning - Sigmoidal
Stars: ✭ 103 (+110.2%)
Mutual labels:  jupyter-notebook, data-science, neural-networks
Edward2
A simple probabilistic programming language.
Stars: ✭ 419 (+755.1%)
Mutual labels:  jupyter-notebook, data-science, neural-networks
H1st
The AI Application Platform We All Need. Human AND Machine Intelligence. Based on experience building AI solutions at Panasonic: robotics predictive maintenance, cold-chain energy optimization, Gigafactory battery mfg, avionics, automotive cybersecurity, and more.
Stars: ✭ 697 (+1322.45%)
Mutual labels:  jupyter-notebook, data-science, time-series

McKinsey-SmartCities-Traffic-Prediction

Adventure into using NN for time-series for the 20171118 McKinsey IronMan (24h non-stop) prediction challenge

This was a code I created without sleeping for the following challenge: https://datahack.analyticsvidhya.com/contest/mckinsey-analytics-hackathon/

Problem statement

Mission: You are working with the government to transform your city into a smart city. The vision is to convert it into a digital and intelligent city to improve the efficiency of services for the citizens. One of the problems faced by the government is traffic. You are a data scientist working to manage the traffic of the city better and to provide input on infrastructure planning for the future.

The government wants to implement a robust traffic system for the city by being prepared for traffic peaks. They want to understand the traffic patterns of the four junctions of the city. Traffic patterns on holidays, as well as on various other occasions during the year, differ from normal working days. This is important to take into account for your forecasting.

Your task: To predict traffic patterns in each of these four junctions for the next 4 months.

Data: The sensors on each of these junctions were collecting data at different times, hence you will see traffic data from different time periods. To add to the complexity, some of the junctions have provided limited or sparse data requiring thoughtfulness when creating future projections. Depending upon the historical data of 20 months, the government is looking to you to deliver accurate traffic projections for the coming four months. Your algorithm will become the foundation of a larger transformation to make your city smart and intelligent.

The evaluation metric for the competition is RMSE. Public-Private split for the competition is 25:75.

Exploratory Data Analysis (EDA)

See here

We have 48120 point of training data (data each hour from 2015-11-01 to 2017-06-30 for 4 junctions) And 11808 points to predict

Approach

Instead of using time-series classics ARMA (auto-regressive moving average) and ARIMA (autoregressive integrated moving average) models or the Kaggle competition classic XGBoost, I choose to try my hand at neural networks.

Given the time constraint, I had to use Keras for quicker prototyping and more documentations even though my preferred framework is PyTorch.

The direct consequence is unoptimized seq2seq as I couldn't share weights between RNNs in Keras at the time (Nov2017).

Architecture

I used a multi-attention Recurrent Neural Network defined as below to capture lag features.


def attention_n_days_ago(inputs, days_ago):
    # inputs.shape = (batch_size, time_steps, input_dim)
    time_steps = days_ago * 24
    suffix = str(days_ago) +'_days'

    # We compute the attention over the seq_len
    a = Permute((2, 1),
                name='Attn_Permute1_' + suffix)(inputs)
    a = Dense(time_steps,
              activation='softmax',
              name='Attn_DenseClf_' + suffix)(a)

    # Now we convolute so that it averages over the whole time window
    feats_depth = int(inputs.shape[2])
    avg = Lambda(lambda x: K.expand_dims(x, axis = 1),
                 name='Attn_Unsqueeze_' + suffix)(inputs)
    avg = SeparableConv2D(feats_depth, (1,1),
                          name='Attn_DepthConv_' + suffix)(avg)
    avg = Lambda(lambda x: K.squeeze(x, 1),
                 name='Attn_Squeeze_'+ str(days_ago) + '_days')(avg)


    a_probs = Permute((2, 1),
                      name='Attn_Permute1_' + suffix)(avg)
    # out = Multiply(name='Attn_mul_'+ suffix)([inputs, a_probs])
    out = Concatenate(name='Attn_cat_'+ suffix)([inputs, a_probs])
    return out

def Net(num_feats, seq_len, num_hidden, num_outputs):
    x = Input(shape=(seq_len, num_feats))

    # Encoder RNNs
    enc = CuDNNGRU(seq_len,
                   return_sequences=True,
                   stateful = False,
                   name = 'Encoder_RNN')(x)

    # Attention decoders (lag features)
    attention_0d = attention_n_days_ago(enc, 0)
    attention_1d = attention_n_days_ago(enc, 1)
    attention_2d = attention_n_days_ago(enc, 2)
    attention_4d = attention_n_days_ago(enc, 4)
    attention_1w = attention_n_days_ago(enc, 7)
    attention_2w = attention_n_days_ago(enc, 14)
    attention_1m = attention_n_days_ago(enc, 30)
    attention_2m = attention_n_days_ago(enc, 60)
    attention_1q = attention_n_days_ago(enc, 92)
    attention_6m = attention_n_days_ago(enc, 184)
    attention_3q = attention_n_days_ago(enc, 276)
    attention_1y = attention_n_days_ago(enc, 365)

    att = Concatenate(name='attns_cat', axis = 1)([attention_0d,
                                                   attention_1d,
                                                   attention_2d,
                                                   attention_4d,
                                                   attention_1w,
                                                   attention_2w,
                                                   attention_1m,
                                                   attention_2m,
                                                   attention_1q,
                                                   attention_6m,
                                                   attention_3q,
                                                   attention_1y])

    # How to merge? concat, mul, add, use Dense Layer or convolution ?

    att = Dense(seq_len, activation=None, name='Dense_merge_attns')(att)
    # att = Lambda(lambda x: softmax(x, axis = 1),
    #              name='Dense_merge_3D_softmax')(att) # Flatten along the concat axis

    # Decoder RNN
    dec = CuDNNGRU(num_hidden,
                   return_sequences=False,
                   stateful = False,
                   name='Decoder_RNN')(att)

    # Regressor
    # Note that Dense is automatically TimeDistributed in Keras 2
    out = Dense(num_outputs, activation=None,
                name = 'Classifier')(dec) # no activation for regression

    model = Model(inputs=x, outputs=out)

    model.compile(loss= root_mean_squared_error, optimizer = optim)
    return model

Important note: make sure to use CUDNNGru and CUDNNLSTM because the default GRU and LSTM are implemented by Google in Tensorflow while the CuDNN were by Nvidia. Google version is slow

Results

Teacher forcing by predicting 48 hours (bassed on real historical values):

Predicting the whole 3 months based on previous predictions:

Note: the optimizer chosen has a big influence on the last part, using RMSprop instead of Adam would give me some response during the first week and then 0 traffic the whole time left.

I then ran out of time to debug the issue of my model predicting a sinusoid.

Future work

I probably should reimplement that in a dynamic framework like PyTorch to share the state between the RNN. Furthermore ARMA/ARIMA capture the general trend but as shown by the 48 hours prediction, my model can capture fast change quite well. So stacking both + an xgboost model should improve the results a lot.

An alternative approach would be to use WaveNet and pure CNNs instead of RNNs

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].