All Projects → dipanjanS → Deep_transfer_learning_nlp_dhs2019

dipanjanS / Deep_transfer_learning_nlp_dhs2019

Licence: gpl-3.0
Contains the code and deck for the presentation on Applying Deep Transfer Learning for NLP in Analytics Vidhya's DataHack Summit 2019

Projects that are alternatives of or similar to Deep transfer learning nlp dhs2019

Keras Segnet Basic
SegNet-Basic with Keras
Stars: ✭ 80 (-1.23%)
Mutual labels:  jupyter-notebook
Mimic Code
MIMIC Code Repository: Code shared by the research community for the MIMIC-III database
Stars: ✭ 1,225 (+1412.35%)
Mutual labels:  jupyter-notebook
Fnn
Embed strange attractors using a regularizer for autoencoders
Stars: ✭ 81 (+0%)
Mutual labels:  jupyter-notebook
Captcha Decoder
Stars: ✭ 80 (-1.23%)
Mutual labels:  jupyter-notebook
Attention Transfer
Improving Convolutional Networks via Attention Transfer (ICLR 2017)
Stars: ✭ 1,231 (+1419.75%)
Mutual labels:  jupyter-notebook
D3 Js Step By Step
http://zeroviscosity.com/category/d3-js-step-by-step
Stars: ✭ 80 (-1.23%)
Mutual labels:  jupyter-notebook
Quickstart Python
Stars: ✭ 80 (-1.23%)
Mutual labels:  jupyter-notebook
Kgtk
Knowledge Graph Toolkit
Stars: ✭ 81 (+0%)
Mutual labels:  jupyter-notebook
Fcn.tensorflow
Tensorflow implementation of Fully Convolutional Networks for Semantic Segmentation (http://fcn.berkeleyvision.org)
Stars: ✭ 1,230 (+1418.52%)
Mutual labels:  jupyter-notebook
Wellnessconversation Languagemodel
Korean Language Model을 이용한 심리상담 대화 언어 모델.
Stars: ✭ 80 (-1.23%)
Mutual labels:  jupyter-notebook
Depthprediction
A tool to predict the depth field of a 2-dimensional image
Stars: ✭ 80 (-1.23%)
Mutual labels:  jupyter-notebook
Hands On Algorithmic Problem Solving
A middle-to-high level algorithm book designed with coding interview at heart!
Stars: ✭ 1,227 (+1414.81%)
Mutual labels:  jupyter-notebook
Tutorials2021
Mediterranean Machine Learning school tutorials
Stars: ✭ 81 (+0%)
Mutual labels:  jupyter-notebook
Odscon Sf 2015
Material for ODSCON San Francisco 2015
Stars: ✭ 80 (-1.23%)
Mutual labels:  jupyter-notebook
Dl in nlp deeppavlov cs224n spring2020
"Deep Learning in Natural Language Processing" - a course by DeepPavlov built on top of Stanford's cs224n
Stars: ✭ 81 (+0%)
Mutual labels:  jupyter-notebook
Nd101
记录自己深度学习之路的点滴
Stars: ✭ 80 (-1.23%)
Mutual labels:  jupyter-notebook
Talks odt
Slides and materials for most of my talks by year
Stars: ✭ 80 (-1.23%)
Mutual labels:  jupyter-notebook
Lifetime value
Stars: ✭ 81 (+0%)
Mutual labels:  jupyter-notebook
Object Detection On Thermal Images
Robust Object Classification of Occluded Objects in Forward Looking Infrared (FLIR) Cameras
Stars: ✭ 81 (+0%)
Mutual labels:  jupyter-notebook
Style Semantics
Code for the paper "Controlling Style and Semantics in Weakly-Supervised Image Generation", ECCV 2020
Stars: ✭ 81 (+0%)
Mutual labels:  jupyter-notebook

Applying Deep Transfer Learning for Natural Language Processing (NLP)

Handling tough real-world problems in Natural Language Processing (NLP) include tackling with class imbalance and the lack of availability of enough labeled data for training. Thanks to the recent advancements in deep transfer learning in NLP, we have been able to make rapid strides in not only tackling these problems but also leverage these models for diverse downstream NLP tasks.

The intent of this hack session is two-fold, we will first look at various SOTA models in deep transfer learning for NLP with hands-on examples and then talk about how these models were used in a real-world industry use-case around proactive detection of security vulnerabilities.

Part 1 - Deep Transfer Learning Techniques for NLP

In this first part of this hands-on hack session, we will take a trip through the various advances in deep transfer learning for NLP including the following:

  • Pre-trained word embeddings for Deep Learning Models (FastText with CNNs\Bi-directional LSTMs + Attention)
  • Universal Embeddings (Sentence Encoders, NNLMs)
  • Transformers (BERT, DistilBERT)

We will take a benchmark classification dataset and train and compare the performance of these models. All examples will be showcased using Python and leveraging the latest and best of TensorFlow 2.0.

Part 2 - Industry Case Study: Proactive Identification of Software Dependency Vulnerabilities

The second part of this hack session will briefly cover a real-world industry use case around proactive detection of security vulnerabilities in software. The idea here is that open-source and third-party libraries (dependencies) can often cost any enterprise dearly since they are not often aware of potential vulnerabilities which might be present in these dependencies. Can we leverage deep learning to proactively find out and flag dependencies having a sign of a potential vulnerability before it becomes a serious issue (Example: the requests library from python was one of the most vulnerable dependencies in the recent past which a lot of developers were not even aware of!).

This solution uses state-of-the-art deep learning models in NLP like BERT to go through public data including GitHub events data, Bugzilla, Mailing list conversations to predict probable security vulnerabilities. This should give the audience an idea of how we leveraged deep transfer learning for NLP in a very unique domain and also tackle problems like extreme class imbalance.

Key Takeaways from this Hack Session

  • Learn to train and fine-tune pre-trained SOTA models including BERT and DistilBERT for downstream NLP tasks like classification
  • Examples showcased using the latest and best in TensorFlow 2.0, TF-Hub and the excellent Transformers framework
  • Learn about a real-world industry use-case on predicting software dependency vulnerabilities using these techniques

Hack Session Examples Powered By

Acknowledgements

  • Hugging Face for the awesome transformers framework
  • Google for giving us Tensorflow 2.0
  • Sebastian Ruder for a lot of excellent images, resources and his thesis on transfer learning
  • Jay Alammar for his excellent interpretations of transformers, BERT, GPT-2 and more!
  • All the researchers and practitioners who worked hard to build all the models leveraged in this tutorial
  • The entire team at Red Hat CodeReady Analytics who I worked with for the showcased case-study on probable vulnerability prediction
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].