All Projects → dipanjanS → nlp_workshop_odsc_europe20

dipanjanS / nlp_workshop_odsc_europe20

Licence: GPL-3.0 license
Extensive tutorials for the Advanced NLP Workshop in Open Data Science Conference Europe 2020. We will leverage machine learning, deep learning and deep transfer learning to learn and solve popular tasks using NLP including NER, Classification, Recommendation \ Information Retrieval, Summarization, Classification, Language Translation, Q&A and T…

Programming Languages

Jupyter Notebook
11667 projects

Projects that are alternatives of or similar to nlp workshop odsc europe20

Text Analytics With Python
Learn how to process, classify, cluster, summarize, understand syntax, semantics and sentiment of text data with the power of Python! This repository contains code and datasets used in my book, "Text Analytics with Python" published by Apress/Springer.
Stars: ✭ 1,132 (+791.34%)
Mutual labels:  scikit-learn, spacy, nltk, gensim
resume tailor
An unsupervised analysis combining topic modeling and clustering to preserve an individuals work history and credentials while tailoring their resume towards a new career field
Stars: ✭ 15 (-88.19%)
Mutual labels:  scikit-learn, nltk, gensim
converse
Conversational text Analysis using various NLP techniques
Stars: ✭ 147 (+15.75%)
Mutual labels:  scikit-learn, transformers, spacy
Practical Machine Learning With Python
Master the essential skills needed to recognize and solve complex real-world problems with Machine Learning and Deep Learning by leveraging the highly popular Python Machine Learning Eco-system.
Stars: ✭ 1,868 (+1370.87%)
Mutual labels:  scikit-learn, spacy, nltk
Adam qas
ADAM - A Question Answering System. Inspired from IBM Watson
Stars: ✭ 330 (+159.84%)
Mutual labels:  scikit-learn, spacy, gensim
Product-Categorization-NLP
Multi-Class Text Classification for products based on their description with Machine Learning algorithms and Neural Networks (MLP, CNN, Distilbert).
Stars: ✭ 30 (-76.38%)
Mutual labels:  transformers, nltk, gensim
adapt
Awesome Domain Adaptation Python Toolbox
Stars: ✭ 46 (-63.78%)
Mutual labels:  scikit-learn, transfer-learning
Autogluon
AutoGluon: AutoML for Text, Image, and Tabular Data
Stars: ✭ 3,920 (+2986.61%)
Mutual labels:  scikit-learn, transfer-learning
Crime Analysis
Association Rule Mining from Spatial Data for Crime Analysis
Stars: ✭ 20 (-84.25%)
Mutual labels:  scikit-learn, gensim
Haystack
🔍 Haystack is an open source NLP framework that leverages Transformer models. It enables developers to implement production-ready neural search, question answering, semantic document search and summarization for a wide range of applications.
Stars: ✭ 3,409 (+2584.25%)
Mutual labels:  transformers, transfer-learning
Ryuzaki bot
Simple chatbot in Python using NLTK and scikit-learn
Stars: ✭ 28 (-77.95%)
Mutual labels:  scikit-learn, nltk
Doc2vec
📓 Long(er) text representation and classification using Doc2Vec embeddings
Stars: ✭ 92 (-27.56%)
Mutual labels:  scikit-learn, gensim
Arch-Data-Science
Archlinux PKGBUILDs for Data Science, Machine Learning, Deep Learning, NLP and Computer Vision
Stars: ✭ 92 (-27.56%)
Mutual labels:  scikit-learn, spacy
Text Classification
Machine Learning and NLP: Text Classification using python, scikit-learn and NLTK
Stars: ✭ 239 (+88.19%)
Mutual labels:  scikit-learn, nltk
Shallowlearn
An experiment about re-implementing supervised learning models based on shallow neural network approaches (e.g. fastText) with some additional exclusive features and nice API. Written in Python and fully compatible with Scikit-learn.
Stars: ✭ 196 (+54.33%)
Mutual labels:  scikit-learn, gensim
udacity-cvnd-projects
My solutions to the projects assigned for the Udacity Computer Vision Nanodegree
Stars: ✭ 36 (-71.65%)
Mutual labels:  nltk, transfer-learning
ParsBigBird
Persian Bert For Long-Range Sequences
Stars: ✭ 58 (-54.33%)
Mutual labels:  transformers, transfer-learning
pygrams
Extracts key terminology (n-grams) from any large collection of documents (>1000) and forecasts emergence
Stars: ✭ 52 (-59.06%)
Mutual labels:  scikit-learn, nltk
Bert Sklearn
a sklearn wrapper for Google's BERT model
Stars: ✭ 182 (+43.31%)
Mutual labels:  scikit-learn, transfer-learning
Semantic-Textual-Similarity
Natural Language Processing using NLTK and Spacy
Stars: ✭ 30 (-76.38%)
Mutual labels:  spacy, nltk

ODSC Europe 2020 Workshop

Advanced NLP: From Essentials to Deep Transfer Learning

Abstract:

Being specialized in domains like computer vision and natural language processing is no longer a luxury but a necessity which is expected of any data scientist in today’s fast-paced world! With a hands-on and interactive approach, we will understand essential concepts in NLP along with extensive hands-on examples to master state-of-the-art tools, techniques and methodologies for actually applying NLP to solve real-world problems. We will leverage machine learning, deep learning and deep transfer learning to learn and solve popular tasks using NLP including NER, Classification, Recommendation \ Information Retrieval, Summarization, Classification, Language Translation, Q&A and Topic Models


Session Outline

Module 1: NLP Essentials

Here we start with the basics of how to process and work with text data and strings. Look at essential components of a NLP pipeline and get started on some of the key components from this pipeline including understanding POS tagging, Named Entity Recognition and Text Pre-processing. We will look at traditional approaches as well as newer deep transfer learning based approaches for a few of these components.

Key Focus Areas: Text Pre-processing, NER, POS Tagging


Module 2: Text Representation

Text can't be consumed directly by downstream machine learning and deep learning models since they are at heart math-based models. The key focus of this module will be to cover both traditional statistical based methodologies and newer representation learning based methodologies which use deep learning to represent text data including bag of words, n-grams, word embeddings, universal embeddings and contextual embeddings.

Key Focus Areas: Count-based Representations (Bag of Words, N-grams, TF-IDF), Similarity, Topics, Word Embeddings (Word2Vec, GloVe, FastText), Universal Embeddings, Contextual Embeddings (Transformers)


Module 3: NLP Application (Machine Learning \ Deep Learning)

We will look at several popular applications of NLP in this module and go through hands-on examples. This includes movie recommendation systems using similarity, topic modeling analysis on research papers, summarizing text documents, language translation, text classification and sentiment analysis

Key Focus Areas: Topic Models, Similarity \ Information Retrieval, Summarization (TextRank \ Transformers), Language Translation (seq2seq \ attention), Classification (machine learning & deep learning models)


Module 4: NLP Applications with Deep Transfer Learning

We finally dive into some of the latest and best advancements which have happened in the last few years in the world of NLP, thanks to deep transfer learning. We will cover a deep conceptual understanding of the transformer architecture and look at some hands-on examples of text classification and multi-task NLP using transformers where we look at solving NER, Q&A, sentiment analysis, summarization, translation using effective constructs like the transformers pipeline.

Key Focus Areas: Text Classification (with pre-trained embeddings, universal sentence encoders and transformers), Multi-task NLP with transformer pipelines (sentiment analysis, NER, text generation, summarization, question-answering, translation). Fine-tuning\training transformers (tips \ guidelines) with examples e.g NER


Background Knowledge

Skills: Basic understanding of Machine Learning, Deep Learning (though we will cover some essentials) Tools \ Languages: Python, Tensorflow\Keras\PyTorch, Scikit-Learn (Basics)

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].