All Projects → denocris → MHPC-Natural-Language-Processing-Lectures

denocris / MHPC-Natural-Language-Processing-Lectures

Licence: other
This is the second part of the Deep Learning Course for the Master in High-Performance Computing (SISSA/ICTP).)

Programming Languages

Jupyter Notebook
11667 projects

Projects that are alternatives of or similar to MHPC-Natural-Language-Processing-Lectures

Laser
The HPC toolbox: fused matrix multiplication, convolution, data-parallel strided tensor primitives, OpenMP facilities, SIMD, JIT Assembler, CPU detection, state-of-the-art vectorized BLAS for floats and integers
Stars: ✭ 191 (+478.79%)
Mutual labels:  high-performance-computing
GLUE-bert4keras
基于bert4keras的GLUE基准代码
Stars: ✭ 59 (+78.79%)
Mutual labels:  natural-language-understanding
finetune-gpt2xl
Guide: Finetune GPT2-XL (1.5 Billion Parameters) and finetune GPT-NEO (2.7 B) on a single GPU with Huggingface Transformers using DeepSpeed
Stars: ✭ 353 (+969.7%)
Mutual labels:  huggingface-transformers
Libflame
High-performance object-based library for DLA computations
Stars: ✭ 197 (+496.97%)
Mutual labels:  high-performance-computing
Tf Quant Finance
High-performance TensorFlow library for quantitative finance.
Stars: ✭ 2,925 (+8763.64%)
Mutual labels:  high-performance-computing
FUTURE
A private, free, open-source search engine built on a P2P network
Stars: ✭ 19 (-42.42%)
Mutual labels:  natural-language-understanding
Qmcpack
Main repository for QMCPACK, an open-source production level many-body ab initio Quantum Monte Carlo code for computing the electronic structure of atoms, molecules, and solids.
Stars: ✭ 160 (+384.85%)
Mutual labels:  high-performance-computing
auto-gfqg
Automatic Gap-Fill Question Generation
Stars: ✭ 17 (-48.48%)
Mutual labels:  natural-language-understanding
vpic
Vector Particle-In-Cell (VPIC) Project
Stars: ✭ 124 (+275.76%)
Mutual labels:  high-performance-computing
ModelDeployment
CRAN Task View: Model Deployment with R
Stars: ✭ 19 (-42.42%)
Mutual labels:  high-performance-computing
Relion
Image-processing software for cryo-electron microscopy
Stars: ✭ 219 (+563.64%)
Mutual labels:  high-performance-computing
Feelpp
💎 Feel++: Finite Element Embedded Language and Library in C++
Stars: ✭ 229 (+593.94%)
Mutual labels:  high-performance-computing
opensbli
A framework for the automated derivation and parallel execution of finite difference solvers on a range of computer architectures.
Stars: ✭ 56 (+69.7%)
Mutual labels:  high-performance-computing
Sundials
SUNDIALS is a SUite of Nonlinear and DIfferential/ALgebraic equation Solvers. This is a mirror of current releases, and development will move here eventually. Pull requests are welcome for bug fixes and minor changes.
Stars: ✭ 194 (+487.88%)
Mutual labels:  high-performance-computing
mathinmse.github.io
Applied Matematical Methods in Materials Engineering
Stars: ✭ 24 (-27.27%)
Mutual labels:  lecture-material
Libhermit
HermitCore: A C-based, lightweight unikernel
Stars: ✭ 190 (+475.76%)
Mutual labels:  high-performance-computing
COCO-LM
[NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining
Stars: ✭ 109 (+230.3%)
Mutual labels:  natural-language-understanding
BBFN
This repository contains the implementation of the paper -- Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal Sentiment Analysis
Stars: ✭ 42 (+27.27%)
Mutual labels:  huggingface-transformers
colmena
Library for steering campaigns of simulations on supercomputers
Stars: ✭ 32 (-3.03%)
Mutual labels:  high-performance-computing
Guided Missile Simulation
Guided Missile, Radar and Infrared EOS Simulation Framework written in Fortran.
Stars: ✭ 33 (+0%)
Mutual labels:  high-performance-computing

Natural Language Processing - Cristiano De Nobili

This is the second part of the Deep Learning Course for the Master in High-Performance Computing (SISSA/ICTP). It is about Natural Language processing, in particular on recent progress involving transformers-based models. I must thank the innovative-startup AINDO for the support.

Cristiano holds a Ph.D. in Theoretical Physics (SISSA) and he has been actively working in Deep Learning for four years. In particular, he is now part of the Bixby project, Samsung's vocal assistant. He is also a TEDx speaker (here is talk about AI, Humans and their future) and civil pilot (PPL). Here his contacts:

  • If you are interested in science and tech news: LinkedIn & Twitter;
  • On my website you can find all my lectures, workshops, and talks;
  • My Instagram is about flying, traveling, and adventure. It is the social platform that I use the most.

Have also a look at the first part of the course, Introduction to Neural Networks (with PyTorch), by Alessio Ansuini, and the third part, Deep generative models with TensorFlow 2, by Piero Coronica.

Course Outline

You can find here the videos of the lectures. For this year, I decided to use PyTorch as the main Deep Learning library.

  • Lecture 1: intro to NLP, text preprocessing, spaCy, common problems in NLP (NER, POS, sentence classification, ...), non-contextual word embedding, SkipGram Word2Vec coded from scratch, pre-trained Glove with Gensim, intro to contextual word embedding and (self-)Attention Mechanism.

  • Lecture 2: transfer learning main concepts, transformer-based model, how BERT-like models are trained and fine-tuned on downstream tasks, intro to Transformers library Hugging Face, tokenization, language modeling with English and non-English (Italian Gilberto and Umberto) pre-trained AutoModels, some examples of NLP problems using Transformers Pipeline.

  • Lecture 3: fine-tune a pre-trained Italian RoBERTa to solve word-sense disambiguation, embedding geometry, clustering (TSNE and UMAP) and visualization (this lecture is a bit advanced). Part of this notebook is done using PyTorch Lightning.

Useful links and references are inside each notebook. For any doubts or questions feel free to contact me!

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].