All Projects → abhimishra91 → Insight

abhimishra91 / Insight

Licence: gpl-3.0
Repository for Project Insight: NLP as a Service

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Insight

Bertviz
Tool for visualizing attention in the Transformer model (BERT, GPT-2, Albert, XLNet, RoBERTa, CTRL, etc.)
Stars: ✭ 3,443 (+1299.59%)
Mutual labels:  natural-language-processing, transformer
Gpt2
PyTorch Implementation of OpenAI GPT-2
Stars: ✭ 64 (-73.98%)
Mutual labels:  natural-language-processing, transformer
Question generation
Neural question generation using transformers
Stars: ✭ 356 (+44.72%)
Mutual labels:  natural-language-processing, transformer
Laravel5 Jsonapi
Laravel 5 JSON API Transformer Package
Stars: ✭ 313 (+27.24%)
Mutual labels:  microservice, transformer
Symfony Jsonapi
JSON API Transformer Bundle for Symfony 2 and Symfony 3
Stars: ✭ 114 (-53.66%)
Mutual labels:  microservice, transformer
Awesome Bert Nlp
A curated list of NLP resources focused on BERT, attention mechanism, Transformer networks, and transfer learning.
Stars: ✭ 567 (+130.49%)
Mutual labels:  natural-language-processing, transformer
Vietnamese Electra
Electra pre-trained model using Vietnamese corpus
Stars: ✭ 55 (-77.64%)
Mutual labels:  natural-language-processing, transformer
Nlp Tutorial
Natural Language Processing Tutorial for Deep Learning Researchers
Stars: ✭ 9,895 (+3922.36%)
Mutual labels:  natural-language-processing, transformer
Transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Stars: ✭ 55,742 (+22559.35%)
Mutual labels:  natural-language-processing, transformer
Multimodal Toolkit
Multimodal model for text and tabular data with HuggingFace transformers as building block for text data
Stars: ✭ 78 (-68.29%)
Mutual labels:  natural-language-processing, transformer
Hardware Aware Transformers
[ACL 2020] HAT: Hardware-Aware Transformers for Efficient Natural Language Processing
Stars: ✭ 206 (-16.26%)
Mutual labels:  natural-language-processing, transformer
Transformers.jl
Julia Implementation of Transformer models
Stars: ✭ 173 (-29.67%)
Mutual labels:  natural-language-processing, transformer
Spacy Api Docker
spaCy REST API, wrapped in a Docker container.
Stars: ✭ 222 (-9.76%)
Mutual labels:  microservice, natural-language-processing
Mitie
MITIE: library and tools for information extraction
Stars: ✭ 2,693 (+994.72%)
Mutual labels:  natural-language-processing
Pytorch Sentiment Analysis
Tutorials on getting started with PyTorch and TorchText for sentiment analysis.
Stars: ✭ 3,209 (+1204.47%)
Mutual labels:  natural-language-processing
Deepnlp Models Pytorch
Pytorch implementations of various Deep NLP models in cs-224n(Stanford Univ)
Stars: ✭ 2,760 (+1021.95%)
Mutual labels:  natural-language-processing
Yivnet
Yivnet is a microservice game server base on go-kit
Stars: ✭ 237 (-3.66%)
Mutual labels:  microservice
Clean Architecture Manga
🌀 Clean Architecture with .NET6, C#10 and React+Redux. Use cases as central organizing structure, completely testable, decoupled from frameworks
Stars: ✭ 3,104 (+1161.79%)
Mutual labels:  microservice
Loc Framework
本项目是完全基于Spring Boot2和Springcloud Finchley所进行了开发的,目的是简化和统一公司内部使用微服务框架的使用方法
Stars: ✭ 238 (-3.25%)
Mutual labels:  microservice
Relational Rnn Pytorch
An implementation of DeepMind's Relational Recurrent Neural Networks in PyTorch.
Stars: ✭ 236 (-4.07%)
Mutual labels:  transformer

Project Insight

NLP as a Service

Project Insight

GitHub issues GitHub forks Github Stars GitHub license Code style: black

Contents

  1. Introduction
  2. Installation
  3. Project Details
  4. License

Introduction

Project Insight is designed to create NLP as a service with code base for both front end GUI (streamlit) and backend server (FastApi) the usage of transformers models on various downstream NLP task.

The downstream NLP tasks covered:

  • News Classification

  • Entity Recognition

  • Sentiment Analysis

  • Summarization

  • Information Extraction To Do

The user can select different models from the drop down to run the inference.

The users can also directly use the backend fastapi server to have a command line inference.

Features of the solution

  • Python Code Base: Built using Fastapi and Streamlit making the complete code base in Python.
  • Expandable: The backend is desinged in a way that it can be expanded with more Transformer based models and it will be available in the front end app automatically.
  • Micro-Services: The backend is designed with a microservices architecture, with dockerfile for each service and leveraging on Nginx as a reverse proxy to each independently running service.
    • This makes it easy to update, manitain, start, stop individual NLP services.

Installation

  • Clone the Repo.
  • Run the Docker Compose to spin up the Fastapi based backend service.
  • Run the Streamlit app with the streamlit run command.

Setup and Documentation

  1. Download the models

    • Download the models from here
    • Save them in the specific model folders inside the src_fastapi folder.
  2. Running the backend service.

    • Go to the src_fastapi folder
    • Run the Docker Compose comnand
    $ cd src_fastapi
    src_fastapi:~$ sudo docker-compose up -d
    
  3. Running the frontend app.

    • Go to the src_streamlit folder
    • Run the app with the streamlit run command
    $ cd src_streamlit
    src_streamlit:~$ streamlit run NLPfily.py
    
  4. Access to Fastapi Documentation: Since this is a microservice based design, every NLP task has its own seperate documentation

Project Details

Demonstration

Project Insight Demo

Directory Details

  • Front End: Front end code is in the src_streamlit folder. Along with the Dockerfile and requirements.txt

  • Back End: Back End code is in the src_fastapi folder.

    • This folder contains directory for each task: Classification, ner, summary...etc
    • Each NLP task has been implemented as a microservice, with its own fastapi server and requirements and Dockerfile so that they can be independently mantained and managed.
    • Each NLP task has its own folder and within each folder each trained model has 1 folder each. For example:
    - sentiment
        > app
            > api
                > distilbert
                    - model.bin
                    - network.py
                    - tokeniser files
                >roberta
                    - model.bin
                    - network.py
                    - tokeniser files
    
    • For each new model under each service a new folder will have to be added.

    • Each folder model will need the following files:

      • Model bin file.
      • Tokenizer files
      • network.py Defining the class of the model if customised model used.
    • config.json: This file contains the details of the models in the backend and the dataset they are trained on.

How to Add a new Model

  1. Fine Tune a transformer model for specific task. You can leverage the transformers-tutorials

  2. Save the model files, tokenizer files and also create a network.py script if using a customized training network.

  3. Create a directory within the NLP task with directory_name as the model name and save all the files in this directory.

  4. Update the config.json with the model details and dataset details.

  5. Update the <service>pro.py with the correct imports and conditions where the model is imported. For example for a new Bert model in Classification Task, do the following:

    • Create a new directory in classification/app/api/. Directory name bert.

    • Update config.json with following:

      "classification": {
      "model-1": {
          "name": "DistilBERT",
          "info": "This model is trained on News Aggregator Dataset from UC Irvin Machine Learning Repository. The news headlines are classified into 4 categories: **Business**, **Science and Technology**, **Entertainment**, **Health**. [New Dataset](https://archive.ics.uci.edu/ml/datasets/News+Aggregator)"
      },
      "model-2": {
          "name": "BERT",
          "info": "Model Info"
      }
      }
      
    • Update classificationpro.py with the following snippets:

      Only if customized class used

      from classification.bert import BertClass
      

      Section where the model is selected

      if model == "bert":
          self.model = BertClass()
          self.tokenizer = BertTokenizerFast.from_pretrained(self.path)
      

License

This project is licensed under the GPL-3.0 License - see the LICENSE.md file for details

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].