All Projects → allenai → Allennlp

allenai / Allennlp

Licence: apache-2.0
An open-source NLP research library, built on PyTorch.

Programming Languages

python
139335 projects - #7 most used programming language
c
50402 projects - #5 most used programming language
Jsonnet
166 projects
Makefile
30231 projects
Scilab
70 projects
shell
77523 projects

Projects that are alternatives of or similar to Allennlp

Machine learning examples
A collection of machine learning examples and tutorials.
Stars: ✭ 6,466 (-39.56%)
Mutual labels:  data-science, natural-language-processing
Chicksexer
A Python package for gender classification.
Stars: ✭ 64 (-99.4%)
Mutual labels:  data-science, natural-language-processing
Coursera
Quiz & Assignment of Coursera
Stars: ✭ 774 (-92.77%)
Mutual labels:  data-science, natural-language-processing
Learn Data Science For Free
This repositary is a combination of different resources lying scattered all over the internet. The reason for making such an repositary is to combine all the valuable resources in a sequential manner, so that it helps every beginners who are in a search of free and structured learning resource for Data Science. For Constant Updates Follow me in …
Stars: ✭ 4,757 (-55.54%)
Mutual labels:  data-science, natural-language-processing
Tageditor
🏖TagEditor - Annotation tool for spaCy
Stars: ✭ 92 (-99.14%)
Mutual labels:  data-science, natural-language-processing
Spacy Stanza
💥 Use the latest Stanza (StanfordNLP) research models directly in spaCy
Stars: ✭ 508 (-95.25%)
Mutual labels:  data-science, natural-language-processing
Freeml
A List of Data Science/Machine Learning Resources (Mostly Free)
Stars: ✭ 974 (-90.9%)
Mutual labels:  data-science, natural-language-processing
Code search
Code For Medium Article: "How To Create Natural Language Semantic Search for Arbitrary Objects With Deep Learning"
Stars: ✭ 436 (-95.92%)
Mutual labels:  data-science, natural-language-processing
Lda Topic Modeling
A PureScript, browser-based implementation of LDA topic modeling.
Stars: ✭ 91 (-99.15%)
Mutual labels:  data-science, natural-language-processing
Applied Ml
📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.
Stars: ✭ 17,824 (+66.6%)
Mutual labels:  data-science, natural-language-processing
Book Socialmediaminingpython
Companion code for the book "Mastering Social Media Mining with Python"
Stars: ✭ 462 (-95.68%)
Mutual labels:  data-science, natural-language-processing
D2l En
Interactive deep learning book with multi-framework code, math, and discussions. Adopted at 300 universities from 55 countries including Stanford, MIT, Harvard, and Cambridge.
Stars: ✭ 11,837 (+10.64%)
Mutual labels:  data-science, natural-language-processing
Courses
Quiz & Assignment of Coursera
Stars: ✭ 454 (-95.76%)
Mutual labels:  data-science, natural-language-processing
Speech Emotion Analyzer
The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)
Stars: ✭ 633 (-94.08%)
Mutual labels:  data-science, natural-language-processing
Spacy
💫 Industrial-strength Natural Language Processing (NLP) in Python
Stars: ✭ 21,978 (+105.42%)
Mutual labels:  data-science, natural-language-processing
Biolitmap
Code for the paper "BIOLITMAP: a web-based geolocated and temporal visualization of the evolution of bioinformatics publications" in Oxford Bioinformatics.
Stars: ✭ 18 (-99.83%)
Mutual labels:  data-science, natural-language-processing
D2l Vn
Một cuốn sách tương tác về học sâu có mã nguồn, toán và thảo luận. Đề cập đến nhiều framework phổ biến (TensorFlow, Pytorch & MXNet) và được sử dụng tại 175 trường Đại học.
Stars: ✭ 402 (-96.24%)
Mutual labels:  data-science, natural-language-processing
Mlinterview
A curated awesome list of AI Startups in India & Machine Learning Interview Guide. Feel free to contribute!
Stars: ✭ 410 (-96.17%)
Mutual labels:  data-science, natural-language-processing
Ml
A high-level machine learning and deep learning library for the PHP language.
Stars: ✭ 1,270 (-88.13%)
Mutual labels:  data-science, natural-language-processing
Jupyterlab Prodigy
🧬 A JupyterLab extension for annotating data with Prodigy
Stars: ✭ 97 (-99.09%)
Mutual labels:  data-science, natural-language-processing

An Apache 2.0 NLP research library, built on PyTorch, for developing state-of-the-art deep learning models on a wide variety of linguistic tasks.


CI PyPI License Codecov Optuna

Quick Links

In this README

Getting Started Using the Library

If you're interested in using AllenNLP for model development, we recommend you check out the AllenNLP Guide for a thorough introduction to the library, followed by our more advanced guides on GitHub Discussions.

When you're ready to start your project, we've created a couple of template repositories that you can use as a starting place:

  • If you want to use allennlp train and config files to specify experiments, use this template. We recommend this approach.
  • If you'd prefer to use python code to configure your experiments and run your training loop, use this template. There are a few things that are currently a little harder in this setup (loading a saved model, and using distributed training), but otherwise it's functionality equivalent to the config files setup.

In addition, there are external tutorials:

And others on the AI2 AllenNLP blog.

Plugins

AllenNLP supports loading "plugins" dynamically. A plugin is just a Python package that provides custom registered classes or additional allennlp subcommands.

There is ecosystem of open source plugins, some of which are maintained by the AllenNLP team here at AI2, and some of which are maintained by the broader community.

Plugin Maintainer CLI Description
allennlp-models AI2 No A collection of state-of-the-art models
allennlp-semparse AI2 No A framework for building semantic parsers
allennlp-server AI2 Yes A simple demo server for serving models
allennlp-optuna Makoto Hiramatsu Yes Optuna integration for hyperparameter optimization

AllenNLP will automatically find any official AI2-maintained plugins that you have installed, but for AllenNLP to find personal or third-party plugins you've installed, you also have to create either a local plugins file named .allennlp_plugins in the directory where you run the allennlp command, or a global plugins file at ~/.allennlp/plugins. The file should list the plugin modules that you want to be loaded, one per line.

To test that your plugins can be found and imported by AllenNLP, you can run the allennlp test-install command. Each discovered plugin will be logged to the terminal.

For more information about plugins, see the plugins API docs. And for information on how to create a custom subcommand to distribute as a plugin, see the subcommand API docs.

Package Overview

allennlp An open-source NLP research library, built on PyTorch
allennlp.commands Functionality for the CLI
allennlp.common Utility modules that are used across the library
allennlp.data A data processing module for loading datasets and encoding strings as integers for representation in matrices
allennlp.fairness A module for bias mitigation and fairness algorithms and metrics
allennlp.modules A collection of PyTorch modules for use with text
allennlp.nn Tensor utility functions, such as initializers and activation functions
allennlp.training Functionality for training models

Installation

AllenNLP requires Python 3.6.1 or later and PyTorch.

We support AllenNLP on Mac and Linux environments. We presently do not support Windows but are open to contributions.

Installing via conda-forge

The simplest way to install AllenNLP is using conda:

conda install -c conda-forge python=3.8 allennlp

All plugins mentioned above are similarly installable, e.g.

conda install -c conda-forge allennlp-models allennlp-semparse allennlp-server allennlp-optuna

Installing via pip

It's recommended that you install the PyTorch ecosystem before installing AllenNLP by following the instructions on pytorch.org.

After that, just run pip install allennlp.

⚠️ If you're using Python 3.7 or greater, you should ensure that you don't have the PyPI version of dataclasses installed after running the above command, as this could cause issues on certain platforms. You can quickly check this by running pip freeze | grep dataclasses. If you see something like dataclasses=0.6 in the output, then just run pip uninstall -y dataclasses.

If you need pointers on setting up an appropriate Python environment or would like to install AllenNLP using a different method, see below.

Setting up a virtual environment

Conda can be used set up a virtual environment with the version of Python required for AllenNLP. If you already have a Python 3 environment you want to use, you can skip to the 'installing via pip' section.

  1. Download and install Conda.

  2. Create a Conda environment with Python 3.7 (3.6 or 3.8 would work as well):

    conda create -n allennlp_env python=3.7
    
  3. Activate the Conda environment. You will need to activate the Conda environment in each terminal in which you want to use AllenNLP:

    conda activate allennlp_env
    

Installing the library and dependencies

Installing the library and dependencies is simple using pip.

pip install allennlp

Looking for bleeding edge features? You can install nightly releases directly from pypi

AllenNLP installs a script when you install the python package, so you can run allennlp commands just by typing allennlp into a terminal. For example, you can now test your installation with allennlp test-install.

You may also want to install allennlp-models, which contains the NLP constructs to train and run our officially supported models, many of which are hosted at https://demo.allennlp.org.

pip install allennlp-models

Installing using Docker

Docker provides a virtual machine with everything set up to run AllenNLP-- whether you will leverage a GPU or just run on a CPU. Docker provides more isolation and consistency, and also makes it easy to distribute your environment to a compute cluster.

AllenNLP provides official Docker images with the library and all of its dependencies installed.

Once you have installed Docker, you should also install the NVIDIA Container Toolkit if you have GPUs available.

Then run the following command to get an environment that will run on GPU:

mkdir -p $HOME/.allennlp/
docker run --rm --gpus all -v $HOME/.allennlp:/root/.allennlp allennlp/allennlp:latest

You can test the Docker environment with

docker run --rm --gpus all -v $HOME/.allennlp:/root/.allennlp allennlp/allennlp:latest test-install 

If you don't have GPUs available, just omit the --gpus all flag.

Building your own Docker image

For various reasons you may need to create your own AllenNLP Docker image, such as if you need a different version of PyTorch. To do so, just run make docker-image from the root of your local clone of AllenNLP.

By default this builds an image with the tag allennlp/allennlp, but you can change this to anything you want by setting the DOCKER_IMAGE_NAME flag when you call make. For example, make docker-image DOCKER_IMAGE_NAME=my-allennlp.

If you want to use a different version of Python or PyTorch, set the flags DOCKER_PYTHON_VERSION and DOCKER_TORCH_VERSION to something like 3.9 and 1.9.0-cuda10.2, respectively. These flags together determine the base image that is used. You can see the list of valid combinations in this GitHub Container Registry: github.com/allenai/docker-images/pkgs/container/pytorch.

After building the image you should be able to see it listed by running docker images allennlp.

REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
allennlp/allennlp   latest              b66aee6cb593        5 minutes ago       2.38GB

Installing from source

You can also install AllenNLP by cloning our git repository:

git clone https://github.com/allenai/allennlp.git

Create a Python 3.7 or 3.8 virtual environment, and install AllenNLP in editable mode by running:

pip install -U pip setuptools wheel
pip install --editable .
pip install -r dev-requirements.txt

This will make allennlp available on your system but it will use the sources from the local clone you made of the source repository.

You can test your installation with allennlp test-install. See https://github.com/allenai/allennlp-models for instructions on installing allennlp-models from source.

Running AllenNLP

Once you've installed AllenNLP, you can run the command-line interface with the allennlp command (whether you installed from pip or from source). allennlp has various subcommands such as train, evaluate, and predict. To see the full usage information, run allennlp --help.

You can test your installation by running allennlp test-install.

Issues

Everyone is welcome to file issues with either feature requests, bug reports, or general questions. As a small team with our own internal goals, we may ask for contributions if a prompt fix doesn't fit into our roadmap. To keep things tidy we will often close issues we think are answered, but don't hesitate to follow up if further discussion is needed.

Contributions

The AllenNLP team at AI2 (@allenai) welcomes contributions from the community. If you're a first time contributor, we recommend you start by reading our CONTRIBUTING.md guide. Then have a look at our issues with the tag Good First Issue.

If you would like to contribute a larger feature, we recommend first creating an issue with a proposed design for discussion. This will prevent you from spending significant time on an implementation which has a technical limitation someone could have pointed out early on. Small contributions can be made directly in a pull request.

Pull requests (PRs) must have one approving review and no requested changes before they are merged. As AllenNLP is primarily driven by AI2 we reserve the right to reject or revert contributions that we don't think are good additions.

Citing

If you use AllenNLP in your research, please cite AllenNLP: A Deep Semantic Natural Language Processing Platform.

@inproceedings{Gardner2017AllenNLP,
  title={AllenNLP: A Deep Semantic Natural Language Processing Platform},
  author={Matt Gardner and Joel Grus and Mark Neumann and Oyvind Tafjord
    and Pradeep Dasigi and Nelson F. Liu and Matthew Peters and
    Michael Schmitz and Luke S. Zettlemoyer},
  year={2017},
  Eprint = {arXiv:1803.07640},
}

Team

AllenNLP is an open-source project backed by the Allen Institute for Artificial Intelligence (AI2). AI2 is a non-profit institute with the mission to contribute to humanity through high-impact AI research and engineering. To learn more about who specifically contributed to this codebase, see our contributors page.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].