All Projects → CAMeL-Lab → Camel_tools

CAMeL-Lab / Camel_tools

Licence: mit
A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Camel tools

Bertweet
BERTweet: A pre-trained language model for English Tweets (EMNLP-2020)
Stars: ✭ 282 (+127.42%)
Mutual labels:  sentiment-analysis, named-entity-recognition
Sudachi
A Japanese Tokenizer for Business
Stars: ✭ 496 (+300%)
Mutual labels:  morphological-analysis, nlp-library
Informers
State-of-the-art natural language processing for Ruby
Stars: ✭ 306 (+146.77%)
Mutual labels:  sentiment-analysis, named-entity-recognition
simple NER
simple rule based named entity recognition
Stars: ✭ 29 (-76.61%)
Mutual labels:  named-entity-recognition, nlp-library
Qutuf
Qutuf (قُطُوْف): An Arabic Morphological analyzer and Part-Of-Speech tagger as an Expert System.
Stars: ✭ 84 (-32.26%)
Mutual labels:  arabic, morphological-analysis
Chatbot ner
chatbot_ner: Named Entity Recognition for chatbots.
Stars: ✭ 273 (+120.16%)
Mutual labels:  named-entity-recognition, nlp-library
Awesome Persian Nlp Ir
Curated List of Persian Natural Language Processing and Information Retrieval Tools and Resources
Stars: ✭ 460 (+270.97%)
Mutual labels:  morphological-analysis, named-entity-recognition
ar-embeddings
Sentiment Analysis for Arabic Text (tweets, reviews, and standard Arabic) using word2vec
Stars: ✭ 83 (-33.06%)
Mutual labels:  sentiment-analysis, arabic
Sentiment Analyser
ML that can extract german and english sentiment
Stars: ✭ 35 (-71.77%)
Mutual labels:  sentiment-analysis, nlp-library
Harvesttext
文本挖掘和预处理工具(文本清洗、新词发现、情感分析、实体识别链接、关键词抽取、知识抽取、句法分析等),无监督或弱监督方法
Stars: ✭ 956 (+670.97%)
Mutual labels:  sentiment-analysis, named-entity-recognition
rosette-elasticsearch-plugin
Document Enrichment plugin for Elasticsearch
Stars: ✭ 25 (-79.84%)
Mutual labels:  sentiment-analysis, named-entity-recognition
Pynlp
A pythonic wrapper for Stanford CoreNLP.
Stars: ✭ 103 (-16.94%)
Mutual labels:  sentiment-analysis, named-entity-recognition
farasapy
A Python implementation of Farasa toolkit
Stars: ✭ 69 (-44.35%)
Mutual labels:  named-entity-recognition, arabic
Danlp
DaNLP is a repository for Natural Language Processing resources for the Danish Language.
Stars: ✭ 111 (-10.48%)
Mutual labels:  named-entity-recognition, nlp-library
GrammarEngine
Грамматический Словарь Русского Языка (+ английский, японский, etc)
Stars: ✭ 68 (-45.16%)
Mutual labels:  nlp-library, morphological-analysis
Spacy
💫 Industrial-strength Natural Language Processing (NLP) in Python
Stars: ✭ 21,978 (+17624.19%)
Mutual labels:  named-entity-recognition, nlp-library
lima
The Libre Multilingual Analyzer, a Natural Language Processing (NLP) C++ toolkit.
Stars: ✭ 75 (-39.52%)
Mutual labels:  named-entity-recognition, nlp-library
teanaps
자연어 처리와 텍스트 분석을 위한 오픈소스 파이썬 라이브러리 입니다.
Stars: ✭ 91 (-26.61%)
Mutual labels:  named-entity-recognition, morphological-analysis
Kagome
Self-contained Japanese Morphological Analyzer written in pure Go
Stars: ✭ 554 (+346.77%)
Mutual labels:  morphological-analysis, nlp-library
Turkish Bert Nlp Pipeline
Bert-base NLP pipeline for Turkish, Ner, Sentiment Analysis, Question Answering etc.
Stars: ✭ 85 (-31.45%)
Mutual labels:  sentiment-analysis, named-entity-recognition

CAMeL Tools

.. image:: https://img.shields.io/pypi/v/camel-tools.svg :target: https://pypi.org/project/camel-tools :alt: PyPI Version

.. image:: https://img.shields.io/pypi/pyversions/camel-tools.svg :target: https://pypi.org/project/camel-tools :alt: PyPI Python Version

.. image:: https://readthedocs.org/projects/camel-tools/badge/?version=latest :target: https://camel-tools.readthedocs.io/en/latest/?badge=latest :alt: Documentation Status

.. image:: https://img.shields.io/pypi/l/camel-tools.svg :target: https://opensource.org/licenses/MIT :alt: MIT License

|

.. image:: camel_tools_logo.png :target: camel_tools_logo.png :alt: CAMeL Tools Logo

Introduction

CAMeL Tools is suite of Arabic natural language processing tools developed by the CAMeL Lab <http://camel-lab.com>_ at New York University Abu Dhabi <http://nyuad.nyu.edu/>_.

**Please use** `GitHub Issues <https://github.com/CAMeL-Lab/camel_tools/issues>`_
**to report a bug or if you need help using CAMeL Tools.**

Installation

You will need Python 3.6 and above (64-bit).

Linux/macOS


.. _linux-macos-install-pip:

Install using pip
^^^^^^^^^^^^^^^^^

.. code-block:: bash

   pip install camel-tools

   # or run the following if you already have camel_tools installed
   pip install camel-tools --upgrade --force-reinstall


.. _linux-macos-install-source:

Install from source
^^^^^^^^^^^^^^^^^^^

.. code-block:: bash

   # Clone the repo
   git clone https://github.com/CAMeL-Lab/camel_tools.git
   cd camel_tools

   # Install from source
   pip install .

   # or run the following if you already have camel_tools installed
   pip install --upgrade --force-reinstall .

.. _linux-macos-install-data:

Installing data
^^^^^^^^^^^^^^^

To install the data sets required by CAMeL Tools components run one of the
following:

.. code-block:: bash

   # To install all data sets
   camel_data full

   # or for a light weight package for morphology and MLE disambiguation only
   camel_data light

See `Available Packages <https://camel-tools.readthedocs.io/en/latest/cli/camel_data.html#available-packages>`_
for a comparison.

By default, data is stored in ``~/.camel_tools``.
Alternatively, if you would like to install the data in a different location,
you need to set the :code:`CAMELTOOLS_DATA` environment variable to the desired
path.

Add the following to your :code:`.bashrc`, :code:`.zshrc`, :code:`.profile`,
etc:

.. code-block:: bash

   export CAMELTOOLS_DATA=/path/to/camel_tools_data

Windows
~~~~~~~

**Note:** CAMeL Tools has been tested on Windows 10. The Dialect Identification
component is not available on Windows at this time.

.. _windows-install-pip:

Install using pip
^^^^^^^^^^^^^^^^^

.. code-block:: bash

   pip install camel-tools -f https://download.pytorch.org/whl/torch_stable.html

   # or run the following if you already have camel_tools installed
   pip install --upgrade --force-reinstall -f https://download.pytorch.org/whl/torch_stable.html camel-tools

.. _windows-install-source:

Install from source
^^^^^^^^^^^^^^^^^^^

.. code-block:: bash

   # Clone the repo
   git clone https://github.com/CAMeL-Lab/camel_tools.git
   cd camel_tools

   # Install from source
   pip install -f https://download.pytorch.org/whl/torch_stable.html .
   pip install --upgrade --force-reinstall -f https://download.pytorch.org/whl/torch_stable.html .

.. _windows-install-data:

Installing data
^^^^^^^^^^^^^^^

To install the data packages required by CAMeL Tools components, run one of the
following commands:

.. code-block:: bash

   # To install all data sets
   camel_data full

   # or for a light weight package for morphology and MLE disambiguation only
   camel_data light

See `Available Packages <https://camel-tools.readthedocs.io/en/latest/cli/camel_data.html#available-packages>`_
for a comparison.

By default, data is stored in
``C:\Users\your_user_name\AppData\Roaming\camel_tools``.
Alternatively, if you would like to install the data in a different location,
you need to set the ``CAMELTOOLS_DATA`` environment variable to the desired
path. Below are the instructions to do so (on Windows 10):

* Press the **Windows** button and type ``env``.
* Click on **Edit the system environment variables (Control panel)**.
* Click on the **Environment Variables...** button.
* Click on the **New...** button under the **User variables** panel.
* Type ``CAMELTOOLS_DATA`` in the **Variable name** input box and the
  desired data path in **Variable value**. Alternatively, you can browse for the
  data directory by clicking on the **Browse Directory...** button.
* Click **OK** on all the opened windows.


Documentation
-------------

You can find the
`full online documentation here <https://camel-tools.readthedocs.io>`_ for both
the command-line tools and the Python API.

Alternatively, you can build your own local copy of the documentation as
follows:

.. code-block:: bash

   # Install dependencies
   pip install sphinx recommonmark sphinx-rtd-theme

   # Go to docs subdirectory
   cd docs

   # Build HTML docs
   make html

This should compile all the HTML documentation in to ``docs/build/html``.


Citation
--------

If you find CAMeL Tools useful in your research, please cite
`our paper <https://www.aclweb.org/anthology/2020.lrec-1.868/>`_:

.. code-block:: bibtex

   @inproceedings{obeid-etal-2020-camel,
      title = "{CAM}e{L} Tools: An Open Source Python Toolkit for {A}rabic Natural Language Processing",
      author = "Obeid, Ossama  and
         Zalmout, Nasser  and
         Khalifa, Salam  and
         Taji, Dima  and
         Oudah, Mai  and
         Alhafni, Bashar  and
         Inoue, Go  and
         Eryani, Fadhl  and
         Erdmann, Alexander  and
         Habash, Nizar",
      booktitle = "Proceedings of the 12th Language Resources and Evaluation Conference",
      month = may,
      year = "2020",
      address = "Marseille, France",
      publisher = "European Language Resources Association",
      url = "https://www.aclweb.org/anthology/2020.lrec-1.868",
      pages = "7022--7032",
      abstract = "We present CAMeL Tools, a collection of open-source tools for Arabic natural language processing in Python. CAMeL Tools currently provides utilities for pre-processing, morphological modeling, Dialect Identification, Named Entity Recognition and Sentiment Analysis. In this paper, we describe the design of CAMeL Tools and the functionalities it provides.",
      language = "English",
      ISBN = "979-10-95546-34-4",
   }


License
-------

CAMeL Tools is available under the MIT license.
See the `LICENSE file
<https://github.com/CAMeL-Lab/camel_tools/blob/master/LICENSE>`_
for more info.


Contribute
----------

If you would like to contribute to CAMeL Tools, please read the
`CONTRIBUTE.rst
<https://github.com/CAMeL-Lab/camel_tools/blob/master/CONTRIBUTING.rst>`_
file.


Contributors
------------

* `Ossama Obeid <https://github.com/owo>`_
* `Go Inoue <https://github.com/go-inoue>`_
* `Bashar Alhafni <https://github.com/balhafni>`_
* `Salam Khalifa <https://github.com/slkh>`_
* `Dima Taji <https://github.com/dima-taji>`_
* `Nasser Zalmout <https://github.com/nzal>`_
* `Nizar Habash <https://github.com/nizarhabash1>`_
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].