All Projects → AnasAito → SkillNER

AnasAito / SkillNER

Licence: MIT license
A (smart) rule based NLP module to extract job skills from text

Programming Languages

python
139335 projects - #7 most used programming language
Jupyter Notebook
11667 projects

Projects that are alternatives of or similar to SkillNER

extractacy
Spacy pipeline object for extracting values that correspond to a named entity (e.g., birth dates, account numbers, laboratory results)
Stars: ✭ 47 (-31.88%)
Mutual labels:  spacy, ner
anonymization-api
How to build and deploy an anonymization API with FastAPI
Stars: ✭ 51 (-26.09%)
Mutual labels:  spacy, ner
rita-dsl
A Domain Specific Language (DSL) for building language patterns. These can be later compiled into spaCy patterns, pure regex, or any other format
Stars: ✭ 60 (-13.04%)
Mutual labels:  spacy, rule-based
presidio-research
This package features data-science related tasks for developing new recognizers for Presidio. It is used for the evaluation of the entire system, as well as for evaluating specific PII recognizers or PII detection models.
Stars: ✭ 62 (-10.14%)
Mutual labels:  spacy, ner
Ner Annotator
Named Entity Recognition (NER) Annotation tool for SpaCy. Generates Traning Data as a JSON which can be readily used.
Stars: ✭ 127 (+84.06%)
Mutual labels:  spacy, ner
Spacy Streamlit
👑 spaCy building blocks and visualizers for Streamlit apps
Stars: ✭ 360 (+421.74%)
Mutual labels:  spacy, ner
anonymisation
Anonymization of legal cases (Fr) based on Flair embeddings
Stars: ✭ 85 (+23.19%)
Mutual labels:  spacy, ner
Spacy Lookup
Named Entity Recognition based on dictionaries
Stars: ✭ 212 (+207.25%)
Mutual labels:  spacy, ner
NER-and-Linking-of-Ancient-and-Historic-Places
An NER tool for ancient place names based on Pleiades and Spacy.
Stars: ✭ 26 (-62.32%)
Mutual labels:  spacy, ner
NLP Quickbook
NLP in Python with Deep Learning
Stars: ✭ 516 (+647.83%)
Mutual labels:  spacy
nlp workshop odsc europe20
Extensive tutorials for the Advanced NLP Workshop in Open Data Science Conference Europe 2020. We will leverage machine learning, deep learning and deep transfer learning to learn and solve popular tasks using NLP including NER, Classification, Recommendation \ Information Retrieval, Summarization, Classification, Language Translation, Q&A and T…
Stars: ✭ 127 (+84.06%)
Mutual labels:  spacy
topic modelling financial news
Topic modelling on financial news with Natural Language Processing
Stars: ✭ 51 (-26.09%)
Mutual labels:  spacy
botframework-components
The repository for components built by Microsoft for the Azure Bot Framework.
Stars: ✭ 90 (+30.43%)
Mutual labels:  skills
Time Convert
时间转换工具
Stars: ✭ 32 (-53.62%)
Mutual labels:  ner
keras-crf-ner
keras+bi-lstm+crf,中文命名实体识别
Stars: ✭ 16 (-76.81%)
Mutual labels:  ner
spacy readability
spaCy pipeline component for adding text readability meta data to Doc objects.
Stars: ✭ 54 (-21.74%)
Mutual labels:  spacy
ATGValidator
iOS validation framework with form validation support
Stars: ✭ 51 (-26.09%)
Mutual labels:  rule-based
biaffine-ner
Named Entity Recognition as Dependency Parsing
Stars: ✭ 293 (+324.64%)
Mutual labels:  ner
ginza-transformers
Use custom tokenizers in spacy-transformers
Stars: ✭ 15 (-78.26%)
Mutual labels:  spacy
genalog
Genalog is an open source, cross-platform python package allowing generation of synthetic document images with custom degradations and text alignment capabilities.
Stars: ✭ 234 (+239.13%)
Mutual labels:  ner

Number of SkillNer downloads

Just looking to test out SkillNer? Check out our demo.

SkillNer is an NLP module to automatically Extract skills and certifications from unstructured job postings, texts, and applicant's resumes.

Skillner uses EMSI databse (an open source skill database) as a knowldge base linker to prevent skill duplications.

Useful links

  • Visit our website to learn about SkillNer features, how it works, and particularly explore our roadmap
  • Get started with SkillNer and get to know its API by visiting the Documentation
  • Test our Demo to see some of SkillNer capabilities

Installation

It is easy to get started with SkillNer and take advantage of its features.

  1. First, install SkillNer through the pip
pip install skillNer
  1. Next, run the following command to install spacy en_core_web_lg which is one of the main plugins of SkillNer. Thanks its modular nature, you can customize SkillNer behavior just by adjusting | plugin | unplugin modules. Don't worry about these details, we will discuss them in details in the an upcomming Tutorial section.
python -m spacy download en_core_web_lg

Note: The later installation will take few seconds before it get done since spacy en_core_web_lg is a bit too large (800 MB). Yet, you need to wait only one time.

Example of usage

With these initial steps being accomplished, let’s dive a bit deeper into skillNer through a worked example.

Let’s say you want to extract skills from the following job posting:

“You are a Python developer with a solid experience in web development and can manage projects. 
You quickly adapt to new environments and speak fluently English and French”

Annotating skills

We start first by importing modules, particularly spacy and SkillExtractor. Note that if you are using skillNer for the first time, it might take a while to download SKILL_DB.

SKILL_DB is SkillNer default skills database. It was built upon EMSI skills database .

# imports
import spacy
from spacy.matcher import PhraseMatcher

# load default skills data base
from skillNer.general_params import SKILL_DB
# import skill extractor
from skillNer.skill_extractor_class import SkillExtractor

# init params of skill extractor
nlp = spacy.load("en_core_web_lg")
# init skill extractor
skill_extractor = SkillExtractor(nlp, SKILL_DB, PhraseMatcher)

# extract skills from job_description
job_description = """
You are a Python developer with a solid experience in web development
and can manage projects. You quickly adapt to new environments
and speak fluently English and French
"""

annotations = skill_extractor.annotate(job_description)

Exploit annotations

Voilà! Now you can inspect results by rendering the text with the annotated skills. You can acheive that through the .describe method. Note that the output of this method is litteraly an HTML document that gets rendered in your notebook.

example output skillNer

Besides, you can use the raw result of the annotations. Below is the value of the annotations variable from the code above.

# output
{
    'text': 'you are a python developer with a solid experience in web development and can manage projects you quickly adapt to new environments and speak fluently english and french',
    'results': {
        'full_matches': [
            {
                'skill_id': 'KS122Z36QK3N5097B5JH', 
                'doc_node_value': 'web development', 
                'score': 1, 'doc_node_id': [10, 11]
            }
        ], '
        ngram_scored': [
            {
                'skill_id': 'KS125LS6N7WP4S6SFTCK', 
                'doc_node_id': [3], 
                'doc_node_value': 'python', 
                'type': 'fullUni', 
                'score': 1, 
                'len': 1
            }, 
        # the other annotated skills
        # ...
        ]
    }
}

Contribure

SkillNer is the first Open Source skill extractor. Hence it is a tool dedicated to the community and thereby relies on its contribution to evolve.

We did our best to adapt SkillNer for usage and fixed many of its bugs. Therefore, we believe its key features make it ready for a diversity of use cases. However, it still has not reached 100% stability. SkillNer needs the assistance of the community to be adapted further and broaden its usage.

You can contribute to SkillNer either by

  1. Reporting issues. Indeed, you may encounter one while you are using SkillNer. So do not hesitate to mention them in the issue section of our GitHub repository. Also, you can use the issue as a way to suggest new features to be added.

  2. Pushing code to our repository through pull requests. In case you fixed an issue or wanted to extend SkillNer features.

  3. A third (friendly and not technical) option to contribute to SkillNer will be soon released. So, stay tuned...

Finally, make sure to read carefully our guidelines before contributing. It will specifies standards to follow so that we can understand what you want to say.

Besides, it will help you setup SkillNer on your local machine, in case you are willing to push code.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].