All Projects → explosion → Projects

explosion / Projects

Licence: mit
🪐 End-to-end NLP workflows from prototype to production

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Projects

Entity Recognition Datasets
A collection of corpora for named entity recognition (NER) and entity recognition tasks. These annotated datasets cover a variety of languages, domains and entity types.
Stars: ✭ 891 (+124.43%)
Mutual labels:  datasets, natural-language-processing, annotations
Aidl kb
A Knowledge Base for the FB Group Artificial Intelligence and Deep Learning (AIDL)
Stars: ✭ 219 (-44.84%)
Mutual labels:  datasets, natural-language-processing
Codesearchnet
Datasets, tools, and benchmarks for representation learning of code.
Stars: ✭ 1,378 (+247.1%)
Mutual labels:  datasets, natural-language-processing
Annotation tools
Visipedia Annotation Tools
Stars: ✭ 245 (-38.29%)
Mutual labels:  datasets, annotations
Awesome Dataset Tools
🔧 A curated list of awesome dataset tools
Stars: ✭ 495 (+24.69%)
Mutual labels:  datasets, annotations
Label Studio
Label Studio is a multi-type data labeling and annotation tool with standardized output format
Stars: ✭ 7,264 (+1729.72%)
Mutual labels:  datasets, annotations
Nlp Python Deep Learning
NLP in Python with Deep Learning
Stars: ✭ 374 (-5.79%)
Mutual labels:  natural-language-processing, spacy
Spacy Api Docker
spaCy REST API, wrapped in a Docker container.
Stars: ✭ 222 (-44.08%)
Mutual labels:  natural-language-processing, spacy
ml-datasets
🌊 Machine learning dataset loaders for testing and example scripts
Stars: ✭ 40 (-89.92%)
Mutual labels:  spacy, datasets
Medacy
🏥 Medical Text Mining and Information Extraction with spaCy
Stars: ✭ 287 (-27.71%)
Mutual labels:  natural-language-processing, spacy
Displacy
💥 displaCy.js: An open-source NLP visualiser for the modern web
Stars: ✭ 311 (-21.66%)
Mutual labels:  natural-language-processing, spacy
Doccano
Open source annotation tool for machine learning practitioners.
Stars: ✭ 5,600 (+1310.58%)
Mutual labels:  datasets, natural-language-processing
Spacy Services
💫 REST microservices for various spaCy-related tasks
Stars: ✭ 230 (-42.07%)
Mutual labels:  natural-language-processing, spacy
Adam qas
ADAM - A Question Answering System. Inspired from IBM Watson
Stars: ✭ 330 (-16.88%)
Mutual labels:  natural-language-processing, spacy
Prodigy Recipes
🍳 Recipes for the Prodigy, our fully scriptable annotation tool
Stars: ✭ 229 (-42.32%)
Mutual labels:  natural-language-processing, spacy
Machine Learning Resources
A curated list of awesome machine learning frameworks, libraries, courses, books and many more.
Stars: ✭ 226 (-43.07%)
Mutual labels:  datasets, natural-language-processing
Thinc
🔮 A refreshing functional take on deep learning, compatible with your favorite libraries
Stars: ✭ 2,422 (+510.08%)
Mutual labels:  natural-language-processing, spacy
Spacy Lookup
Named Entity Recognition based on dictionaries
Stars: ✭ 212 (-46.6%)
Mutual labels:  natural-language-processing, spacy
clothing-detection-ecommerce-dataset
Clothing detection dataset
Stars: ✭ 43 (-89.17%)
Mutual labels:  annotations, datasets
Chakin
Simple downloader for pre-trained word vectors
Stars: ✭ 323 (-18.64%)
Mutual labels:  datasets, natural-language-processing

🪐 Project Templates

spaCy projects let you manage and share end-to-end spaCy workflows for different use cases and domains, and orchestrate training, packaging and serving your custom pipelines. You can start off by cloning a pre-defined project template, adjust it to fit your needs, load in your data, train a pipeline, export it as a Python package, upload your outputs to a remote storage and share your results with your team.

⚠️ spaCy project templates require spaCy v3.0. You can install it from pip with pip install spacy or conda with conda install spacy -c conda-forge. Make sure to use a fresh virtual environment.

See the master branch for the previous version of this repo.

Azure Pipelines spaCy

🗃 Categories

Name Description
pipelines Templates for training NLP pipelines with different components on different corpora.
tutorials Templates that work through a specific NLP use case end-to-end.
integrations Templates showing integrations with third-party libraries and tools for managing your data and experiments, iterating on demos and prototypes and shipping your models into production.
benchmarks Templates to reproduce our benchmarks and produce quantifiable results that are easy to compare against other systems or versions of spaCy.
experimental Experimental workflows and other cutting-edge stuff to use at your own risk.

🚀 Quickstart

Projects can be used via the new spacy project CLI. To find out more about a command, add --help. For detailed instructions, see the usage guide.

  1. Clone the project template you want to use.
    python -m spacy project clone tutorials/ner_fashion_brands
    
  2. Fetch assets (data, weights) defined in the project.yml.
    cd ner_fashion_brands
    python -m spacy project assets
    
  3. Run a command defined in the project.yml.
    python -m spacy project run preprocess
    
  4. Run a workflow of multiple steps in order.
    python -m spacy project run all
    
  5. Adjust the template for your specific use case, load in your own data, adjust the settings and model and share the result with your team.

👷‍♀️Repository maintanance

To keep the project templates and their documentation up to date, this repo contains several scripts:

Script Description
update_docs.py Update all auto-generated docs in the given root. Calls into spacy project document and only replaces the auto-generated sections, not any custom content before or after.
update_category_docs.py Update the auto-generated README.md in the category directories listing the available project templates.
update_configs.py Update and auto-fill all config.cfg files included in the repo, similar to spacy init fill-config. Can be used to keep the configs up to date with changes in spaCy.
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].