All Projects → InstantDomain → instant-segment

InstantDomain / instant-segment

Licence: Apache-2.0 License
Fast English word segmentation in Rust

Programming Languages

rust
11053 projects
python
139335 projects - #7 most used programming language
Makefile
30231 projects

Projects that are alternatives of or similar to instant-segment

LightNet
LightNet: Light-weight Networks for Semantic Image Segmentation (Cityscapes and Mapillary Vistas Dataset)
Stars: ✭ 710 (+1348.98%)
Mutual labels:  segmentation
Segmentation-Series-Chaos
Summary and experiment includes basic segmentation, human segmentation, human or portrait matting for both image and video.
Stars: ✭ 75 (+53.06%)
Mutual labels:  segmentation
TNSCUI2020-Seg-Rank1st
This is the source code of the 1st place solution for segmentation task in MICCAI 2020 TN-SCUI challenge.
Stars: ✭ 161 (+228.57%)
Mutual labels:  segmentation
Opensource OBIA processing chain
An open-source semi-automated processing chain for urban OBIA classification.
Stars: ✭ 75 (+53.06%)
Mutual labels:  segmentation
segmentation-paper-reading-notes
segmentation paper reading notes
Stars: ✭ 39 (-20.41%)
Mutual labels:  segmentation
unsupervised llamas
Code for https://unsupervised-llamas.com
Stars: ✭ 70 (+42.86%)
Mutual labels:  segmentation
Entity
EntitySeg Toolbox: Towards Open-World and High-Quality Image Segmentation
Stars: ✭ 313 (+538.78%)
Mutual labels:  segmentation
shellnet
ShellNet: Efficient Point Cloud Convolutional Neural Networks using Concentric Shells Statistics
Stars: ✭ 80 (+63.27%)
Mutual labels:  segmentation
DigiPathAI
Digital Pathology AI
Stars: ✭ 43 (-12.24%)
Mutual labels:  segmentation
Image-Processing-in-Python
This repository contains the links to the article that I wrote on Medium pertaining to Image processing.
Stars: ✭ 23 (-53.06%)
Mutual labels:  segmentation
pedx
Python tools for working with PedX dataset.
Stars: ✭ 26 (-46.94%)
Mutual labels:  segmentation
FCN-Segmentation-TensorFlow
FCN for Semantic Image Segmentation achieving 68.5 mIoU on PASCAL VOC
Stars: ✭ 34 (-30.61%)
Mutual labels:  segmentation
segmentation training pipeline
Research Pipeline for image masking/segmentation in Keras
Stars: ✭ 54 (+10.2%)
Mutual labels:  segmentation
torch-points3d
Pytorch framework for doing deep learning on point clouds.
Stars: ✭ 1,823 (+3620.41%)
Mutual labels:  segmentation
HoughRectangle
Rectangle detection using the Hough transform
Stars: ✭ 76 (+55.1%)
Mutual labels:  segmentation
DSeg
Invariant Superpixel Features for Object Detection
Stars: ✭ 18 (-63.27%)
Mutual labels:  segmentation
DeepFashion MRCNN
Fashion Item segmentation with Mask_RCNN
Stars: ✭ 29 (-40.82%)
Mutual labels:  segmentation
mri-deep-learning-tools
Resurces for MRI images processing and deep learning in 3D
Stars: ✭ 56 (+14.29%)
Mutual labels:  segmentation
lite.ai.toolkit
🛠 A lite C++ toolkit of awesome AI models with ONNXRuntime, NCNN, MNN and TNN. YOLOX, YOLOP, MODNet, YOLOR, NanoDet, YOLOX, SCRFD, YOLOX . MNN, NCNN, TNN, ONNXRuntime, CPU/GPU.
Stars: ✭ 1,354 (+2663.27%)
Mutual labels:  segmentation
segmenter
[ICCV2021] Official PyTorch implementation of Segmenter: Transformer for Semantic Segmentation
Stars: ✭ 463 (+844.9%)
Mutual labels:  segmentation

Cover logo

Instant Segment: fast English word segmentation in Rust

Documentation Crates.io PyPI Build status License: Apache 2.0

Instant Segment is a fast Apache-2.0 library for English word segmentation. It is based on the Python wordsegment project written by Grant Jenks, which is in turn based on code from Peter Norvig's chapter Natural Language Corpus Data from the book Beautiful Data (Segaran and Hammerbacher, 2009).

For the microbenchmark included in this repository, Instant Segment is ~100x faster than the Python implementation. The API was carefully constructed so that multiple segmentations can share the underlying state to allow parallel usage.

How it works

Instant Segment works by segmenting a string into words by selecting the splits with the highest probability given a corpus of words and their occurrences.

For instance, provided that choose and spain occur more frequently than chooses and pain, and that the pair choose spain occurs more frequently than chooses pain, Instant Segment can help identify the domain choosespain.com as ChooseSpain.com which more likely matches user intent.

Read about how we built and improved Instant Segment for use in production at Instant Domain Search to help our users find relevant domains they can register.

Using the library

Python (>= 3.9)

pip install instant-segment

Rust

[dependencies]
instant-segment = "0.8.1"

Examples

The following examples expect unigrams and bigrams to exist. See the examples (Rust, Python) to see how to construct these objects.

import instant_segment

segmenter = instant_segment.Segmenter(unigrams, bigrams)
search = instant_segment.Search()
segmenter.segment("instantdomainsearch", search)
print([word for word in search])

--> ['instant', 'domain', 'search']
use instant_segment::{Search, Segmenter};
use std::collections::HashMap;

let segmenter = Segmenter::from_maps(unigrams, bigrams);
let mut search = Search::default();
let words = segmenter
    .segment("instantdomainsearch", &mut search)
    .unwrap();
println!("{:?}", words.collect::<Vec<&str>>())

--> ["instant", "domain", "search"]

Check out the tests for more thorough examples: Rust, Python

Testing

To run the tests run the following:

cargo t -p instant-segment --all-features

You can also test the Python bindings with:

make test-python
Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].