All Projects → featureform → embeddinghub

featureform / embeddinghub

Licence: MPL-2.0 License
A vector database for machine learning embeddings.

Programming Languages

go
31211 projects - #10 most used programming language
javascript
184084 projects - #8 most used programming language
C++
36643 projects - #6 most used programming language
python
139335 projects - #7 most used programming language
Starlark
911 projects
shell
77523 projects

Projects that are alternatives of or similar to embeddinghub

Milvus
An open-source vector database for embedding similarity search and AI applications.
Stars: ✭ 9,015 (+1297.67%)
Mutual labels:  vector-database, embeddings-similarity
CODER
CODER: Knowledge infused cross-lingual medical term embedding for term normalization. [JBI, ACL-BioNLP 2022]
Stars: ✭ 24 (-96.28%)
Mutual labels:  embeddings
wolfram-notebook-embedder
JavaScript embedder for Wolfram Cloud notebooks
Stars: ✭ 48 (-92.56%)
Mutual labels:  embeddings
whatlies
Toolkit to help understand "what lies" in word embeddings. Also benchmarking!
Stars: ✭ 351 (-45.58%)
Mutual labels:  embeddings
lda2vec
Mixing Dirichlet Topic Models and Word Embeddings to Make lda2vec from this paper https://arxiv.org/abs/1605.02019
Stars: ✭ 27 (-95.81%)
Mutual labels:  embeddings
SentimentAnalysis
Sentiment Analysis: Deep Bi-LSTM+attention model
Stars: ✭ 32 (-95.04%)
Mutual labels:  embeddings
word2vec-tsne
Google News and Leo Tolstoy: Visualizing Word2Vec Word Embeddings using t-SNE.
Stars: ✭ 59 (-90.85%)
Mutual labels:  embeddings
milvus cli
Milvus Command Line
Stars: ✭ 19 (-97.05%)
Mutual labels:  vector-database
relation-network
Tensorflow Implementation of Relation Networks for the bAbI QA Task, detailed in "A Simple Neural Network Module for Relational Reasoning," [https://arxiv.org/abs/1706.01427] by Santoro et. al.
Stars: ✭ 45 (-93.02%)
Mutual labels:  embeddings
code-compass
a contextual search engine for software packages built on import2vec embeddings (https://www.code-compass.com)
Stars: ✭ 33 (-94.88%)
Mutual labels:  embeddings
attu
Milvus management GUI
Stars: ✭ 51 (-92.09%)
Mutual labels:  vector-database
Network-Embedding-Resources
Network Embedding Survey and Resources
Stars: ✭ 43 (-93.33%)
Mutual labels:  embeddings
EmbeddedScrollView
Embedded UIScrollView for iOS.
Stars: ✭ 55 (-91.47%)
Mutual labels:  embeddings
Awesome-Machine-Learning-Papers
📖Notes and remarks on Machine Learning related papers
Stars: ✭ 35 (-94.57%)
Mutual labels:  embeddings
Text and Audio classification with Bert
Text Classification in Turkish Texts with Bert
Stars: ✭ 34 (-94.73%)
Mutual labels:  embeddings
ncc
Neural Code Comprehension: A Learnable Representation of Code Semantics
Stars: ✭ 162 (-74.88%)
Mutual labels:  embeddings
Keras-Application-Zoo
Reference implementations of popular DL models missing from keras-applications & keras-contrib
Stars: ✭ 31 (-95.19%)
Mutual labels:  embeddings
sentiment-analysis-of-tweets-in-russian
Sentiment analysis of tweets in Russian using Convolutional Neural Networks (CNN) with Word2Vec embeddings.
Stars: ✭ 51 (-92.09%)
Mutual labels:  embeddings
reach
Load embeddings and featurize your sentences.
Stars: ✭ 17 (-97.36%)
Mutual labels:  embeddings
Recommender-Systems-with-Collaborative-Filtering-and-Deep-Learning-Techniques
Implemented User Based and Item based Recommendation System along with state of the art Deep Learning Techniques
Stars: ✭ 41 (-93.64%)
Mutual labels:  embeddings

featureform

Embedding Store workflow PyPi Downloads Featureform Slack
Python supported PyPi Version featureform Website Twitter

What is Embeddinghub?

Embeddinghub is a database built for machine learning embeddings. It is built with four goals in mind.

  • Store embeddings durably and with high availability
  • Allow for approximate nearest neighbor operations
  • Enable other operations like partitioning, sub-indices, and averaging
  • Manage versioning, access control, and rollbacks painlessly


drawing



Features

  • Supported Operations: Run approximate nearest neighbor lookups, average multiple embeddings, partition tables (spaces), cache locally while training, and more.
  • Storage: Store and index billions vectors embeddings from our storage layer.
  • Versioning: Create, manage, and rollback different versions of your embeddings.
  • Access Control: Encode different business logic and user management directly into Embeddinghub.
  • Monitoring: Keep track of how embeddings are being used, latency, throughput, and feature drift over time.

What is an Embedding?

Embeddings are dense numerical representations of real-world objects and relationships, expressed as a vector. The vector space quantifies the semantic similarity between categories. Embedding vectors that are close to each other are considered similar. Sometimes, they are used directly for “Similar items to this” section in an e-commerce store. Other times, embeddings are passed to other models. In those cases, the model can share learnings across similar items rather than treating them as two completely unique categories, as is the case with one-hot encodings. For this reason, embeddings can be used to accurately represent sparse data like clickstreams, text, and e-commerce purchases as features to downstream models.

Further Reading



Getting Started

Step 1: Install Embeddinghub client

Install the Python SDK via pip

pip install embeddinghub

Step 2: Deploy Docker container ( optional )

The Embeddinghub client can be used without a server. This is useful when using embeddings in a research environment where a database server is not necessary. If that’s the case for you, skip ahead to the next step.

Otherwise, we can use this docker command to run Embeddinghub locally and to map the container's main port to our host's port.

docker run featureformcom/embeddinghub -p 7462:7462

Step 3: Initialize Python Client

If you deployed a docker container, you can initialize the python client.

import embeddinghub as eh

hub = eh.connect(eh.Config())

Otherwise, you can use a LocalConfig to store and index embeddings locally.

hub = eh.connect(eh.LocalConfig("data/"))

Step 4: Create a Space

Embeddings are written and retrieved from Spaces. When creating a Space we must also specify a version, otherwise a default version is used.

space = hub.create_space("quickstart", dims=3)

Step 5: Upload Embeddings

We will create a dictionary of three embeddings and upload them to our new quickstart space.

embeddings = {
    "apple": [1, 0, 0],
    "orange": [1, 1, 0],
    "potato": [0, 1, 0],
    "chicken": [-1, -1, 0],
}
space.multiset(embeddings)

Step 6: Get nearest neighbors

Now we can compare apples to oranges and get the nearest neighbors.

neighbors = space.nearest_neighbors(key="apple", num=2)
print(neighbors)

Contributing


Report Issues

Please help us by reporting any issues you may have while using Embeddinghub.


License

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].