All Projects → jina-ai → Jina

jina-ai / Jina

Licence: apache-2.0
Cloud-native neural search framework for 𝙖𝙣𝙮 kind of data

Programming Languages

python
139335 projects - #7 most used programming language
HTML
75241 projects
shell
77523 projects
CSS
56736 projects
Dockerfile
14818 projects
EJS
674 projects
javascript
184084 projects - #8 most used programming language

Projects that are alternatives of or similar to Jina

Kratos
A modular-designed and easy-to-use microservices framework in Go.
Stars: ✭ 15,844 (+25.57%)
Mutual labels:  framework, microservice, cloud-native
Milvus
An open-source vector database for embedding similarity search and AI applications.
Stars: ✭ 9,015 (-28.55%)
Mutual labels:  cloud-native, image-search, video-search
Light 4j
A fast, lightweight and more productive microservices framework
Stars: ✭ 3,303 (-73.82%)
Mutual labels:  microservice, cloud, cloud-native
Micro
Micro is a distributed cloud operating system
Stars: ✭ 10,778 (-14.58%)
Mutual labels:  cloud, cloud-native, framework
Go Chassis
a microservice framework for rapid development of micro services in Go with rich eco-system
Stars: ✭ 2,428 (-80.76%)
Mutual labels:  microservice, cloud-native, framework
Haystack
🔍 Haystack is an open source NLP framework that leverages Transformer models. It enables developers to implement production-ready neural search, question answering, semantic document search and summarization for a wide range of applications.
Stars: ✭ 3,409 (-72.98%)
Mutual labels:  search, semantic-search, neural-search
Vald
Vald. A Highly Scalable Distributed Vector Search Engine
Stars: ✭ 158 (-98.75%)
Mutual labels:  cloud, cloud-native
Discord Player
🎧 Complete framework to simplify the implementation of music commands using discords.js v12
Stars: ✭ 161 (-98.72%)
Mutual labels:  search, framework
Myapp
🖥️ How to build a Dockerized RESTful API application using Go.
Stars: ✭ 171 (-98.64%)
Mutual labels:  microservice, cloud-native
Stack
Golang RPC 开发框架
Stars: ✭ 178 (-98.59%)
Mutual labels:  microservice, framework
Wookiee
Scala based lightweight service framework using akka and other popular technologies.
Stars: ✭ 132 (-98.95%)
Mutual labels:  microservice, framework
Inclavare Containers
A novel container runtime, aka confidential container, for cloud-native confidential computing and enclave runtime ecosystem.
Stars: ✭ 173 (-98.63%)
Mutual labels:  cloud, cloud-native
Loc Framework
本项目是完全基于Spring Boot2和Springcloud Finchley所进行了开发的,目的是简化和统一公司内部使用微服务框架的使用方法
Stars: ✭ 238 (-98.11%)
Mutual labels:  microservice, framework
Oci Cloudnative
MuShop - Cloud Native microservices demo for Oracle Cloud Infrastructure
Stars: ✭ 147 (-98.83%)
Mutual labels:  microservice, cloud
Devspace
DevSpace - The Fastest Developer Tool for Kubernetes ⚡ Automate your deployment workflow with DevSpace and develop software directly inside Kubernetes.
Stars: ✭ 2,559 (-79.72%)
Mutual labels:  microservice, cloud-native
Booster
Booster Cloud Framework
Stars: ✭ 136 (-98.92%)
Mutual labels:  cloud-native, framework
Externalsecret Operator
An operator to fetch secrets from cloud services and inject them in Kubernetes
Stars: ✭ 177 (-98.6%)
Mutual labels:  cloud, cloud-native
Cloud Ops Sandbox
Cloud Operations Sandbox is an open source tool that helps practitioners to learn Service Reliability Engineering practices from Google and apply them on their cloud services using Cloud Operations suite of tools.
Stars: ✭ 191 (-98.49%)
Mutual labels:  cloud, cloud-native
Spring Cloud Azure
Spring Cloud integration with Azure services
Stars: ✭ 197 (-98.44%)
Mutual labels:  microservice, cloud-native
Metalk8s
An opinionated Kubernetes distribution with a focus on long-term on-prem deployments
Stars: ✭ 217 (-98.28%)
Mutual labels:  cloud, cloud-native

Jina logo: Jina is a cloud-native neural search framework

Cloud-Native Neural Search? Framework for Any Kind of Data

Python 3.7 3.8 3.9 PyPI Docker Image Version (latest semver) codecov

Jina is a neural search framework that empowers anyone to build SOTA and scalable deep learning search applications in minutes.

⏱️ Save time - The design pattern of neural search systems. Native support for PyTorch/Keras/ONNX/Paddle. Build solutions in just minutes.

🌌 All data types - Process, index, query, and understand videos, images, long/short text, audio, source code, PDFs, etc.

🌩️ Local & cloud friendly - Distributed architecture, scalable & cloud-native from day one. Same developer experience on both local and cloud.

🍱 Own your stack - Keep end-to-end stack ownership of your solution. Avoid integration pitfalls you get with fragmented, multi-vendor, generic legacy tools.

Install

pip install -U jina

More install options including Conda, Docker, Windows can be found here.

Documentation

Get Started

Get started with Jina to build production-ready neural search solution via ResNet in less than 20 minutes

We promise you can build a scalable ResNet-powered image search service in 20 minutes or less, from scratch. If not, you can forget about Jina.

Basic Concepts

Document, Executor, and Flow are three fundamental concepts in Jina.

  • Document is the basic data type in Jina;
  • Executor is how Jina processes Documents;
  • Flow is how Jina streamlines and distributes Executors.

Leveraging these three components, let's build an app that find similar images using ResNet50.

ResNet50 Image Search in 20 Lines

💡 Preliminaries: download dataset, install PyTorch & Torchvision

from jina import DocumentArray, Document

def preproc(d: Document):
    return (d.load_uri_to_image_blob()  # load
             .set_image_blob_normalization()  # normalize color 
             .set_image_blob_channel_axis(-1, 0))  # switch color axis
docs = DocumentArray.from_files('img/*.jpg').apply(preproc)

import torchvision
model = torchvision.models.resnet50(pretrained=True)  # load ResNet50
docs.embed(model, device='cuda')  # embed via GPU to speedup

q = (Document(uri='img/00021.jpg')  # build query image & preprocess
     .load_uri_to_image_blob()
     .set_image_blob_normalization()
     .set_image_blob_channel_axis(-1, 0))
q.embed(model)  # embed
q.match(docs)  # find top-20 nearest neighbours, done!

Done! Now print q.matches and you'll see the URIs of the most similar images.

Print q.matches to get visual similar images in Jina using ResNet50

Add three lines of code to visualize them:

for m in q.matches:
    m.set_image_blob_channel_axis(0, -1).set_image_blob_inv_normalization()
q.matches.plot_image_sprites()

Visualize visual similar images in Jina using ResNet50

Sweet! FYI, you can use Keras, ONNX, or PaddlePaddle for the embedding model. Jina supports them well.

As-a-Service in 10 Extra Lines

With an extremely trivial refactoring and ten extra lines of code, you can make the local script a ready-to-serve service:

  1. Import what we need.

    from jina import Document, DocumentArray, Executor, Flow, requests
  2. Copy-paste the preprocessing step and wrap it via Executor:

    class PreprocImg(Executor):
        @requests
        def foo(self, docs: DocumentArray, **kwargs):
            for d in docs:
                (d.load_uri_to_image_blob()  # load
                 .set_image_blob_normalization()  # normalize color
                 .set_image_blob_channel_axis(-1, 0))  # switch color axis
  3. Copy-paste the embedding step and wrap it via Executor:

    class EmbedImg(Executor):
        def __init__(self, **kwargs):
            super().__init__(**kwargs)
            import torchvision
            self.model = torchvision.models.resnet50(pretrained=True)        
    
        @requests
        def foo(self, docs: DocumentArray, **kwargs):
            docs.embed(self.model)
  4. Wrap the matching step into an Executor:

    class MatchImg(Executor):
        _da = DocumentArray()
    
        @requests(on='/index')
        def index(self, docs: DocumentArray, **kwargs):
            self._da.extend(docs)
            docs.clear()  # clear content to save bandwidth
    
        @requests(on='/search')
        def foo(self, docs: DocumentArray, **kwargs):
            docs.match(self._da)
            for d in docs.traverse_flat('r,m'):  # only require for visualization
                d.convert_uri_to_datauri()  # convert to datauri
                d.pop('embedding', 'blob')  # remove unnecessary fields for save bandwidth
  5. Connect all Executors in a Flow, scale embedding to 3:

    f = Flow(port_expose=12345, protocol='http').add(uses=PreprocImg).add(uses=EmbedImg, replicas=3).add(uses=MatchImg)

    Plot it via f.plot('flow.svg') and you get:

  6. Index image data and serve REST query publicly:

    with f:
        f.post('/index', DocumentArray.from_files('img/*.jpg'), show_progress=True, request_size=8)
        f.block()

Done! Now query it via curl and you get the most similar images:

Use curl to query image search service built by Jina & ResNet50

Or go to http://0.0.0.0:12345/docs and test requests via a Swagger UI:

Visualize visual similar images in Jina using ResNet50

Or use a Python client to access the service:

from jina import Client, Document
from jina.types.request import Response

def print_matches(resp: Response):  # the callback function invoked when task is done
    for idx, d in enumerate(resp.docs[0].matches):  # print top-3 matches
        print(f'[{idx}]{d.scores["cosine"].value:2f}: "{d.uri}"')

c = Client(protocol='http', port=12345)  # connect to localhost:12345
c.post('/search', Document(uri='img/00021.jpg'), on_done=print_matches)

At this point, you probably have taken 15 minutes but here we are: an image search service with rich features:

Solution as microservices Scale in/out any component Query via HTTP/WebSocket/gRPC/Client
Distribute/Dockerize components Async/non-blocking I/O Extendable REST interface

Deploy to Kubernetes in 7 Minutes

Have another seven minutes? We'll show you how to bring your service to the next level by deploying it to Kubernetes.

  1. Create a Kubernetes cluster and get credentials (example in GCP, more K8s providers here):
    gcloud container clusters create test --machine-type e2-highmem-2  --num-nodes 1 --zone europe-west3-a
    gcloud container clusters get-credentials test --zone europe-west3-a --project jina-showcase
  2. Move each Executor class to a separate folder with one Python file in each:
    • PreprocImg -> 📁 preproc_img/exec.py
    • EmbedImg -> 📁 embed_img/exec.py
    • MatchImg -> 📁 match_img/exec.py
  3. Push all Executors to Jina Hub:
    jina hub push preproc_img
    jina hub push embed_img
    jina hub push match_img
    You will get three Hub Executors that can be used via Docker container.
  4. Adjust Flow a bit and open it:
    f = Flow(name='readme-flow', port_expose=12345, infrastructure='k8s').add(uses='jinahub+docker://PreprocImg').add(uses='jinahub+docker://EmbedImg', replicas=3).add(uses='jinahub+docker://MatchImg')
    with f:
        f.block()

Intrigued? Find more about Jina from our docs.

Run Quick Demo

Support

Join Us

Jina is backed by Jina AI and licensed under Apache-2.0. We are actively hiring AI engineers, solution engineers to build the next neural search ecosystem in open source.

Contributing

We welcome all kinds of contributions from the open-source community, individuals and partners. We owe our success to your active involvement.

All Contributors

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].