All Projects → apache → Submarine

apache / Submarine

Licence: apache-2.0
Submarine is Cloud Native Machine Learning Platform.

Programming Languages

java
68154 projects - #9 most used programming language

Projects that are alternatives of or similar to Submarine

Polyaxon
Machine Learning Platform for Kubernetes (MLOps tools for experimentation and automation)
Stars: ✭ 2,966 (+612.98%)
Mutual labels:  ai, notebook
Image classifier
CNN image classifier implemented in Keras Notebook 🖼️.
Stars: ✭ 139 (-66.59%)
Mutual labels:  ai, notebook
Xlearning
AI on Hadoop
Stars: ✭ 1,709 (+310.82%)
Mutual labels:  ai, yarn
Basic Mathematics For Machine Learning
The motive behind Creating this repo is to feel the fear of mathematics and do what ever you want to do in Machine Learning , Deep Learning and other fields of AI
Stars: ✭ 300 (-27.88%)
Mutual labels:  ai, notebook
Katrain
Improve your Baduk skills by training with KataGo!
Stars: ✭ 390 (-6.25%)
Mutual labels:  ai
Bigdl
Building Large-Scale AI Applications for Distributed Big Data
Stars: ✭ 3,813 (+816.59%)
Mutual labels:  ai
W.i.l.l
A python written personal assistant
Stars: ✭ 377 (-9.37%)
Mutual labels:  ai
Npminstall
Make `npm install` fast and easy.
Stars: ✭ 374 (-10.1%)
Mutual labels:  yarn
Lockfile Lint
Lint an npm or yarn lockfile to analyze and detect security issues
Stars: ✭ 411 (-1.2%)
Mutual labels:  yarn
Tagui
Free RPA tool by AI Singapore
Stars: ✭ 4,257 (+923.32%)
Mutual labels:  ai
Convnetsharp
Deep Learning in C#
Stars: ✭ 390 (-6.25%)
Mutual labels:  ai
Js Stack From Scratch
🛠️⚡ Step-by-step tutorial to build a modern JavaScript stack.
Stars: ✭ 18,814 (+4422.6%)
Mutual labels:  yarn
Gank.io Unofficial Android Client
An unofficial gank. io android client
Stars: ✭ 394 (-5.29%)
Mutual labels:  notebook
Movement Tracking
UP - DOWN - LEFT - RIGHT movement tracking.
Stars: ✭ 379 (-8.89%)
Mutual labels:  ai
Kglib
Grakn Knowledge Graph Library (ML R&D)
Stars: ✭ 405 (-2.64%)
Mutual labels:  ai
Nlp Python Deep Learning
NLP in Python with Deep Learning
Stars: ✭ 374 (-10.1%)
Mutual labels:  notebook
Sourcery
Refactor Python using AI. ⭐ this repo and Sourcery Starbot will send you a PR
Stars: ✭ 372 (-10.58%)
Mutual labels:  ai
D2l Vn
Một cuốn sách tương tác về học sâu có mã nguồn, toán và thảo luận. Đề cập đến nhiều framework phổ biến (TensorFlow, Pytorch & MXNet) và được sử dụng tại 175 trường Đại học.
Stars: ✭ 402 (-3.37%)
Mutual labels:  notebook
Benchmarks Of Javascript Package Managers
Benchmarks of JavaScript Package Managers
Stars: ✭ 388 (-6.73%)
Mutual labels:  yarn
Awesome Npm
Awesome npm resources and tips
Stars: ✭ 3,894 (+836.06%)
Mutual labels:  yarn

color_logo_with_text

Build Status License HitCount PyPI version

What is Apache Submarine?

Apache Submarine (Submarine for short) is an End-to-End Machine Learning PLATFORM to allow data scientists to create end-to-end machine learning workflows. To elaborate, on Submarine, data scientists can finish each stage in the ML model lifecycle, including data exploration, data pipeline creation, model training, serving, and monitoring.

Why Submarine?

Some open-source and commercial projects are trying to build an end-to-end ML platform. What's the vision of Submarine?

Problems

  1. Many platforms lack easy-to-use user interfaces (API, SDK, and IDE, etc.)
  2. In the same company, data scientists in different teams usually spend much time on developments of existing feature sets and models.
  3. Data scientists put emphasis on domain-specific tasks (e.g. Click-Through-Rate), but they need to implement their models from scratch with SDKs provided by existing platforms.
  4. Many platforms lack a unified workbench to manage each component in the ML lifecycle.

Theodore Levitt once said:

“People don’t want to buy a quarter-inch drill. They want a quarter-inch hole.”

Goals of Submarine

Model Training (Experiment)

  • Run/Track distributed training experiment on prem or cloud via easy-to-use UI/API/SDK.
  • Easy for data scientists to manage versions of experiment and dependencies of environment
  • Support popular machine learning frameworks, including TensorFlow, PyTorch, Horovod, and MXNet
  • Provide pre-defined template for data scientists to implement domain-specific tasks easily (e.g. using DeepFM template to build a CTR prediction model)
  • Support many compute resources (e.g. CPU and GPU, etc.)
  • Support Kubernetes and YARN
  • Pipeline is also on the backlog, we will look into pipeline for training in the future.

Notebook Service

  • Submarine aims to provide a notebook service (e.g. Jupyter notebook) which allows users to manage notebook instances running on the cluster.

Model Management (Serving/versioning/monitoring, etc.)

  • Model management for model-serving/versioning/monitoring is on the roadmap.

Easy-to-use User Interface

As mentioned above, Submarine attempts to provide Data-Scientist-friendly UI to make data scientists have a good user experience. Here're some examples.

Example: Submit a distributed Tensorflow experiment via Submarine Python SDK

Run a Tensorflow Mnist experiment

# New a submarine client of the submarine server
submarine_client = submarine.ExperimentClient(host='http://localhost:8080')

# The experiment's environment, could be Docker image or Conda environment based
environment = EnvironmentSpec(image='apache/submarine:tf-dist-mnist-test-1.0')

# Specify the experiment's name, framework it's using, namespace it will run in,
# the entry point. It can also accept environment variables. etc.
# For PyTorch job, the framework should be 'Pytorch'.
experiment_meta = ExperimentMeta(name='mnist-dist',
                                 namespace='default',
                                 framework='Tensorflow',
                                 cmd='python /var/tf_dist_mnist/dist_mnist.py --train_steps=100')
# 1 PS task of 2 cpu, 1GB
ps_spec = ExperimentTaskSpec(resources='cpu=2,memory=1024M',
                             replicas=1)
# 1 Worker task
worker_spec = ExperimentTaskSpec(resources='cpu=2,memory=1024M',
                                 replicas=1)

# Wrap up the meta, environment and task specs into an experiment.
# For PyTorch job, the specs would be "Master" and "Worker".
experiment_spec = ExperimentSpec(meta=experiment_meta,
                                 environment=environment,
                                 spec={'Ps':ps_spec, 'Worker': worker_spec})

# Submit the experiment to submarine server
experiment = submarine_client.create_experiment(experiment_spec=experiment_spec)

# Get the experiment ID
id = experiment['experimentId']

Query a specific experiment

submarine_client.get_experiment(id)

Wait for finish

submarine_client.wait_for_finish(id)

Get the experiment's log

submarine_client.get_log(id)

Get all running experiment

submarine_client.list_experiments(status='running')

For a quick-start, see Submarine On K8s

Example: Submit a pre-defined experiment template job

Example: Submit an experiment via Submarine UI

(Available on 0.6.0, see Roadmap)

Architecture, Design and requirements

If you want to know more about Submarine's architecture, components, requirements and design doc, they can be found on Architecture-and-requirement

Detailed design documentation, implementation notes can be found at: Implementation notes

Apache Submarine Community

Read the Apache Submarine Community Guide

How to contribute Contributing Guide

Issue Tracking: https://issues.apache.org/jira/projects/SUBMARINE

User Document

See User Guide Home Page

Developer Document

See Developer Guide Home Page

Roadmap

What to know more about what's coming for Submarine? Please check the roadmap out: https://cwiki.apache.org/confluence/display/SUBMARINE/Roadmap

License

The Apache Submarine project is licensed under the Apache 2.0 License. See the LICENSE file for details.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].