All Projects → NVIDIA → Aistore

NVIDIA / Aistore

Licence: mit
AIStore: scalable storage for AI applications

Programming Languages

go
31211 projects - #10 most used programming language

Projects that are alternatives of or similar to Aistore

Awesome Graal
A curated list of awesome resources for Graal, GraalVM, Truffle and related topics
Stars: ✭ 302 (-17.71%)
Mutual labels:  high-performance
Windterm
A quicker and better cross-platform SSH/Sftp/Shell/Telnet/Serial client.
Stars: ✭ 345 (-5.99%)
Mutual labels:  high-performance
Numpy Stl
Simple library to make working with STL files (and 3D objects in general) fast and easy.
Stars: ✭ 356 (-3%)
Mutual labels:  high-performance
Tarsjava
Java language framework rpc source code implementation
Stars: ✭ 321 (-12.53%)
Mutual labels:  high-performance
Webkettle
基于web版kettle开发的一套分布式综合调度,管理,ETL开发的用户专业版B/S架构工具
Stars: ✭ 334 (-8.99%)
Mutual labels:  etl
Gobeansdb
Distributed object storage server from Douban Inc.
Stars: ✭ 351 (-4.36%)
Mutual labels:  object-storage
Ocbarrage
iOS 弹幕库 OCBarrage, 同时渲染5000条弹幕也不卡, 轻量, 可拓展, 高度自定义动画, 超高性能, 简单易上手; A barrage render-engine with high performance for iOS. At the same time, rendering 5000 barrages is also very smooth, lightweight, scalable, highly custom animation, ultra high performance, simple and easy to use!
Stars: ✭ 294 (-19.89%)
Mutual labels:  high-performance
Infinit
The Infinit policy-based software-defined storage platform.
Stars: ✭ 363 (-1.09%)
Mutual labels:  object-storage
Kore
Kore (https://kore.io) is an easy to use web application platform for writing scalable web APIs in C. Its main goals are security, scalability and allowing rapid development and deployment of such APIs.
Stars: ✭ 3,477 (+847.41%)
Mutual labels:  high-performance
Zenko
Zenko is the open source multi-cloud data controller: own and keep control of your data on any cloud.
Stars: ✭ 353 (-3.81%)
Mutual labels:  object-storage
Geojs
High-performance visualization and interactive data exploration of scientific and geospatial location aware datasets
Stars: ✭ 323 (-11.99%)
Mutual labels:  high-performance
Bloom Filter Scala
Bloom filter for Scala, the fastest for JVM
Stars: ✭ 333 (-9.26%)
Mutual labels:  high-performance
Gramework
Fast and Reliable Golang Web Framework
Stars: ✭ 354 (-3.54%)
Mutual labels:  high-performance
Saea
SAEA.Socket is a high-performance IOCP framework TCP based on dotnet standard 2.0; Src contains its application test scenarios, such as websocket,rpc, redis driver, MVC WebAPI, lightweight message server, ultra large file transmission, etc. SAEA.Socket是一个高性能IOCP框架的 TCP,基于dotnet standard 2.0;Src中含有其应用测试场景,例如websocket、rpc、redis驱动、MVC WebAPI、轻量级消息服务器、超大文件传输等
Stars: ✭ 318 (-13.35%)
Mutual labels:  high-performance
Pgagroal
High-performance connection pool for PostgreSQL
Stars: ✭ 362 (-1.36%)
Mutual labels:  high-performance
Exprtk
C++ Mathematical Expression Parsing And Evaluation Library
Stars: ✭ 301 (-17.98%)
Mutual labels:  high-performance
Dataform
Dataform is a framework for managing SQL based data operations in BigQuery, Snowflake, and Redshift
Stars: ✭ 342 (-6.81%)
Mutual labels:  etl
Wedatasphere
WeDataSphere is a financial level one-stop open-source suitcase for big data platforms. Currently the source code of Scriptis and Linkis has already been released to the open-source community. WeDataSphere, Big Data Made Easy!
Stars: ✭ 372 (+1.36%)
Mutual labels:  etl
Metorikku
A simplified, lightweight ETL Framework based on Apache Spark
Stars: ✭ 361 (-1.63%)
Mutual labels:  etl
Aero
🚄 High-performance web server for Go.
Stars: ✭ 354 (-3.54%)
Mutual labels:  high-performance

AIStore is a lightweight object storage system with the capability to linearly scale-out with each added storage node and a special focus on petascale deep learning.

License Go Report Card

AIStore (AIS for short) is a built from scratch, lightweight storage stack tailored for AI apps. AIS consistently shows balanced I/O distribution and linear scalability across arbitrary numbers of clustered servers, producing performance charts that look as follows:

I/O distribution

The picture above comprises 120 HDDs.

The ability to scale linearly with each added disk was, and remains, one of the main incentives behind AIStore. Much of the development is also driven by the ideas to offload dataset transformation and other I/O intensive stages of the ETL pipelines.

Features

  • scale-out with no downtime and no limitation;
  • comprehensive HTTP REST API to GET and PUT objects, create, destroy, list and configure buckets, and more;
  • Amazon S3 API to run unmodified S3 apps;
  • FUSE client (aisfs) to access AIS objects as files;
  • arbitrary number of extremely lightweight access points;
  • easy-to-use CLI that supports TAB auto-completions;
  • automated cluster rebalancing upon: changes in cluster membership, drive failures and attachments, bucket renames;
  • N-way mirroring (RAID-1), Reed–Solomon erasure coding, end-to-end data protection.
  • ETL offload: running user-defined extract-transform-load workloads on (and by) performance-optimized storage cluster;

Also, AIStore:

  • can be deployed on any commodity hardware;
  • supports single-command infrastructure and software deployment on Google Cloud Platform via ais-k8s GitHub repo;
  • supports Amazon S3, Google Cloud, and Microsoft Azure backends (and all S3, GCS, and Azure-compliant object storages);
  • provides unified global namespace across (ad-hoc) connected AIS clusters;
  • can be used as a fast cache for GCS and S3; can be populated on-demand and/or via prefetch and download APIs;
  • can be used as a standalone highly-available protected storage;
  • includes MapReduce extension for massively parallel resharding of very large datasets;
  • supports existing PyTorch and TensorFlow-based training models.

AIS runs natively on Kubernetes and features open format - thus, the freedom to copy or move your data from AIS at any time using the familiar Linux tar(1), scp(1), rsync(1) and similar.

For AIStore white paper and design philosophy, for introduction to large-scale deep learning and the most recently added features, please see AIStore Overview (where you can also find six alternative ways to work with existing datasets). Videos and animated presentations can be found at videos. To get started with AIS, please click on Getting Started.

Table of Contents

Introduction

AIStore supports numerous deployment options covering a spectrum from a single-laptop to petascale bare-metal clusters of any size. This includes:

Deployment option Targeted audience and objective
Local playground AIS developers and development, Linux or Mac OS
Minimal production-ready deployment This option utilizes preinstalled docker image and is targeting first-time users or researchers (who could immediately start training their models on smaller datasets)
Easy automated GCP/GKE deployment Developers, first-time users, AI researchers
Large-scale production deployment Requires Kubernetes and is provided (documented, automated) via a separate repository: ais-k8s

For detailed information on these and other supported options, and for a step-by-step instruction, please refer to Getting Started.

Monitoring

As is usually the case with storage clusters, there are multiple ways to monitor their performance.

AIStore includes aisloader - the tool to stress-test and benchmark storage performance. For background, command-line options, and usage, please see Load Generator and How To Benchmark AIStore.

For starters, AIS collects and logs a fairly large and growing number of counters that describe all aspects of its operation, including (but not limited to) those that reflect cluster recovery/rebalancing, all extended long-running operations, and, of course, object storage transactions.

In particular:

The logging interval is called stats_time (default 10s) and is configurable on the level of both each specific node and the entire cluster.

However. Speaking of ways to monitor AIS remotely, the two most obvious ones would be:

As far as Graphite/Grafana, AIS integrates with these popular backends via StatsD - the daemon for easy but powerful stats aggregation. StatsD can be connected to Graphite, which then can be used as a data source for Grafana to get a visual overview of the statistics and metrics.

The scripts for easy deployment of both Graphite and Grafana are included (see below).

For local non-containerized deployments, use ./deploy/dev/local/deploy_grafana.sh to start Graphite and Grafana containers. Local deployment scripts will automatically "notice" the presence of the containers and will send statistics to the Graphite.

For local docker-compose based deployments, make sure to use -grafana command-line option. The ./deploy/dev/docker/deploy_docker.sh script will then spin-up Graphite and Grafana containers.

In both of these cases, Grafana will be accessible at localhost:3000.

For information on AIS statistics, please see Statistics, Collected Metrics, Visualization

Configuration

AIS configuration is consolidated in a single JSON template where the configuration sections and the knobs within those sections must be self-explanatory, whereby the majority of those (except maybe just a few) have pre-assigned default values. The configuration template serves as a single source for all deployment-specific configurations, examples of which can be found under the folder that consolidates both containerized-development and production deployment scripts.

AIS production deployment, in particular, requires careful consideration of at least some of the configurable aspects. For example, AIS supports 3 (three) logical networks and will, therefore, benefit, performance-wise, if provisioned with up to 3 isolated physical networks or VLANs. The logical networks are:

  • user (aka public)
  • intra-cluster control
  • intra-cluster data

with the corresponding JSON names, respectively:

  • hostname
  • hostname_intra_control
  • hostname_intra_data

Assorted Tips

  • To enable an optional AIStore authentication server, execute $ AUTH_ENABLED=true make deploy. For information on AuthN server, please see AuthN documentation.
  • In addition to AIStore - the storage cluster, you can also deploy aisfs - to access AIS objects as files, and AIS CLI - to monitor, configure and manage AIS nodes and buckets.
  • AIS CLI is an easy-to-use command-line management tool supporting a growing number of commands and options (one of the first ones you may want to try could be ais show cluster - show the state and status of an AIS cluster). The CLI is documented in the readme; getting started with it boils down to running make cli and following the prompts.
  • For more testing commands and options, please refer to the testing README.
  • For aisnode command-line options, see: command-line options.
  • For helpful links and/or background on Go, AWS, GCP, and Deep Learning: helpful links.
  • And again, run make help to find out how to build, run, and test AIStore and tools.

Guides and References

Selected Package READMEs

License

MIT

Author

Alex Aizman (NVIDIA)

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].