All Projects → ngageoint → Scale

ngageoint / Scale

Licence: apache-2.0
Processing framework for containerized algorithms

Programming Languages

python
139335 projects - #7 most used programming language

Projects that are alternatives of or similar to Scale

Dcos
DC/OS - The Datacenter Operating System
Stars: ✭ 2,316 (+2216%)
Mutual labels:  mesos, dcos
dcos-k8s-rust-skaffold-demo
A demo of pipelining Rust application development to Kubernetes on DC/OS with Skaffold.
Stars: ✭ 40 (-60%)
Mutual labels:  mesos, dcos
marathon-slack
Integration for Marathon's Event Bus with Slack
Stars: ✭ 42 (-58%)
Mutual labels:  mesos, dcos
Dcos Commons
DC/OS SDK is a collection of tools, libraries, and documentation for easy integration of technologies such as Kafka, Cassandra, HDFS, Spark, and TensorFlow with DC/OS.
Stars: ✭ 162 (+62%)
Mutual labels:  mesos, dcos
Linkerdcosdockerfile
Linker Dcos DockerFile&DockerCompose yml file
Stars: ✭ 8 (-92%)
Mutual labels:  mesos, dcos
microservices-demo.github.io
The Microservices Demo website.
Stars: ✭ 65 (-35%)
Mutual labels:  mesos, dcos
container-orchestration
A Benchmark for Container Orchestration Systems
Stars: ✭ 19 (-81%)
Mutual labels:  mesos, dcos
dcos-deploy
Deploy, manage and orchestrate services and apps on DC/OS
Stars: ✭ 21 (-79%)
Mutual labels:  mesos, dcos
Marathon Lb
Marathon-lb is a service discovery & load balancing tool for DC/OS
Stars: ✭ 449 (+349%)
Mutual labels:  mesos, dcos
dcos-autoscaler
Autoscaler for DC/OS hosted in a cloud provider
Stars: ✭ 12 (-88%)
Mutual labels:  mesos, dcos
Vault Gatekeeper
A small service for securely delivering Vault authorization keys to Mesos tasks and ECS containers.
Stars: ✭ 83 (-17%)
Mutual labels:  mesos, dcos
Etcd Mesos
self-healing etcd on mesos!
Stars: ✭ 68 (-32%)
Mutual labels:  mesos, dcos
Acs Engine
WE HAVE MOVED: Please join us at Azure/aks-engine!
Stars: ✭ 1,049 (+949%)
Mutual labels:  mesos, dcos
Csilvm
A LVM2 CSI plugin
Stars: ✭ 49 (-51%)
Mutual labels:  dcos
Smoothrefreshlayout
一款支持上下拉刷新、越界回弹、二级刷新、横向刷新、拉伸回弹、平滑滚动、嵌套滚动的多功能刷新控件
Stars: ✭ 1,166 (+1066%)
Mutual labels:  scale
Guitarscalechart
A guitar scale chart for many popular scales, modes, and keys.
Stars: ✭ 45 (-55%)
Mutual labels:  scale
Mesos Threshold Oversubscription
Simple, threshold-based oversubscription modules for Apache Mesos
Stars: ✭ 43 (-57%)
Mutual labels:  mesos
Flyte
Accelerate your ML and Data workflows to production. Flyte is a production grade orchestration system for your Data and ML workloads. It has been battle tested at Lyft, Spotify, freenome and others and truly open-source.
Stars: ✭ 1,242 (+1142%)
Mutual labels:  scale
Tweeter
A tiny Twitter clone for DC/OS
Stars: ✭ 68 (-32%)
Mutual labels:  dcos
Ansible Dcos
[DEPRECATED] Please consider using the Ansible Roles for DC/OS maintained by the Mesosphere SRE team
Stars: ✭ 41 (-59%)
Mutual labels:  dcos

Scale

Join the chat at https://gitter.im/ngageoint/scale Build Status

Scale is a system that provides management of automated processing on a cluster of machines. It allows users to define jobs, which can be any type of script or algorithm. These jobs run on ingested source data and produce product files. The produced products can be disseminated to appropriate users and/or used to evaluate the producing algorithm in terms of performance and accuracy.

Mesos and Nodes

Scale runs across a cluster of networked machines (called nodes) that process the jobs. Scale utilizes Apache Mesos, a free and open source project, for managing the available resources on the nodes. Mesos informs Scale of available computing resources and Scale schedules jobs to run on those resources.

Ingest

Scale ingests source files using a Scale component called Strike. Strike is a process that monitors an ingest directory into which source data files are being copied. After a new source data file has been ingested, Scale produces and places jobs on the queue depending on the type of the ingested file. Many Strike processes can be run simultaneously, allowing Scale to monitor many different ingest directories.

Jobs

Scale creates jobs based on its known job types. A job type defines key characteristics about an algorithm that Scale needs to know in order to run it (what command to run, the algorithm.s inputs and outputs, etc.) Job types are labeled with versions, allowing Scale to run multiple versions of the same algorithm. Jobs may be created automatically due to an event, such as the ingest of a particular type of source data file, or they may be created manually by a user. Jobs that need to be executed are placed onto and prioritized within a queue before being scheduled onto an available node. When multiple jobs need to be run in a serial or parallel sequence, a recipe can be created that defines the job workflow.

Products

Jobs can produce products as a result of their successful execution. Products may be disseminated to users or used to analyze and improve the algorithms that produced them. Scale allows the creation of different workspaces. A workspace defines a separate location for storing source or product files. When a job is created, it is given a workspace to use for storing its results, allowing a user to control whether the job.s results are available to a wider audience or are restricted to a private workspace for the user's own use.

Scale Dependencies

Scale requires several external components to run as intended. PostgreSQL is used to store all internal system state and must be accessible to both the scheduler and web server processes. Fluentd along with Elasticsearch are used to collect and store all algorithm logs. A message broker is required for in-flight storage of internal Scale messages and must be accessible to all system components. The following versions of these services are required to support Scale:

  • Elasticsearch 6.6.2
  • Fluentd 1.4
  • PostgreSQL 9.4+
  • PostGIS 2.0+
  • Message Broker (RabbitMQ 3.6+ or Amazon SQS)

Note: We strongly recommend using managed services for PostgreSQL (AWS RDS), Messaging (AWS SQS) and Elasticsearch (AWS Elasticsearch Service), if available to you. Use of these services in Docker containers should be avoided in all but development environments. Reference the Architecture documentation for additional details on configuring supporting services.

Quick Start

While Scale can be entirely run on a pure Apache Mesos cluster, we strongly recommend using Data Center Operating System (DC/OS). DC/OS provides service discovery, load-balancing and fail-over for Scale, as well as deployment scripts for nearly all imaginable target infrastructures. This stack allows Scale users to focus on use of the framework while minimizing effort spent on deployment and configuration. A complete quick start guide can be found at:

https://ngageoint.github.io/scale/quickstart.html

Algorithm Development

Scale is designed to allow development of recipes and jobs for your domain without having to concern yourself with the complexities of cluster scheduling or data flow management. As long as your processing can be accomplished with discrete inputs on a Linux command line, it can be run in Scale. Simple examples of a complete processing chain can be found within the above quick start or you can refer to our in-depth documentation for step-by-step Scale integration:

https://ngageoint.github.io/scale/docs/algorithm_integration/index.html

Scale Development

If you want to contribute to the actual Scale open source project, we welcome your contributions. There are 2 primary components of Scale:

The links provide specific development environment setup instructions for each individual component.

Build

Scale is tested and built using a combination of Travis CI and Docker Hub. All unit test execution and documentation generation are done using Travis CI. We require that any pull request fully pass unit test checks prior to being merged. Docker Hub builds are saved to x.x.x-snapshot image tags between releases and on release tags are matched to release version.

A new release can be cut using the generate-release.sh shell script from a cloned Scale repository (where numbers refer to MAJOR MINOR PATCH versions respectively):

./generate-release.sh 4 0 0 

There is no direct connection between the Travis CI and Docker Hub builds, but both are launched via push to the GitHub repository.

Contributing

Scale was developed at the National Geospatial-Intelligence Agency (NGA). The government has "unlimited rights" and is releasing this software to increase the impact of government investments by providing developers with the opportunity to take things in new directions. The software use, modification, and distribution rights are stipulated within the Apache 2.0 license.

All pull request contributions to this project will be released under the Apache 2.0 or compatible license. Software source code previously released under an open source license and then modified by NGA staff is considered a "joint work" (see 17 USC § 101); it is partially copyrighted, partially public domain, and as a whole is protected by the copyrights of the non-government authors and must be released according to the terms of the original open source license.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].