Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Created with love in Canada, visit hostnodejs.com today

Feel like to post an Ad? Learn Details

All Projects → skale-me → Skale

skale-me / Skale

Licence: apache-2.0

High performance distributed data processing engine

Programming Languages

javascript

184084 projects - #8 most used programming language

Labels

nodejs machine-learning cluster aws-s3 parquet azure-storage

Projects that are alternatives of or similar to Skale

Udacity Data Engineering Projects

Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.

Stars: ✭ 458 (+17.44%)

Mutual labels: aws-s3, cluster

Goofys

a high-performance, POSIX-ish Amazon S3 file system written in Go

Stars: ✭ 3,932 (+908.21%)

Mutual labels: aws-s3, azure-storage

databricks-notebooks

Collection of Databricks and Jupyter Notebooks

Stars: ✭ 19 (-95.13%)

Mutual labels: parquet, azure-storage

Storage

💿 Storage abstractions with implementations for .NET/.NET Standard

Stars: ✭ 380 (-2.56%)

Mutual labels: aws-s3, azure-storage

node-storage

📬 A unified file storage library for storage in cloud or on premise

Stars: ✭ 29 (-92.56%)

Mutual labels: aws-s3, azure-storage

BlobHelper

BlobHelper is a common, consistent storage interface for Microsoft Azure, Amazon S3, Komodo, Kvpbase, and local filesystem written in C#.

Stars: ✭ 23 (-94.1%)

Mutual labels: aws-s3, azure-storage

Zenko

Zenko is the open source multi-cloud data controller: own and keep control of your data on any cloud.

Stars: ✭ 353 (-9.49%)

Mutual labels: aws-s3, azure-storage

Sparklens

Qubole Sparklens tool for performance tuning Apache Spark

Stars: ✭ 345 (-11.54%)

Mutual labels: cluster

Diplomat

A HTTP Ruby API for Consul

Stars: ✭ 358 (-8.21%)

Mutual labels: cluster

Ckss Certified Kubernetes Security Specialist

This repository is a collection of resources to prepare for the Certified Kubernetes Security Specialist (CKSS) exam.

Stars: ✭ 333 (-14.62%)

Mutual labels: cluster

S3mock

A simple mock implementation of the AWS S3 API startable as Docker image, JUnit 4 rule, or JUnit Jupiter extension

Stars: ✭ 332 (-14.87%)

Mutual labels: aws-s3

Tensorflowonspark

TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.

Stars: ✭ 3,748 (+861.03%)

Mutual labels: cluster

Nebula

Nebula is a powerful framwork for building highly concurrent, distributed, and resilient message-driven applications for C++.

Stars: ✭ 385 (-1.28%)

Mutual labels: cluster

Dotnext

Next generation API for .NET

Stars: ✭ 379 (-2.82%)

Mutual labels: cluster

Azure Spring Boot

Spring Boot Starters for Azure services

Stars: ✭ 352 (-9.74%)

Mutual labels: azure-storage

Parquet Cpp

Apache Parquet

Stars: ✭ 339 (-13.08%)

Mutual labels: parquet

Kontraktor

distributed Actors for Java 8 / JavaScript

Stars: ✭ 333 (-14.62%)

Mutual labels: cluster

Nodejsstarterkit

Starter Kit for Node.js v14.x, minimum dependencies 🚀

Stars: ✭ 348 (-10.77%)

Mutual labels: cluster

Swarmlet

A self-hosted, open-source Platform as a Service that enables easy swarm deployments, load balancing, automatic SSL, metrics, analytics and more.

Stars: ✭ 373 (-4.36%)

Mutual labels: cluster

Oap

Optimized Analytics Package for Spark* Platform

Stars: ✭ 343 (-12.05%)

Mutual labels: parquet

View All Similar Projects ➔

High performance distributed data processing and machine learning.

Skale provides a high-level API in Javascript and an optimized parallel execution engine on top of NodeJS.

Features

Pure javascript implementation of a Spark like engine
Multiple data sources: filesystems, databases, cloud (S3, azure)
Multiple data formats: CSV, JSON, Columnar (Parquet)...
50 high level operators to build parallel apps
Machine learning: scalable classification, regression, clusterization
Run interactively in a nodeJS REPL shell
Docker ready, simple local mode or full distributed mode
Very fast, see benchmark

Quickstart

npm install skale

Word count example:

var sc = require('skale').context();

sc.textFile('/my/path/*.txt')
  .flatMap(line => line.split(' '))
  .map(word => [word, 1])
  .reduceByKey((a, b) => a + b, 0)
  .count(function (err, result) {
    console.log(result);
    sc.end();
  });

Local mode

In local mode, worker processes are automatically forked and communicate with app through child process IPC channel. This is the simplest way to operate, and it allows to use all machine available cores.

To run in local mode, just execute your app script:

node my_app.js

or with debug traces:

SKALE_DEBUG=2 node my_app.js

Distributed mode

In distributed mode, a cluster server process and worker processes must be started prior to start app. Processes communicate with each other via raw TCP or via websockets.

To run in distributed cluster mode, first start a cluster server on server_host:

./bin/server.js

On each worker host, start a worker controller process which connects to server:

./bin/worker.js -H server_host

Then run your app, setting the cluster server host in environment:

SKALE_HOST=server_host node my_app.js

The same with debug traces:

SKALE_HOST=server_host SKALE_DEBUG=2 node my_app.js

Resources

Contributing guide
Documentation
Gitter for support and discussion
Mailing list for discussion about use and development

Authors

The original authors of skale are Cedric Artigue and Marc Vertes.

List of all contributors

License

Apache-2.0

Credits

Logo Icon made by Smashicons from www.flaticon.com is licensed by CC 3.0 BY

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Stars: ✭ 390

Visit Git Page 🔗Visit User Page 🔗Visit Issues Page (24) 🔗