grailbio / Bigslice
Licence: apache-2.0
A serverless cluster computing system for the Go programming language
Stars: ✭ 469
Projects that are alternatives of or similar to Bigslice
Big Data Engineering Coursera Yandex
Big Data for Data Engineers Coursera Specialization from Yandex
Stars: ✭ 71 (-84.86%)
Mutual labels: bigdata, mapreduce
Bigdata Interview
🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结
Stars: ✭ 857 (+82.73%)
Mutual labels: bigdata, mapreduce
gan deeplearning4j
Automatic feature engineering using Generative Adversarial Networks using Deeplearning4j and Apache Spark.
Stars: ✭ 19 (-95.95%)
Mutual labels: bigdata, machinelearning
Panther
Detect threats with log data and improve cloud security posture
Stars: ✭ 885 (+88.7%)
Mutual labels: bigdata, etl
Aws Etl Orchestrator
A serverless architecture for orchestrating ETL jobs in arbitrarily-complex workflows using AWS Step Functions and AWS Lambda.
Stars: ✭ 245 (-47.76%)
Mutual labels: bigdata, etl
Dpark
Python clone of Spark, a MapReduce alike framework in Python
Stars: ✭ 2,668 (+468.87%)
Mutual labels: bigdata, mapreduce
Mobius
C# and F# language binding and extensions to Apache Spark
Stars: ✭ 929 (+98.08%)
Mutual labels: bigdata, mapreduce
big data
A collection of tutorials on Hadoop, MapReduce, Spark, Docker
Stars: ✭ 34 (-92.75%)
Mutual labels: bigdata, mapreduce
Vaex
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualize and explore big tabular data at a billion rows per second 🚀
Stars: ✭ 6,793 (+1348.4%)
Mutual labels: bigdata, machinelearning
Aws Auto Terminate Idle Emr
AWS Auto Terminate Idle AWS EMR Clusters Framework is an AWS based solution using AWS CloudWatch and AWS Lambda using a Python script that is using Boto3 to terminate AWS EMR clusters that have been idle for a specified period of time.
Stars: ✭ 21 (-95.52%)
Mutual labels: bigdata, etl
lectures-hse-spark
Масштабируемое машинное обучение и анализ больших данных с Apache Spark
Stars: ✭ 20 (-95.74%)
Mutual labels: bigdata, mapreduce
Arvados
An open source platform for managing and analyzing biomedical big data
Stars: ✭ 274 (-41.58%)
Mutual labels: bigdata, cluster
Pglogical
Logical Replication extension for PostgreSQL 13, 12, 11, 10, 9.6, 9.5, 9.4 (Postgres), providing much faster replication than Slony, Bucardo or Londiste, as well as cross-version upgrades.
Stars: ✭ 455 (-2.99%)
Mutual labels: etl
Openpbs
An HPC workload manager and job scheduler for desktops, clusters, and clouds.
Stars: ✭ 427 (-8.96%)
Mutual labels: cluster
Bigslice
Bigslice is a serverless cluster data processing system for Go. Bigslice exposes composable API that lets the user express data processing tasks in terms of a series of data transformations that invoke user code. The Bigslice runtime then transparently parallelizes and distributes the work, using the Bigmachine library to create an ad hoc cluster on a cloud provider.
- website: bigslice.io
- API documentation: godoc.org/github.com/grailbio/bigslice
- issue tracker: github.com/grailbio/bigslice/issues
Developing Bigslice
Bigslice uses Go modules to capture its dependencies; no tooling other than the base Go install is required.
$ git clone https://github.com/grailbio/bigslice
$ cd bigslice
$ GO111MODULE=on go test
If tests fail with socket: too many open files
errors, try increasing the maximum number of open files.
$ ulimit -n 2000
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].