Alternatives and detailed information of Alluxio

Alluxio / Alluxio

Licence: apache-2.0

Alluxio, data orchestration for analytics and machine learning in the cloud

Programming Languages

java

68154 projects - #9 most used programming language

typescript

32286 projects

shell

77523 projects

31211 projects - #10 most used programming language

C++

36643 projects - #6 most used programming language

Mustache

554 projects

Projects that are alternatives of or similar to Alluxio

Dockerfiles

50+ DockerHub public images for Docker & Kubernetes - Hadoop, Kafka, ZooKeeper, HBase, Cassandra, Solr, SolrCloud, Presto, Apache Drill, Nifi, Spark, Consul, Riak, TeamCity and DevOps tools built on the major Linux distros: Alpine, CentOS, Debian, Fedora, Ubuntu

Stars: ✭ 847 (-84.25%)

Mutual labels: spark, hadoop, presto

Bigdata docker

Big Data Ecosystem Docker

Stars: ✭ 161 (-97.01%)

Mutual labels: spark, hadoop, presto

leaflet heatmap

简单的可视化湖州通话数据假设数据量很大，没法用浏览器直接绘制热力图，把绘制热力图这一步骤放到线下计算分析。使用Apache Spark并行计算数据之后，再使用Apache Spark绘制热力图，然后用leafletjs加载OpenStreetMap图层和热力图图层，以达到良好的交互效果。现在使用Apache Spark实现绘制，可能是Apache Spark不擅长这方面的计算或者是我没有设计好算法，并行计算的速度比不上单机计算。Apache Spark绘制热力图和计算代码在这 https://github.com/yuanzhaokang/ParallelizeHeatmap.git .

Stars: ✭ 13 (-99.76%)

Mutual labels: spark, hadoop, data-analysis

Ytk Learn

Ytk-learn is a distributed machine learning library which implements most of popular machine learning algorithms(GBDT, GBRT, Mixture Logistic Regression, Gradient Boosting Soft Tree, Factorization Machines, Field-aware Factorization Machines, Logistic Regression, Softmax).

Stars: ✭ 337 (-93.73%)

Mutual labels: spark, hadoop

Elasticluster

Create clusters of VMs on the cloud and configure them with Ansible.

Stars: ✭ 298 (-94.46%)

Mutual labels: spark, hadoop

Zat

Zeek Analysis Tools (ZAT): Processing and analysis of Zeek network data with Pandas, scikit-learn, Kafka and Spark

Stars: ✭ 303 (-94.37%)

Mutual labels: spark, data-analysis

bigkube

Minikube for big data with Scala and Spark

Stars: ✭ 16 (-99.7%)

Mutual labels: spark, presto

Bigdl

Building Large-Scale AI Applications for Distributed Big Data

Stars: ✭ 3,813 (-29.11%)

Mutual labels: spark, hadoop

Wedatasphere

WeDataSphere is a financial level one-stop open-source suitcase for big data platforms. Currently the source code of Scriptis and Linkis has already been released to the open-source community. WeDataSphere, Big Data Made Easy!

Stars: ✭ 372 (-93.08%)

Mutual labels: spark, hadoop

Iceberg

Iceberg is a table format for large, slow-moving tabular data

Stars: ✭ 393 (-92.69%)

Mutual labels: spark, hadoop

Marmaray

Generic Data Ingestion & Dispersal Library for Hadoop

Stars: ✭ 414 (-92.3%)

Mutual labels: spark, hadoop

Trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

Stars: ✭ 4,581 (-14.84%)

Mutual labels: hadoop, presto

basin

Basin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser

Stars: ✭ 25 (-99.54%)

Mutual labels: spark, hadoop

Spline

Data Lineage Tracking And Visualization Solution

Stars: ✭ 306 (-94.31%)

Mutual labels: spark, hadoop

Spotify-Song-Recommendation-ML

UC Berkeley team's submission for RecSys Challenge 2018

Stars: ✭ 70 (-98.7%)

Mutual labels: spark, data-analysis

Sqlpad

Web-based SQL editor run in your own private cloud. Supports MySQL, Postgres, SQL Server, Vertica, Crate, ClickHouse, Trino, Presto, SAP HANA, Cassandra, Snowflake, BigQuery, SQLite, and more with ODBC

Stars: ✭ 4,113 (-23.54%)

Mutual labels: data-analysis, presto

Devops Python Tools

80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Function, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.

Stars: ✭ 406 (-92.45%)

Mutual labels: spark, hadoop

Yanagishima

Web UI for Trino, Presto, Hive, Elasticsearch, SparkSQL

Stars: ✭ 424 (-92.12%)

Mutual labels: spark, presto

Data Science Ipython Notebooks

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

Stars: ✭ 22,048 (+309.89%)

Mutual labels: spark, hadoop

visions

Type System for Data Analysis in Python

Stars: ✭ 136 (-97.47%)

Mutual labels: spark, data-analysis

View All Similar Projects ➔

What is Alluxio

Alluxio (formerly known as Tachyon) is a virtual distributed storage system. It bridges the gap between computation frameworks and storage systems, enabling computation applications to connect to numerous storage systems through a common interface. Read more about Alluxio Overview.

The Alluxio project originated from a research project called Tachyon at AMPLab, UC Berkeley, which was the data layer of the Berkeley Data Analytics Stack (BDAS). For more details, please refer to Haoyuan Li's PhD dissertation Alluxio: A Virtual Distributed File System.

Who Uses Alluxio

Alluxio is used in production to manage Petabytes of data in many leading companies, with the largest deployment exceeding 3,000 nodes. You can find more use cases at Powered by Alluxio or visit our first community conference (Data Orchestration Summit) to learn from other community members!

Who Owns and Manages Alluxio Project

Alluxio Open Source Foundation is the owner of Alluxio project. Project operation is done by Alluxio Project Management Committee (PMC). You can checkout more details in its structure and how to join Alluxio PMC here.

Community and Events

Please use the following to reach members of the community:

Alluxio Community Slack Channel: post your questions here if you seek for help for general questions or issues using Alluxio.
Special Interest Groups (SIG) for Alluxio users and developers
Community Events: upcoming online office hours, meetups and webinars
Meetup Groups: Global Online Meetup, Bay Area Meetup, New York Meetup, Beijing Alluxio Meetup, Austin Meetup
Alluxio Twitter; Alluxio Youtube Channel; Alluxio Mailing List

Download Alluxio

Binary download

Prebuilt binaries are available to download at https://www.alluxio.io/download .

Docker

Download and start an Alluxio master and a worker. More details can be found in documentation.

# Create a network for connecting Alluxio containers
$ docker network create alluxio_nw
# Create a volume for storing ufs data
$ docker volume create ufs
# Launch the Alluxio master
$ docker run -d --net=alluxio_nw \
    -p 19999:19999 \
    --name=alluxio-master \
    -v ufs:/opt/alluxio/underFSStorage \
    alluxio/alluxio master
# Launch the Alluxio worker
$ export ALLUXIO_WORKER_RAMDISK_SIZE=1G
$ docker run -d --net=alluxio_nw \
    --shm-size=${ALLUXIO_WORKER_RAMDISK_SIZE} \
    --name=alluxio-worker \
    -v ufs:/opt/alluxio/underFSStorage \
    -e ALLUXIO_JAVA_OPTS="-Dalluxio.worker.ramdisk.size=${ALLUXIO_WORKER_RAMDISK_SIZE} -Dalluxio.master.hostname=alluxio-master" \
    alluxio/alluxio worker

MacOS Homebrew

$ brew install alluxio

Quick Start

Please follow the Guide to Get Started to run a simple example with Alluxio.

Report a Bug

To report bugs, suggest improvements, or create new feature requests, please open a Github Issue. If you are not sure whether you run into bugs or simply have general questions with respect to Alluxio, post your questions on Alluxio Slack channel.

Depend on Alluxio

Alluxio project provides several different client artifacts for external projects to depend on Alluxio client:

Artifact alluxio-shaded-client is recommended generally for a project to use Alluxio client. The jar of this artifact is self-contained (including all dependencies in a shaded form to prevent dependency conflicts), and thus larger than the following two artifacts.
Artifact alluxio-core-client-fs provides Alluxio Java file system API) to access all Alluxio-specific functionalities. This artifact is included in alluxio-shaded-client.
Artifact alluxio-core-client-hdfs provides HDFS-Compatible file system API. This artifact is included in alluxio-shaded-client.

Here are examples to declare the dependecies on alluxio-shaded-client using Maven:

<dependency>
  <groupId>org.alluxio</groupId>
  <artifactId>alluxio-shaded-client</artifactId>
  <version>2.6.0</version>
</dependency>

Contributing

Contributions via GitHub pull requests are gladly accepted from their original author. Along with any pull requests, please state that the contribution is your original work and that you license the work to the project under the project's open source license. Whether or not you state this explicitly, by submitting any copyrighted material via pull request, email, or other means you agree to license the material under the project's open source license and warrant that you have the legal authority to do so. For a more detailed step-by-step guide, please read how to contribute to Alluxio. For new contributor, please take two new contributor tasks.

For advanced feature requests and contributions, Alluxio core team is hosting regular online meetings with community users and developers to iterate the project in two special interest groups:

Alluxio and AI workloads: e.g., running Tensorflow, Pytorch on Alluxio through the POSIX API. Checkout the meeting notes
Alluxio and Presto workloads: e.g., running Presto on Alluxio, running Alluxio catalog service. Checkout the meeting notes

Subscribe our public calendar to join us.

Useful Links

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

Alluxio / Alluxio

Programming Languages

Labels

Projects that are alternatives of or similar to Alluxio

What is Alluxio

Who Uses Alluxio

Who Owns and Manages Alluxio Project

Community and Events

Download Alluxio

Binary download

Docker

MacOS Homebrew

Quick Start

Report a Bug

Depend on Alluxio

Contributing

Useful Links