All Projects → edp963 → Moonbox

edp963 / Moonbox

Licence: other
Moonbox is a DVtaaS (Data Virtualization as a Service) Platform

Programming Languages

javascript
184084 projects - #8 most used programming language

Labels

Projects that are alternatives of or similar to Moonbox

Wedatasphere
WeDataSphere is a financial level one-stop open-source suitcase for big data platforms. Currently the source code of Scriptis and Linkis has already been released to the open-source community. WeDataSphere, Big Data Made Easy!
Stars: ✭ 372 (-12.26%)
Mutual labels:  spark, hive
Bigdata docker
Big Data Ecosystem Docker
Stars: ✭ 161 (-62.03%)
Mutual labels:  spark, hive
Quicksql
A Flexible, Fast, Federated(3F) SQL Analysis Middleware for Multiple Data Sources
Stars: ✭ 1,821 (+329.48%)
Mutual labels:  spark, hive
Bigdata Notes
大数据入门指南 ⭐
Stars: ✭ 10,991 (+2492.22%)
Mutual labels:  spark, hive
spark-acid
ACID Data Source for Apache Spark based on Hive ACID
Stars: ✭ 91 (-78.54%)
Mutual labels:  spark, hive
Cube.js
📊 Cube — Open-Source Analytics API for Building Data Apps
Stars: ✭ 11,983 (+2726.18%)
Mutual labels:  spark, hive
Linkis
Linkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,323 (+447.88%)
Mutual labels:  spark, hive
Dataspherestudio
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (+181.84%)
Mutual labels:  spark, hive
swordfish
Open-source distribute workflow schedule tools, also support streaming task.
Stars: ✭ 35 (-91.75%)
Mutual labels:  spark, hive
Hadoop Docker
基于Docker构建的Hadoop开发测试环境,包含Hadoop,Hive,HBase,Spark
Stars: ✭ 238 (-43.87%)
Mutual labels:  spark, hive
Repository
个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。
Stars: ✭ 92 (-78.3%)
Mutual labels:  spark, hive
incubator-linkis
Linkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,459 (+479.95%)
Mutual labels:  spark, hive
Hops Examples
Examples for Deep Learning/Feature Store/Spark/Flink/Hive/Kafka jobs and Jupyter notebooks on Hops
Stars: ✭ 84 (-80.19%)
Mutual labels:  spark, hive
Hadoopcryptoledger
Hadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive
Stars: ✭ 126 (-70.28%)
Mutual labels:  spark, hive
Hadoop cookbook
Cookbook to install Hadoop 2.0+ using Chef
Stars: ✭ 82 (-80.66%)
Mutual labels:  spark, hive
Spark Authorizer
A Spark SQL extension which provides SQL Standard Authorization for Apache Spark
Stars: ✭ 141 (-66.75%)
Mutual labels:  spark, hive
Luigi Warehouse
A luigi powered analytics / warehouse stack
Stars: ✭ 72 (-83.02%)
Mutual labels:  spark, hive
Apache Spark Hands On
Educational notes,Hands on problems w/ solutions for hadoop ecosystem
Stars: ✭ 74 (-82.55%)
Mutual labels:  spark, hive
Xsql
Unified SQL Analytics Engine Based on SparkSQL
Stars: ✭ 176 (-58.49%)
Mutual labels:  spark, hive
BigData-News
基于Spark2.2新闻网大数据实时系统项目
Stars: ✭ 36 (-91.51%)
Mutual labels:  spark, hive

Moonbox - DVtaaS (Data Virtualization as a Service) Solution

Document

Introduction

Moonbox, designed based on the concept "Data Virtualization", is aimed at offering batch and interactive computing services. Moonbox hides the details and complexities of accessing data from the underlying data sources. Users can implement hybrid computation across disparate data systems and write out with SQL. In addition, Moonbox provides basic services like data service, data management, data tools, data development, etc., and it can make data application architecture and practice of logical data warehouse much more agile and flexible.

Features

  • Multi-tenant Supported
    Moonbox establishes a complete user architecture and introduces the concept of Organization for user space division. System Administrator can use ROOT account to create more than one Organizations and assign SA (super admin) (one or more) to these Organizations. SA creates and manages User. Moonbox abstracts 6 functionalities for User: whether it can execute Account statement, whether it can execute DDL statement, whether it can execute DCL statement, whether it can authorize other users to execute Account statement, whether it can authorize other users to execute DDL statement, and whether it can authorize other users to execute DCL statement. Free combinations of those functionalities build various user architecture models meeting multiple demands and implement multi-tenant.

  • Hybrid Calculation across Multiple Data Sources
    Taking Apache Spark as calculation engine, Moonbox supports hybrid calculation across multiple data sources, such as MySQL, Oracle, Hive, Kudu, HDFS, MongoDB, etc., and it also supports custom extension for more data sources.

  • Unified SQL Supported
    Spark SQL is the standard query language of Moonbox. With Spark SQL, specific DDL and DCL are expanded, including creating, deleting and authorizing users, access authorization for data table and data column, mount/umount of physical data source/table, creating or deleting logical database/time-scheduling event and udf/udaf, etc..

  • Optimization Strategy Supported
    Moonbox supports hybrid calculation based on Apache Spark, and Spark SQL supports multiple data sources. However, Spark SQL fails to utilize the calculation feature of data sources while pulling data, only focusing on the pushdown of project and filter (operators). Moonbox optimizes LogicalPlan that has been optimized by Spark Optimizer, splits subtree which can be pushed to data source, figures out the Data Source Query Language as the mapping of the subtree, and pulls the results back to Spark for further calculation. If the whole LogicalPlan can be pushed to data source, Moonbox will directly run the query statement (mapping of LogicalPlan) with data source, so as to reduce the cost of distributed obligation and save computing resource.

  • Column Permissions Control
    Moonbox defines DCL to implement column permission control. System Administrator authorizes data tables or columns to user with DCL, and Moonbox saves the permission relationship between user and tables/columns into catalog. While user executes SQL query, Moonbox will intercept the SQL and analyze whether it contains unauthorized tables/columns. If it does, Moonbox will report errors to users.

  • Diversified UDF/UDAF
    Moonbox supports creating UDF/UDAF not only with JAR files, but also with Source Code, including Java and Scala, making the development and verification of UDF more convenient.

  • Time-Scheduling Event Supported
    Moonbox provides time-scheduling event function. User defines time-scheduling event with DDL, defines scheduling strategy with crontab expression, and embeds quartz in the backend for time-scheduling event.

Latest Release

Please download the latest RELEASE

Get Help

Welcome to join our WeChat group "edpstack" for online discussion.

License

Please refer to LICENSE file.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].