All Projects → Qihoo360 → Quicksql

Qihoo360 / Quicksql

Licence: mit
A Flexible, Fast, Federated(3F) SQL Analysis Middleware for Multiple Data Sources

Programming Languages

java
68154 projects - #9 most used programming language

Projects that are alternatives of or similar to Quicksql

Dataspherestudio
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Stars: ✭ 1,195 (-34.38%)
Mutual labels:  spark, flink, hive
Kyuubi
Kyuubi is a unified multi-tenant JDBC interface for large-scale data processing and analytics, built on top of Apache Spark
Stars: ✭ 363 (-80.07%)
Mutual labels:  sql, spark, hive
Linkis
Linkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
Stars: ✭ 2,323 (+27.57%)
Mutual labels:  sql, spark, hive
Repository
个人学习知识库涉及到数据仓库建模、实时计算、大数据、Java、算法等。
Stars: ✭ 92 (-94.95%)
Mutual labels:  spark, flink, hive
Bigdataguide
大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
Stars: ✭ 817 (-55.13%)
Mutual labels:  spark, flink, hive
Hops Examples
Examples for Deep Learning/Feature Store/Spark/Flink/Hive/Kafka jobs and Jupyter notebooks on Hops
Stars: ✭ 84 (-95.39%)
Mutual labels:  spark, flink, hive
Xsql
Unified SQL Analytics Engine Based on SparkSQL
Stars: ✭ 176 (-90.33%)
Mutual labels:  sql, spark, hive
God Of Bigdata
专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
Stars: ✭ 6,008 (+229.93%)
Mutual labels:  spark, flink, hive
Scriptis
Scriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.
Stars: ✭ 696 (-61.78%)
Mutual labels:  sql, spark, hive
Bdp Dataplatform
大数据生态解决方案数据平台:基于大数据、数据平台、微服务、机器学习、商城、自动化运维、DevOps、容器部署平台、数据平台采集、数据平台存储、数据平台计算、数据平台开发、数据平台应用搭建的大数据解决方案。
Stars: ✭ 456 (-74.96%)
Mutual labels:  spark, flink, hive
Kamu Cli
Next generation tool for decentralized exchange and transformation of semi-structured data
Stars: ✭ 69 (-96.21%)
Mutual labels:  sql, spark, flink
Szt Bigdata
深圳地铁大数据客流分析系统🚇🚄🌟
Stars: ✭ 826 (-54.64%)
Mutual labels:  spark, flink, hive
Hadoopcryptoledger
Hadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive
Stars: ✭ 126 (-93.08%)
Mutual labels:  spark, flink, hive
Spark Website
Apache Spark Website
Stars: ✭ 75 (-95.88%)
Mutual labels:  sql, spark
Hadoop cookbook
Cookbook to install Hadoop 2.0+ using Chef
Stars: ✭ 82 (-95.5%)
Mutual labels:  spark, hive
Bigdata Notes
大数据入门指南 ⭐
Stars: ✭ 10,991 (+503.57%)
Mutual labels:  spark, hive
Bigdata Notebook
Stars: ✭ 100 (-94.51%)
Mutual labels:  spark, flink
Maha
A framework for rapid reporting API development; with out of the box support for high cardinality dimension lookups with druid.
Stars: ✭ 101 (-94.45%)
Mutual labels:  sql, hive
Php Thrift Sql
A PHP library for connecting to Hive or Impala over Thrift
Stars: ✭ 107 (-94.12%)
Mutual labels:  sql, hive
Apache Spark Hands On
Educational notes,Hands on problems w/ solutions for hadoop ecosystem
Stars: ✭ 74 (-95.94%)
Mutual labels:  spark, hive

English|中文

200_200

Language Release Version license Documentation Status PRs Welcome

Quicksql is a SQL query product which can be used for specific datastore queries or multiple datastores correlated queries. It supports relational databases, non-relational databases and even datastore which does not support SQL (such as Elasticsearch, Druid) . In addition, a SQL query can join or union data from multiple datastores in Quicksql. For example, you can perform unified SQL query on one situation that a part of data stored on Elasticsearch, but the other part of data stored on Hive. The most important is that QSQL is not dependent on any intermediate compute engine, users only need to focus on data and unified SQL grammar to finished statistics and analysis.

Star-History

Architecture

An architecture diagram helps you access Quicksql more easily.

1540973404791

QSQL architecture consists of three layers:

  • Parsing Layer: Used for parsing, validation, optimization of SQL statements, splitting of mixed SQL and finally generating Query Plan;

  • Computing Layer: For routing query plan to a specific execution plan, then interpreted to executable code for given storage or engine(such as Elasticsearch JSON query or Hive HQL);

  • Storage Layer: For data prepared extraction and storage;

Basic Features

In the vast majority of cases, we expect to use a language for data analysis and don't want to consider things that are not related to data analysis, Quicksql is born for this.

The goal of Quicksql is to provide three functions:

1. Unify all structured data queries into a SQL grammar

  • Only Use SQL

In Quicksql, you can query Elasticsearch like this:

SELECT state, pop FROM geo_mapping WHERE state = 'CA' ORDER BY state

Even an aggregation query:

SELECT approx_count_distinct(city), state FROM geo_mapping GROUP BY state LIMIT 10

You won't be annoyed again because the brackets in the JSON query can't match ;)

  • Eliminate Dialects

In the past, the same semantic statement needs to be converted to a dialect for different engines, such as:

SELECT * FROM geo_mapping                       -- MySQL Dialect
LIMIT 10 OFFSET 10                              
SELECT * FROM geo_mapping                       -- Oracle Dialect
OFFSET 10 ROWS FETCH NEXT 10 ROWS ONLY          

In Quicksql, relational databases no longer have the concept of dialects. You can use the grammar of Quicksql to query any engine, just like this:

SELECT * FROM geo_mapping LIMIT 10 OFFSET 10    -- Run Anywhere

2. Shield the isolation between different data sources

Consider a situation where you want to join tables that are in different engines or are not in the same cluster, you may be in trouble.

However, in Quicksql, you can query like this:

SELECT * FROM 
    (SELECT * FROM es_raw.profile AS profile    //index.tpye on Elasticsearch 
        WHERE note IS NOT NULL )AS es_profile
INNER JOIN 
    (SELECT * FROM hive_db.employee AS emp  //database.table on Hive
    INNER JOIN hive_db.action AS act    //database.table on Hive
    ON emp.name = act.name) AS tmp 
ON es_profile.prefer = tmp.prefer

3. Choose the most appropriate way to execute the query

A query involving multiple engines can be executed in a variety of ways. Quicksql wants to combine the advantages of each engine to find the most appropriate one.

Getting Started

For instructions on building Quicksql from source, see Getting Started.

Reporting Issues

If you find any bugs or have any better suggestions, please file a GitHub issue.

And if the issue is approved, a label [QSQL-ID] will be added before the issue description by committer so that it can correspond to commit. Such as:

[QSQL-1002]: Views generated after splitting logical plan are redundant.

Contributing

We welcome contributions.

If you are interested in Quicksql, you can download the source code from GitHub and execute the following maven command at the project root directory:

mvn -DskipTests clean package

If you are planning to make a large contribution, talk to us first! It helps to agree on the general approach. Log a Issures on GitHub for your proposed feature.

Fork the GitHub repository, and create a branch for your feature.

Develop your feature and test cases, and make sure that mvn install succeeds. (Run extra tests if your change warrants it.)

Commit your change to your branch.

If your change had multiple commits, use git rebase -i master to squash them into a single commit, and to bring your code up to date with the latest on the main line.

Then push your commit(s) to GitHub, and create a pull request from your branch to the QSQL master branch. Update the JIRA case to reference your pull request, and a committer will review your changes.

The pull request may need to be updated (after its submission) for two main reasons:

  1. you identified a problem after the submission of the pull request;
  2. the reviewer requested further changes;

In order to update the pull request, you need to commit the changes in your branch and then push the commit(s) to GitHub. You are encouraged to use regular (non-rebased) commits on top of previously existing ones.

Join us

Slack Github QQ

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].