All Projects → datafuselabs → Datafuse

datafuselabs / Datafuse

Licence: apache-2.0
Datafuse is a free Cloud-Native Analytics DBMS(Inspired by ClickHouse) implemented in Rust

Programming Languages

rust
11053 projects

Projects that are alternatives of or similar to Datafuse

Gorose
GoRose(go orm), a mini database ORM for golang, which inspired by the famous php framwork laravle's eloquent. It will be friendly for php developer and python or ruby developer. Currently provides six major database drivers: mysql,sqlite3,postgres,oracle,mssql, Clickhouse.
Stars: ✭ 947 (+189.6%)
Mutual labels:  sql, database, clickhouse
Crate
CrateDB is a distributed SQL database that makes it simple to store and analyze massive amounts of data in real-time.
Stars: ✭ 3,254 (+895.11%)
Mutual labels:  sql, olap, database
Questdb
An open source SQL database designed to process time series data, faster
Stars: ✭ 7,544 (+2207.03%)
Mutual labels:  sql, database, simd
Omniscidb
OmniSciDB (formerly MapD Core)
Stars: ✭ 2,601 (+695.41%)
Mutual labels:  sql, olap, database
Spark With Python
Fundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (-54.13%)
Mutual labels:  sql, database, distributed-computing
Clickhouse Go
Golang driver for ClickHouse
Stars: ✭ 1,234 (+277.37%)
Mutual labels:  sql, database, clickhouse
Clickhouse
ClickHouse® is a free analytics DBMS for big data
Stars: ✭ 21,089 (+6349.24%)
Mutual labels:  sql, olap, clickhouse
Radon
RadonDB is an open source, cloud-native MySQL database for building global, scalable cloud services
Stars: ✭ 1,584 (+384.4%)
Mutual labels:  sql, olap, database
Bats
面向 OLTP、OLAP、批处理、流处理场景的大一统 SQL 引擎
Stars: ✭ 152 (-53.52%)
Mutual labels:  sql, olap, database
Duckdb
DuckDB is an in-process SQL OLAP Database Management System
Stars: ✭ 4,014 (+1127.52%)
Mutual labels:  sql, olap, database
H2database
H2 is an embeddable RDBMS written in Java.
Stars: ✭ 3,078 (+841.28%)
Mutual labels:  sql, database
Bedquilt Core
A JSON document store on PostgreSQL
Stars: ✭ 256 (-21.71%)
Mutual labels:  sql, database
Nodejs Bigquery
Node.js client for Google Cloud BigQuery: A fast, economical and fully-managed enterprise data warehouse for large-scale data analytics.
Stars: ✭ 268 (-18.04%)
Mutual labels:  sql, database
Trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Stars: ✭ 4,581 (+1300.92%)
Mutual labels:  sql, database
Dataux
Federated mysql compatible proxy to elasticsearch, mongo, cassandra, big-table, google datastore
Stars: ✭ 268 (-18.04%)
Mutual labels:  sql, database
Preql
An interpreted relational query language that compiles to SQL.
Stars: ✭ 257 (-21.41%)
Mutual labels:  sql, database
Bitnami Docker Mariadb
Bitnami MariaDB Docker Image
Stars: ✭ 251 (-23.24%)
Mutual labels:  sql, database
Text2sql Data
A collection of datasets that pair questions with SQL queries.
Stars: ✭ 287 (-12.23%)
Mutual labels:  sql, database
Squeal
A Swift wrapper for SQLite databases
Stars: ✭ 303 (-7.34%)
Mutual labels:  sql, database
Php Sql Query Builder
An elegant lightweight and efficient SQL Query Builder with fluid interface SQL syntax supporting bindings and complicated query generation.
Stars: ✭ 313 (-4.28%)
Mutual labels:  sql, database

FuseQuery

FuseQuery Lint FuseQuery Unit Tests codecov.io Platform License

FuseQuery is a real-time Cloud Query Engine implemented in Rust.

Inspired by ClickHouse and powered by Arrow.

Features

  • High Performance

    • Everything is Parallelism
  • High Scalability

    • Everything is Distributed
  • High Reliability

    • True Separation of Storage and Compute

Architecture

DataFuse Architecture

Performance

  • Memory SIMD-Vector processing performance only
  • Dataset: 100,000,000,000 (100 Billion)
  • Hardware: AMD Ryzen 7 PRO 4750U, 8 CPU Cores, 16 Threads
  • Rust: rustc 1.49.0 (e1884a8e3 2020-12-29)
  • Build with Link-time Optimization and Using CPU Specific Instructions
  • ClickHouse server version 21.2.1 revision 54447
Query FuseQuery (v0.1) ClickHouse (v21.2.1)
SELECT avg(number) FROM system.numbers_mt (3.11 s.) ×3.14 slow, (9.77 s.)
10.24 billion rows/s., 81.92 GB/s.
SELECT sum(number) FROM system.numbers_mt (2.96 s.) ×2.02 slow, (5.97 s.)
16.75 billion rows/s., 133.97 GB/s.
SELECT min(number) FROM system.numbers_mt (3.57 s.) ×3.90 slow, (13.93 s.)
7.18 billion rows/s., 57.44 GB/s.
SELECT max(number) FROM system.numbers_mt (3.59 s.) ×4.09 slow, (14.70 s.)
6.80 billion rows/s., 54.44 GB/s.
SELECT count(number) FROM system.numbers_mt (1.76 s.) ×2.22 slow, (3.91 s.)
25.58 billion rows/s., 204.65 GB/s.
SELECT sum(number+number+number) FROM numbers_mt (23.14 s.) ×5.47 slow, (126.67 s.)
789.47 million rows/s., 6.32 GB/s.
SELECT sum(number) / count(number) FROM system.numbers_mt (3.09 s.) ×1.96 slow, (6.07 s.)
16.48 billion rows/s., 131.88 GB/s.
SELECT sum(number) / count(number), max(number), min(number) FROM system.numbers_mt (6.73 s.) ×4.01 slow, (27.59 s.)
3.62 billion rows/s., 28.99 GB/s.

Note:

  • ClickHouse system.numbers_mt is 16-way parallelism processing
  • FuseQuery system.numbers_mt is 16-way parallelism processing

Status

General

  • [x] SQL Parser
  • [x] Query Planner
  • [x] Query Optimizer
  • [x] Predicate Push Down
  • [ ] Projection Push Down (TODO)
  • [ ] Limit Push Down (TODO)
  • [x] Type coercion
  • [x] Parallel Query Execution
  • [x] Distributed Query Execution
  • [ ] Sorting (WIP)
  • [ ] GroupBy (TODO)
  • [ ] Joins (TODO)

SQL Support

  • [x] Projection
  • [x] Filter (WHERE)
  • [x] Limit
  • [x] Aggregate Functions
  • [x] Scalar Functions
  • [x] UDF Functions
  • [ ] Sorting (WIP)
  • [ ] SubQueries (TOO)
  • [ ] Joins (TODO)
  • [ ] Window (TODO)

Getting Started

Learn FuseQuery

Try FuseQuery

Roadmap

  • [x] 0.1 Support aggregation select (2021.02)
  • [x] 0.2 Support distributed query (2021.03)
  • [ ] 0.3 Support order by
  • [ ] 0.5 Support group by
  • [ ] 0.6 Support sub queries
  • [ ] 0.7 Support join
  • [ ] 0.8 Support TPC-H benchmark

Contributing

You can learn more about contributing to the FuseQuery project by reading our Contribution Guide and by viewing our Code of Conduct.

License

FuseQuery is licensed under Apache 2.0.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].