datafuselabs / Datafuse
Licence: apache-2.0
Datafuse is a free Cloud-Native Analytics DBMS(Inspired by ClickHouse) implemented in Rust
Stars: ✭ 327
Programming Languages
rust
11053 projects
Projects that are alternatives of or similar to Datafuse
Gorose
GoRose(go orm), a mini database ORM for golang, which inspired by the famous php framwork laravle's eloquent. It will be friendly for php developer and python or ruby developer. Currently provides six major database drivers: mysql,sqlite3,postgres,oracle,mssql, Clickhouse.
Stars: ✭ 947 (+189.6%)
Mutual labels: sql, database, clickhouse
Crate
CrateDB is a distributed SQL database that makes it simple to store and analyze
massive amounts of data in real-time.
Stars: ✭ 3,254 (+895.11%)
Mutual labels: sql, olap, database
Questdb
An open source SQL database designed to process time series data, faster
Stars: ✭ 7,544 (+2207.03%)
Mutual labels: sql, database, simd
Spark With Python
Fundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (-54.13%)
Mutual labels: sql, database, distributed-computing
Clickhouse Go
Golang driver for ClickHouse
Stars: ✭ 1,234 (+277.37%)
Mutual labels: sql, database, clickhouse
Clickhouse
ClickHouse® is a free analytics DBMS for big data
Stars: ✭ 21,089 (+6349.24%)
Mutual labels: sql, olap, clickhouse
Radon
RadonDB is an open source, cloud-native MySQL database for building global, scalable cloud services
Stars: ✭ 1,584 (+384.4%)
Mutual labels: sql, olap, database
Duckdb
DuckDB is an in-process SQL OLAP Database Management System
Stars: ✭ 4,014 (+1127.52%)
Mutual labels: sql, olap, database
H2database
H2 is an embeddable RDBMS written in Java.
Stars: ✭ 3,078 (+841.28%)
Mutual labels: sql, database
Nodejs Bigquery
Node.js client for Google Cloud BigQuery: A fast, economical and fully-managed enterprise data warehouse for large-scale data analytics.
Stars: ✭ 268 (-18.04%)
Mutual labels: sql, database
Trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Stars: ✭ 4,581 (+1300.92%)
Mutual labels: sql, database
Dataux
Federated mysql compatible proxy to elasticsearch, mongo, cassandra, big-table, google datastore
Stars: ✭ 268 (-18.04%)
Mutual labels: sql, database
Preql
An interpreted relational query language that compiles to SQL.
Stars: ✭ 257 (-21.41%)
Mutual labels: sql, database
Bitnami Docker Mariadb
Bitnami MariaDB Docker Image
Stars: ✭ 251 (-23.24%)
Mutual labels: sql, database
Text2sql Data
A collection of datasets that pair questions with SQL queries.
Stars: ✭ 287 (-12.23%)
Mutual labels: sql, database
Php Sql Query Builder
An elegant lightweight and efficient SQL Query Builder with fluid interface SQL syntax supporting bindings and complicated query generation.
Stars: ✭ 313 (-4.28%)
Mutual labels: sql, database
FuseQuery
FuseQuery is a real-time Cloud Query Engine implemented in Rust.
Inspired by ClickHouse and powered by Arrow.
Features
-
High Performance
- Everything is Parallelism
-
High Scalability
- Everything is Distributed
-
High Reliability
- True Separation of Storage and Compute
Architecture
Performance
- Memory SIMD-Vector processing performance only
- Dataset: 100,000,000,000 (100 Billion)
- Hardware: AMD Ryzen 7 PRO 4750U, 8 CPU Cores, 16 Threads
- Rust: rustc 1.49.0 (e1884a8e3 2020-12-29)
- Build with Link-time Optimization and Using CPU Specific Instructions
- ClickHouse server version 21.2.1 revision 54447
Query | FuseQuery (v0.1) | ClickHouse (v21.2.1) |
---|---|---|
SELECT avg(number) FROM system.numbers_mt | (3.11 s.) |
×3.14 slow, (9.77 s.) 10.24 billion rows/s., 81.92 GB/s. |
SELECT sum(number) FROM system.numbers_mt | (2.96 s.) |
×2.02 slow, (5.97 s.) 16.75 billion rows/s., 133.97 GB/s. |
SELECT min(number) FROM system.numbers_mt | (3.57 s.) |
×3.90 slow, (13.93 s.) 7.18 billion rows/s., 57.44 GB/s. |
SELECT max(number) FROM system.numbers_mt | (3.59 s.) |
×4.09 slow, (14.70 s.) 6.80 billion rows/s., 54.44 GB/s. |
SELECT count(number) FROM system.numbers_mt | (1.76 s.) |
×2.22 slow, (3.91 s.) 25.58 billion rows/s., 204.65 GB/s. |
SELECT sum(number+number+number) FROM numbers_mt | (23.14 s.) |
×5.47 slow, (126.67 s.) 789.47 million rows/s., 6.32 GB/s. |
SELECT sum(number) / count(number) FROM system.numbers_mt | (3.09 s.) |
×1.96 slow, (6.07 s.) 16.48 billion rows/s., 131.88 GB/s. |
SELECT sum(number) / count(number), max(number), min(number) FROM system.numbers_mt | (6.73 s.) |
×4.01 slow, (27.59 s.) 3.62 billion rows/s., 28.99 GB/s. |
Note:
- ClickHouse system.numbers_mt is 16-way parallelism processing
- FuseQuery system.numbers_mt is 16-way parallelism processing
Status
General
- [x] SQL Parser
- [x] Query Planner
- [x] Query Optimizer
- [x] Predicate Push Down
- [ ] Projection Push Down (TODO)
- [ ] Limit Push Down (TODO)
- [x] Type coercion
- [x] Parallel Query Execution
- [x] Distributed Query Execution
- [ ] Sorting (WIP)
- [ ] GroupBy (TODO)
- [ ] Joins (TODO)
SQL Support
- [x] Projection
- [x] Filter (WHERE)
- [x] Limit
- [x] Aggregate Functions
- [x] Scalar Functions
- [x] UDF Functions
- [ ] Sorting (WIP)
- [ ] SubQueries (TOO)
- [ ] Joins (TODO)
- [ ] Window (TODO)
Getting Started
Learn FuseQuery
Try FuseQuery
Roadmap
- [x] 0.1 Support aggregation select (2021.02)
- [x] 0.2 Support distributed query (2021.03)
- [ ] 0.3 Support order by
- [ ] 0.5 Support group by
- [ ] 0.6 Support sub queries
- [ ] 0.7 Support join
- [ ] 0.8 Support TPC-H benchmark
Contributing
You can learn more about contributing to the FuseQuery project by reading our Contribution Guide and by viewing our Code of Conduct.
License
FuseQuery is licensed under Apache 2.0.
Note that the project description data, including the texts, logos, images, and/or trademarks,
for each open source project belongs to its rightful owner.
If you wish to add or remove any projects, please contact us at [email protected].