All Projects → microsoft → Hyperspace

microsoft / Hyperspace

Licence: apache-2.0
An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.

Programming Languages

scala
5932 projects

Projects that are alternatives of or similar to Hyperspace

Trino
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Stars: ✭ 4,581 (+1762.2%)
Mutual labels:  analytics, big-data, databases
Spark With Python
Fundamentals of Spark with Python (using PySpark), code examples
Stars: ✭ 150 (-39.02%)
Mutual labels:  spark, analytics, big-data
Delta
An open-source storage layer that brings scalable, ACID transactions to Apache Spark™ and big data workloads.
Stars: ✭ 3,903 (+1486.59%)
Mutual labels:  spark, analytics, big-data
awesome-AI-kubernetes
❄️ 🐳 Awesome tools and libs for AI, Deep Learning, Machine Learning, Computer Vision, Data Science, Data Analytics and Cognitive Computing that are baked in the oven to be Native on Kubernetes and Docker with Python, R, Scala, Java, C#, Go, Julia, C++ etc
Stars: ✭ 95 (-61.38%)
Mutual labels:  big-data, spark, analytics
Logisland
Scalable stream processing platform for advanced realtime analytics on top of Kafka and Spark. LogIsland also supports MQTT and Kafka Streams (Flink being in the roadmap). The platform does complex event processing and is suitable for time series analysis. A large set of valuable ready to use processors, data sources and sinks are available.
Stars: ✭ 97 (-60.57%)
Mutual labels:  spark, analytics, big-data
Feast
Feature Store for Machine Learning
Stars: ✭ 2,576 (+947.15%)
Mutual labels:  spark, big-data
Gaffer
A large-scale entity and relation database supporting aggregation of properties
Stars: ✭ 1,642 (+567.48%)
Mutual labels:  spark, big-data
Opaque
An encrypted data analytics platform
Stars: ✭ 129 (-47.56%)
Mutual labels:  spark, analytics
Sparkling Graph
SparklingGraph provides easy to use set of features that will give you ability to proces large scala graphs using Spark and GraphX.
Stars: ✭ 139 (-43.5%)
Mutual labels:  spark, big-data
Maha
A framework for rapid reporting API development; with out of the box support for high cardinality dimension lookups with druid.
Stars: ✭ 101 (-58.94%)
Mutual labels:  analytics, big-data
Spark On Lambda
Apache Spark on AWS Lambda
Stars: ✭ 137 (-44.31%)
Mutual labels:  spark, big-data
Fili
Easily make RESTful web services for time series reporting with Big Data analytics engines like Druid and SQL Databases.
Stars: ✭ 151 (-38.62%)
Mutual labels:  analytics, big-data
Openuba
A robust, and flexible open source User & Entity Behavior Analytics (UEBA) framework used for Security Analytics. Developed with luv by Data Scientists & Security Analysts from the Cyber Security Industry. [PRE-ALPHA]
Stars: ✭ 127 (-48.37%)
Mutual labels:  spark, analytics
Cube.js
📊 Cube — Open-Source Analytics API for Building Data Apps
Stars: ✭ 11,983 (+4771.14%)
Mutual labels:  spark, analytics
Spark
.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
Stars: ✭ 1,721 (+599.59%)
Mutual labels:  spark, analytics
Bigdataclass
Two-day workshop that covers how to use R to interact databases and Spark
Stars: ✭ 110 (-55.28%)
Mutual labels:  spark, big-data
Geopyspark
GeoTrellis for PySpark
Stars: ✭ 167 (-32.11%)
Mutual labels:  spark, big-data
Geni
A Clojure dataframe library that runs on Spark
Stars: ✭ 152 (-38.21%)
Mutual labels:  spark, big-data
Data Science Live Book
An open source book to learn data science, data analysis and machine learning, suitable for all ages!
Stars: ✭ 193 (-21.54%)
Mutual labels:  analytics, big-data
Sparkrdma
RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Stars: ✭ 215 (-12.6%)
Mutual labels:  spark, big-data

Icon

Hyperspace

An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.

aka.ms/hyperspace

Build Status javadoc

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

Please review our contribution guide.

Inspiration and Special Thanks

This project would not have been possible without the outstanding work from the following communities:

  • Apache Spark: Unified Analytics Engine for Big Data, the engine that Hyperspace builds on top of.
  • Delta Lake: Delta Lake is an open-source storage layer that brings ACID transactions to Apache Spark™ and big data workloads. Hyperspace derives quite a bit of inspiration from the way the Delta Lake community operates and pioneering of some surrounding ideas in the context of data lakes (e.g., their novel use of optimistic concurrency).
  • Databricks: Unified analytics platform. Many thanks to all the inspiration they have provided us.
  • .NET for Apache Spark™: Hyperspace offers .NET bindings for developers, thanks to the efforts of this team in collaborating and releasing the bindings just-in-time.
  • Minimal Mistakes: The awesome theme behind Hyperspace documentation.

Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

License

Apache License 2.0, see LICENSE.

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].