All Projects → Carbondata → Similar Projects or Alternatives

369 Open source projects that are alternatives of or similar to Carbondata

Pretzel
Javascript full-stack framework for Big Data visualisation and analysis
Stars: ✭ 26 (-97.75%)
Mutual labels:  big-data
Storm
Mirror of Apache Storm
Stars: ✭ 6,297 (+443.78%)
Mutual labels:  big-data
Analysispreservation.cern.ch
Source code for the CERN Analysis Preservation portal
Stars: ✭ 37 (-96.8%)
Mutual labels:  big-data
Accumulo
Apache Accumulo
Stars: ✭ 857 (-25.99%)
Mutual labels:  big-data
H2o 3
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+388.43%)
Mutual labels:  big-data
Moosefs
MooseFS – Open Source, Petabyte, Fault-Tolerant, Highly Performing, Scalable Network Distributed File System (Software-Defined Storage)
Stars: ✭ 1,025 (-11.49%)
Mutual labels:  big-data
Sqoop
Mirror of Apache Sqoop
Stars: ✭ 817 (-29.45%)
Mutual labels:  big-data
Docker Spark Cluster
A Spark cluster setup running on Docker containers
Stars: ✭ 57 (-95.08%)
Mutual labels:  big-data
Samza
Mirror of Apache Samza
Stars: ✭ 676 (-41.62%)
Mutual labels:  big-data
Skymap
High-throughput gene to knowledge mapping through massive integration of public sequencing data.
Stars: ✭ 29 (-97.5%)
Mutual labels:  big-data
Dremio Oss
Dremio - the missing link in modern data
Stars: ✭ 862 (-25.56%)
Mutual labels:  big-data
Scanner
Efficient video analysis at scale
Stars: ✭ 569 (-50.86%)
Mutual labels:  big-data
Trck
Query engine for TrailDB
Stars: ✭ 48 (-95.85%)
Mutual labels:  big-data
Dataflowjavasdk
Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
Stars: ✭ 854 (-26.25%)
Mutual labels:  big-data
Attic Lens
Mirror of Apache Lens
Stars: ✭ 58 (-94.99%)
Mutual labels:  big-data
Bandar Log
Monitoring tool to measure flow throughput of data sources and processing components that are part of Data Ingestion and ETL pipelines.
Stars: ✭ 19 (-98.36%)
Mutual labels:  big-data
Attaca
Robust, distributed version control for large files.
Stars: ✭ 41 (-96.46%)
Mutual labels:  big-data
Titanoboa
Titanoboa makes complex workflows easy. It is a low-code workflow orchestration platform for JVM - distributed, highly scalable and fault tolerant.
Stars: ✭ 787 (-32.04%)
Mutual labels:  big-data
Spark Doc Zh
Apache Spark 官方文档中文版
Stars: ✭ 1,126 (-2.76%)
Mutual labels:  big-data
Cython
The most widely used Python to C compiler
Stars: ✭ 6,588 (+468.91%)
Mutual labels:  big-data
Metrics
Measure behavior of Java applications
Stars: ✭ 35 (-96.98%)
Mutual labels:  big-data
Sdc
Intel® Scalable Dataframe Compiler for Pandas*
Stars: ✭ 623 (-46.2%)
Mutual labels:  big-data
Lifion Kinesis
A native Node.js producer and consumer library for Amazon Kinesis Data Streams
Stars: ✭ 54 (-95.34%)
Mutual labels:  big-data
Zeppelin
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Stars: ✭ 5,513 (+376.08%)
Mutual labels:  big-data
Awesome Scalability
The Patterns of Scalable, Reliable, and Performant Large-Scale Systems
Stars: ✭ 36,688 (+3068.22%)
Mutual labels:  big-data
Phoenix
Mirror of Apache Phoenix
Stars: ✭ 867 (-25.13%)
Mutual labels:  big-data
Pachyderm
Reproducible Data Science at Scale!
Stars: ✭ 5,305 (+358.12%)
Mutual labels:  big-data
Datumbox Framework
Datumbox is an open-source Machine Learning framework written in Java which allows the rapid development of Machine Learning and Statistical applications.
Stars: ✭ 1,063 (-8.2%)
Mutual labels:  big-data
Sparkjni
A heterogeneous Apache Spark framework.
Stars: ✭ 11 (-99.05%)
Mutual labels:  big-data
Verticapy
VerticaPy is a Python library that exposes sci-kit like functionality to conduct data science projects on data stored in Vertica, thus taking advantage Vertica’s speed and built-in analytics and machine learning capabilities.
Stars: ✭ 59 (-94.91%)
Mutual labels:  big-data
Hazelcast Jet
Distributed Stream and Batch Processing
Stars: ✭ 855 (-26.17%)
Mutual labels:  big-data
Traildb
TrailDB is an efficient tool for storing and querying series of events
Stars: ✭ 1,029 (-11.14%)
Mutual labels:  big-data
Autodl
Automated Deep Learning without ANY human intervention. 1'st Solution for AutoDL [email protected]
Stars: ✭ 854 (-26.25%)
Mutual labels:  big-data
Cloud Volume
Read and write Neuroglancer datasets programmatically.
Stars: ✭ 63 (-94.56%)
Mutual labels:  big-data
Pyspark Setup Demo
Demo of PySpark and Jupyter Notebook with the Jupyter Docker Stacks
Stars: ✭ 24 (-97.93%)
Mutual labels:  big-data
Couchdb Couch
Mirror of Apache CouchDB
Stars: ✭ 43 (-96.29%)
Mutual labels:  big-data
Hadoop For Geoevent
ArcGIS GeoEvent Server sample Hadoop connector for storing GeoEvents in HDFS.
Stars: ✭ 5 (-99.57%)
Mutual labels:  big-data
Ymcache
YMCache is a lightweight object caching solution for iOS and Mac OS X that is designed for highly parallel access scenarios.
Stars: ✭ 58 (-94.99%)
Mutual labels:  big-data
Parquet Format
Apache Parquet
Stars: ✭ 800 (-30.92%)
Mutual labels:  big-data
Egads
A Java package to automatically detect anomalies in large scale time-series data
Stars: ✭ 997 (-13.9%)
Mutual labels:  big-data
Rakam Api
📈 Collect customer event data from your apps. (Note that this project only includes the API collector, not the visualization platform)
Stars: ✭ 772 (-33.33%)
Mutual labels:  big-data
Flink Shaded
Apache Flink shaded artifacts repository
Stars: ✭ 67 (-94.21%)
Mutual labels:  big-data
Spark Movie Lens
An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset
Stars: ✭ 745 (-35.66%)
Mutual labels:  big-data
Esper Tv
Esper instance for TV news analysis
Stars: ✭ 37 (-96.8%)
Mutual labels:  big-data
Sciblog support
Support content for my blog
Stars: ✭ 694 (-40.07%)
Mutual labels:  big-data
Kibble 1
Apache Kibble - a tool to collect, aggregate and visualize data about any software project
Stars: ✭ 54 (-95.34%)
Mutual labels:  big-data
Data Science Career
Career Resources for Data Science, Machine Learning, Big Data and Business Analytics Career Repository
Stars: ✭ 630 (-45.6%)
Mutual labels:  big-data
Predictionio Template Text Classifier
Text Classification Engine
Stars: ✭ 30 (-97.41%)
Mutual labels:  big-data
Kafka Streams
equivalent to kafka-streams 🐙 for nodejs ✨🐢🚀✨
Stars: ✭ 613 (-47.06%)
Mutual labels:  big-data
Warp
Convert and analyze large data sets at light speed, on Mac and iOS.
Stars: ✭ 62 (-94.65%)
Mutual labels:  big-data
Oozie
Mirror of Apache Oozie
Stars: ✭ 602 (-48.01%)
Mutual labels:  big-data
Qcportal
A client interface to the QCArchive Project (read-only image of QCFractal)
Stars: ✭ 29 (-97.5%)
Mutual labels:  big-data
Giraph
Mirror of Apache Giraph
Stars: ✭ 569 (-50.86%)
Mutual labels:  big-data
Macro ml
Course Website on Macroeconomic Analysis with Machine Learning and Big Data
Stars: ✭ 53 (-95.42%)
Mutual labels:  big-data
Spark
Apache Spark - A unified analytics engine for large-scale data processing
Stars: ✭ 31,618 (+2630.4%)
Mutual labels:  big-data
Hazelcast Cpp Client
Hazelcast IMDG C++ Client
Stars: ✭ 67 (-94.21%)
Mutual labels:  big-data
Rsparkling
RSparkling: Use H2O Sparkling Water from R (Spark + R + Machine Learning)
Stars: ✭ 65 (-94.39%)
Mutual labels:  big-data
Nabhash
An extremely fast Non-crypto-safe AES Based Hash algorithm for Big Data
Stars: ✭ 62 (-94.65%)
Mutual labels:  big-data
Oodt
Mirror of Apache OODT
Stars: ✭ 52 (-95.51%)
Mutual labels:  big-data
K8s Ingress Claim
An admission control policy that safeguards against accidental duplicate claiming of Hosts/Domains.
Stars: ✭ 14 (-98.79%)
Mutual labels:  big-data
1-60 of 369 similar projects