All Projects → Carbondata → Similar Projects or Alternatives

369 Open source projects that are alternatives of or similar to Carbondata

Pretzel

Javascript full-stack framework for Big Data visualisation and analysis

Stars: ✭ 26 (-97.75%)

Mutual labels: big-data

Storm

Mirror of Apache Storm

Stars: ✭ 6,297 (+443.78%)

Mutual labels: big-data

Analysispreservation.cern.ch

Source code for the CERN Analysis Preservation portal

Stars: ✭ 37 (-96.8%)

Mutual labels: big-data

Accumulo

Apache Accumulo

Stars: ✭ 857 (-25.99%)

Mutual labels: big-data

H2o 3

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

Stars: ✭ 5,656 (+388.43%)

Mutual labels: big-data

Moosefs

MooseFS – Open Source, Petabyte, Fault-Tolerant, Highly Performing, Scalable Network Distributed File System (Software-Defined Storage)

Stars: ✭ 1,025 (-11.49%)

Mutual labels: big-data

Sqoop

Mirror of Apache Sqoop

Stars: ✭ 817 (-29.45%)

Mutual labels: big-data

Docker Spark Cluster

A Spark cluster setup running on Docker containers

Stars: ✭ 57 (-95.08%)

Mutual labels: big-data

Samza

Mirror of Apache Samza

Stars: ✭ 676 (-41.62%)

Mutual labels: big-data

Skymap

High-throughput gene to knowledge mapping through massive integration of public sequencing data.

Stars: ✭ 29 (-97.5%)

Mutual labels: big-data

Dremio Oss

Dremio - the missing link in modern data

Stars: ✭ 862 (-25.56%)

Mutual labels: big-data

Scanner

Efficient video analysis at scale

Stars: ✭ 569 (-50.86%)

Mutual labels: big-data

Trck

Query engine for TrailDB

Stars: ✭ 48 (-95.85%)

Mutual labels: big-data

Dataflowjavasdk

Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.

Stars: ✭ 854 (-26.25%)

Mutual labels: big-data

Attic Lens

Mirror of Apache Lens

Stars: ✭ 58 (-94.99%)

Mutual labels: big-data

Bandar Log

Monitoring tool to measure flow throughput of data sources and processing components that are part of Data Ingestion and ETL pipelines.

Stars: ✭ 19 (-98.36%)

Mutual labels: big-data

Attaca

Robust, distributed version control for large files.

Stars: ✭ 41 (-96.46%)

Mutual labels: big-data

Titanoboa

Titanoboa makes complex workflows easy. It is a low-code workflow orchestration platform for JVM - distributed, highly scalable and fault tolerant.

Stars: ✭ 787 (-32.04%)

Mutual labels: big-data

Spark Doc Zh

Apache Spark 官方文档中文版

Stars: ✭ 1,126 (-2.76%)

Mutual labels: big-data

Cython

The most widely used Python to C compiler

Stars: ✭ 6,588 (+468.91%)

Mutual labels: big-data

Metrics

Measure behavior of Java applications

Stars: ✭ 35 (-96.98%)

Mutual labels: big-data

Sdc

Intel® Scalable Dataframe Compiler for Pandas*

Stars: ✭ 623 (-46.2%)

Mutual labels: big-data

Lifion Kinesis

A native Node.js producer and consumer library for Amazon Kinesis Data Streams

Stars: ✭ 54 (-95.34%)

Mutual labels: big-data

Zeppelin

Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.

Stars: ✭ 5,513 (+376.08%)

Mutual labels: big-data

Awesome Scalability

The Patterns of Scalable, Reliable, and Performant Large-Scale Systems

Stars: ✭ 36,688 (+3068.22%)

Mutual labels: big-data

Phoenix

Mirror of Apache Phoenix

Stars: ✭ 867 (-25.13%)

Mutual labels: big-data

Pachyderm

Reproducible Data Science at Scale!

Stars: ✭ 5,305 (+358.12%)

Mutual labels: big-data

Datumbox Framework

Datumbox is an open-source Machine Learning framework written in Java which allows the rapid development of Machine Learning and Statistical applications.

Stars: ✭ 1,063 (-8.2%)

Mutual labels: big-data

Sparkjni

A heterogeneous Apache Spark framework.

Stars: ✭ 11 (-99.05%)

Mutual labels: big-data

Verticapy

VerticaPy is a Python library that exposes sci-kit like functionality to conduct data science projects on data stored in Vertica, thus taking advantage Vertica’s speed and built-in analytics and machine learning capabilities.

Stars: ✭ 59 (-94.91%)

Mutual labels: big-data

Hazelcast Jet

Distributed Stream and Batch Processing

Stars: ✭ 855 (-26.17%)

Mutual labels: big-data

Traildb

TrailDB is an efficient tool for storing and querying series of events

Stars: ✭ 1,029 (-11.14%)

Mutual labels: big-data

Autodl

Automated Deep Learning without ANY human intervention. 1'st Solution for AutoDL [email protected]

Stars: ✭ 854 (-26.25%)

Mutual labels: big-data

Cloud Volume

Read and write Neuroglancer datasets programmatically.

Stars: ✭ 63 (-94.56%)

Mutual labels: big-data

Pyspark Setup Demo

Demo of PySpark and Jupyter Notebook with the Jupyter Docker Stacks

Stars: ✭ 24 (-97.93%)

Mutual labels: big-data

Couchdb Couch

Mirror of Apache CouchDB

Stars: ✭ 43 (-96.29%)

Mutual labels: big-data

Hadoop For Geoevent

ArcGIS GeoEvent Server sample Hadoop connector for storing GeoEvents in HDFS.

Stars: ✭ 5 (-99.57%)

Mutual labels: big-data

Ymcache

YMCache is a lightweight object caching solution for iOS and Mac OS X that is designed for highly parallel access scenarios.

Stars: ✭ 58 (-94.99%)

Mutual labels: big-data

Parquet Format

Apache Parquet

Stars: ✭ 800 (-30.92%)

Mutual labels: big-data

Egads

A Java package to automatically detect anomalies in large scale time-series data

Stars: ✭ 997 (-13.9%)

Mutual labels: big-data

Rakam Api

📈 Collect customer event data from your apps. (Note that this project only includes the API collector, not the visualization platform)

Stars: ✭ 772 (-33.33%)

Mutual labels: big-data

Flink Shaded

Apache Flink shaded artifacts repository

Stars: ✭ 67 (-94.21%)

Mutual labels: big-data

Spark Movie Lens

An on-line movie recommender using Spark, Python Flask, and the MovieLens dataset

Stars: ✭ 745 (-35.66%)

Mutual labels: big-data

Esper Tv

Esper instance for TV news analysis

Stars: ✭ 37 (-96.8%)

Mutual labels: big-data

Sciblog support

Support content for my blog

Stars: ✭ 694 (-40.07%)

Mutual labels: big-data

Kibble 1

Apache Kibble - a tool to collect, aggregate and visualize data about any software project

Stars: ✭ 54 (-95.34%)

Mutual labels: big-data

Data Science Career

Career Resources for Data Science, Machine Learning, Big Data and Business Analytics Career Repository

Stars: ✭ 630 (-45.6%)

Mutual labels: big-data

Predictionio Template Text Classifier

Text Classification Engine

Stars: ✭ 30 (-97.41%)

Mutual labels: big-data

Kafka Streams

equivalent to kafka-streams 🐙 for nodejs ✨🐢🚀✨

Stars: ✭ 613 (-47.06%)

Mutual labels: big-data

Warp

Convert and analyze large data sets at light speed, on Mac and iOS.

Stars: ✭ 62 (-94.65%)

Mutual labels: big-data

Oozie

Mirror of Apache Oozie

Stars: ✭ 602 (-48.01%)

Mutual labels: big-data

Qcportal

A client interface to the QCArchive Project (read-only image of QCFractal)

Stars: ✭ 29 (-97.5%)

Mutual labels: big-data

Giraph

Mirror of Apache Giraph

Stars: ✭ 569 (-50.86%)

Mutual labels: big-data

Macro ml

Course Website on Macroeconomic Analysis with Machine Learning and Big Data

Stars: ✭ 53 (-95.42%)

Mutual labels: big-data

Spark

Apache Spark - A unified analytics engine for large-scale data processing

Stars: ✭ 31,618 (+2630.4%)

Mutual labels: big-data

Hazelcast Cpp Client

Hazelcast IMDG C++ Client

Stars: ✭ 67 (-94.21%)

Mutual labels: big-data

Rsparkling

RSparkling: Use H2O Sparkling Water from R (Spark + R + Machine Learning)

Stars: ✭ 65 (-94.39%)

Mutual labels: big-data

Nabhash

An extremely fast Non-crypto-safe AES Based Hash algorithm for Big Data

Stars: ✭ 62 (-94.65%)

Mutual labels: big-data

Oodt

Mirror of Apache OODT

Stars: ✭ 52 (-95.51%)

Mutual labels: big-data

K8s Ingress Claim

An admission control policy that safeguards against accidental duplicate claiming of Hosts/Domains.

Stars: ✭ 14 (-98.79%)

Mutual labels: big-data

1-60 of 369 similar projects

›

next*5