lensMirror of Apache Lens
Stars: ✭ 57 (-1.72%)
K8s Ingress ClaimAn admission control policy that safeguards against accidental duplicate claiming of Hosts/Domains.
Stars: ✭ 14 (-75.86%)
Rakam Api📈 Collect customer event data from your apps. (Note that this project only includes the API collector, not the visualization platform)
Stars: ✭ 772 (+1231.03%)
Kafka Streamsequivalent to kafka-streams 🐙 for nodejs ✨🐢🚀✨
Stars: ✭ 613 (+956.9%)
Hadoop For GeoeventArcGIS GeoEvent Server sample Hadoop connector for storing GeoEvents in HDFS.
Stars: ✭ 5 (-91.38%)
SkymapHigh-throughput gene to knowledge mapping through massive integration of public sequencing data.
Stars: ✭ 29 (-50%)
AccumuloApache Accumulo
Stars: ✭ 857 (+1377.59%)
GiraphMirror of Apache Giraph
Stars: ✭ 569 (+881.03%)
CouchdbSeamless multi-master syncing database with an intuitive HTTP/JSON API, designed for reliability
Stars: ✭ 5,166 (+8806.9%)
Pyspark Setup DemoDemo of PySpark and Jupyter Notebook with the Jupyter Docker Stacks
Stars: ✭ 24 (-58.62%)
Steamvr UndistortSteamVR lens distortion adjustment utility for spherical lenses
Stars: ✭ 33 (-43.1%)
TraildbTrailDB is an efficient tool for storing and querying series of events
Stars: ✭ 1,029 (+1674.14%)
Spark Movie LensAn on-line movie recommender using Spark, Python Flask, and the MovieLens dataset
Stars: ✭ 745 (+1184.48%)
Awesome ScalabilityThe Patterns of Scalable, Reliable, and Performant Large-Scale Systems
Stars: ✭ 36,688 (+63155.17%)
Data Science CareerCareer Resources for Data Science, Machine Learning, Big Data and Business Analytics Career Repository
Stars: ✭ 630 (+986.21%)
Macro mlCourse Website on Macroeconomic Analysis with Machine Learning and Big Data
Stars: ✭ 53 (-8.62%)
OozieMirror of Apache Oozie
Stars: ✭ 602 (+937.93%)
Dremio OssDremio - the missing link in modern data
Stars: ✭ 862 (+1386.21%)
PachydermReproducible Data Science at Scale!
Stars: ✭ 5,305 (+9046.55%)
EgadsA Java package to automatically detect anomalies in large scale time-series data
Stars: ✭ 997 (+1618.97%)
DataflowjavasdkGoogle Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
Stars: ✭ 854 (+1372.41%)
ArkimeArkime (formerly Moloch) is an open source, large scale, full packet capturing, indexing, and database system.
Stars: ✭ 4,994 (+8510.34%)
Onlinestats.jlSingle-pass algorithms for statistics
Stars: ✭ 507 (+774.14%)
PretzelJavascript full-stack framework for Big Data visualisation and analysis
Stars: ✭ 26 (-55.17%)
MetricsMeasure behavior of Java applications
Stars: ✭ 35 (-39.66%)
Bandar LogMonitoring tool to measure flow throughput of data sources and processing components that are part of Data Ingestion and ETL pipelines.
Stars: ✭ 19 (-67.24%)
TrckQuery engine for TrailDB
Stars: ✭ 48 (-17.24%)
SqoopMirror of Apache Sqoop
Stars: ✭ 817 (+1308.62%)
TitanoboaTitanoboa makes complex workflows easy. It is a low-code workflow orchestration platform for JVM - distributed, highly scalable and fault tolerant.
Stars: ✭ 787 (+1256.9%)
Lifion KinesisA native Node.js producer and consumer library for Amazon Kinesis Data Streams
Stars: ✭ 54 (-6.9%)
StormMirror of Apache Storm
Stars: ✭ 6,297 (+10756.9%)
QcportalA client interface to the QCArchive Project (read-only image of QCFractal)
Stars: ✭ 29 (-50%)
CythonThe most widely used Python to C compiler
Stars: ✭ 6,588 (+11258.62%)
MoosefsMooseFS – Open Source, Petabyte, Fault-Tolerant, Highly Performing, Scalable Network Distributed File System (Software-Defined Storage)
Stars: ✭ 1,025 (+1667.24%)
SamzaMirror of Apache Samza
Stars: ✭ 676 (+1065.52%)
SparkApache Spark - A unified analytics engine for large-scale data processing
Stars: ✭ 31,618 (+54413.79%)
SdcIntel® Scalable Dataframe Compiler for Pandas*
Stars: ✭ 623 (+974.14%)
H2o 3H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+9651.72%)
PhoenixMirror of Apache Phoenix
Stars: ✭ 867 (+1394.83%)
ZeppelinWeb-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Stars: ✭ 5,513 (+9405.17%)
AttacaRobust, distributed version control for large files.
Stars: ✭ 41 (-29.31%)
ScannerEfficient video analysis at scale
Stars: ✭ 569 (+881.03%)
SparkjniA heterogeneous Apache Spark framework.
Stars: ✭ 11 (-81.03%)
NipypeWorkflows and interfaces for neuroimaging packages
Stars: ✭ 557 (+860.34%)
OodtMirror of Apache OODT
Stars: ✭ 52 (-10.34%)
ThrillThrill - An EXPERIMENTAL Algorithmic Distributed Big Data Batch Processing Framework in C++
Stars: ✭ 528 (+810.34%)
Hazelcast JetDistributed Stream and Batch Processing
Stars: ✭ 855 (+1374.14%)
BeamApache Beam is a unified programming model for Batch and Streaming
Stars: ✭ 5,149 (+8777.59%)
MagellanGeo Spatial Data Analytics on Spark
Stars: ✭ 507 (+774.14%)
AutodlAutomated Deep Learning without ANY human intervention. 1'st Solution for AutoDL [email protected]
Stars: ✭ 854 (+1372.41%)
YmcacheYMCache is a lightweight object caching solution for iOS and Mac OS X that is designed for highly parallel access scenarios.
Stars: ✭ 58 (+0%)
Kibble 1Apache Kibble - a tool to collect, aggregate and visualize data about any software project
Stars: ✭ 54 (-6.9%)
Datumbox FrameworkDatumbox is an open-source Machine Learning framework written in Java which allows the rapid development of Machine Learning and Statistical applications.
Stars: ✭ 1,063 (+1732.76%)
Esper TvEsper instance for TV news analysis
Stars: ✭ 37 (-36.21%)