Stream FrameworkStream Framework is a Python library, which allows you to build news feed, activity streams and notification systems using Cassandra and/or Redis. The authors of Stream-Framework also provide a cloud service for feed technology:
Stars: ✭ 4,576 (+32585.71%)
Circosjsd3 library to build circular graphs
Stars: ✭ 436 (+3014.29%)
SdcIntel® Scalable Dataframe Compiler for Pandas*
Stars: ✭ 623 (+4350%)
BeamApache Beam is a unified programming model for Batch and Streaming
Stars: ✭ 5,149 (+36678.57%)
External DnsConfigure external DNS servers (AWS Route53, Google CloudDNS and others) for Kubernetes Ingresses and Services
Stars: ✭ 4,749 (+33821.43%)
CoursesQuiz & Assignment of Coursera
Stars: ✭ 454 (+3142.86%)
Hadoop For GeoeventArcGIS GeoEvent Server sample Hadoop connector for storing GeoEvents in HDFS.
Stars: ✭ 5 (-64.29%)
Opendata.cern.chSource code for the CERN Open Data portal
Stars: ✭ 411 (+2835.71%)
ZeppelinWeb-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Stars: ✭ 5,513 (+39278.57%)
ThrillThrill - An EXPERIMENTAL Algorithmic Distributed Big Data Batch Processing Framework in C++
Stars: ✭ 528 (+3671.43%)
BigdlBuilding Large-Scale AI Applications for Distributed Big Data
Stars: ✭ 3,813 (+27135.71%)
Spark Movie LensAn on-line movie recommender using Spark, Python Flask, and the MovieLens dataset
Stars: ✭ 745 (+5221.43%)
MagellanGeo Spatial Data Analytics on Spark
Stars: ✭ 507 (+3521.43%)
Pyspark Setup DemoDemo of PySpark and Jupyter Notebook with the Jupyter Docker Stacks
Stars: ✭ 24 (+71.43%)
RedisliteRedis in a python module.
Stars: ✭ 464 (+3214.29%)
GimbalGimbal is an ingress load balancing platform capable of routing traffic to multiple Kubernetes and OpenStack clusters. Built by Heptio in partnership with Actapio.
Stars: ✭ 637 (+4450%)
Data Science Ipython NotebooksData science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Stars: ✭ 22,048 (+157385.71%)
Hazelcast JetDistributed Stream and Batch Processing
Stars: ✭ 855 (+6007.14%)
H2o 3H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Stars: ✭ 5,656 (+40300%)
MockneatMockNeat is a Java 8+ library that facilitates the generation of arbitrary data for your applications.
Stars: ✭ 410 (+2828.57%)
OrcApache ORC - the smallest, fastest columnar storage for Hadoop workloads
Stars: ✭ 389 (+2678.57%)
ScannerEfficient video analysis at scale
Stars: ✭ 569 (+3964.29%)
CouchdbSeamless multi-master syncing database with an intuitive HTTP/JSON API, designed for reliability
Stars: ✭ 5,166 (+36800%)
HiveApache Hive
Stars: ✭ 4,031 (+28692.86%)
StormMirror of Apache Storm
Stars: ✭ 6,297 (+44878.57%)
ArkimeArkime (formerly Moloch) is an open source, large scale, full packet capturing, indexing, and database system.
Stars: ✭ 4,994 (+35571.43%)
PretzelJavascript full-stack framework for Big Data visualisation and analysis
Stars: ✭ 26 (+85.71%)
Onlinestats.jlSingle-pass algorithms for statistics
Stars: ✭ 507 (+3521.43%)
CythonThe most widely used Python to C compiler
Stars: ✭ 6,588 (+46957.14%)
Pgm Index🏅State-of-the-art learned data structure that enables fast lookup, predecessor, range searches and updates in arrays of billions of items using orders of magnitude less space than traditional indexes
Stars: ✭ 499 (+3464.29%)
AccumuloApache Accumulo
Stars: ✭ 857 (+6021.43%)
Fit SneFast Fourier Transform-accelerated Interpolation-based t-SNE (FIt-SNE)
Stars: ✭ 485 (+3364.29%)
SamzaMirror of Apache Samza
Stars: ✭ 676 (+4728.57%)
HazelcastOpen-source distributed computation and storage platform
Stars: ✭ 4,662 (+33200%)
Bandar LogMonitoring tool to measure flow throughput of data sources and processing components that are part of Data Ingestion and ETL pipelines.
Stars: ✭ 19 (+35.71%)
Conjure UpDeploying complex solutions, magically.
Stars: ✭ 454 (+3142.86%)
Data Science CareerCareer Resources for Data Science, Machine Learning, Big Data and Business Analytics Career Repository
Stars: ✭ 630 (+4400%)
Application Gateway Kubernetes IngressThis is an ingress controller that can be run on Azure Kubernetes Service (AKS) to allow an Azure Application Gateway to act as the ingress for an AKS cluster.
Stars: ✭ 448 (+3100%)
Dremio OssDremio - the missing link in modern data
Stars: ✭ 862 (+6057.14%)
CortxCORTX Community Object Storage is 100% open source object storage uniquely optimized for mass capacity storage devices.
Stars: ✭ 426 (+2942.86%)
Kafka Streamsequivalent to kafka-streams 🐙 for nodejs ✨🐢🚀✨
Stars: ✭ 613 (+4278.57%)
Datascience Ai Machinelearning ResourcesAlex Castrounis' curated set of resources for artificial intelligence (AI), machine learning, data science, internet of things (IoT), and more.
Stars: ✭ 414 (+2857.14%)
SqoopMirror of Apache Sqoop
Stars: ✭ 817 (+5735.71%)
Cogcomp NlpCogComp's Natural Language Processing libraries and Demos:
Stars: ✭ 410 (+2828.57%)
OozieMirror of Apache Oozie
Stars: ✭ 602 (+4200%)
Decentralized InternetA SDK/library for decentralized web and distributing computing projects
Stars: ✭ 406 (+2800%)
DataflowjavasdkGoogle Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
Stars: ✭ 854 (+6000%)
GiraphMirror of Apache Giraph
Stars: ✭ 569 (+3964.29%)
IgniteApache Ignite
Stars: ✭ 4,027 (+28664.29%)
TitanoboaTitanoboa makes complex workflows easy. It is a low-code workflow orchestration platform for JVM - distributed, highly scalable and fault tolerant.
Stars: ✭ 787 (+5521.43%)
PachydermReproducible Data Science at Scale!
Stars: ✭ 5,305 (+37792.86%)
PhoenixMirror of Apache Phoenix
Stars: ✭ 867 (+6092.86%)
SparkjniA heterogeneous Apache Spark framework.
Stars: ✭ 11 (-21.43%)
AutodlAutomated Deep Learning without ANY human intervention. 1'st Solution for AutoDL [email protected]
Stars: ✭ 854 (+6000%)
Rakam Api📈 Collect customer event data from your apps. (Note that this project only includes the API collector, not the visualization platform)
Stars: ✭ 772 (+5414.29%)
NipypeWorkflows and interfaces for neuroimaging packages
Stars: ✭ 557 (+3878.57%)