ArkimeArkime (formerly Moloch) is an open source, large scale, full packet capturing, indexing, and database system.
Stars: ✭ 4,994 (+2501.04%)
data-viz-utilsFunctions for easily making publication-quality figures with matplotlib.
Stars: ✭ 16 (-91.67%)
FeastFeature Store for Machine Learning
Stars: ✭ 2,576 (+1241.67%)
pyspark-algorithmsPySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2
Stars: ✭ 72 (-62.5%)
Onlinestats.jlSingle-pass algorithms for statistics
Stars: ✭ 507 (+164.06%)
lidboxEnd-to-end spoken language identification out of the box.
Stars: ✭ 39 (-79.69%)
Uproot4ROOT I/O in pure Python and NumPy.
Stars: ✭ 80 (-58.33%)
awesome-toolscurated list of awesome tools and libraries for specific domains
Stars: ✭ 31 (-83.85%)
Pgm Index🏅State-of-the-art learned data structure that enables fast lookup, predecessor, range searches and updates in arrays of billions of items using orders of magnitude less space than traditional indexes
Stars: ✭ 499 (+159.9%)
Bigdata PlaygroundA complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
Stars: ✭ 177 (-7.81%)
leetspeekOpen and collaborative content from leet hackers!
Stars: ✭ 11 (-94.27%)
Fit SneFast Fourier Transform-accelerated Interpolation-based t-SNE (FIt-SNE)
Stars: ✭ 485 (+152.6%)
awesome-coder-resources编程路上加油站!------【持续更新中...欢迎star,欢迎常回来看看......】【内容:编程/学习/阅读资源,开源项目,面试题,网站,书,博客,教程等等】
Stars: ✭ 54 (-71.87%)
SetlA simple Spark-powered ETL framework that just works 🍺
Stars: ✭ 79 (-58.85%)
couchdb-pkgApache CouchDB Packaging support files
Stars: ✭ 24 (-87.5%)
HazelcastOpen-source distributed computation and storage platform
Stars: ✭ 4,662 (+2328.13%)
Quantitative-Big-Imaging-2018(Latest semester at https://github.com/kmader/Quantitative-Big-Imaging-2019) The material for the Quantitative Big Imaging course at ETHZ for the Spring Semester 2018
Stars: ✭ 50 (-73.96%)
RichdemHigh-performance Terrain and Hydrology Analysis
Stars: ✭ 127 (-33.85%)
mmtf-sparkMethods for the parallel and distributed analysis and mining of the Protein Data Bank using MMTF and Apache Spark.
Stars: ✭ 20 (-89.58%)
Conjure UpDeploying complex solutions, magically.
Stars: ✭ 454 (+136.46%)
Clustering4EverC4E, a JVM friendly library written in Scala for both local and distributed (Spark) Clustering.
Stars: ✭ 126 (-34.37%)
Circosjsd3 library to build circular graphs
Stars: ✭ 436 (+127.08%)
MetamodelMirror of Apache Metamodel
Stars: ✭ 143 (-25.52%)
bagriXML/Document DB on top of distributed cache
Stars: ✭ 40 (-79.17%)
LabsResearch on distributed system
Stars: ✭ 73 (-61.98%)
Opendata.cern.chSource code for the CERN Open Data portal
Stars: ✭ 411 (+114.06%)
mascMicrosoft's contributions for Spark with Apache Accumulo
Stars: ✭ 20 (-89.58%)
MockneatMockNeat is a Java 8+ library that facilitates the generation of arbitrary data for your applications.
Stars: ✭ 410 (+113.54%)
ClickhouseClickHouse® is a free analytics DBMS for big data
Stars: ✭ 21,089 (+10883.85%)
Vue Virtual Scroll List⚡️A vue component support big amount data list with high render performance and efficient.
Stars: ✭ 3,201 (+1567.19%)
Data AcceleratorData Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.
Stars: ✭ 247 (+28.65%)
FluoApache Fluo
Stars: ✭ 159 (-17.19%)
Aws Etl OrchestratorA serverless architecture for orchestrating ETL jobs in arbitrarily-complex workflows using AWS Step Functions and AWS Lambda.
Stars: ✭ 245 (+27.6%)
IgniteApache Ignite
Stars: ✭ 4,027 (+1997.4%)
Kafka UiOpen-Source Web GUI for Apache Kafka Management
Stars: ✭ 230 (+19.79%)
AppdocsApplication Performance Optimization Summary
Stars: ✭ 1,169 (+508.85%)
ElandPython Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
Stars: ✭ 235 (+22.4%)
HiveApache Hive
Stars: ✭ 4,031 (+1999.48%)
Lite Virtual ListVirtual list component library supporting waterfall flow based on vue
Stars: ✭ 223 (+16.15%)
MetorikkuA simplified, lightweight ETL Framework based on Apache Spark
Stars: ✭ 361 (+88.02%)
GunAn open source cybersecurity protocol for syncing decentralized graph data.
Stars: ✭ 15,172 (+7802.08%)
FlumeMirror of Apache Flume
Stars: ✭ 2,200 (+1045.83%)
Attic PredictionioPredictionIO, a machine learning server for developers and ML engineers.
Stars: ✭ 12,522 (+6421.88%)
FiliEasily make RESTful web services for time series reporting with Big Data analytics engines like Druid and SQL Databases.
Stars: ✭ 151 (-21.35%)
DataflowjavasdkGoogle Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
Stars: ✭ 854 (+344.79%)
storm-mlan online learning algorithm library for Storm
Stars: ✭ 18 (-90.62%)