Library for multi-dimensional, multi-sensor, uni/multivariate time series data analysis, unsupervised feature selection, unsupervised deep anomaly detection, and prototype of explainable AI for anomaly detector

Stars: ✭ 80 (+3.9%)

Mutual labels: feature-engineering

kaggle-berlin

Material of the Kaggle Berlin meetup group!

Stars: ✭ 36 (-53.25%)

Mutual labels: feature-engineering

zdh web

大数据采集,抽取平台

Stars: ✭ 292 (+279.22%)

Mutual labels: bigdata

soda-spark

Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes

Stars: ✭ 58 (-24.68%)

Mutual labels: pyspark

traefik-ondemand-service

Traefik ondemand service for the traefik ondemand plugin

Stars: ✭ 35 (-54.55%)

Mutual labels: scale

intersect

一道面试题的思考 - 6000万数据包和300万数据包在50M内存使用环境中求交集

Stars: ✭ 54 (-29.87%)

Mutual labels: bigdata

StreamBench

Measuring the performance of popular streaming engines with Yahoo's Streaming Benchmark

Stars: ✭ 52 (-32.47%)

Mutual labels: bigdata

clink

Clink is a library that provides APIs and infrastructure to facilitate the development of parallelizable feature engineering operators that can be used in both C++ and Java runtime.

Stars: ✭ 24 (-68.83%)

Mutual labels: feature-engineering

pyspark-algorithms

PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2

Stars: ✭ 72 (-6.49%)

Mutual labels: pyspark

feng

feng - feature engineering for machine-learning champions

Stars: ✭ 27 (-64.94%)

Mutual labels: feature-engineering

hayabusa

Hayabusa: Simple and Fast Full-Text Search Engine for Massive System Log Data

Stars: ✭ 43 (-44.16%)

Mutual labels: bigdata

jhdf

A pure Java HDF5 library

Stars: ✭ 83 (+7.79%)

Mutual labels: bigdata

dot

distributed data sync with operational transformation/transforms

Stars: ✭ 73 (-5.19%)

Mutual labels: transformation

jgit-spark-connector

jgit-spark-connector is a library for running scalable data retrieval pipelines that process any number of Git repositories for source code analysis.

Stars: ✭ 71 (-7.79%)

Mutual labels: pyspark

awesome-coder-resources

编程路上加油站！------【持续更新中...欢迎star,欢迎常回来看看......】【内容：编程/学习/阅读资源，开源项目,面试题,网站,书,博客,教程等等】

Stars: ✭ 54 (-29.87%)

Mutual labels: bigdata

dt-sql-parser

SQL Parsers for BigData, built with antlr4.

Stars: ✭ 135 (+75.32%)

Mutual labels: bigdata

einet

Uncertainty and causal emergence in complex networks

Stars: ✭ 77 (+0%)

Mutual labels: scale

gintonic

A declarative transformation language for GraphQL 🍸

Stars: ✭ 27 (-64.94%)

Mutual labels: transformation

stargan2

StarGAN2 for practice

Stars: ✭ 89 (+15.58%)

Mutual labels: transformation

163-bigdate-note

bigdata note

Stars: ✭ 38 (-50.65%)

Mutual labels: bigdata

learn-by-examples

Real-world Spark pipelines examples

Stars: ✭ 84 (+9.09%)

Mutual labels: pyspark

flask-spark-docker

Just a boilerplate for PySpark and Flask

Stars: ✭ 32 (-58.44%)

Mutual labels: pyspark

hedgedhttp

Hedged HTTP client which helps to reduce tail latency at scale.

Stars: ✭ 103 (+33.77%)

Mutual labels: scale

greycat

GreyCat - Data Analytics, Temporal data, What-if, Live machine learning

Stars: ✭ 104 (+35.06%)

Mutual labels: bigdata

Clustering4Ever

C4E, a JVM friendly library written in Scala for both local and distributed (Spark) Clustering.

Stars: ✭ 126 (+63.64%)

Mutual labels: bigdata

ReinforcementLearning Sutton-Barto Solutions

Solutions and figures for problems from Reinforcement Learning: An Introduction Sutton&Barto

Stars: ✭ 20 (-74.03%)

Mutual labels: feature-engineering

bigquery-data-lineage

Reference implementation for real-time Data Lineage tracking for BigQuery using Audit Logs, ZetaSQL and Dataflow.

Stars: ✭ 112 (+45.45%)

Mutual labels: bigdata

2019 egu workshop jupyter notebooks

Short course on interactive analysis of Big Earth Data with Jupyter Notebooks

Stars: ✭ 29 (-62.34%)

Mutual labels: bigdata

young-examples

java学习和项目中一些典型的应用场景样例代码

Stars: ✭ 21 (-72.73%)

Mutual labels: bigdata

PubMed-Best-Match

Machine-learning based pipeline relying on LambdaMART currently used in PubMed for relevance (Best Match) searches

Stars: ✭ 36 (-53.25%)

Mutual labels: feature-engineering

ASV

[CVPR16] Accumulated Stability Voting: A Robust Descriptor from Descriptors of Multiple Scales

Stars: ✭ 26 (-66.23%)

Mutual labels: scale

go-hx711

Golang HX711 interface using periph.io driver

Stars: ✭ 15 (-80.52%)

Mutual labels: scale

GEAN

This toolkit deals with GEnomic sequence and genome structure ANnotation files between inbreeding lines and species.

Stars: ✭ 36 (-53.25%)

Mutual labels: transformation

exemplary-ml-pipeline

Exemplary, annotated machine learning pipeline for any tabular data problem.

Stars: ✭ 23 (-70.13%)

Mutual labels: feature-engineering

scale

📦 Toolkit for mapping abstract data into visual representation.

Stars: ✭ 53 (-31.17%)

Mutual labels: scale

kafka-twitter-spark-streaming

Counting Tweets Per User in Real-Time

Stars: ✭ 38 (-50.65%)

Mutual labels: pyspark

FIFA-2019-Analysis

This is a project based on the FIFA World Cup 2019 and Analyzes the Performance and Efficiency of Teams, Players, Countries and other related things using Data Analysis and Data Visualizations

Stars: ✭ 28 (-63.64%)

Mutual labels: feature-engineering

the-apache-ignite-book

All code samples, scripts and more in-depth examples for The Apache Ignite Book. Include Apache Ignite 2.6 or above

Stars: ✭ 65 (-15.58%)

Mutual labels: bigdata

nitroml

NitroML is a modular, portable, and scalable model-quality benchmarking framework for Machine Learning and Automated Machine Learning (AutoML) pipelines.

Stars: ✭ 40 (-48.05%)

Mutual labels: scale

skrobot

skrobot is a Python module for designing, running and tracking Machine Learning experiments / tasks. It is built on top of scikit-learn framework.

Stars: ✭ 22 (-71.43%)

Mutual labels: feature-engineering

isarn-sketches-spark

Routines and data structures for using isarn-sketches idiomatically in Apache Spark

Stars: ✭ 28 (-63.64%)

Mutual labels: pyspark

Spark-MLlib-Tutorial

大数据框架 Spark MLlib 机器学习库基础算法全面讲解,附带齐全的测试文件

Stars: ✭ 32 (-58.44%)

Mutual labels: bigdata

amas

Amas is recursive acronym for “Amas, monitor alert system”.

Stars: ✭ 77 (+0%)

Mutual labels: bigdata

lectures-hse-spark

Масштабируемое машинное обучение и анализ больших данных с Apache Spark

Stars: ✭ 20 (-74.03%)

Mutual labels: bigdata

BetterDummy

Unlock your displays on your Mac! Smooth scaling, HiDPI unlock, XDR/HDR extra brightness upscale, DDC, brightness and dimming, dummy displays, PIP and lots more!

Stars: ✭ 9,601 (+12368.83%)

Mutual labels: scale

pyspark-cassandra

pyspark-cassandra is a Python port of the awesome @datastax Spark Cassandra connector. Compatible w/ Spark 2.0, 2.1, 2.2, 2.3 and 2.4

Stars: ✭ 70 (-9.09%)

Mutual labels: pyspark

1-60 of 464 similar projects

›

next*5