joshday / OnlineStatsBase.jl

Licence: other

Base types for OnlineStats.

Programming Languages

2034 projects

Projects that are alternatives of or similar to OnlineStatsBase.jl

Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.

Stars: ✭ 247 (+850%)

Mutual labels: big-data, streaming-data

Onlinestats.jl

Single-pass algorithms for statistics

Stars: ✭ 507 (+1850%)

Mutual labels: big-data, streaming-data

nebula

A distributed block-based data storage and compute engine

Stars: ✭ 127 (+388.46%)

Mutual labels: big-data

ByteSlice

"Byteslice: Pushing the envelop of main memory data processing with a new storage layout" (SIGMOD'15)

Stars: ✭ 24 (-7.69%)

Mutual labels: big-data

xcast

A High-Performance Data Science Toolkit for the Earth Sciences

Stars: ✭ 28 (+7.69%)

Mutual labels: big-data

cloudberry

Big Data Visualization

Stars: ✭ 89 (+242.31%)

Mutual labels: big-data

awesome-bigdata

A curated list of awesome big data frameworks, ressources and other awesomeness.

Stars: ✭ 11,093 (+42565.38%)

Mutual labels: streaming-data

sparkucx

A high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer

Stars: ✭ 32 (+23.08%)

Mutual labels: big-data

godsend

A simple and eloquent workflow for streaming messages to micro-services.

Stars: ✭ 15 (-42.31%)

Mutual labels: streaming-data

bigquery-kafka-connect

☁️ nodejs kafka connect connector for Google BigQuery

Stars: ✭ 17 (-34.62%)

Mutual labels: big-data

Big-Data-Demo

基于Vue、three.js、echarts，数据可视化展示项目，包含三维模型导入交互、三维模型标注等功能

Stars: ✭ 146 (+461.54%)

Mutual labels: big-data

arrow-datafusion

Apache Arrow DataFusion SQL Query Engine

Stars: ✭ 2,360 (+8976.92%)

Mutual labels: big-data

insightedge

InsightEdge Core

Stars: ✭ 22 (-15.38%)

Mutual labels: big-data

talaria

TalariaDB is a distributed, highly available, and low latency time-series database for Presto

Stars: ✭ 148 (+469.23%)

Mutual labels: big-data

incubator-liminal

Apache Liminals goal is to operationalise the machine learning process, allowing data scientists to quickly transition from a successful experiment to an automated pipeline of model training, validation, deployment and inference in production. Liminal provides a Domain Specific Language to build ML workflows on top of Apache Airflow.

Stars: ✭ 117 (+350%)

Mutual labels: big-data

MLBD

Materials for "Machine Learning on Big Data" course

Stars: ✭ 20 (-23.08%)

Mutual labels: big-data

beekeeper

Service for automatically managing and cleaning up unreferenced data

Stars: ✭ 43 (+65.38%)

Mutual labels: big-data

LoL-Match-Prediction

Win probability predictions for League of Legends matches using neural networks

Stars: ✭ 34 (+30.77%)

Mutual labels: big-data

meetups-archivos

Ppts, códigos y videos de las meetups, data science days, videollamadas y workshops. Data Science Research es una organización sin fines de lucro que busca difundir, descentralizar y difundir los conocimientos en Ciencia de Datos e Inteligencia Artificial en el Perú, dando oportunidades a nuevos talentos mediante MeetUps, Workshops y Semilleros …

Stars: ✭ 60 (+130.77%)

Mutual labels: big-data

Twitter-Stream-API-Dataset

Twitter Dynamic Dataset Api. Create any dataset YOU want.

Stars: ✭ 20 (-23.08%)

Mutual labels: streaming-data

View All Similar Projects ➔

OnlineStatsBase

This package defines the basic types and interface for OnlineStats.

Interface

Required

_fit!(stat, y): Update the "sufficient statistics" of the estimator from a single observation y.

Required (with Defaults)

value(stat, args...; kw...) = <first field of struct>: Calculate the value of the estimator from the "sufficient statistics".
nobs(stat) = stat.n: Return the number of observations.

Optional

_merge!(stat1, stat2): Merge stat2 into stat1 (an error by default in OnlineStatsBase versions >= 1.5).
Base.empty!(stat): Return the stat to its initial state (an error by default).

Example

Make a subtype of OnlineStat and give it a _fit!(::OnlineStat{T}, y::T) method.
T is the type of a single observation. Make sure it's adequately wide.

using OnlineStatsBase

mutable struct MyMean <: OnlineStat{Number}
    value::Float64
    n::Int
    MyMean() = new(0.0, 0)
end
function OnlineStatsBase._fit!(o::MyMean, y)
    o.n += 1
    o.value += (1 / o.n) * (y - o.value)
end

That's all there is to it!

y = randn(1000)

o = fit!(MyMean(), y)
# MyMean: n=1_000 | value=0.0530535

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].

Cheap and reliable Node.js hosting starts at $3/month, and $1/month static HTML hosting

joshday / OnlineStatsBase.jl

Programming Languages

Labels

Projects that are alternatives of or similar to OnlineStatsBase.jl

OnlineStatsBase

Interface

Required

Required (with Defaults)

Optional

Example

That's all there is to it!