All Projects → dgryski → Go Fastquantiles

dgryski / Go Fastquantiles

approximate streaming quantiles

Programming Languages

go
31211 projects - #10 most used programming language

Approximate streaming quantiles

NOTE: THIS CODE IS CURRENTLY BROKEN

For a fast streaming quantile implementation, please see http://github.com/dgryski/go-gk

My goal is to implement:

"A Fast Algorithm for Approximate Quantiles in High Speed Data Streams" (Qi Zhang and Wei Wang, 2007).

http://www.cs.unc.edu/~zhangq/PUBLICATIONS/SSDBM07-Qi%20Zhang-FastQStream.pdf

Course notes: http://www.mathcs.emory.edu/~cheung/Courses/584-StreamDB/Syllabus/08-Quantile/Zhang.html http://www.mathcs.emory.edu/~cheung/Courses/584-StreamDB/Syllabus/08-Quantile/Zhang2.html

Zhang/Wang is based on the algorithm in "Space-Efficient Online Computation of Quantile Summaries" (Greenwald, Khanna, 2001) http://infolab.stanford.edu/~datar/courses/cs361a/papers/quantiles.pdf

So I decided to start with GK. There's a good overview at http://papercruncher.com/2010/03/02/stream-algorithms-order-statistics/

and more detailed notes at http://www.mathcs.emory.edu/~cheung/Courses/584-StreamDB/Syllabus/08-Quantile/Greenwald.html http://www.mathcs.emory.edu/~cheung/Courses/584-StreamDB/Syllabus/08-Quantile/Greenwald2.html

The MERGE algorithm is taken from "Continuously Maintaining Quantile Summaries of the Most Recent N Elements over a Data Stream" (Lin, H. Lu, J. Xu, and J.X. Yu, 2004) http://www.cs.ubc.ca/~xujian/paper/quant.pdf

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].