All Projects → mrjgreen → HyperLogLog

mrjgreen / HyperLogLog

Licence: other
PHP implementation of the HyperLogLog algorithm. Based on Antirez/Redis implementation.

Programming Languages

PHP
23972 projects - #3 most used programming language

HyperLogLog & MinHash

PHP implementation of the HyperLogLog algorithm. Based on Antirez/Redis implementation.

Resources

Note!

This version has been tuned to work with a P value of 14. This is a register size of 2^14 Bytes = 16KB

There is a large bias that can be seen in the graphs below, which begins when the set cardinality reaches around 2^P * 2.5. Polynomial regression has been used to calculate bias offsets BUT ONLY FOR P = 14. You are free to change the P value but the bias offsets will not be applied. Check out the code for more information

Some Professional Looking Graphs

####HyperLogLog

P=14 HyperLogLog P = 14

P=16 Note the offset bias around 2.5 * 2^16 ~= 165,000 HyerLogLog P = 16

P=20 Note the offset bias around 2.5 * 2^20 ~= 2,600,000 HyerLogLog P = 20

####MinHash

K=8192 MinHash K = 8129

Note that the project description data, including the texts, logos, images, and/or trademarks, for each open source project belongs to its rightful owner. If you wish to add or remove any projects, please contact us at [email protected].