HyperLogLog

From wikieduonline
Jump to navigation Jump to search

wikipedia:HyperLogLog is an algorithm for the count-distinct problem.

Calculating the exact cardinality of a multiset requires an amount of memory proportional to the cardinality, which is impractical for very large data sets. The HyperLogLog algorithm is able to estimate cardinalities of > 109 with a typical accuracy (standard error) of 2%, using 1.5 kB of memory.

/etc/redis.conf

HyperLogLog++ counts based on the hashes of the values with some properties:

  • Configurable precision, which decides on how to trade memory for accuracy
  • Excellent accuracy on low-cardinality sets
  • Fixed memory usage: no matter if there are tens or billions of unique values, memory usage only depends on the configured precision.


See also

Advertising: