This ipython notebook is a brief, interactive introduction to the Count Min (CM) Sketch, a method of estimating the number of occurances of a particular item in a set. The CM sketch is well-suited to estimating "heavy-hitters" in a streaming data set.
First, the notebook implements an algorithm for the CM sketch, taken from here. Exact and approximate frequencies for the heavy hitters are calculated and compared for various configurations of the CM sketch. Some timing tests are done.