Concurrent bloom filters

Bloom filters are probabilistic data structure for determining whether an element is in a set. Such a data structure offers only two methods - add and mightContain. Google's guava library offers a nice implementation, but unfortunately this implementation (like every implementation I've found) is not concurrent. Concurrent reads are no problem …

more ...


How to lie without statistics - ProPublica edition

almost statistically significant

I've recently discovered a 4 step process for lying without statistics. Here it is:

  1. Write down the conclusion.
  2. Run a statistical analysis.
  3. If the statistical analysis agrees with your conclusion, publish it.
  4. If the statistical analysis disagrees with your conclusion, write some anecdotes and allude to the fact that you …
more ...

Robots didn't take our jobs

I live a strange life - due to a funny life situation, I travel regularly between India and the US. One of the starkest differences is observing how many jobs machines have taken in the wealthy US, leaving the US with a dire surplus of labor.

A day in the life …

more ...