Revenge Porn is the Price of a Free Society

Sorry to break into a non-programming rant. No python today.

Over the past week or so, there has been a small war within journalism. Charlotte Laws, a reporter, discovered that another reporter was writing about her daughter. Specifically this reporter (Hunter Moore) discovered, via dubious sources, compromising information about her …

more ...


How to measure a changing conversion rate (with python code)

As the owner of the spamblog http://www.iwishiwastaller.com, I've run into the following problem. I'm selling some height enhancing pills full of organic free range snake oil. I've come up with several different calls to action:

  • Tired of finding pants that fit? Click here for the solution.
  • Click …
more ...




Postgres NOTIFY for cache busting and more

"There are only two hard things in Computer Science: cache invalidation and naming things."

Phil Karlton

For those of using Postgres as a data store, cache invalidation has become significantly easier. Postgres has introduced the command NOTIFY which can be used to inform the cache of necessary invalidation.

The old …

more ...

Compound Aggregates in Hadoop/Scalding

Consider the following problem. I have an extremely large number of servers, each of which uploads their logs to a Hadoop cluster. Each line of the log file contains a server IP address, and represents a single message in Hadoop. I'm investigating a network intrusion. One of my network admins …

more ...

Don't use Hadoop - your data isn't that big

image possibly inspired by this post

"So, how much experience do you have with Big Data and Hadoop?" they asked me. I told them that I use Hadoop all the time, but rarely for jobs larger than a few TB. I'm basically a big data neophite - I know the concepts, I've written code, but never at …

more ...

java.lang.OutOfMemoryError, GC overhead limit exceeded

One annoying error which I often see when running Hadoop jobs is this:

java.lang.OutOfMemoryError: GC overhead limit exceeded

The cause of this error is that Java is spending a lot of time inside the garbage collector, and is not freeing up large chunks of memory. When this error …

more ...