What problem did that specific bit of GC pressure cause you? (I'm doing "5 whys"...

riffraff · on Oct 22, 2010

sorry I thought I replied yesterday but apparently my comment got lost.

Using a (hash)set instead of a bloom filter caused memory usage to explode, which in turn led to memory thrashing/swapping which in turn led to slow processing times.

The rest of the code runs in constant time (well, linear in the size of the item, with num(items) >> size(items[n]) ) and it did work fine for smaller inputs.

stcredzero · on Oct 22, 2010

How much slower? Just how much wall clock time did you lose?

riffraff · on Oct 22, 2010

hard to estimate I'm sorry. The first naive implementation had been running for more than 15 minutes before I thought it as too much and killed it. Rewrote version run all in about 5.