sorry I thought I replied yesterday but apparently my comment got lost.
Using a (hash)set instead of a bloom filter caused memory usage to explode, which in turn led to memory thrashing/swapping which in turn led to slow processing times.
The rest of the code runs in constant time (well, linear in the size of the item, with num(items) >> size(items[n]) ) and it did work fine for smaller inputs.
hard to estimate I'm sorry. The first naive implementation had been running for more than 15 minutes before I thought it as too much and killed it.
Rewrote version run all in about 5.