Have to be careful there. A jihad against duplication means that poor-quality scans will drive out good ones, or prevent them from ever being created. Especially if you're misguided enough to optimize for minimum file size.
I agree with samatman's position below: as long as the format is the slightest bit lossy -- and it always will be -- aggressive deduplication has more downsides than upsides.
Deduplication doesn't have to mean removal. It might be just tagging. It would be very nice to be able to fetch the "best filesize" version of the entire collection, then pull down the "best quality" editions of only a few things I'm particularly interested in.
I agree with samatman's position below: as long as the format is the slightest bit lossy -- and it always will be -- aggressive deduplication has more downsides than upsides.