While all this stuff around deep fakes and its detection is interesting I'm still left wondering why we don't use cryptography to prove if an image is authentic or not.
Because the world is impure and the link between the real world and the mathematical domain will never be ironclad.
For instance, say that Joe makes a deepfake and then signs it with his key. Sure, it's beyond doubt (assuming keys weren't leaked, etc) that Joe either took or created the picture, but that doesn't in itself tell you whether Joe made a deepfake or not.
It's the same as in supposed blockchain logistics operations. If Fred the farmer says he harvested x bags worth of grain but some were stolen before he could ship them, there's no way to mathematically verify whether the theft actually took place or the harvest just came up short.
In both cases you're going to need some kind of monitoring, and that's the purpose deepfake detection algorithms would serve.
I had the same realization several months ago. In the arms race between deepfake detection and generation it's widely understood that generation will win. That's why I started working on Tovera[0].
Will signing come from the camera? At that point you run into DRM.
Besides, what will the camera sign? The produced JPG? The raw file? What about rescaling? Do we want ZKPs that a JPG was achieved by nothing more than re-scaling and tone-mapping another JPG or RAW file? Those ZKPs are going to be massively big and slow to verify.
My main concern is the faking of official statements by public figures where having those signed addresses the issue of deep fakes. Of course there has to be infrastructure in place for this and people must know that you can verify them.
If we get to a point in society where unsigned images, or images signed from a questionable source are looked at skeptically, then that alone is a step in the right direction. I don't know the final solution and I don't claim to know it, but cryptography seems like a step in the right direction.