Call me stupid, but why do you have to disassemble the program to get the symbol pointers when you just finished compiling the thing in the first place?
Couldn't this intermediate step of pointer collection be part of the prelink process and skip all this guesswork?
If the source is in a single language, compiled by a single tool, with one single version, and entirely linked by a single linker strictly at the end of the process, then that might be feasible. But that's probably not the case. If there is much heterogeneity at all, getting proper symbol information may be a lot more work than analysing the binary for control and data flow. For example, intermediate linking steps may resolve symbol fixups internally and just leave a list of relocations behind, or perhaps use strictly relative addresses within what looks to the final linker like a monolithic blob.
If we're talking about a way to generate an upgrade image for a single architecture, taking a known version # of an executable to another known version #, built on a production build machine, I would think that could cover the case easily.
That kind of led to my second question: this makes really small images, but only from one known version to another, right? What happens if the target to be upgraded is 80 revisions behind?
I can't speak to what chrome does, but usually there would be rollup patches to take advantage of, cutting down the number of patches to apply dramatically. E.g. To get from 23 to 76 you might only need to apply 23->25->75->76. This would be slightly less efficient than going directly both in terms of time patching and patch size, but not terribly so (one can assume most patches are decently disjoint). Most importantly, this keeps the total number of supported patches manageable.
Couldn't this intermediate step of pointer collection be part of the prelink process and skip all this guesswork?