I agree it's a fantastic PoC. Jobs' voice sounds very heavily derived from keynote speeches. People speak with different intonation and cadence in different situations and the uncanny valley effect here is due to having him speak casually on a podcast but in keynote voice. It'd be like training a Paul McCartney AI voice solely on Beatles songs - it wouldn't sound anything like McCartney's real life speech.
Joe Roegan I can guarantee you just had a MUCH better amount of background data to feed this thing - I doubt we have nearly enough high quality audio of Steve Jobs speaking casually to make a convincing facsimile.
A future Steve Jobs though where everyone has a smartphone and more is recorded.....that is kind of scary.
Stuff like this kind of makes me want to secretly record myself in very high quality in tons of situations and then put it all in a safe with a clause in my will to have it all donated to some AI research firm.