The goal wasn’t to get the post possible summary, which could be done with a more elaborate prompt detailing exactly how the summary should go. I was just demonstrating that it can get to the end of 14K chars of text and still remember both the task at hand and enough information to solve it.
And it failed at that. As the text gets longer and longer, the lack of synthesis across the contexts you've glued together becomes ever more glaring. It's a decent hack but not a solution. More research is needed for how to forget properly across long contexts.
To be fair, one of the earlier comments said "I wouldn't build a summarisation startup on top of GPT3". But it seems like GPT3 is more than capable of producing summaries that are passable, and would be far cheaper than humans. It does seem feasible that one could build a startup based on that
Familiar with it. But as they themselves say,the fine tuned model(not released) achieves 5/7 rating for only 15% of the time. So 85% of the time the results are not satisfactory.