The question was about movie scripts and children stories. Not about paragraphs....

goodside · on Aug 21, 2022

Do it in stages. Chop the text up into sections (using GPT-3 if you have to) and then summarize each page/section/chapter in isolation. Then concatenate the summarizations and summarize again. 2048 tokens is like >5KB of text — it’s not that limiting.

nutanc · on Aug 21, 2022

The context is lost. We have tried all that. We tried that when there was free usage. Now it is too expensive. If you believe it can be done please do go ahead. There is a huge market out there for summarizing novels. No one has cracked it yet.

goodside · on Aug 21, 2022

The GPT-3 you had access to when it was free is quite different than what’s deployed today, and its ability to handle long-form inputs is its most apparent change. I’ve gotten it to give good summaries of ~5000 character texts but I admit I haven’t gone longer than the context length.

Edit: Here’s an example summarizing 8,294 characters of Harry Potter fan-fiction: https://twitter.com/goodside/status/1561213457374011392?s=21...

goodside · on Aug 21, 2022

Update: I got this working for all 14,410 characters of the first chapter of that fanfic. See reply in same link above.

Vetch · on Aug 21, 2022

Maybe if you've read the story it'd seem good but as someone who hasn't, I consider it a poor summary, especially the latter half.

goodside · on Aug 21, 2022

The goal wasn’t to get the post possible summary, which could be done with a more elaborate prompt detailing exactly how the summary should go. I was just demonstrating that it can get to the end of 14K chars of text and still remember both the task at hand and enough information to solve it.

Vetch · on Aug 21, 2022

And it failed at that. As the text gets longer and longer, the lack of synthesis across the contexts you've glued together becomes ever more glaring. It's a decent hack but not a solution. More research is needed for how to forget properly across long contexts.

woojoo666 · on Aug 21, 2022

To be fair, one of the earlier comments said "I wouldn't build a summarisation startup on top of GPT3". But it seems like GPT3 is more than capable of producing summaries that are passable, and would be far cheaper than humans. It does seem feasible that one could build a startup based on that

goodside · on Aug 21, 2022

I assume you’re familiar with this, but if not: https://openai.com/blog/summarizing-books/

nutanc · on Aug 22, 2022

Familiar with it. But as they themselves say,the fine tuned model(not released) achieves 5/7 rating for only 15% of the time. So 85% of the time the results are not satisfactory.

goodside · on Aug 22, 2022

Wow, didn’t notice it was that low. Thanks for walking me through this — I might try to tackle this problem next. Very interesting.