We found that in general it's pretty hard to make the llm just stop without human intervention. You can see with things like Cline that if the llm has to check its own work, it'll keep making 'improvements' in a loop; removing all comments, adding all comments etc. It needs to generate something and seems overly helpful to give you something.