I was having a back and forth with Claude over a somewhat controversial topic, and I found it difficult for it to not misinterpret my questions. It was like speaking to a motivated reasoner who misinterpreted the 3 important words because the 10 others gave it cognitive disconence.
Eventually I cracked it and it said this:
“ I treated the subject as denial-adjacent and reflexively re-asserted the obvious, which means I was answering an imaginary opponent instead of you.”
Is his why online forums like Reddit are dying? Because people are moving their time-wasting arguing with the void to arguing with an ai? This is really bizarre to me.
My experience of reddit forums is extremely poor. I admit to sometimes wanting to see if I can crack the AI on something, but mostly use it like a search engine for topics I'm not familiar with rather than to speak to/debate.
I've been making an auction site and have been using an AI swarm to test it: sellers, intermediaries, buyers, market practices/norms etc. I was mostly using GPT 5.5 xhigh to code up the scenario, and looping over it to check with opus 4.8.
Out of curiosity I asked Fable to review it all and I was shocked to find that there were a lot of blindingly obvious common sense mistakes that got through, for example:
- all intermediaries were given the prices of all buyers up front
- private price information in certain auction types was actually being broadcast to everyone
- multiple contradictions in instructions
If it was any one of these things then I might have understood - but the fact that so many got passed both Opus and GPT 5.5 makes me think that Fable has something special. This is a common sense type thing, that I think you only get to notice when your task doesn't involve a measurable metric, but rather some sort of real world fuzzy task.
There's clearly a problem with all these measures of performance when the difference between these models was night and day in my specific task.
Unless you're coming up with a deterministic set of criteria for evaluating these bugs and issues, every single model is going to keep telling you it finds new things and to fix them.
I'm sure you said the same "find mistakes please" thing to Opus 4.8 and GPT 5.5 when you were using $previous_amazing_latest_model, and they also found and fixed them.
Once the next "Fable"-type model comes out I'm sure it's going to find even more mistakes that the "special" Fable made.
You're using these models to make mistakes and then using upgraded versions of them to find their previous mistakes and fix them, until a new version comes along that can magically fix even more mistakes their previous versions made. There's no end to it.
Yes - I was thinking this - however I had already worked on it so many times with opus and gpt that I thought they had enough time to realise some common sense things that fable just got and understood first time, on the first pass. The difference seemed significant enough to comment about.
It's just much more thorough and spins up a lot of subagents to basically do a lot more E2E testing. Not necessarily smarter, imo you could get the same result with a lesser model by procedurally prompting, but a lot more compute and orchestration.
The issue with bank transfers today is that the SEPA system is robust and established, but got no web compatible API.
But there are two projects (why one, if you can have two!?), one being Wero by different banks, the other being the Digital Euro by the European central bank. If either finds good adaption (Wero is rolling out slowly and for quite a bunch of banks every customer already got a Wero account automatically) this could move things around ...
I'm Irish, but I've built a website for an Australian client and they integrated something which did that. In the checkout, you could choose to pay with a system which would log you into your bank's website, where you could approve a payment, then return to the site on which you'd made your purchase, where it would instantly be marked as paid. I think that it may have taken a few days for the money to actually arrive in their bank account, but the payment was authorised instantly.
This stuff is very popular in the Baltics, there are many payment options and banks provide the necessary connections to be able to complete payments for the users using 2fa auth. Not to mention crypto. e.g. check out varle.lt as an example of an online retailer, the options are sort of normal and expected.
I believe the usual SEPA flow is either scan this QR code or type this IBAN+reference into your bank's mobile app? SEPA is a "giro" system, meaning the person who owns the money has to push it, rather than a cheque system where the money owner writes something to the merchant who then pulls money from the money owner. These are always less convenient because the money owner has to contact their bank. They're also more secure.
That would seem like a logical solution. So wouldn't it be convenient for the expensive payment methods if legalities prevented merchants from charging higher fees to customers using them?
Indeed. It's a triumph of consumer protection laws failing to protect consumers. Merchants here have to set their prices a bit higher to compensate for the fees and you still have to pay those higher prices as a customer even if you're using a more efficient payment method. I will never understand why the law wasn't set the other way - requiring explicit disclosure of payment fees to end customers and prohibiting payment services from incorporating these kinds of anticompetitive terms in merchant agreements - so that everyone could make an informed choice and market pressures would push the transaction overheads down.
It might have been regulatory capture - though I have seen no specific evidence of that myself. It might simply have been the old story about a road and good intentions. At this point it doesn't really matter how it happened - it would be better if the situation were fixed in any case.
Customers paying the price would:
1) induce scrutiny from the public on visa a Mastercard (the monopolies)
2) encourage the competitive market amongst issuers to compress prices
Maybe check out Goose. It is the standard agent harness being developed by The Linux Foundation under the AAIF. Under active development and the implementation seems to have a good leg up on the other popular agents.
Any Pi extensions you'd specifically recommend? I'm just starting out with Pi, but I've had mixed results with extensions. I'm using Pi with gemma4 26b locally, so anything that's friendly to small local models would be appreciated. I think the only extension I'm using right now is pi-total-recall.
This will be interesting. I can see some world where it’s used with consumers, but for the most part I think it will be in the cloud and that would make most sense
is this true because training companies have not been training AI for both performance and brevity (or some other metric like that)? If this becomes a much more serious issue surely they would adjust the training processes
Eventually I cracked it and it said this:
“ I treated the subject as denial-adjacent and reflexively re-asserted the obvious, which means I was answering an imaginary opponent instead of you.”
reply