Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> You need to set up your fine-tuning and prompts and then test well for consistent results.

Tell that to Google...

Seriously, it is well established that these systems hallucinate. Trying to say otherwise shows you are trying to push something that just is not true.

They can be right, yes. But when they are wrong they can be catastrophically wrong. You could be wasting time looking into the wrong problem with something like this.



If you're curious what the state of the art in multi-agent is looking like, I really recommend https://thinkwee.top/multiagent_ebook/


This looks great! Unfortunately doesn't well on firefox but I take it as being Mozilla's fault nowadays.


The most common hallucinations I've seen are phantom GitHub repos and issues, and this usually appears when I ask for a source.


It's definitely an overblown problem. In practice it's not a big issue.


Yeah.... no it's really not overblown.

It is a serious problem when these tools are being pushed as trustworthy when they are anything but.

On an almost daily occurrence I deal with some sort of hallucination in code, in summarizing something, we see it constantly on social media when people try to use Google's AI summary as a source of truth.

Let's not try to lie to push an agenda about what the capabilities of what these models can do. They are very powerful, but they make mistakes. There is zero question about that, and quite often.

The problem isn't that they hallucinate, the problem is that we have comments like yours trying to downplay it. Then we have people that, it is right just enough times that they start trusting it without double checking.

That is the problem, it is right enough times that you just start accepting the answers. That leads to, making scripts that grab data and put it into a database without checking. It's fine if it is not business critical data, but it's not really fine when we are talking about health care data or.. oh idk, police records like a recent post was talking about.

If you are going to use it for your silly little project, or you're going to bring down your own companies infrastructure go for it. But let's not pretend the problem doesn't exist and shove this technology into far more sensitive areas.


Yeah.... no it's overblown.

I think you're exaggerating. You're imagining the worst but your argument basically boils down to not trusting that people can handle it, and calling me a liar. Good one.


> Tell that to Google...

Yeah, because Google's LLMs have an completely open question/answer space.

For e.g. a Kubernetes AI, you can nowadays just feed in the whole Kubernetes docs + a few reference Helm charts, tell it to stick close to the material, and you'll have next to no hallucinations. Same thing for simple data extraction tasks, where in the past you couldn't use LLMs because they would just hallucinate data into the output that wasn't there in the input (e.g. completely mangling an ID), which nowadays is essentially a non-issue.

As soon as you have a restrictable space in which the LLM acts, you have a lot of options to tune them that hallucinations are not a major issue nowadays.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: