Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A big limitation for skills (or agents using browsers) is that the LLM is working against raw html/DOM/pixels. The new WebMCP API solves this: apps register schema-validated tools via navigator.modelContext, so the agent has structured JSON to work with and can be way more reliable.

WebMCP is currently being incubated in W3C [1], so if it lands as a proper browser standard, this becomes a endpoint every website can expose.

I think browser agents/skills+WebMCP might actually be the killer app for local-first apps [2]. Remote APIs need hand-crafted endpoints for every possible agent action. A local DB exposed via WebMCP gives the agent generic operations (query, insert, upsert, delete) it can freely compose multiple steps of read and writes, at zero latency, offline-capable. The agent operates directly on a data model rather than orchestrating UI interactions, which is what makes complex things actually reliable.

For example the user can ask "Archive all emails I haven't opened in 30 days except from these 3 senders" and the agent then locally runs the nosql query and updates.

- [1] https://webmachinelearning.github.io/webmcp/

- [2] https://rxdb.info/webmcp.html

 help



>Remote APIs need hand-crafted endpoints for every possible agent action.

They already need a remote API for every possible user action. MCP is just duplicate work.


What's the difference with complying to OpenAPI specification and providing an endpoint?

OpenAPI is primarily for machine-to-machine which needs determinism and optimized for some cases (e.g. time in unix format with ms accuracy). MCP is optimized for another use case where LLM has many limitations but has good "understanding" of text. instead of sending `{ user: {id: 123123123123, first_name: "XYZYZYZ", "last_name": "SDFSDF", "gender": "..."..... } }` you could return "Mr XYZYZYZ" or "Mrs XYZYZYZ"

llm doesn't need all these and can't parse it anyway without additional tools (e.g. why should it spend tokens even trying to convert unix timestamp to understand the time)


I thought the whole point of structuring the data was to avoid the LLM from hallucinating/forcing it to conform to a spec?



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: