> The question is if a llm will run with usable performance at that scale? For t... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		bob1029 on March 5, 2025 \| parent \| context \| favorite \| on: Apple M3 Ultra > The question is if a llm will run with usable performance at that scale? For the self-attention mechanism, memory bandwidth requirements scale ~quadratically with the sequence length.

kridsdale1 on March 5, 2025 [–]

Someone has got to be working on a better method than that. Hundreds of billions are at stake.

Consider applying for YC's Summer 2026 batch! Applications are open till May 4
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact