Yes, and plenty of others do too. Quantizied. Join us at r/localllama My largest...

danilocesar · 2026-04-24T22:55:39 1777071339

Is your house's heating system based on H100s?

Liftyee · 2026-04-24T19:09:35 1777057775

What hardware do you use?

MezzoDelCammin · 2026-04-25T10:24:43 1777112683

I think the answer to this is:"yes"

Terretta · 2026-04-27T12:36:19 1777293379

Most of those have custom quants for Mac Studio M3 Ultra 512GB. You'll typically see them mention it by name.

All of that list but the last three run at these sizes. For last three, look for a custom quant, e.g. 9.5 bits and/or the Ultra M3 512GB mention.

Not sure which direction I'm surprised but Macbook Pro M5 Max ticks over models at the same speed. With "only" 128GB look for models of 116 GB (the absolute max that retains reasonable stability) or less.

CoolThings · 2026-04-25T11:04:55 1777115095

a Beowulf cluster of 256 x Raspberry Pi 3.

hhh · 2026-04-26T13:09:51 1777208991

I used to maintain a 2000 pi 4 cluster, before LLMs were relevant, with around 6gb free ram per node. I wonder what I could have done with something like this.

tclancy · 2026-04-25T03:00:03 1777086003

All of it.

chid · 2026-04-25T13:08:40 1777122520

even quantised, those are HUGE