More

somewhatrandom9 · 2026-06-05T18:09:00 1780682940

Could these quantized models make MTP (Multi-Token Prediction) significantly faster when used as drafters for larger regular Gemma 4 models?

dist-epoch · 2026-06-05T19:01:47 1780686107

Google already released specialized drafters for Gemma 4.

Havoc · 2026-06-05T23:23:44 1780701824

The E2B ones? Or what do you mean by specialized drafters?

int_19h · 2026-06-06T01:22:37 1780708957

They have -assistant in the name, so e.g.: https://huggingface.co/google/gemma-4-31B-it-assistant

Havoc · 2026-06-06T08:32:42 1780734762

Thanks

girvo · 2026-06-06T00:42:29 1780706549

The “-assistant” models released by Google are specialised tiny MTP draft models :)

31b-it-assistant is what enables MTP

somewhatrandom9 · 2026-05-15T01:27:57 1778808477

With "intelligence" (or whatever you want to call it) and speed both seeming to ramp up quickly with local models I wonder what the growth rate and ceiling(?) might be in this space. Will this kind of iq and performance work with just e.g: 16GB RAM in a couple years? Is there a new kind of Moore's law to be defined here?

hadlock · 2026-05-15T03:16:58 1778815018

640gb ought to be enough for anybody

famouswaffles · 2026-05-15T04:49:44 1778820584

Squeezing a model like this complete with 'big model smell' into 16GB...Honestly it's not even possible or feasibly possible today.

It'll require some kind of:

- breakthrough in architecture or

- breakthrough in hardware or

- some breakthrough quantisization technique

The problem is that all the parameters need to be in memory, even the ones that aren't active (say for Mixture Of Expert Models) because switching parametrs in and out of ram is far too slow.

marci · 2026-05-15T06:56:51 1778828211

"That’s where EMO comes in.

We show that EMO – a 1B-active, 14B-total-parameter (8-expert active, 128-expert total) MoE trained on 1 trillion tokens – supports selective expert use: for a given task or domain, we can use only a small subset of experts (just 12.5% of total experts) while retaining near full-model performance."

https://allenai.org/blog/emo

lwansbrough · 2026-05-15T02:04:25 1778810665

The people working at the leading edge of this stuff seem to believe that there is a need for parallel models that solve different problems.

A crow exhibits some degree of intelligence in what is a very small brain compared to humans. There is overlap in the problem solving skills of the dumbest humans and the smartest crows.

So the question is: what is that? Yann LeCun seems to think it’s what we now call world models. World models predict behaviour as opposed to predicting structured data (like language.)

If your model can predict how some world works (how you define world largely depends on the size of your training data), then in theory it is able to reason about cause and effect.

If you can combine cause and effect reasoning with language, you might get something truly intelligent.

That’s where things seem to be going. Once we have a prototype of that system, there will be many questions about how much data you really need. We’ve seen how even shrinking LLMs with 1-bit quantization can lead to models that exhibit a fairly strong understanding of language.

I don’t think it’s unreasonable to expect to see some very intelligent low (relatively) memory AI systems in the next couple years.

somewhatrandom9 · 2026-05-02T16:50:01 1777740601

There are other ways to mount S3, but you may want to check out Amazon's new product, S3 files: https://aws.amazon.com/about-aws/whats-new/2026/04/amazon-s3...

https://aws.amazon.com/s3/features/files/

somewhatrandom9 · 2026-04-19T00:15:40 1776557740

byroot sets a great example sharing his code optimization expertise. His blog has many great improvements like this. A 7x improvement in Dir.join and similar calls?! Thank you, byroot!

somewhatrandom9 · 2026-03-11T17:32:26 1773250346

You may want to also look into AWS's OpenClaw offering (I was surprised to see this): https://aws.amazon.com/blogs/aws/introducing-openclaw-on-ama...

somewhatrandom9 · 2026-01-21T23:29:51 1769038191

"Over four months, LLM users consistently underperformed at neural, linguistic, and behavioral levels. These results raise concerns about the long-term educational implications of LLM reliance and underscore the need for deeper inquiry into AI's role in learning."

somewhatrandom9 · 2025-10-19T15:13:35 1760886815

https://replacement.ai/safety

somewhatrandom9 · 2025-09-13T23:56:01 1757807761

This piece (at a website I know nothing about) claims AWS next generation data centers will use "closed loop" water systems: https://www.aboutamazon.com/news/aws/aws-liquid-cooling-data...

somewhatrandom9 · 2025-08-20T18:45:19 1755715519

The tech specs: https://store.google.com/us/product/pixel_10_specs?hl=en-US Says it has vpn capabilities..But then there is a footnote:

>12. Restrictions apply. Some data is not transmitted through VPN.... See https://g.co/pixel/vpn for details.

Does anyone know what data doesn't go through the vpn?

On the positive side it lists a 24+ hour battery life!! This is huge for me!! ..but it has a footnote, as well

> 6. Battery life depends upon many factors and usage of certain features will decrease battery life. Actual battery life may be lower. Over time, Pixel software will manage battery performance to help maintain battery health as your battery ages. See https://g.co/pixel/battery-tests and https://g.co/pixel/batteryhealth for details.

Which I guess is understandable

Disparallel · 2025-08-20T19:14:59 1755717299

The help section article lists

# Data that isn’t protected by the VPN

Not all network data from your device is protected by the VPN. Examples of data that aren’t protected by the VPN include:

- Tethering traffic

  - This includes USB and Wi-Fi hotspot.

- Push notifications

- Wi-Fi calling and other IMS services

- Work profile app traffic

  - This applies if a work profile is configured on your device.

- Data traffic from an app that routes traffic directly over the Wi-Fi or a cellular connection

All of which make sense to me except push notifications. My guess is they might mean syncing notifications to e.g. a watch.

chippiewill · 2025-08-20T20:36:26 1755722186

I think it might be because push notifications use long-lived connections that are already open when the VPN is turned on.

red369 · 2025-08-21T02:49:17 1755744557

I wonder why tethering traffic doesn't go through the VPN. I could be wrong, but I think it works the same way with iPhones.

I might test that later, but this (old) SE question seems to confirm my memory: https://apple.stackexchange.com/questions/266871/is-there-a-...

tucnak · 2025-08-21T06:19:49 1755757189

FWIW, all tethered traffic in GrapheneOS goes through a VPN.

epolanski · 2025-08-20T20:58:24 1755723504

I bought a Xiaomi 14T and bought my gf a pixel 9.

I like my phone more, but battery life on hers is way better to the point I regret buying mine, it barely lasted a day out when on vacations, and I'm not a super heavy phone user, but look for restaurants, open maps, take pictures, ask Gemini stuff and I'd be at 50% by the time she was at 75.

OneDeuxTriSeiGo · 2025-08-20T19:15:36 1755717336

> Does anyone know what data doesn't go through the vpn?

I can't speak to exactly what data doesn't go through their VPN but I know carrier apps tend to not play nice with VPNs, especially the Google Fi app (as it relies on its connection and what IP its on to coordinate switching between their various carrier contracts and that seems to break under a VPN).

And also seemingly Wi-fi calling has been problematic over VPN for as long as I can remember so that's usually a safe bet for exclusion.

somewhatrandom9 · 2025-07-04T02:27:04 1751596024

The Google Pixel 6a was released on July 21, 2022. A perfectly fine phone artificially obsoleted in 3 years? Time to switch to Apple?