More

devttyeu · 2026-06-03T13:33:29 1780493609

If you believe a 128gb machine that is essentially DGX Spark in a laptop chassis can run models comparable to SOTA you either never ran open models on hard tasks, or you aren't scratching the surface of SOTA closed LLM capability in how you're using them.

f311a · 2026-06-03T13:45:45 1780494345

Can you show me an example of a hard task that can't be achieved using light models? When we don't want the model to work on autopilot without reviewing the code at all. Even SOTA models will produce garbage code, if you don't guide them all the time.

Hard tasks require a lot of guidance and code reviewing, unless you are creating another throw away project where correctness, maintainability and code understanding does not matter.

devttyeu · 2026-05-25T20:58:40 1779742720

More like $1M at current prices at this scale / level of performance.

If you go with HDD arrays probably $50k

arjie · 2026-05-25T21:20:11 1779744011

Boy pricing is pretty nuts these days. I have half a petabyte in Seagate enterprise drives myself and I didn’t pay anything close to that to acquire it. Such a pity about the flash storage. 2 years ago we built 200 TiB or something of flash using Samsung PM1633 or something and it was a fraction of the cost per gigabyte that $1m would imply.

rcbdev · 2026-05-26T06:12:02 1779775922

We're in the boom phase of the cycle. The bust on these chips always comes.

devttyeu · 2026-05-25T17:12:43 1779729163

Well if you're a devshop just billing hours of mostly low impact work then hours are very much equal to productivity.

saghm · 2026-05-25T17:30:50 1779730250

Next time you're going to work for an hour, ping me, and I bet I can surprise you with how much less productive I am than you

devttyeu · 2026-05-20T21:03:27 1779311007

Fwiw if you trained an LLM in an RL sandbox that would require it to have goals, the output llm probably would "have goals"

devttyeu · 2026-05-20T20:56:04 1779310564

That's here for anyone wondering - https://cdn.openai.com/pdf/1625eff6-5ac1-40d8-b1db-5d5cf925d...

noobermin · 2026-05-21T12:36:31 1779366991

So, I'm not really a mathematician, but the first 3-8 pages reads like nonsense and a bunch of unrelated facts. A bit surreal may be, but if this the norm for this kind of thing, I'm amazed it arrives at any useful result at all.

chiwilliams · 2026-05-22T21:29:25 1779485365

It didn't seem like nonsense to me. (Recently graduated undergrad with a math degree; probably could have gone to PhD). It seemed like the AI was cycling through a bunch of different possible approaches to tackle the problem. Eventually it finds one and makes more progress there until it reaches the solution

gilgoomesh · 2026-05-21T05:31:12 1779341472

I'm disappointed only that the chain of thought needed to be rewritten. Need to train these LLMs to natively communicate in LaTeX research paper format.

bigzyg33k · 2026-05-21T11:17:19 1779362239

I believe they rewrite the chain of thought to protect their IP, i.e. the chain of thought reveals information about how the model works in a manner that may aid replication

devttyeu · 2026-05-11T22:03:09 1778536989

Cargo is spiritually based on NPM so it's not much better.

Go Get is closer to always locking dependencies unless you explicitly upgrade them with a go get, so it's much much better in my view.

Yes, you can lock deps in NPM/Cargo/etc. but that's not the default. It is the default in Go.

In Go projects my policy for upgrading dependencies includes running full AI audit of all code changed across all dependencies, comes out to ~$200 in tokens every time but it gives those warm 'not likely to get pwned' vibes. And it comes with a nice report of likely breaking changes etc.

nine_k · 2026-05-11T22:08:55 1778537335

> comes out to ~$200 in tokens every time

BTW a curated mirror of <whatever ecosystem> packages, where every package is guaranteed to have been analyzed and tested, could be an easy sell now. Also relatively easy to create, with the help of AI. A $200 every time is less pleasant than, say, $100/mo for the entire org.

Docker does something vaguely similar for Docker images, for free though.

AgentME · 2026-05-11T22:12:03 1778537523

People are already scanning npm constantly. You can limit yourself to pre-scanned packages by setting npm's minimum release age setting to 1 or 2 days (a timeframe that all the recent high-profile malicious package versions were unpublished within).

nine_k · 2026-05-11T22:15:02 1778537702

Note to self: the test suite for vetting a package should include setting the system date some time in the future, to check if an exploit is trying to sleep long enough to defeat the age limit.

chickensong · 2026-05-12T08:00:33 1778572833

https://www.chainguard.dev/

voxl · 2026-05-11T22:10:37 1778537437

It's insane to me you spend $200 on a report you likely rarely read in detail or double check for correctness, yet you're doing it to feel good about security.

devttyeu · 2026-05-11T22:35:52 1778538952

If it runs in a harness that will alert me when something dodgy is detected I'm fine to stay at that level.

I don't read it in detail because reading in detail is precisely what I delegate to the harness. The alternative is that I delegate all this trust to package managers and the maintainers which quite clearly is a bad idea.

Whether the $$ pricetag is worth it is.. relative. Also in Go you don't update all that often, really when something either breaks or there is a legitimate security reason to do so, which in deep systems software is quite infrequent.

Funnily enough for frontend NPM code our policy was to never ever upgrade and run with locked dependencies, running few years old JS deps. For internal dashboards it was perfectly fine, never missed a feature and never had a supply chain close call.

crab_galaxy · 2026-05-11T23:15:13 1778541313

> running few years old JS deps

What do you when a critical vulnerability gets discovered and you have to update a package? How many critical/high severity vulnerabilities are you running with in production every day to avoid supply chain attacks?

devttyeu · 2026-05-12T01:30:47 1778549447

For the stuff in more sensitive deployments it's really quite simple, just setup CORS etc properly and don't do anything overly fancy on the frontend. Worst case the user may force some internal function to eval some JS by pasting scripts into the browsers debug console.

Critical severity vulnerabilities are only critical when they are reachable, but are completely meaningless if your application doesn't touch that code at all. It's objectively more risky to "patch" those by updating dependencies than just let them be there.

throawayonthe · 2026-05-12T00:25:03 1778545503

they said internal dashboards

nine_k · 2026-05-12T00:53:39 1778547219

Anyone who gets into the security perimeter may be in for a feast then.

n_e · 2026-05-11T22:51:32 1778539892

> Yes, you can lock deps in NPM/Cargo/etc. but that's not the default. It is the default in Go.

How is it not the default in npm?

chuckadams · 2026-05-11T23:15:33 1778541333

It is the default in both cargo and npm, but "npm install" stupidly enough still updates the lockfile, and you need "npm ci" to actually respect it. I think there's some flag to make install work sanely, but long-term I find the best approach is to use anything other than npm.

I ditched npm for yarn years ago because it had saner dependency resolution (npm's peer dependency algorithm was a constantly moving target), and now I've switched from yarn to bun because it doesn't run hooks in dependencies by default. It also helps that it installs dependencies 10x faster.

cluckindan · 2026-05-11T23:32:00 1778542320

”npm install” does not update the lockfile in any current major version.

At least not if you haven’t edited your package.json manually.

devttyeu · 2026-05-06T10:49:55 1778064595

Enterprise NVMe on the high end is now starting to ship batches at $1000/TB with existing stock around $500/TB. No consumer is going to pay that.

But if you're buying a $500k GPU server putting 100TB of nvme in there for $50-100k is justifiable.

therealmarv · 2026-05-06T11:18:18 1778066298

There was once a 2.5" SSD Mushkin Source 16TB SATA drive. At its cheapest it was ~1700 USD (or 1500 EUR). That was mid 2023 (like 3 years ago!).

Nowadays it feels like that this time and price region is like decades away in the future. I was hoping I can store more data in future on modern tech like SSDs and not less.

radicality · 2026-05-06T16:00:21 1778083221

Yeah it sucks :( Almost exactly a year ago, I got a brand new 15.36TB Kioxia CD-6R (u.3 pcie4x4 drive) for $1450+tax from serverpartdeals.com - that same drive is now listed for ~$4600 (and it’s also out of stock there)

devttyeu · 2026-04-28T10:18:32 1777371512

403 B is the revenues - https://www.sec.gov/Archives/edgar/data/1652044/000165204426...

devttyeu · 2026-03-02T12:43:30 1772455410

Also not able to access Gemini API.

At least our local GPU server still serves Kimi K2.5 to my team just fine.

devttyeu · 2026-01-22T15:47:45 1769096865

Also like some popular youtubers and popular speakers.

pixl97 · 2026-01-22T16:25:53 1769099153

Hmm, wonder where they got their training data from?