This is not how I've seen the term meta-harness be used. The common usage I've seen has been for a meta-harness to be a wrapper around an existing agent to give that agent a new ui or abilities.
I'm curious why I've seen this sentiment repeated in so many places, I learned Rust once 5 years ago and I haven't had to learn any new idioms and there have been no backwards incompatible changes to it that required migrating any of my code.
I think people don't like the JavaScript treadmill. People want to think about using tools and getting proficient with them rather than relearning tools. I'm not saying rust is like that, but I do feel that way about python and JavaScript. Those are dynamic languages but it is what all this editions stuff evokes. It's an if it were stable, it wouldn't be changing sort of thing.
> using tools and getting proficient with them rather than relearning tools
This attitude works in carpentry, but not in software. You need to get proficient, but your tools will keep evolving, like everything else in the software world.
This attitude doesn't even work in carpentry, depending on the timeframe you look at, tools have changed over time. You can still use a hand saw, where a table saw would be just as suitable, or have a SawStop(tm) and reduce the likelihood of losing a finger.
In carpentry, you still do a lot of work with a hammer which did not change materially for last 70 years. Programming tools did change very, very much since 1956, even though some still retain the recognizable shape (e.g. Lisp or Fortran).
That's exactly the point. This is not normal even in software.
You can, in fact, learn C exactly once. Or any number of other languages. The entire argument being made here is that the world you're suggesting is a problem. Software developers should not have to continually relearn their tools and it is abnormal to suggest they should.
I've seen C written by people who learned it "exactly once", in let's say the 2000s. They're the same people who insist that all the safety & linting introduce since was pointless.
I'll take C written by people who've learned and improved since then any & every day of the week.
I have never even heard of the linked repo, and it does not appear to be overly popular. Nor have I ever heard of "witness types" or seen code that attempts to make use of them. And no, any new borrow checker would not require some new approach to iterators. This entire comment reads like a non sequitur. Where on Earth did you get any of this from?
To be very fair there are legitimate gripes here, they're small but they are worth covering, and then there's a huge nonsense
L1: The edition system allows Rust to literally mutate the language. 2024 edition (if you begin a new Rust project today) has different rules from 2021 Edition, from 2018 edition and the Rust 1.0 "2015 edition". These changes aren't exactly huge, but they are real and at corporate scale you would probably want to add say a one day internal seminar to learn what's new in a new edition if you want to adopt that edition. For example we hope 2027 edition will swap out the 1..=10 syntax to be sugar for the new core::range::RangeInclusive<i32> not today's core::ops::RangeInclusive<i32> and this swap delivers some nice improvements.
L2: Unlike C++ the Rust stdlib unconditionally grows for everybody in new compiler releases. So even if you stuck with 2015 Edition, all the time since Rust 1.0, when you use a brand new Rust compiler you get the standard library as it exists today in 2026, not how it was in 2015 when you began coding. If you decided you needed a "strip_suffix" method for the string slice reference type &str you might have written a Rust trait, say, ImprovedString and implemented it for &str to give it your strip_suffix method. Meanwhile in Rust 1.45 the Rust standard library &str also gained a method for the same purpose with the same name and so now what you've written won't compile due to ambiguity. You will need to modify your software to compile it on Rust 1.45 and later.
L3: Because Rust is a language with type inference, changes to what's possible which seem quite subtle and of no consequence for existing code may make something old you wrote now ambiguous because what once had a single obvious type is now ambiguous. This is more surprising than the L2 case because now it seems as though this should never have compiled at all. Type A and B already existed, before it inferred type A, now it insists B might be possible, but it may be quite a tangle to discover why B was not a possibility until this new version of Rust. If the compiler had rejected your code when you wrote it in 2015 as ambiguous you'd have grunted and written what you meant, but at this distance in time it may be hard to remember, did you mean B here?
Now the nonsense: There's a vague superstition that Rust is constantly changing while good old C is absolutely stable. Neither is true by orders of magnitude. If you really need certainty you should freeze actual hardware and software, or at the very least build a VM and then nothing changes because you changed nothing. If you'd have been comfortable upgrading to a new CC version, you shouldn't be scared about upgrading the Rust tools.
Strip_suffix won't break with new compiler versions. Anything explicitly imported takes precedence over the prelude, or else everything is a breaking change and would have to wait for an edition.
Switch to Rust 1.50 and now it's calling the stdlib strip_suffix silently, I actually wasn't expecting it to be silent, and obviously if they have the same exact behaviour (mine instead panics to show we're calling it) you wouldn't even notice, but it is a change.
Oh, wow. I am wrong. So much of the rust community must be wrong as this is commonly mentioned when discussing breakage. This is awful.
But on the other hand, it could be a bug as the trait resolver is commonly mentioned as the buggiest part of the language. I'm scared of the breakage if they fix it though.
Probably a key thing you misunderstood is that &str wasn't from the prelude. It's a type in the actual Rust language, that's why it has the lowercase name like u16 or bool
So we didn't bring str::strip_prefix from the prelude in preference to our custom trait, we made a string literal and those have type &'static str -- an immutable reference to a string which lives forever. So the "prelude doesn't win" rule does not apply for &str because it didn't come from the prelude.
If we were talking about a type which implements Iterator for example, new Iterator features would come from Iterator, which is in the prelude and you didn't specifically ask for Iterator so the things you did ask for beat Iterator. But here the language primitive type grew new methods, a thing which Rust does but many languages don't do - Rust has methods on pointers and bytes and anything, whereas a language like Java or C++ can only put methods on "classes" not the ordinary types.
The reason it works with `String` is because trait methods get priority over applying autoderef (which is needed to go from `&String` to `&str` and select `str::strip_suffix`). If you however already have a `&str` then autoderef won't be needed and the inherent method will win over the trait method. At no point does the prelude come into play
but why would anybody choose blue? there is no moral benefit to doing so.
If you altered the game to say that only some fraction of the population get the choice, and everyone who doesn't get the choice is assumed blue (or, is killed if less than 50% of voters choose blue) then there's some question to be explored here. But at it stands there is literally no reason to choose blue.
Choosing red is choosing to survive knowing that there will always be people who choose blue, potentially an amount that would mean you don't survive if you didn't take explicit action against it.
They didn't cause the peril, but knowing that their choice is possibility, if I don't make a decision to protect myself now their decisions may then be the cause of my continued not-survival.
To me, the whole point of the riddle is that it reveals the most internal bias towards either yourself or others, meaning that you do things for society or for yourself. Blues don't understand reds, reds don't understand blues. The bias is invisible to the self but it is clearly there given the huge contrast in the opinions of people.
You fail to see how anyone could choose blue, even though there are plenty of people on the internet and even in the comments here who are stating they would choose blue?
Depends on the scenario… or the number of people in the experiment. A sufficiently large number of people will guarantee votes in both bins. The specific scenario (reading this outside of a vacuum) will also have knock-on effects.
Eg: reading this into the current political landscape in the US vs reading this into another toy problem about jumping off a cliff or not will have very different outcomes and ethics.
The article makes a good point with their reframing.
"Give everyone a magic gun. They may choose to shoot themselves in the head. If more than 50% of people choose to shoot themselves, all the guns jam. The person also has the option to put the gun down and not shoot it."
The "dilemma" is asking to what lengths we should go to save people choosing to commit suicide, and does that change when they are unintentionally choosing suicide due to being "tricked" into it.
I guess that just underlines how reframing can really muddy or clarify a problem. The original problem can be mapped onto many varied scenarios with wildly different ethics.
Practically at least one person will choose blue for lulz or curiosity or as a moral compass. Shall we punish them? How does it affect survival of whole population in a long term?
There’s a moral benefit to choosing blue if you think there’s a chance that the end result will be split 50-50 and you’ll be the deciding vote between a blue majority and a red majority.
If everyone picks red everyone lives, nobody needs saving by picking blue. Picking blue obliges others to pick blue to prevent your death, risking their own life in turn. Red is the moral option.
There is no topic in which you'll get 100% of people to agree with you, and this is no different. There will always be people who choose blue. Arguing that you could ever get 100% of people to pick red is a coping mechanism to deal with the knowledge that your choice to pick red will result in some deaths (i.e., unless blue wins).
That isn't to say I categorically judge anyone who would choose red.
If there's good reason to believe a majority and especially a supermajority would choose red over blue, then choosing red is indeed the only rational choice, and convincing overs to do the same is the only way to save lives.
What I like about the question is that it can be used to measure whether a society is low trust (majority red) or high trust (majority blue).
However, where I take issue with the article is the assertion that it's impossible to get a blue majority, especially in the face of polling that suggests such a majority already exists. The article's claim that choosing red is the only moral choice seems at best to be self-delusion.
The utility of choosing red and the morality of convincing others to follow suit maximizes the larger the currently expected pool of red gets, sure. However, while choosing blue has less and less personal downside the greater the expected majority of blue there is, similar to red, the morality of choosing blue maximizes the closer you get to an even split, since it's the product of the potential lives saved by going blue and the likelihood your individual vote will push it over the edge.
Personally, I'd choose blue. I'd rather sacrifice myself than be party to the deaths of billions of people, so if there's even some hope at convincing the majority to go blue, I'd feel obligated to stay with it even if pre-polling suggests things initially tip toward red. I'd also be a bit wary of living in a society now devoid of anyone willing to self-sacrifice. I'm not convinced most people choosing red give that any thought.
> However, where I take issue with the article is the assertion that it's impossible to get a blue majority, especially in the face of polling that suggests such a majority already exists.
The people saying they'd vote blue would never actually do it. People support lots of altruistic things in the abstract, but almost nobody does it when it involves real risk and sacrifice. The cost of saving a kid in Africa by donating malaria medicine and insecticidal nets is only about $5,000. How many people do you know who will cancel their Hawaii vacation and donate that money to an African charity?
Every time you choose to take a vacation, or get a tricked out Macbook Pro, etc., you are in a real way choosing to allow some kid in Africa to die. But you do it anyway.
You're thinking of this like a game where the only point is to "win". That's not how this would actually work in practice.
Blue is the only moral and logical choice. If red gets over 50% and you picked it, therefore contributing to the "red" outcome, you are now effectively a murderer. Plus you now get to live in a world where everyone else alive are sociopaths that picked red, where everyone with a conscience is now dead.
You also can't count on everyone picking red, or "if you picked blue, then you voted for suicide".
It's reasonable to assume that, leading to the button press event, the usual low-trust, "every man by himself" types will rally for red, with the usual excuses, where high-trust societies will make it clear that it's your moral duty to pick blue, to get the votes to the 50% threshold and ensure no one dies. Around the world there would be debates nonstop that would permeate every social circle and families. You'd have huge arguments where the typical selfish types would scream at their family members "how dare you say you're going to press blue, do you want to leave your poor mother alone without their only child?", only pushing red-leaning voters more into red and blue-leaning voters more into blue.
Plus, if you look at the possible outcomes:
- Red wins, you picked red: Depending on where you live, a reasonable portion to the large majority of the population is now dead. The ones alive have, by definition, a strong bias towards individualism and noncooperation. It's extremely likely civilisation will collapse. Pick your favourite fictional dystopia and you might have a reasonable chance of it actually coming somewhat real.
- Red wins, you picked blue: You are now dead, but at least you don't have to live in the world above.
- Blue wins, you picked blue: Things carry on as normal and your conscience is safe in knowing that you didn't vote to kill and that over 50% of your fellow humans also didn't vote to kill.
- Blue wins, you picked red: Things carry on as normal, but you now have a guilty conscience, or, if your vote was made public, people around you know you would have killed them to save your skin.
By picking red you didn't contribute to anything at all, this button does absolutely nothing in practice. If you remove the red button, leaving the choice between pressing blue and not participating at all, the choice to not participate seems quite obvious. The red button adds some "weight" to the decision, but it's materially the same
> Depending on where you live, a reasonable portion to the large majority of the population is now dead. The ones alive have, by definition, a strong bias towards individualism and noncooperation.
Anyone who picked blue gambled their own lives over nothing. There is nothing altruistic about pressing the blue button and especially nothing altruistic about trying to convince people to press the blue button. The altruistic thing is to convince everyone that they don't need to kill themselves by pressing the blue button.
You're ignoring the dimension of universalism versus insularity. In practice, high-trust, high-cooperation communities are also insular. They cooperate within their community, but not people outside their community. Those communities can ensure the survival of their members by using their social infrastructure to ensure everyone votes red.
Assuming that the red/blue choice doesn't have a theological valance, you'd have a lot of tight-knit Mormon, Muslim, and Orthodox Jewish communities surviving in the red scenario. I suspect also all the highly authoritarian Asian countries.
That's still not really a dilemma. It would be a dilemma if it were up to me to save those people who choose blue. But it's not up to me - it's up to a massive gamble that over 50% of people (over 4 BILLION people) will vote with me as well. Like... huh? Are we being serious here? We want to play poker with the lives of billions?
Maybe if the required percentage was lower this would compute better in my brain lol
The technique Anthropic uses was demonstrated by Nicholas Carlini in a talk he gave 2 weeks ago and it's very simple, when asking LLMs to review code, ask them to focus its review on one file in a single session. Here is the video with the timestamp (watch through to ~5:30, they show two different ways of prompting claude).
IMO the big "innovation" being shown by Mythos is the effectiveness with prompting LLMs to look for security vulnerabilities by focusing on specific files one at a time and automating this prompting with a simple script.
Prompting Mythos to focus on a single file per session is why I suspect it cost Anthropic $20k to find some of the bugs in these codebases. I know this same technique is effective with Opus 4.6 and GPT 5.4 because I've been using it on my own code. If you just ask the agent to review your pr with a low effort prompt they are not exhaustive, they will not actually read each changed file and look at how it interacts with the system as a whole. If the entire session is to review the changes for a single file, the llm will do much more work reviewing it.
Edit: I changed my phrasing, it's not about restricting its entire context to one file but focusing it on one file but still allowing it to look at how other files interact with it.
Instead of asking the model: "Here's this codebase, report any vulnerability." you ask. "Here's this codebase, report any vulnerability in module\main.c".
The model can still explore references and other files inside the codebase, but you start over a new context/session for each file in the codebase.
Honestly, that's the only way I've ever been able to trust the output. Once you go beyond the scope of one file it really degrades. But within a single file I've seen amazing results.
Are you not supposed to include as many _preconditions_ (in the form of test cases or function constraints like "assert" macro in C) as you can into your prompt describing an input for a particular program file before asking AI to analyze the file?
Please, read my reply to one of the authors of Angr, a binary analysis tool. Here is an excerpt:
> A "brute-force" algorithm (an exhaustive search, in other words) is the easiest way to find an answer to almost any engineering problem. But it often must be optimized before being computed. The optimization may be done by an AI agent based on neural nets, or a learning Mealy machine.
> Isn't it interesting what is more efficient: neural nets or a learning Mealy machine?
...Then I describe what is a learning Mealy machine. And then:
> Some interesting engineering (and scientific) problems are: - finding an input for a program that hacks it; - finding a machine code for a controller of a bipedal robot, which makes it able to work in factories;
reply