> Reading and understanding other people's code is much harder than writing code.
I keep seeing this sentiment repeated in discussions around LLM coding, and I'm baffled by it.
For the kind of function that takes me a morning to research and write, it takes me probably 10 or 15 minutes to read and review. It's obviously easier to verify something is correct than come up with the correct thing in the first place.
And obviously, if it took longer to read code than to write it, teams would be spending the majority of their time in code review, but they don't.
Five hours ago I was reviewing some failed tests in a PR. The affected code was probably 300 lines, total source for the project ~1200 lines. Reading the code, I couldn't figure out what the hell was going on... and I wrote all the code. Why would that be failing? This all looks totally fine. <changes some lines> There that should fix it! <runs test suite; 6 new broken tests> Fuck.
When you write code, your brain follows a logical series of steps to produce the code, based on a context you pre-loaded in your brain in order to be capable of writing it that way. The reader does not have that context pre-loaded in their brain; they have to reverse-engineer the context in order to understand the code, and that can be time-consuming, laborious, and (as in my case) erroneous.
I worked with people that defended the idea that code should not have comments, the code should self explain itself.
I am not a developer and I completely disagree with that, the python scripts I wrote, Ansible playbook, they all have comments because 1 month down the road I no longer remember why I did what I did, was that a system limitation or software limitation or the easiest solution at the time???
I like to think of it as the distinction between editor and reader. Like you said, it's quite easy to read code. I heavily agree with this. I don't professionally write C but I can read and kinda infer what C devs are doing.
But if I were an "editor," I actually take the time to understand codepaths, tweak the code to see what could be better, actually try different refactoring approaches while editing. Literally seeing how this can be rewritten or reworked to be better, that takes considerable effort but it's not the same as reading.
We need a better word for this than editor and reading, like something with a dev classification too it.
When the code is written, it's all laid out nicely for the reader to understand quickly and verify. Everything is pre-organized, just for you the reader.
But in order to write the code, you might have to try 4 different top-level approaches until you figure out the one that works, try integrating with a function from 3 different packages until you find the one that works properly, hunt down documentation on another function you have to integrate with, and make a bunch of mistakes that you need to debug until it produces the correct result across unit test coverage.
There's so much time spent on false starts and plumbing and dead ends and looking up documentation and debugging when you code. In contrast, when you read code that already has passing tests... you skip all that stuff. You just ensure it does what it claims and is well-written and look for logic or engineering errors or missing tests or questionable judgment. Which is just so, so much faster.
> But in order to write the code, you might have to try 4 different top-level approaches until you figure out the one that works , try integrating with a function from 3 different packages until you find the one that works properly
If you haven’t spent the time to try the different approaches yourself, tried the different packages etc., you can’t really judge if the code you’re reading is really the appropriate thing. It may look superficially plausible and pass some existing tests, but you haven’t deeply thought through it, and you can’t judge how much of the relevant surface area the tests are actually covering. The devil tends to be in the details, and you have to work with the code and with the libraries for a while to gain familiarity and get a feeling for them. The false starts and dead ends, the reading of documentation, those teach you what is important; without them you can only guess. Wihout having explored the territory, it’s difficult to tell if the place you’ve been teleported to is really the one you want to be in.
The goal isn't usually to determine whether the function is the perfect optimal version of the function that could ever exist, if the package it integrates with the the best possible package out of the 4 mainstream options, or to become totally and intimately familiar with them to ensure it's as idiomatic as possible or whatever.
You're just making sure it works correctly and that you understand how. Not superficially, but thinking through it indeed. That the tests are covering it. It doesn't take that long.
What you're describing sounds closer to studying the Talmud than to reading and reviewing most code.
Like, the kind of stuff you're describing is not most code. And when it is, then you've got code that requires design documents where the approach is described in great detail. But again, as a reader you just read those design documents first. That's what they're there for, so other people don't have to waste time trying out all the false starts and dead ends and incorrect architectures. If the code needs this massive understanding, then that understanding needs to be documented. Fortunately, most functions don't need anything like that.
> And when it is, then you've got code that requires design documents where the approach is described in great detail
And how do you write those design documents? First, you need to understand the landscape, and that means reading code, building experiments and trying out different variants, which then allows you to specify a design.
Our job isn't writing code, our job is to gain the understand required that allows us to write specifications and/or optimal code.
And while AI may be a better typewriter, it obscures the actually hard part of our job, the actual engineering, and the reason why others pay us to consult them.
Most human written code has 0 (ZERO!) docs. And if it has them, they're inaccurate or out of date or both.
Lots of code is simple and boring but a fair amount isn't and reading it is non trivial, you basically need to run it in your head or do step by step debugging in multiple scenarios.
> If I got code like Joel describes for a code review, I'm sending it back asking for it to be clearly commented.
I'd argue that most code written today is never code reviewed. Heck, most code today isn't even read by another human being, and I'm talking about code generated by other humans! :-)
Most code written today (and most likely most code ever written) is poorly written code.
Now, these days there are many companies with good engineering practices, so there are lots of islands where this isn't the case.
I can read a line of code and tell you that it's storing a pointer in this array cell and removing this other pointer and incrementing this integer by 6 and so on. None of that tells me if that is the correct thing to be doing.
Detecting obvious programming errors like forgetting to check for an error case or freeing a variable or using an array where a set should be is, usually, obvious, and frequently machine can and will point it out.
Knowing that when you add a transaction to this account you always need to add an inverse transaction to a different account to keep them in sync is unlikely to be obvious from the code. Or that you can't schedule an appointment on may 25th because it's memorial day. Or whatever other sorts of actually major bugs tend to cause real business problems.
I mean, sure, if someone documented those requirements clearly and concisely and they were easy to find from the section of code you were reviewing such that you knew you needed to read them first, then yes, it becomes a lot easier. My experience as a professional programmer is this happens approximately never, but I suppose I could be an outlier.
And yes if you want to be extremely literal, some code is easier to read than write. But no one cares about that type of code.
Outside of life saving critical software or military spec software, no one needs to review so hard they understand it to the level you’re describing, and they do not.
There is a mathematical principle that verification of a proof is easier than any proof. The same is true in code.
I mean, it's even easier to just not read the code in the first place, I'm not sure what that proves, other than perhaps an implicit collorary to the original quote "reading code is quite hard (so people rarely bother)"
Well said, you’re absolutely right. In practice code review is orders of magnitude faster than code creation and it always has been, baffling anyone is arguing otherwise. Perhaps they’ve never worked in a real organisation, or they’ve only worked on safety critical code,m or something?
Sometimes code review is so fast it's literally instant (because people aren't actually reading the code).
I think it's one of those sort of, dunno, wink wink situations where we all know that doing real in depth code reviews would take way more time than the managers will give (and generally isn't worth it anyways) so we just scan for obvious things and whatever happens to interact with our particular speciality in the code base.
I’ve rarely seen this happen personally but it does happen from time to time. Even thorough code reviews don’t take anywhere near as long as the writing though.
Reading and thinking you understand other people's code is trivially easy
Reading and actually understanding other peoples' code is an unsolved problem
You draw an analogy from the function you wrote to a similar one. Maybe by someone who shared a social role similar to one you had in the past.
It just so happens that most times you think you understand something you aren't bit. Because bugs still exist we know that reading and understanding code can't be easier than writing. Also, in the past it would have take you less than a morning since the compiler was nicer. Anyway it sounds like most of your "writing" process was spent reading and understanding code.
>It's obviously easier to verify something is correct than come up with the correct thing in the first place.
You are missing the biggest root cause of the problem you describe: People write code differently!
There are "cough" developers whose code is copy/paste from all over the internet. I am not even getting into the AI folks going full copy/paste mode.
When investigating said code, you will be like why this code in here??
You call tell when a python script contains different logic for example.
Sure, 50 lines will be easy to ready, expand that to 100 lines and you be left on life support.
I think this originated from old arguments that say that the total _cumulative_ time spent reading code will be higher than the time spent writing it. But then people just warped it in their heads that it takes more time to read and understand code than it takes to write it in general, which is obviously false.
I think people want to believe this because it is a lot of effort to read and truly understand some pieces of code. They would just rather write the code themselves, so this is convenient to believe.
The reason I don't spend the majority of my time in code review is that when I'm reviewing my teammates' code I trust that the code has already been substantially verified already by that teammate in the process of writing it and testing it. Like 90% verified already. I see code review as just one small stage in the verification process, not the whole of it.
The way I approach it, it's really more about checking for failures, rather than verifying success. Like a smoke test. I scan over the code and if anything stands out to me as wrong, I point it out. I don't expect to catch everything that's wrong, and indeed I don't (as demonstrated by the fact that other members of the team will review the code and find issues I didn't notice). When the code has failed review, that means there's definitely an issue, but when the code has passed review, my confidence that there are no issues is still basically the same as it was before, only a little bit higher. Maybe I'm doing it wrong, I don't know.
If I had to fully verify that the code was correct when reviewing, applying the same level of scrutiny that I apply to my own code when I'm writing, I feel like I'd spend much longer on it---a similar time to what I'd spend writing on it.
Now with LLM coding, I guess opinions will differ as to how far one needs to fully verify LLM-generated code. If you see LLMs as stochastic parrots without any "real" intelligence, you'll probably have no trust in them and you'll see the code generated by the LLM as being 0% verified, and so as the user of the LLM you then have to do a "review" which is really going from 0% to 100%, not 90% to 100% and so is a much more challenging task. On the other hand, if you see LLMs as genuine intelligences you'd expect that LLMs are verifying the code to some extent as they write it, since after all it's pretty dumb to write a bunch of code for somebody without checking that it works. So in that case, you might see the LLM-generated code as 90% verified already, just as if it was generated by a trusted teammate, and then you can just do your normal review process.
I keep seeing this sentiment repeated in discussions around LLM coding, and I'm baffled by it.
For the kind of function that takes me a morning to research and write, it takes me probably 10 or 15 minutes to read and review. It's obviously easier to verify something is correct than come up with the correct thing in the first place.
And obviously, if it took longer to read code than to write it, teams would be spending the majority of their time in code review, but they don't.
So where is this idea coming from?