Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Well, I have this idea that, pretty much, smaller programs (in the sense of Kolmogorov complexity) are well-written and thus maintainable. More precise argument is as follows:

You take a very simple functional language, like say, Haskell Core. Let's consider a large and representative enough (finite) set of programs in this language. We find the smallest possible (or very small) representation of this set. This will involve definition of some commonly used functions. Let's call all the functions that are in this representation "the basic library".

Then, for a new program, we can try to find shortest possible (among those you're able to consider) representation, such that, using the functions from the basic library comes for free (so you don't count the size of functions in the basic library towards the total size of the new program).

So if we can write a program that would be able to find smaller representation using definitions (functions) that we already generated in that representative set, I think it would essentially automatically refactor code to be more readable and maintainable.



This might get you the shortest code but it wouldn't necessarily get you the cleanest code.

There's often a trade off between coupling and terseness which requires (human) judgement in order to balance the competing concerns correctly. There's also the matter of using appropriate metaphors.

I have a feeling that a "cleaner" optimized like this would get you a program full of the equivalent of perl one liners, assuming it's even possible.


I don't know how the generated code would look like, but I am sure humans wouldn't like it. But that's not a big deal, because for every piece of code out there, there is somebody who doesn't like it.

Proper metaphors (better call them abstractions) would be generated at that "basic library" creation step. Basically those are functions that are commonly useful. Some of them would correspond to abstractions known by many programmers, some of them would be different, but altogether would let you write more compact code (because that's what they have been selected for). Ultimately, knowing abstractions amortize and ease understanding of the code that uses those.

We humans do the same thing - the code that is easy to understand is the code that uses commonly known abstractions, such as control structures, standard library functions, etc. What makes this possible is that we hold these shared abstractions in our heads, and create them as a part of programming culture by working with different code bases.

There is no trade off. You cannot write a terse function with high coupling, because then you cannot use the shared abstractions very efficiently inside it, nor you can build other functions easily by reusing this function. So terseness IMHO behaves very differently when you account for the fact that you can create new definitions, and you put those definitions to use. (And especially if those definitions come from some "basic library" for free - they are not included in the cost that you have to pay to understand the program - because they have already been understood.)


Abstractions are different to metaphors. You can have an abstraction with a name 'x' that you can use to send messages over an SMTP server. The metaphor could be "mailer", "mailbox", "email sender", "postman" or many other different things with subtly different implications.

Moreover, the "basic library" creation step that you allude to the point where you decide where the borders between different modules of code lie. Creating those borders at the appropriate point - deciding the coupling - is arguably the most important part of refactoring and the part (I) have found hardest to clean up in legacy code. I am currently rewriting a library which in retrospect I now realize should have been two libraries.

"You cannot write a terse function with high coupling"

Oh, you absolutely can. In fact, after a certain point terseness tends to correlate with higher coupling. There is often a trade off to be made between coupling one block of code to another (e.g. introducing a dependency) and sacrificing terseness.


> Abstractions are different to metaphors. You can have an abstraction with a name 'x' that you can use to send messages over an SMTP server.

How is that different from a function (or set of functions) that abstract the technicalities of sending messages over SMTP?

I am not sure why specific metaphor is so helpful here. I don't think it helps in understanding the function, on the contrary.

> Moreover, the "basic library" creation step that you allude to the point where you decide where the borders between different modules of code lie.

Not quite. The important thing to understand is that I am not optimizing for size of a single program, but large set of different programs. The same thing that humans do.

> There is often a trade off to be made between coupling one block of code to another (e.g. introducing a dependency) and sacrificing terseness.

Well, I understand "high coupling" differently than you, then. For me, it means things like doing different unrelated things in a single function (so the result is that it's harder to reuse it). It doesn't mean that you cannot call other functions.

I don't consider a function that avoids a library call, instead inlining the functionality, to be well-written. (If that's what you mean by the trade-off - maybe some example would help.)


>How is that different from a function (or set of functions) that abstract the technicalities of sending messages over SMTP? > >I am not sure why specific metaphor is so helpful here. I don't think it helps in understanding the function, on the contrary.

It isn't a different function, but "send_email" is better than "send_message" which is in turn better than a function just called "x" which is shorter. In each case a different metaphor is used. x is a shorter method but code that uses names like that everywhere is NOT cleaner. It's horrible.

>Not quite. The important thing to understand is that I am not optimizing for size of a single program, but large set of different programs.

Optimizing purely for size will end up increasing coupling to an insane degree. I've seen this done by people who get religious about DRY.

>Well, I understand "high coupling" differently than you, then. For me, it means things like doing different unrelated things in a single function (so the result is that it's harder to reuse it). It doesn't mean that you cannot call other functions.

If you're using one function to call another you have coupled them.

Coupling is not about what you can or cannot do it's about what is.

Coherence (or rather, a lack thereof) is about doing different things in the same function (or class, file, project, etc.).


Well, I agree that naming is a problem, but let's put that aside for this discussion. However, I should have explained that I consider all identifiers to have the same length for the purpose of trying to make the representation of program "compact". So what matters is number of functions being called inside a function (in FP, even constants are considered a function).

OK, I get your point about coupling. But I am not sure why you consider high coupling to be a bad thing, especially in case where you have an automated refactoring system.

It seems to me that to "decrease coupling" you have to add nodes to dependency graph, and this increases overall complexity.

Perhaps you could provide some example where high coupling (calling different functions from a single function) is bad, and how would you like to have it resolved.

Update: Actually, with automated refactoring, you wouldn't have to modify any existing functions (except maybe to call your new functions). You would just write new functions that would incorporate the new functionality using perhaps the existing building blocks. Then the refactorer would sort out things in DRY manner, as needed. So you don't need to care if the code is easy to refactor, it just needs to be easy to read.


>But I am not sure why you consider high coupling to be a bad thing

Isolation of code enables you to more easily understand it, test it, replace it and re-use it.

Linear increases in coupling leads to a combinatorial explosion in the difficulty in all of the above.

Coupling is actually more important than DRY.


I can see why you intuitively think that's the case, but is it really?

I don't understand how you imagine you can "isolate code" while making it less DRY. If I need to do certain functionality from some function, I need either to call other function to do it (if you already have a function that can do it) or copy that functionality into the function itself. The first approach is DRY, the second isn't. But I don't think second approach is a good idea, like, at all. So if I assume you don't mean that, what do you mean?

I can understand you can make code easier to test if you make it less DRY. Let's put that aside, because it's not really a big deal in this discussion. But the other three - I would love to see some specific example where making code less DRY will increase ease to understand it, and also possibly replace it and re-use it (provided we don't care about elegance of the result, since it would be taken care of by subsequent refactoring, as I already explained, so we can for instance copy-paste the existing code for the purpose of change).




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: