Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> not a single piece of large software (except maybe a compiler or two) has ever been written using the skills/concepts of the higher levels of this chart.

I wonder: to what extent this is because those higher level concept tend to shrink programs in the first place? I mean, big is bad, there's no debating this. Big programs can only arise from necessity or stupidity. And if those fancy features are any use, they must be reducing the need for big.

I would ask myself a slightly different question: what kind of program still have to be big, even if you have all these fine concepts at your disposal?

And of course, big is risky, and that tend to make us chose more conservative options. Why use Haskell where C++ has shown in the past it could do that kind of job? Not to mention the network effects: even if Haskell as a language was better than C++ as some specific job, you may still choose C++ because of libraries, commercial support, and developer availability.

> When it comes to small (<100KLOC) and non-large (<1MLOC) programs

Oh, so that's how you're calibrated… My, to me, small means <10KLOC. 100KLOC is already big, and 1MLOC is gigantic. Besides, I've seen a couple multi-million lines programs, and I don't believe for a second they had to exceed 100KLOC. Such programs are more about piling historical accident on top of historical accident than about encoding a complex problem domain.

A small compiler takes a couple thousand lines of code. With the proper tools, it can often be squeezed into 2KLOC (Source: the STEPS project from http://vpri.org). It won't optimize like GCC, but you rarely have to anyway. Now you have a DSL for writing an executable specification. What kind of specification is so complex that it cannot even fit in 50 books?

Of course you have to keenly understand the problem domain to pull that of, and that often means solving it the crappy way the first time around. Which means you're never going to rewrite it the proper way: it would be way too risky. So of course it has seldom been put to the test. One does not just hinge an entire business on original research.

> You're equating upfront effort with a particular choice of technique, and one that has never been put to the test.

No, not never. I have at least one example: one of my uncles once had to write a number of database transactions. It was one of his first job. He had 1 year to do it. Seeing how tedious it would be, he first though about the problem, then devised a DSL in which he could write the damn transactions. Since he wasn't exactly senior, and had crappy tools (he used B), the DSL took him about 6-7 months to perfect. (During which management were on the verge of panic, because no transaction has been done yet.)

At the 8 month mark, all transactions were done, tested, and had a surprisingly low bug count. The client was very pleased and the contract was renewed for another batch of transactions. Same size, but to be done in 8 months this time. That was given to a co-worker, who used the my uncle's DSL to perform 8 months worth of contracted work in 1 month.

DSLs sometimes work.



> to what extent this is because those higher level concept tend to shrink programs in the first place?

None at all. No one, AFAIK, even claims a reduction of even a single order of magnitude in code size.

> I would ask myself a slightly different question: what kind of program still have to be big, even if you have all these fine concepts at your disposal?

All those that are big now. I'm not talking about a single executable, but about an entire system (specification, development and debugging have always been done modularly, regardless of whether there's a single process or a distributed system). I see no way to implement the requirements of, say, an air-traffic control system, complex avionics or a banking system in software that isn't very big.

> 100KLOC is already big

A mid-sized enterprise software system is ~5MLOC. Google and Facebook have codebases that are measured in the hundreds of millions LOC. You can use whatever definitions, but if you look at software actually being constructed, much (if not most) of development effort in the industry is systems that are about ~5MLOC.

> What kind of specification is so complex that it cannot even fit in 50 books?

My guess? The majority of software written today is part of such specifications. I once worked on a medium-sized air-traffic control system designed for a relatively small area and number of planes, whose informal functional specification was ~10 books. The specifications for the avionics software of a fighter jet developed in the 80s were ~2000 pages of structured natural language (source: http://www.wisdom.weizmann.ac.il/~harel/papers/Statecharts.H...)


> No one, AFAIK, even claims a reduction of even a single order of magnitude in code size.

http://vpri.org claims about 3, though most of it is not because of the languages, but because of the removed redundancies. The languages do seem to be responsible for at least 1 order of magnitude.

About the rest, of what you say, I won't claim anything. It just make me feel… uneasy. Okay, those specs are that big. Do they have to, though? I've seen a big fat list of requirements in my last gig, and many were duplicated into 2 slightly different versions. As were some pieces of the code. And that's the obvious stuff. There were more subtle waste, where simpler alternatives would have fit the bill, but weren't applied because of historical reasons (I asked the architect, there was a reason for everything).

Almost everywhere I look, I see a wasteland of useless code, and even the specs aren't that clean to begin with. It feels like proper DRY alone would have reduced the size of this stuff by 2 or 3. Maybe I was unlucky enough to work in especially crappy environments. But from what I hear, that's the norm. So far, you're the only one I met that challenges that perception.


> The languages do seem to be responsible for at least 1 order of magnitude.

Compared to what? C? I'm not talking about C, but about any modern language.

> Do they have to, though?

Yes. Or, at least, everybody (including me when I first saw them) says "this cannot possibly be this complicated", and after understanding them, everybody says, "oh, OK". But let me put it another way: if the modern world really only requires simple software, then our work is nearly done. Some would use Python, some would use Java, some would use Haskell -- but if the largest software system needs to be ~100KLOC, then none of it matters too much. Writing such a piece of software isn't hard regardless of what language you use (or, rather, the difficulty is in the essential complexity; there's not much accidental complexity in 100KLOC). Even assuming one methodology would be 15% than another, it wouldn't make much difference to the bottom line because producing 100KLOC software is cheap anyway. And if that were the case, investing in programming languages would be even a bigger waste, as investing in simplifying specifications would have a much bigger impact, and would cost a lot less.

But just to get a sense, GHC -- which is a compiler, and, as you've noted, compilers tend to be small -- is about 400KLOCs of Haskell. The Linux kernel is over 15MLOC of C. Reduce that by an order of magnitude, you still get 1.5MLOC, and that's just for an OS kernel.

> So far, you're the only one I met that challenges that perception.

I don't argue that there isn't a lot of waste. I argue that even without all that waste, we'd still need software that's very large (or that, alternatively, waste is unavoidable). There are no signs that Haskell reduces this waste at all, or that it dramatically reduces the size of programs.


> Compared to what? C? I'm not talking about C, but about any modern language.

The domain they tackle is personal computing, which means Kernel, Windowing system, remote communication (web/mail), multimedia… So, yeah: mostly C and C++, by the look of currently popular programs.

That said, much of the (apparently) needlessly complex stuff I have seen was written in C++, and it did look like they didn't have the real-time requirements or resources constraints that would justify the use of such a monster of a language.

> Or, at least, everybody (including me when I first saw them) says "this cannot possibly be this complicated", and after understanding them, everybody says, "oh, OK".

I have yet to reach the second stage. And on one occasion, I did reach a reasonable understanding of the whole system. It definitely had to be very complex to match the specification, but the specifications themselves didn't match the end user's needs.

> (or that, alternatively, waste is unavoidable)

That's the alternative I'm most scared of. I cannot comprehend unavoidable waste, but I can't rule it out either.

Alternatively, I've came across the idea of not solving some problems¹, because the return on investment is just crap. Okay, when safety is involved, you probably cannot do that. Still, the idea that the 80/20 rule is sometimes more like 99.9/0.1 is enticing. Sometimes, full automation is not best. For instance last I checked, the best Chess player ever is a human-computer team, not a computer.

[1] Stop Over-Engineering https://www.youtube.com/watch?v=GRr4xeMn1uU


> I've came across the idea of not solving some problems

That, too, cannot be solved by a programming language :)

But much of complex software deals with things that could not possibly have been solved more efficiently (or even efficiently enough) by humans (like sensor fusion, package tracking, manufacturing control etc.). Also, we're pretty far from full automation. Nobody trusts computers to make the decisions in air-traffic control systems, power plant control software, or even in ERP systems.

The reason we keep building large software is that -- in spite of many problems -- they really do work (in the sense of achieving their goal of higher-throughput; whether or not a higher throughput of flights or of business deals is good or bad for humanity is an entirely separate question).


> if the largest software system needs to be ~100KLOC, then none of it matters too much. Writing such a piece of software isn't hard regardless of what language you use

I think this is absolutely the difference between you and proponents of Haskell. Personally I have found working on 100k LOC codebases very hard in Python and easy in Haskell.


I wouldn't know because I've never written anything in Python. How many 100KLOC non-compiler programs would say have been written in Haskell? It would be great if the team behind one of them would write a technical report so we'd at least slowly get to find out if it's actually easier to work with Haskell than with Python, and if so, by how much.


I agree that would be great. I can't help thinking that you're holding Haskell up to a standard to which you do not hold Rust, Go or Julia. Have you asked their proponents for technical reports on Hacker News threads?


It is Haskell that holds itself to a higher standard. Rust, Julia and Go don't make claims that are anywhere as extreme as Haskell's fans (e.g., see on this page the claim that Haskell's high abstractions -- optical profunctors or whatever -- are "load-bearing materials" to other languages' "mud". Rust has presented itself as a safe alternative to C/C++; Go presents itself as a high-performance language with simple concurency (or Java without the JVM); Julia presents itself as a high-performance alternative to Matlab, R or NumPy. But Haskell has sort of painted itself into a corner. Because the approach is so foreign and the learning-curve so steep -- i.e., the adoption cost is high -- if it didn't make outlandish claims (like, if it compiles it works) then no one in industry would consider using it. I don't need to know by how much Go lowers development costs because it makes no claims that it does, so I simply assume that it doesn't. That it is faster than Python, easy to learn in a day or two, and that it compiles down to a native executable -- are all trivial to verify. If you want, all of its claims are supported by plentiful data.

But other than that: yes! It is trivial to verify that Rust indeed fulfills its claims, but it is not trivial to verify that that's enough, namely, that overall, the overall cost of developing in Rust is lower than in C++, which is why I wouldn't switch from C++ to Rust without seeing that at least the costs are comparable. I'm also eagerly waiting for data about Rust's concurrency approach.

I would also tell you this: while I find the pure-functional aesthetically unattractive for interactive programs and am very much impressed by Clojure's and Erlang's approaches to state, and while I've personally written a semi-popular Clojure library, I have expressed skepticism about Clojure's suitability for large-scale projects. How well a language works in big software is something that you simply cannot extrapolate from experience in small projects, and as much as I like Clojure, I have serious doubts about its applicability, and I would not use it to write a large system without the kind of data I expect from Haskell.

But all those languages (Erlang and Clojure included) have social advantages that make this data much more easy to come by: they are used by people who aren't enamored with the language itself, who aren't PL enthusiasts, and are much more goal-oriented. Their judgment stems mostly from how well things work, not how interesting the language is. So articles with pertinent data (maybe not enough to risk a large project, but certainly more than those you find for Haskell) are more abundant.

So I don't need to beg for technical reports as much because 1. no wild claims that aren't supported by data are made, and 2. there's data out there (for Go, Erlang, Clojure) or it's coming (Rust).


I think the simplest solution to all of this is for you to ignore the outlandish claims, or at the very least pay attention only to the claims of senior and respectable Haskell propenents, such as Simons Peyton Jones and Marlow, Don Stewart, Lennart Augustsson, Neil Mitchell etc.. Otherwise you're simply allowing yourself to be gently trolled.

By the way "if it compiles it works" is supposed to be somewhat tongue in cheek, as is "avoid success at all costs".


That's not a solution because my problem isn't with Haskell itself (and I do familiarize myself with the theory when I find it interesting, and I'm well aware that most of the researchers aren't like that), but with the tendency of people in the industry to market things so enthusiastically, that they basically encourage suspension of critical thinking (and this is doubly annoying when what they evangelize isn't something as cheap as a new profiler, but something as expensive as a whole new language with a new, unproven, programming paradigm). Ignoring it wouldn't solve the problem. And in case you think this isn't an actual problem, consider that over-excitement about promises that couldn't be kept has caused at least two "research winters" in CS: one in AI and one in formal methods. The industry started believing its own hype, academia didn't mind the extra funding, but when the industry ended up disappointed, funding dried completely and research slowed to a crawl.


So you think you are singlehandedly saving "the" industry from the insidious promotion of Haskell?


Singlehandedly? Absolutely not. Why, am I the only Haskell skeptic you know? There are more on this page alone. There are even PLT researchers who warn against overselling results in PLT, and especially typed FP.

And we're not saving the industry; the industry as a whole isn't suicidal and will never bet too much money on unproven technologies. The industry isn't so fragile, but research is. Given the amount of money in the industry, even small parts that do bet on unproven tech can cause a funding surge followed by a research winter when the disappointment hits. We're saving Haskell.

If those who adopt Haskell do it with clear vision and not by believing some messianic claims, there would be no great disappointment and no research winter. Nothing is more dangerous to research than wild claims. You need to promise less and deliver more, not the other way around. And because the industry is competitive, it is actually quick to adopt ideas once they show actual significant benefit -- i.e., once they're ready. The industry adopted garbage collection almost overnight; it was almost 40 years after it had been invented, but almost immediately after it's been productized well enough to provide a significant advantage.

If pure-FP is really the way to go in general-purpose programming, eventually the industry will adopt it in some well-productized form. Getting there does require some early adopters, but they're already here and overselling doesn't help to get more good ones. It helps to get precisely those adopters who will end up causing a research winter.

----

What bothers me in terms of harm to the industry is the waste caused by people switching from one language to another, rewriting libraries over and over and thinning out the effort. I think we're losing years of progress. But there's really nothing I can do about that because even if I can convince someone that a language that is 1% "better" than another does not justify the effort of porting libraries over, there are already too many languages and platforms around, and there's always the question of, so which few languages should we pick, and why should it be the ones you like and not the ones I like. Haskell actually does not contribute much to this problem because its adoption rate doesn't yet make a dent.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: