Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There are various studies of defect density (bugs per 1000 lines of code) versus whatever factor someone thought would influence bugs. "Software defect density" seem to be the magic search terms for finding actual studies instead of blog posts.

I unfortunately can't find it, but I remember there being a study (by Microsoft?) of defect density for projects with different methodologies (scrum, TDD, whatever), which found lines of code is more correlated with defect rate than everything else they tried. They took that as a failure; I've always taken that to mean that reducing lines of code is the most important first step (or probably more accurately reducing tokens of executable code; one liners don't help and type annotations don't hurt).

While attempting to find that study, I found a study [1] that claims to have found an empirical sweet spot in defect density by module size - apparently 400 lines for assembly modules or 200 lines for ADA modules. Those are some weird languages to have numbers on, but maybe something else in that paper's family tree has something more relatable.

[1] http://www.cs.colostate.edu/~malaiya/p/denton_2000.pdf



even reducing tokens isn't a very good measure - lines of code is a proxy for complexity of implementation. I think a good measure (albeit a heuristic one) is how concise the implementation is relative to the complexity of the problem - or in other words, the ratio of accidental to essential complexity.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: