Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Why? Usually non-ascii characters belongs to translation, and translation usually not belongs to codebase.


Documentation stored in your repo, unit tests, and comments. If your README.md includes code bracketed with backticks, you've got non-ASCII characters in your repo.

Things like Péter Rózsa or Kg.m² or ±0.5 PPM or C11 Standard §6.3.1.3/2 all work with my toolchain. Why mangle them into ASCII when there's no need to?


Maybe it works in your english codebase, but I'd be very hesitant about rolling it out everywhere. For example, Perl 6 supports unicode in identifiers so it's perfectly valid to write your code in Japanese. Another increasingly common example I've seen in python is the use of emoji in command line tools.


> For example, Perl 6 supports unicode in identifiers so it's perfectly valid to write your code in Japanese.

Raku/Perl6 also has non-ASCII Unicode operators (they all have multicharacter ASCII aliases, but the Unicode characters are usually more readable.)


They definitely belong in comments though. Names, non-English languages and the like.


This (primarily American) attitude to encodings is why we're in this situation.


Just off the top of my head, test cases would be a valid reason to have non-ASCII in your codebase.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: