Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Huh? It works in C (I just tried) and I bet in most other languages.


Works in JS as well. Though I have never seen anyone put whitespace around field accesses like that, in any language.


Newlines are pretty common. Especially in chained cases. No?


i guess i forgot about the newlines case, and just had never seen anyone put spaces in like that without newlines, so figured it didn't exist. strange that it's even allowed.


That’s because C-like languages don’t distinguish newlines from other whitespace. If you want

  foo()
    .filter(...)
    .map(...)
then you have to allow

  foo() .filter(...) .map(...)
as well, as a matter of principle.


But this doesn't apply to Javascript in its absolute sense, since although

    (1) foo()
        .bar()
is equivalent to

    (2) foo() .bar()
there is a difference between

    (3) foo()
        bar()
and

    (4) foo() bar()
(the (3) results in valid code, whereas (4) would not).

But trying to argue that (2) should be syntactically invalid would be a very hard ask indeed.


Yes. Though even Python, which does treat newlines different from other whitespace, allows `object . someMethod` with spaces on either side.


Works in Python, too.


The spaces are mandatory in Python, no?

    >>> 2020 . to_bytes(4, 'little')
    b'\xe4\x07\x00\x00'
    >>> 2020.to_bytes(4, 'little')
      File "<stdin>", line 1
        2020.to_bytes(4, 'little')
             ^
    SyntaxError: invalid syntax
JavaScript also requires the spaces:

    > 2020 . toFixed(2)
    '2020.00'
    > 2020.toFixed(2)
    2020.toFixed(2)
    ^^^^^

    Uncaught SyntaxError: Invalid or unexpected token


Numeric literals is a special case (in most languages), since the dot could also be a decimal separator. You have to use space or parentheses to disambiguate.

But for decimals there is no ambiguity:

  >>> 2020.50.is_integer()
  False


You say disambiguate, but there is no ambiguity. Only one parse could possibly be valid.


The lexer could disambiguate but that would require lookahead. I believe Python have a deliberate policy of keeping the parser simple.

But I just noticed C# support this, so it is not the same for all languages. Java doesn't, but then again you cant call methods on numbers in Java anyway.


It's a real lexing ambiguity because '0.' is a valid numeric literal (so '0..toString()' parses ok, somewhat counterintuitively). In principle, yes, you could lex '.' as a separate token even in numeric literals and have the parser figure it all out.


No, it is not a lexing ambiguity. You just need one extra character of lookahead after encountering a decimal point.


Clearly one character lookahead is not sufficient, because e.g. '0. toString()' (note the space). There's no question that the lexer could in principle disambiguate with unbounded lookahead, but it would be a bit hacky, as you'd effectively be implementing part of the parser in the lexer (by attempting to figure out if if was a method call, which is really the parser's job).

So basically, you could easily write a parser that allowed '0.toString()', but you'd either have to piece numeric literals together in the parser or add nasty hacks to the lexer.


> There's no question that the lexer could in principle disambiguate with unbounded lookahead, but it would be a bit hacky, as you'd effectively be implementing part of the parser in the lexer (by attempting to figure out if if was a method call, which is really the parser's job).

This is actually not hacky. It's just a rule that the "." cannot be followed by [ \t]∗\w, which is a simple negative lookahead assertion. Replace \w with whatever you use at the start of identifiers.

It is extremely common for languages to have corner cases like this in the lexer to make the language more usable. For example, consider the rules in JavaScript or Go concerning where you can put line breaks. Or the rules for JavaScript concerning regular expression literals, which must be disambiguated from division.

> So basically, you could easily write a parser that allowed '0.toString()', but you'd either have to piece numeric literals together in the parser or add nasty hacks to the lexer.

This is factually incorrect. As I explained, you would only need one character of lookahead. There is no need to parse "0. toString()" successfully. If you wanted to parse "0. toString()" correctly, you could use unbounded lookahead, which is fairly simple in practice (speaking as a sometimes parser writer). I don't get why you say it is hacky, this is all just a bunch of regular expression stuff (in the traditional sense of "regular").


>If you wanted to parse "0. toString()" correctly, you could use unbounded lookahead

Right, which is what I said. If you agree that unbounded lookahead is required then we don't really disagree, except on the somewhat subjective question of how 'hacky' that is.

If I understand correctly, you suggest that unbounded lookahead could be avoid by allowing '0.toString()' but not '0. toString()', while still allowing both '(0).toString()' and '(0). toString()' and both 'foo.bar' and 'foo. bar'. That would produce highly counterintuitive results in some instances:

    Parsed as one expression:
    {}.
      foo

    Parsed as two statements:
    0.
      toString()
But again, it is really a subjective judgment. Obviously you could modify Javascript in this way, and on that point there is no disagreement.


Wow, I never knew python supported dots after numbers. Thanks!


But prefer (55).foo() as you shouldn't expect your reader to have memorized the Python grammar.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: