Huh? It works in C (I just tried) and I bet in most other languages.

agency · on Sept 21, 2020

Works in JS as well. Though I have never seen anyone put whitespace around field accesses like that, in any language.

taeric · on Sept 21, 2020

Newlines are pretty common. Especially in chained cases. No?

platz · on Sept 21, 2020

i guess i forgot about the newlines case, and just had never seen anyone put spaces in like that without newlines, so figured it didn't exist. strange that it's even allowed.

codeflo · on Sept 21, 2020

That’s because C-like languages don’t distinguish newlines from other whitespace. If you want

  foo()
    .filter(...)
    .map(...)

then you have to allow

  foo() .filter(...) .map(...)

as well, as a matter of principle.

squiggleblaz · on Sept 21, 2020

But this doesn't apply to Javascript in its absolute sense, since although

    (1) foo()
        .bar()

is equivalent to

    (2) foo() .bar()

there is a difference between

    (3) foo()
        bar()

and

    (4) foo() bar()

(the (3) results in valid code, whereas (4) would not).

But trying to argue that (2) should be syntactically invalid would be a very hard ask indeed.

eru · on Sept 21, 2020

Yes. Though even Python, which does treat newlines different from other whitespace, allows `object . someMethod` with spaces on either side.

eru · on Sept 21, 2020

Works in Python, too.

klodolph · on Sept 21, 2020

The spaces are mandatory in Python, no?

    >>> 2020 . to_bytes(4, 'little')
    b'\xe4\x07\x00\x00'
    >>> 2020.to_bytes(4, 'little')
      File "<stdin>", line 1
        2020.to_bytes(4, 'little')
             ^
    SyntaxError: invalid syntax

JavaScript also requires the spaces:

    > 2020 . toFixed(2)
    '2020.00'
    > 2020.toFixed(2)
    2020.toFixed(2)
    ^^^^^

    Uncaught SyntaxError: Invalid or unexpected token

goto11 · on Sept 21, 2020

Numeric literals is a special case (in most languages), since the dot could also be a decimal separator. You have to use space or parentheses to disambiguate.

But for decimals there is no ambiguity:

  >>> 2020.50.is_integer()
  False

klodolph · on Sept 21, 2020

You say disambiguate, but there is no ambiguity. Only one parse could possibly be valid.

goto11 · on Sept 21, 2020

The lexer could disambiguate but that would require lookahead. I believe Python have a deliberate policy of keeping the parser simple.

But I just noticed C# support this, so it is not the same for all languages. Java doesn't, but then again you cant call methods on numbers in Java anyway.

foldr · on Sept 21, 2020

It's a real lexing ambiguity because '0.' is a valid numeric literal (so '0..toString()' parses ok, somewhat counterintuitively). In principle, yes, you could lex '.' as a separate token even in numeric literals and have the parser figure it all out.

klodolph · on Sept 21, 2020

No, it is not a lexing ambiguity. You just need one extra character of lookahead after encountering a decimal point.

foldr · on Sept 22, 2020

Clearly one character lookahead is not sufficient, because e.g. '0. toString()' (note the space). There's no question that the lexer could in principle disambiguate with unbounded lookahead, but it would be a bit hacky, as you'd effectively be implementing part of the parser in the lexer (by attempting to figure out if if was a method call, which is really the parser's job).

So basically, you could easily write a parser that allowed '0.toString()', but you'd either have to piece numeric literals together in the parser or add nasty hacks to the lexer.

klodolph · on Sept 23, 2020

> There's no question that the lexer could in principle disambiguate with unbounded lookahead, but it would be a bit hacky, as you'd effectively be implementing part of the parser in the lexer (by attempting to figure out if if was a method call, which is really the parser's job).

This is actually not hacky. It's just a rule that the "." cannot be followed by [ \t]∗\w, which is a simple negative lookahead assertion. Replace \w with whatever you use at the start of identifiers.

It is extremely common for languages to have corner cases like this in the lexer to make the language more usable. For example, consider the rules in JavaScript or Go concerning where you can put line breaks. Or the rules for JavaScript concerning regular expression literals, which must be disambiguated from division.

> So basically, you could easily write a parser that allowed '0.toString()', but you'd either have to piece numeric literals together in the parser or add nasty hacks to the lexer.

This is factually incorrect. As I explained, you would only need one character of lookahead. There is no need to parse "0. toString()" successfully. If you wanted to parse "0. toString()" correctly, you could use unbounded lookahead, which is fairly simple in practice (speaking as a sometimes parser writer). I don't get why you say it is hacky, this is all just a bunch of regular expression stuff (in the traditional sense of "regular").

foldr · on Sept 24, 2020

>If you wanted to parse "0. toString()" correctly, you could use unbounded lookahead

Right, which is what I said. If you agree that unbounded lookahead is required then we don't really disagree, except on the somewhat subjective question of how 'hacky' that is.

If I understand correctly, you suggest that unbounded lookahead could be avoid by allowing '0.toString()' but not '0. toString()', while still allowing both '(0).toString()' and '(0). toString()' and both 'foo.bar' and 'foo. bar'. That would produce highly counterintuitive results in some instances:

    Parsed as one expression:
    {}.
      foo

    Parsed as two statements:
    0.
      toString()

But again, it is really a subjective judgment. Obviously you could modify Javascript in this way, and on that point there is no disagreement.

thomasahle · on Sept 21, 2020

Wow, I never knew python supported dots after numbers. Thanks!

ben509 · on Sept 23, 2020

But prefer (55).foo() as you shouldn't expect your reader to have memorized the Python grammar.