Exactly, in Java i couldn't even inline my functions, and had zero control over ...

JanecekPetr · on Nov 23, 2021

> i couldn't even inline my functions

You could, manually :). Either way if they're hot, they're inlined.

One common trick for open-addressing maps in Java I don't see in your implementations is to have an array of keys zipped with values (or two arrays, one for keys, one for values) instead of array of Entries. This improves locality and indirection a bit. Obviously more is needed for even more speed.

(Mandatory "in the future it will be better": Valhalla is coming in a few years, that's when we'll have much more control over the memory layout.)

nomemory · on Nov 23, 2021

I've assumed they were inline initially, until I profiled and found out one of the swap methods was not. I believe there's also a limit on the amount of bytecode to inline, so no matter how hot a method, if the associated bytecode reaches a certain threshold it won't be. I need to double check though.

JanecekPetr · on Nov 23, 2021

Indeed, max bytecode count (FreqInlineSize and MaxInlineSize) and inlining depth (MaxInlineLevel). Your GC choice will also slightly modify the inlining decisions, and then obviously GraalVM will be completely different.

vips7L · on Nov 23, 2021

Is there a full list of the differences between the graal jit and C2? I was only aware of graal having better partial escape analysis over C2's conservative EA.

JanecekPetr · on Nov 23, 2021

It's simply a completely different implementation. Some of the optimiization passes are the same, obviously, but overall it simply performs ... differently.

kaba0 · on Nov 23, 2021

But inlining will not cause better performance in every case, will it?

JanecekPetr · on Nov 23, 2021

Correct! Inlining obviously costs CPU, code cache space, and makes the receiver method bigger so that it's less likely it will be inlined itself. If there ever is a decompilation occuring, the inlining efforts were pretty much wasted.

masklinn · on Nov 23, 2021

However it must be pointed out that inlining enables further optimisations.

The tradeoffs are more complicated for a jit, but for an aot compiler it’s one of the most important optimisations in the pipeline.

tsimionescu · on Nov 23, 2021

Out of curiosity, weren't your objects ending up neatly compacted after a GC cycle?

nomemory · on Nov 23, 2021

I didn't perform any deletes, and I've tried to keep everything clean without allocating intermediary objects. But one the pitfalls is that I cannot play easily on the memory lane with Java. For example in the entries and bins implementation I had to have two separate arrays. In C I would've kept a single array, with a compact zone, and another sparse zone, iterating through it using an offset.