Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Exactly, in Java i couldn't even inline my functions, and had zero control over the underlying memory layout.


> i couldn't even inline my functions

You could, manually :). Either way if they're hot, they're inlined.

One common trick for open-addressing maps in Java I don't see in your implementations is to have an array of keys zipped with values (or two arrays, one for keys, one for values) instead of array of Entries. This improves locality and indirection a bit. Obviously more is needed for even more speed.

(Mandatory "in the future it will be better": Valhalla is coming in a few years, that's when we'll have much more control over the memory layout.)


I've assumed they were inline initially, until I profiled and found out one of the swap methods was not. I believe there's also a limit on the amount of bytecode to inline, so no matter how hot a method, if the associated bytecode reaches a certain threshold it won't be. I need to double check though.


Indeed, max bytecode count (FreqInlineSize and MaxInlineSize) and inlining depth (MaxInlineLevel). Your GC choice will also slightly modify the inlining decisions, and then obviously GraalVM will be completely different.


Is there a full list of the differences between the graal jit and C2? I was only aware of graal having better partial escape analysis over C2's conservative EA.


It's simply a completely different implementation. Some of the optimiization passes are the same, obviously, but overall it simply performs ... differently.


But inlining will not cause better performance in every case, will it?


Correct! Inlining obviously costs CPU, code cache space, and makes the receiver method bigger so that it's less likely it will be inlined itself. If there ever is a decompilation occuring, the inlining efforts were pretty much wasted.


However it must be pointed out that inlining enables further optimisations.

The tradeoffs are more complicated for a jit, but for an aot compiler it’s one of the most important optimisations in the pipeline.


Out of curiosity, weren't your objects ending up neatly compacted after a GC cycle?


I didn't perform any deletes, and I've tried to keep everything clean without allocating intermediary objects. But one the pitfalls is that I cannot play easily on the memory lane with Java. For example in the entries and bins implementation I had to have two separate arrays. In C I would've kept a single array, with a compact zone, and another sparse zone, iterating through it using an offset.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: