Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

But what about other SIMD extensions, are really all packages distributed with the expectation that for example SSE5 and similar are supported?


Most of the SSE3, SSSE3, SSE4.1, and SSE4.2 (there is no SSE5 in any released processor) instructions are not particularly feasible to be used by automatic vectorization, being mostly horizontal vector optimizations or some oddball instructions that are pretty task-specific (hi, PCMPESTRI). You might see them come up in SLP vectorization, but my last experience with LLVM's SLP vectorizer is that it does a poor job of taking advantage of these kinds of instructions anyways.

For hot kernels (say, memcpy), it is definitely the case that many projects have implementations of several different varieties of these, and use the version best suited for your current architecture. See https://sourceware.org/git/?p=glibc.git;a=tree;f=sysdeps/x86... for the different variants of common functions in glibc.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: