Stream: helpdesk (published)

Topic: Detecting SIMD capability


view this post on Zulip Jakob Nybo Nissen (Jun 30 2023 at 18:18):

Is there currently a solid way of checking whether it's safe to precompile code making use of specific CPU instructions?
I have this package that uses llvmcall, by checking which CPU instructions the host CPU supports. Problem is, when the user then sets export JULIA_CPU_TARGET="generic", it produces the wrong code and crashes.
How does LoopVectorization handle this? cc @chriselrod

view this post on Zulip Mason Protter (Jun 30 2023 at 18:35):

Looks like the JuliaSIMD packages use https://github.com/JuliaSIMD/HostCPUFeatures.jl

view this post on Zulip Mason Protter (Jun 30 2023 at 18:38):

e.g.

julia> using HostCPUFeatures

julia> HostCPUFeatures.simd_integer_register_size()
static(32)

julia> HostCPUFeatures.pick_vector_width(Float64)
static(4)

julia> HostCPUFeatures.pick_vector_width(Float16)
static(8)

view this post on Zulip Jakob Nybo Nissen (Jun 30 2023 at 18:51):

Looks like it, but then there is https://github.com/JuliaSIMD/HostCPUFeatures.jl/issues/12 in which it's not quite clear if it was fixed.
Anyway, I can try it!

view this post on Zulip Mason Protter (Jun 30 2023 at 18:54):

It seems like the PR that closed the issue should fix that issue right?

view this post on Zulip Jakob Nybo Nissen (Jun 30 2023 at 19:02):

Oh yeah it does seem to work actually :smiley:

view this post on Zulip Mason Protter (Jun 30 2023 at 19:03):

nice!

view this post on Zulip Jakob Nybo Nissen (Jun 30 2023 at 19:36):

Nevermind, it doesn't work still

julia> using ScanByte
[ Info: Precompiling ScanByte [7b38b023-a4d7-4c5e-8d43-3f3097f304eb]
LLVM ERROR: Do not know how to split the result of this operator!

view this post on Zulip Mason Protter (Jun 30 2023 at 19:47):

Ah that's too bad. I'd maybe open a new issue or try to get that old one re-opened

view this post on Zulip Sukera (Jul 01 2023 at 08:46):

it's basically branching on whaat Base.Sys claims

view this post on Zulip Jakob Nybo Nissen (Jul 01 2023 at 09:24):

As far as I can tell, it is just not possible to robustly write ISA-specific code in Julia, because there is no documented or even semi-legit looking method of checking which code can be validly emitted :frown: Too bad

view this post on Zulip Sukera (Jul 01 2023 at 09:25):

yup

view this post on Zulip Sukera (Jul 01 2023 at 09:26):

there's no API checking for the kind of architecture you're actually being compiled for

view this post on Zulip Sukera (Jul 01 2023 at 09:26):

Base.Sys only tells you the host system, after all

view this post on Zulip chriselrod (Jul 02 2023 at 00:34):

Missed this --
HostCPUFeature can only detect the host's features (which is why it's not called TargetCPUFeatures.jl).
We don't have an API for actually detecting Julia's LLVM's JIT target, even though one has easy/direct access to this from within LLVM. Which is another one of the motivations for working as an LLVM pass; then things like sys image multiversioning, setting pkg image targets, etc should all "just work", as they do with every other LLVM-level optimization.

view this post on Zulip chriselrod (Jul 02 2023 at 00:35):

So the JuliaSIMD approach is to generate code for the host, and on __init__ try to detect if the host is wrong. If so, it'll try and @eval some methods to fix things, and invalidate any cached compiled code that may now be invalid.

view this post on Zulip chriselrod (Jul 02 2023 at 00:36):

(This is why all queries are functions, rather than global consts, so we get backedges)

view this post on Zulip chriselrod (Jul 02 2023 at 00:37):

Even though sysimages for generic targets have existed for ages now, I didn't pay enough attention to the fact that precompilation like that could fail. I guess we weren't running/actually compiling enough code, even for sysimage use cases, to run into that.

view this post on Zulip chriselrod (Jul 02 2023 at 00:42):

The most correct thing to do would be to get builtin support for the queries:

julia> have_fma(::Type{T}) where {T} = Core.Intrinsics.have_fma(T)
have_fma (generic function with 1 method)

julia> have_fma(Float64)
true

julia> Core.Intrinsics.have_fma(Float64)
false

Note that it actually needs to compile to not give a safe/conservative/possibly pessimizing answer.

view this post on Zulip chriselrod (Jul 02 2023 at 00:43):

But failing that, as discussed in the issue, I would set it up to check on precompilation whether the ENV variable has been set. If so, load the appropriate generic set of capabilities.
This is pretty bad, but better than nothing.

view this post on Zulip chriselrod (Jul 02 2023 at 00:45):

Note also that LLVM is able to legalize many, but not all, generic intrinsics.
It probably cannot legalize instruction-specific intrsinsics.
And some generic ones can also cause aborts, until LLVM actually implements legalizing fallbacks.

For example, the matmul intrinsics, or the compressed stores/expand loads on older versions of LLVM (recent LLVM versions can scalarize these; I haven't checked if they can scalarize or vetorize the matmul intrinsics, but maybe; these are obviously aimed at tensor cores).

view this post on Zulip Jakob Nybo Nissen (Jul 02 2023 at 07:03):

Ok - I'll try to make a PR to HostCPUFeatures today or one of the next days, but to me it looks like a relatively easy feature to add to Base itself. I'll also make an issue in Julia itself.

view this post on Zulip Sukera (Jul 02 2023 at 07:07):

wait, which feature is easy to add to Base?

view this post on Zulip Sukera (Jul 02 2023 at 07:07):

"detecting SIMD capability" can mean a whole lot of things

view this post on Zulip chriselrod (Jul 02 2023 at 14:51):

Sukera said:

"detecting SIMD capability" can mean a whole lot of things

I haven't looked at how have_fma is implemented, but at least Thayer a template we should be able to follow.


Last updated: Oct 02 2023 at 04:34 UTC