When writing a GPU kernel is it a bad idea to rely on constant propagation?
For example if I have the kernel helper function u01 from PhiloxRNG.jl
@inline function u01(::Type{Float32}, u::UInt32)::Float32
fma(Float32(u), Float32(2)^(-32), Float32(2)^(-33))
end
Is it better to instead do:
const _f32_2_to_the_neg_32 = Float32(2)^(-32)
const _f32_2_to_the_neg_33 = Float32(2)^(-33)
@inline function u01(::Type{Float32}, u::UInt32)::Float32
fma(Float32(u), _f32_2_to_the_neg_32, _f32_2_to_the_neg_33)
end
Because in the first case the ^ will need to be done on the GPU?
I see in the C++ version of this function they also needed to do some workarounds to get this to work.
Background is I am trying to understand why https://github.com/JuliaGPU/OpenCL.jl/pull/428 was needed, but this seems like a more general question for how to write GPU kernels in Julia.
Last updated: Apr 21 2026 at 06:18 UTC