I'm dabbling with some GPU kernel programming and hitting some issues with Float64
s that crop up in the math. Specifically, I've got some integers and when I divide them I get a Float64
. E.g.
Julia> /(Int32(4),Int32(5)) |> typeof
Float64
Any tips for handling this, such as an alternative to /
that will keep the results as Float32
? Do I just have to wrap every /
operation with Float32(...)
?
You might want to convert these integers as Float32
before perfoming the division; that is what the method for /(::Int32, ::Int32)
/ does but it uses float
to convert them to Float64
.
function d(a::Integer, b::Integer)
Float32(a) / b
end
function d(a, b)
a / b
end
Alec said:
Any tips for handling this, such as an alternative to
/
that will keep the results asFloat32
?
Keeping the result as Float32
sounds inaccurate if neither argument was Float32
to begin with. Sometimes you can get away with reordering your computations, e.g. if x
is Float32
change 4 / 5 * x
to 4 * x / 5
but usually it's easiest to add explicit casts where needed or use Float32
literals, e.g. 4f0 / 5 * x
.
You also have the option to express the divisions as rationals, e.g. 4 // 5 * x
. In the case of literals this is optimized to just a Float32
multiplication.
julia> f(x) = 4 // 5 * x
f (generic function with 1 method)
julia> @code_typed f(1.0f0)
CodeInfo(
1 ─ %1 = Base.mul_float(0.8f0, x)::Float32
└── return %1
) => Float32
Gunnar Farnebäck said:
You also have the option to express the divisions as rationals, e.g.
4 // 5 * x
This :point_of_information: is the general solution to avoid spurious promotion, at least as long as you're only doing basic arithmetic. Related to this performance tip from the manual: https://docs.julialang.org/en/v1/manual/style-guide/#Avoid-using-floats-for-numeric-literals-in-generic-code-when-possible.
There's also zero
, one
, and oneunit
, which are helpful if you're doing more than plain arithmetic and want to keep your code generic. For example, if x
is a float of some type and I want to do x * sqrt(3)
, I'll write x * sqrt(3one(x))
to avoid spurious promotion. But for GPU kernel programming, you're probably not all that concerned about keeping your code generic.
Thanks all, very helpful!
Alec has marked this topic as resolved.
Last updated: Apr 04 2025 at 04:42 UTC