Why might this application of the turbo macro fail and fall back to inbounds and simd? We're trying to do mul!(y, A, x[:, col]), but without copying the column of x.
function matmul_column!(y, A, x, col)
@turbo for i in axes(A, 1)
yi = 0.
for j in axes(A, 2)
yi += A[i, j] * x[j, col]
end
y[i] = yi
end
end
What are the element types?
Note that @view(x[:,col]) should also let you avoid copying the column of x.
Turns out the element types were SVectors, no wonder it wasn't working. Reshaping things to be a 3d SArray rather than a SMatrix{SVector} fixed it.
The matrix x is tiny enough that I don't want the overhead of the @view, but that's something I should measure too.
Last updated: Oct 27 2025 at 04:47 UTC