The Transducers.jl tcollect
function is super useful, but it seems to always return a Vector
regardless of the input type.
Is there a way to preserve the input type with Transducers.jl or Base.Threads?
julia> using Transducers
julia> using CircularArrays
julia> tcollect((1,2,3))
3-element Vector{Int64}:
1
2
3
julia> tcollect([1,2,3])
3-element Vector{Int64}:
1
2
3
julia> tcollect(CircularVector([1,2,3]))
3-element Vector{Int64}:
1
2
3
The underlying issue is that we have a couple of map(fun, iter)
that we would like to parallelize. But replacing map
with tcollect
doesn't work as expected.
Yeah, this is something Transducers.jl doesn't really handle well. I think the right function here is tcopy
since that doesn't try to follow the behaviour of collect
but trying your examples with tcopy
, none of them work still
which I guess we should classify as bugs that need to be fixed.
The circular arrays one could potentially be a package extension
but this is in general a hard problem
I would suggest though that if your problem is so time consuming that it requires parallelism, then maybe it's not so bad to have to do a convert after, i.e.
julia> Tuple(tcollect((1,2,3)))
(1, 2, 3)
julia> CircularVector(tcollect(CircularVector([1,2,3])))
3-element CircularVector(::Vector{Int64}):
1
2
3
but I get that's also annoying
Yes, we ended up following this route.
It would be super nice if these things worked. Also, is it too big of a dream to imagine a future where we can just replace map by a pmap and choose the form of parallelism? Threads vs GPU threads vs processes...
I wish we had a general pmap that performed all sorts of parallelism with keyword options
Also, is it too big of a dream to imagine a future where we can just replace map by a pmap and choose the form of parallelism? Threads vs GPU threads vs processes...
That's what Transducers.jl / Folds.jl already does. They provide sequential, threaded, and distributed backends
Taka worked on a GPU backend but it's broken now
would be great if we could revive it
Perhaps it is an issue of documentation then. I never saw the GPU case for instance
https://github.com/JuliaFolds/FoldsCUDA.jl
It was always experimental though
but yes, the idea is that Transducers can give us a very general way of doing parallelism that can be re-implemented for many different backends
Last updated: Nov 22 2024 at 04:41 UTC