Anyone know how to deal with this? I'm trying to use remote_call
(or any other form of distributed computing) from within a package and I just can't get it to work because it complains about not being able to find the package.
Here's a MWE:
julia> module Foos
struct Foo end
using Distributed
f() = let p = only(addprocs(1))
x = remotecall_fetch(p) do
Foo()
end
rmprocs([p])
x
end
end
Main.Foos
julia> Foos.f()
ERROR: On worker 2:
UndefVarError: `Foos` not defined
Stacktrace:
[1] deserialize_module
@ ~/julia-1.10/usr/share/julia/stdlib/v1.10/Serialization/src/Serialization.jl:997
[2] handle_deserialize
@ ~/julia-1.10/usr/share/julia/stdlib/v1.10/Serialization/src/Serialization.jl:896
You defined Foo
in your process (1), but not the newly added process. Since Julia is nominally typed, and its full name is Foos.Foo
, you'll get an error.
You'd have to import shared types with @everywhere
I don't think that's right. Notice that the error isn't that Foo
isn't defined, it's that Foos
is not defined
Let's step away from the REPL version with a local module and do a package:
> cat Foos/src/Foos.jl
module Foos
using Distributed
f() = let p = only(addprocs(1))
x = remotecall_fetch(p) do
1
end
rmprocs([p])
x
end
end # module Foos
> cat Foos/Project.toml
name = "Foos"
uuid = "378b42d5-b922-4ee0-b574-33ae726347f8"
version = "0.1.0"
[deps]
Distributed = "8ba89e20-285c-5b6f-9357-94700520ee1b"
i.e. a very simple little package that just creates a process, runs a function that does nothing but return 1
, and then deletes that process.
Now here's what I see if I try to use it:
> julia -q --project=./Foos
julia> using Foos
julia> Foos.f()
ERROR: On worker 2:
KeyError: key Foos [378b42d5-b922-4ee0-b574-33ae726347f8] not found
Stacktrace:
[1] getindex
@ ./dict.jl:498 [inlined]
[2] macro expansion
@ ./lock.jl:267 [inlined]
[3] root_module
@ ./loading.jl:1878
[4] deserialize_module
@ ~/julia-1.10/usr/share/julia/stdlib/v1.10/Serialization/src/Serialization.jl:994
[5] handle_deserialize
@ ~/julia-1.10/usr/share/julia/stdlib/v1.10/Serialization/src/Serialization.jl:896
[...]
Maybe you need to do using Foos
on the worker first?
both your REPL & package MWE share that the name Foos
already exists; in the REPL example because it's in Main
, in the package example because you have using Foos
first
Changing the function to
f() = let p = only(addprocs(1))
x = remotecall_fetch(p) do
eval(:(using Foos))
1
end
rmprocs([p])
x
end
doesn't seem to eliminate the error
Hm, okay, so I managed to get my actual usecase working by using what feels like an excessive amount of indirection. I did something like this:
module Foos
using Distributed
struct Foo
x
end
function _f(foo)
Foo(foo.x + 1)
end
f(foo) = let p = only(addprocs(1))
Distributed.remotecall_eval(
Main, p, :(using Foos))
x =Distributed.remotecall_eval(Main, p, :(Foos._f($foo)))
rmprocs([p])
x
end
end # module Foos
Without all that indirection, I guess it was unhappy about me passing around structs that were defined in Foos
, in particular, I needed totally separate calls to remotecall_eval
to make it work.
very weird
Mason Protter has marked this topic as resolved.
yeah, that's what I was thinking of; you need the additional remotecall_eval
for world age purposes, I think
Last updated: Nov 06 2024 at 04:40 UTC