The following code fails with segmentation faults
import PyCall
so = PyCall.pyimport("scipy.optimize")
function f(x)
(x - 2) * x * (x + 2)^2
end
function do_optimize(fn)
result = so.minimize_scalar(fn)
end
let
out = Vector{Any}(undef, 20)
Threads.@threads for i in 1:20
println(i)
out[i] = do_optimize(f)
end
end
What should be the proper way to do this? If it is not possible, do I have to use multi-processing with Distributed
instead?
I have also found an unresolved question (https://discourse.julialang.org/t/using-pycall-from-threads/32742) from the internet.
Python doesn't have mulithreading, so none of it's datastructures are threadsafe
Spinning up multiple julia instances won't help I think, instead, you need to have multiple python instances that you're sending data to
I see. Looks like the only way to do it is to use Python's own multiprocessing
module, which technically spawn independent processes (with identical setups)?
yep
PyCall might have it's own way to spawn multiple pythons but I don't know
Thank you! I will investigate whether PyCall can spawn multiple Pythons.
I tried Python's multiprocessing
, and still got segmentation fault, and caused my laptop to slow down. I managed to recover by pkill julia
in a non-X terminal (ctrl-alt-F6).
import PyCall
so = PyCall.pyimport("scipy.optimize")
mp = PyCall.pyimport("multiprocessing")
function f(x)
(x - 2) * x * (x + 2)^2
end
function do_optimize(fn)
result = so.minimize_scalar(fn)
end
let
procs = []
for x in 1:20
println(x)
proc = mp.Process(target=f, args=())
proc.start()
push!(procs, proc)
end
for p in procs
p.join()
end
end
My hypothesis is that the proc
object gets duplicated indefinitely when passed over to Julia?
^ I didn't even use the do_optimize
function.
EDIT: I forgot to add -t2 to the julia argument. The code below failed with segmentation fault as well!
This solution works:
import PyCall
so = PyCall.pyimport("scipy.optimize")
function f(x)
(x - 2) * x * (x + 2)^2
end
let
out = Vector{Any}(undef, 20)
Threads.@threads for i in 1:20
function do_optimize(fn)
result = so.minimize_scalar(fn)
end
out[i] = do_optimize(f)
end
println(out)
end
Not sure why.
EDIT: I forgot to add -t2 to the julia
argument. The code below failed with segmentation fault as well!
This also works:
import PyCall
so = PyCall.pyimport("scipy.optimize")
function f(x)
(x - 2) * x * (x + 2)^2
end
function do_optimize(fn)
result = so.minimize_scalar(fn)
end
let
out = Vector{Any}(undef, 20)
Threads.@threads for i in 1:20
_do_optimize = do_optimize
out[i] = _do_optimize(f)
end
println(out)
end
I have tested that using Python multiprocessing, running a Python script that calls Julia functions also result in segfault. Looks like my only viable path is to either go full Julia or stay full Python. Too bad that the only Python function I need is the LBFGSB code from scipy.optimize.
why don't you use a native lbfgs implementation?
LBFGS isn't bounded. It has to use a generic Optim.Fminbox
and so is less efficient than LBFGSB I think. See https://discourse.julialang.org/t/optim-jl-vs-scipy-optimize-once-again/61661/35
cool, I didn't know about lbfgs-B :)
@distributed with pycall should work
just add @everywhere stubs in loading packages and function defs
@Paulito Palmes yes I confirm that @distributed
with PyCall works. Though by now I have already used LBFGSB.jl.
Using LBFGSB.jl results in fewer allocations that using Scipy's LBFGSB.
Last updated: Nov 06 2024 at 04:40 UTC