Stream: helpdesk (published)

Topic: ✔ Default library path for `ccall` and `llvmcall`


view this post on Zulip Brenhin Keller (Jul 24 2022 at 17:47):

Say I write something like

julia> ccall(:printf, Int, (Cstring,), "hello there!\n")
hello there!
13

i,e,. without explicitly specifying a path to a library… what determines the default set of paths Julia will search for the printf function?

view this post on Zulip Brenhin Keller (Jul 24 2022 at 17:48):

(x-posted on slack, will update either if I get an answer)

view this post on Zulip Sukera (Jul 24 2022 at 17:56):

it should check the usual library paths, since iirc it just delegates that to dlsym

view this post on Zulip Sukera (Jul 24 2022 at 17:56):

or LD_LIBRARY_PATH on linux, or the other usual places for libraries, which is system dependent

view this post on Zulip Sukera (Jul 24 2022 at 17:56):

e.g. Windows also checks the current directory, IIRC

view this post on Zulip Mosè Giordano (Jul 24 2022 at 17:57):

Symbols in libraries dlopened with RTLD_GLOBAL are available without specifying the library (which is also a good way to get clashes between different libraries providing same symbols)

view this post on Zulip Brenhin Keller (Jul 24 2022 at 18:08):

Oh cool

view this post on Zulip Mosè Giordano (Jul 24 2022 at 18:10):

Sukera said:

or LD_LIBRARY_PATH on linux, or the other usual places for libraries, which is system dependent

I'm not sure this is relevant. the library must be already dlopened in order to call a symbol without specifying the path, it isn't like ccall starts randomly dlopening all possible libraries it can find in LD_LIBRARY_PATH

view this post on Zulip Mosè Giordano (Jul 24 2022 at 18:14):

Mosè Giordano said:

Symbols in libraries dlopened with RTLD_GLOBAL are available without specifying the library (which is also a good way to get clashes between different libraries providing same symbols)

example:

shell> cat foo.c
int bar(void) {
    return 42;
}

shell> cc -shared -fPIC -o libfoo.so foo.c

julia> using Libdl

julia> dlopen("./libfoo.so", RTLD_GLOBAL)
Ptr{Nothing} @0x0000000002aa3fd0

julia> @ccall bar()::Cint
42

julia>
% julia -q
julia> using Libdl

julia> dlopen("./libfoo.so")
Ptr{Nothing} @0x0000000000faf670

julia> @ccall bar()::Cint
ERROR: could not load symbol "bar":
julia: undefined symbol: bar
Stacktrace:
 [1] top-level scope
   @ ./REPL[3]:1

view this post on Zulip Brenhin Keller (Jul 24 2022 at 18:19):

Success! :tada:

julia> using Libdl, StaticMPI

julia> dlopen("libmpi", RTLD_GLOBAL)
Ptr{Nothing} @0x00007ff9385475a0

julia> MPI_Init() == MPI_SUCCESS
true

view this post on Zulip Mosè Giordano (Jul 24 2022 at 18:22):

I must say I'm quite interested in the possibility of having a standalone MPI application written in julia. on Fugaku I had lots of jobs dying because the system couldn't load a random library, especially in large jobs (~400 nodes, which they actually still call "small" jobs)

view this post on Zulip Sukera (Jul 24 2022 at 18:23):

just would be nice if it were the regular MPI package :/

view this post on Zulip Mosè Giordano (Jul 24 2022 at 18:24):

sure, it'd be nice if this whole static compilation wasn't an alien dialect :grimacing:

view this post on Zulip Brenhin Keller (Jul 24 2022 at 18:24):

So far it seems to be working FWIW! All the usual caveats about no allocations, no dynamic dispatch, etc of course apply, but in HPC obsessively micromanaging your memory is kinda standard anyways... The one thing I'd still like is to intercept errors and return a custom exit code instead of having to avoid things that could error altogether.

view this post on Zulip Mosè Giordano (Jul 24 2022 at 18:25):

BTW, MPI.jl does dlopen libmpi with RTLD_GLOBAL on unix systems: https://github.com/JuliaParallel/MPI.jl/blob/a8d4d6400d9f91677c72f261ae0eb4db4b04e1b2/src/MPI.jl#L110

view this post on Zulip Brenhin Keller (Jul 24 2022 at 18:26):

For posterity re the initial question, dlopen was the key thing I was missing for making the symbols available to be ccall/ llvmcall'd directly, while LD_LIBRARY_PATH seems to be what gets used during the dlopen to find libmpi (i.e., if I didn't have /opt/local/lib/mpich-mp (my local MPICH install) in my LD_LIBRARY_PATH already I think I'd have to write out the full thing in the dlopen above)

view this post on Zulip Notification Bot (Jul 24 2022 at 18:27):

Brenhin Keller has marked this topic as resolved.

view this post on Zulip Notification Bot (Jul 24 2022 at 18:40):

Brenhin Keller has marked this topic as unresolved.

view this post on Zulip Notification Bot (Jul 24 2022 at 18:40):

Brenhin Keller has marked this topic as resolved.

view this post on Zulip Mason Protter (Jul 24 2022 at 21:06):

You should be able to just stick the dlopen in your __init__()

view this post on Zulip Brenhin Keller (Jul 24 2022 at 21:13):

Ooh, maybe I should just have StaticMPI.jl do that by default with MPICH_jll so that people don't get segfaults when they paste code into the repl without considering the implications :sweat_smile:

view this post on Zulip Brenhin Keller (Jul 24 2022 at 21:15):

A tradeoff between convenience and flexibility (if people want to dlopen their cluster's libmpi)

view this post on Zulip Mason Protter (Jul 24 2022 at 21:16):

You could make the behaviour depend on an environment variable

view this post on Zulip Fredrik Ekre (Jul 24 2022 at 21:46):

Could you use MPItrampoline?

view this post on Zulip Brenhin Keller (Jul 24 2022 at 21:52):

Hmm, possibly — I should look into that

view this post on Zulip Brenhin Keller (Jul 25 2022 at 01:28):

Huh, looks like MPItrampoline_jll.is_available() == false on my system.. I wonder how similar the MPItrampoline libmpi interface is to the MPICH one -- since currently I've been targeting that

view this post on Zulip Brenhin Keller (Jul 31 2022 at 21:44):

What I ended up going with:

ccall(:dlsym, Ptr{Cvoid}, (Ptr{Nothing}, Cstring), handle, "MPI_Init") == C_NULL

where handle is the (OS-dependent) pseudo-pointer RTLD_DEFAULT


Last updated: Nov 06 2024 at 04:40 UTC