Stream: helpdesk (published)

Topic: Threaded `randn` and benchmarking error


view this post on Zulip Alec (Jul 23 2022 at 21:14):

When benchmarking some code for EconomicScenarioGenerators.jl, I encountered this issue that I've only encountered while benchmarking. When using a shared Random.MersenneTwister, randn will hit a complex number and error. Here's a minimal replicating example:

function g()
    v = zeros(100000)
    Threads.@threads for i = 1:length(v)
        v[i] =randn(Random.MersenneTwister(1))
    end
    v
end

function g(RNG)
    v = zeros(100000)
    Threads.@threads for i = 1:length(v)
        v[i] =randn(RNG)
    end
    v
end

# these do not error
@btime g()
g()
g(Random.MersenneTwister(1))

# this errors
@btime g(Random.MersenneTwister(1))

Is this indeed an issue worth filing? If so, with Julia or with BenchmarkTools?

view this post on Zulip Alec (Jul 23 2022 at 21:15):

Full stacktrace on Mac M1, Julia 1.8 RC3

julia> @btime g(Random.MersenneTwister(1))
ERROR: TaskFailedException
Stacktrace:
  [1] wait
    @ ./task.jl:345 [inlined]
  [2] threading_run(fun::var"#128#threadsfor_fun#31"{var"#128#threadsfor_fun#30#32"{MersenneTwister, Vector{Float64}, UnitRange{Int64}}}, static::Bool)
    @ Base.Threads ./threadingconstructs.jl:38
  [3] macro expansion
    @ ./threadingconstructs.jl:89 [inlined]
  [4] g
    @ ./Untitled-1:13 [inlined]
  [5] var"##core#393"()
    @ Main ~/.julia/packages/BenchmarkTools/7xSXH/src/execution.jl:489
  [6] var"##sample#394"(::Tuple{}, __params::BenchmarkTools.Parameters)
    @ Main ~/.julia/packages/BenchmarkTools/7xSXH/src/execution.jl:497
  [7] _lineartrial(b::BenchmarkTools.Benchmark, p::BenchmarkTools.Parameters; maxevals::Int64, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ BenchmarkTools ~/.julia/packages/BenchmarkTools/7xSXH/src/execution.jl:161
  [8] _lineartrial(b::BenchmarkTools.Benchmark, p::BenchmarkTools.Parameters)
    @ BenchmarkTools ~/.julia/packages/BenchmarkTools/7xSXH/src/execution.jl:152
  [9] #invokelatest#2
    @ ./essentials.jl:729 [inlined]
 [10] invokelatest
    @ ./essentials.jl:726 [inlined]
 [11] #lineartrial#46
    @ ~/.julia/packages/BenchmarkTools/7xSXH/src/execution.jl:35 [inlined]
 [12] lineartrial
    @ ~/.julia/packages/BenchmarkTools/7xSXH/src/execution.jl:35 [inlined]
 [13] tune!(b::BenchmarkTools.Benchmark, p::BenchmarkTools.Parameters; progressid::Nothing, nleaves::Float64, ndone::Float64, verbose::Bool, pad::String, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ BenchmarkTools ~/.julia/packages/BenchmarkTools/7xSXH/src/execution.jl:251
 [14] tune! (repeats 2 times)
    @ ~/.julia/packages/BenchmarkTools/7xSXH/src/execution.jl:247 [inlined]
 [15] top-level scope
    @ ~/.julia/packages/BenchmarkTools/7xSXH/src/execution.jl:576

    nested task error: DomainError with -1.0:
    log will only return a complex result if called with a complex argument. Try log(Complex(x)).
    Stacktrace:
     [1] throw_complex_domainerror(f::Symbol, x::Float64)
       @ Base.Math ./math.jl:33
     [2] _log(x::Float64, base::Val{:ℯ}, func::Symbol)
       @ Base.Math ./special/log.jl:292
     [3] log
       @ ./special/log.jl:257 [inlined]
     [4] randn_unlikely(rng::MersenneTwister, idx::Int64, rabs::Int64, x::Float64)
       @ Random ~/prog/julia-dd/usr/share/julia/stdlib/v1.8/Random/src/normal.jl:73
     [5] randn
       @ ~/prog/julia-dd/usr/share/julia/stdlib/v1.8/Random/src/normal.jl:54 [inlined]
     [6] macro expansion
       @ ./Untitled-1:14 [inlined]
     [7] (::var"#128#threadsfor_fun#31"{var"#128#threadsfor_fun#30#32"{MersenneTwister, Vector{Float64}, UnitRange{Int64}}})(tid::Int64; onethread::Bool)
       @ Main ./threadingconstructs.jl:84
     [8] #128#threadsfor_fun
       @ ./threadingconstructs.jl:51 [inlined]
     [9] (::Base.Threads.var"#1#2"{var"#128#threadsfor_fun#31"{var"#128#threadsfor_fun#30#32"{MersenneTwister, Vector{Float64}, UnitRange{Int64}}}, Int64})()
       @ Base.Threads ./threadingconstructs.jl:30

view this post on Zulip Alec (Jul 23 2022 at 21:16):

FWIW I hit the same thing (error only while benchmarking) with ThreadsX, but the above just uses Base to narrow in on the issue.

view this post on Zulip Michael Fiano (Jul 23 2022 at 21:37):

You should probably use the setup argument or interpolate the rng into @btime. It is creating too many rng's I think. Not sure who to blame yet, but I would expect you to have the same issue with @belapsed and @benchmark unless you setup the RNG first.

view this post on Zulip Michael Fiano (Jul 23 2022 at 21:40):

Also might want to try the newer RNG algorithm. MT has a lot of bad properties in general.

view this post on Zulip Mason Protter (Jul 23 2022 at 21:48):

I’d assume this is an M1 Mac bug

view this post on Zulip Alec (Jul 24 2022 at 01:21):

@Michael Fiano what would you recommend? I was using MT because I thought it was the new thread-safe option

view this post on Zulip Mosè Giordano (Jul 24 2022 at 01:37):

MT is the old thread-unsafe RNG

view this post on Zulip Brenhin Keller (Jul 24 2022 at 01:55):

The new one is thread-local Xoshiro256++

view this post on Zulip Alec (Jul 24 2022 at 02:11):

Thanks. I was using MT based on the recommendations on the Julia blog.

view this post on Zulip Brenhin Keller (Jul 24 2022 at 02:16):

Ah yes, a good bit has changed in the last three years on this front

view this post on Zulip Brenhin Keller (Jul 24 2022 at 02:17):

The new one probably could have been better advertized

view this post on Zulip Sukera (Jul 24 2022 at 08:12):

@Alec The blogpost you've linked only talked about rand(), i.e. random calls that use the thread/task specific default RNG. It does not imply that sharing MersenneTwister instances is thread safe.

view this post on Zulip Sukera (Jul 24 2022 at 08:13):

Since that blogpost was published, the default RNG was changed to Xoshiro, due to being smaller memory wise and faster. I'm not sure if sharing instances of that is thread safe.

view this post on Zulip Jameson Nash (Jul 31 2022 at 18:16):

To the original question, it is forbidden to share objects (instances of Random.MersenneTwister or Xoshiro, for example) with mutable state between threads. This will create a data-race, and will lead to undefined behavior

view this post on Zulip Alec (Aug 04 2022 at 03:34):

The current (1.7.3) docs say:

In a multi-threaded program, you should generally use different RNG objects from different threads or tasks in order to be thread-safe. However, the default RNG is thread-safe as of Julia 1.3 (using a per-thread RNG up to version 1.6, and per-task thereafter).

Would the following clarify the situation? I current read that and think that since 1.3, the default RNG is thread safe, and now that Xoshiro is the default, it's thread-safe.

In a multi-threaded program, you should generally use different RNG objects from different threads or tasks in order to be thread-safe. However, the default global RNG is thread-safe as of Julia 1.3 (using a per-thread RNG up to version 1.6, and per-task thereafter). The global RNG refers to per-thread RNG instances; If you instantiate a non-global RNG, you should create a new RNG instance for each thread to avoid possible data-races.

view this post on Zulip chriselrod (Aug 04 2022 at 11:44):

julia> using Random

julia> Random.default_rng()
TaskLocalRNG()

This is a special object. The core RNG isn't thread safe.

view this post on Zulip Sundar R (Aug 04 2022 at 12:00):

That seems worth explicitly mentioning, since "the default" seems bound to be misinterpreted as the core/"default" RNG.

However, the default global RNG (obtained via Random.default_rng()) is thread-safe as of Julia 1.3


Last updated: Nov 22 2024 at 04:41 UTC