Stream: helpdesk (published)

Topic: Threading in library code


view this post on Zulip mbaz (May 18 2022 at 23:33):

Say I'm writing a package and there's a bit of code that measurably benefits from multithreading with Threads.@threads. Should I use multithreading?

My concern is that doing so may clash with multithreading that the package user has set up in their own code, leading to overall slower execution.

view this post on Zulip Brenhin Keller (May 18 2022 at 23:46):

It's a good question... one option might be to add an optional keyword argument multithreaded=true or false to whatever relevant function... optionally plus or minus some heuristics to pick the fastest option by default (e.g. https://github.com/JuliaSIMD/VectorizedStatistics.jl/blob/4839b730f88f1d39091624f9433c8a9009e9a83e/src/vcov.jl#L41-L48)

view this post on Zulip Brenhin Keller (May 18 2022 at 23:47):

Not sure I'm 100% satisfied with that approach, but it's an option and I haven't thought of a better one yet in that case

view this post on Zulip Chris Foster (May 18 2022 at 23:55):

Threads.@threads uses the composable task-level parallelism now, so in theory it composes well with other multithreaded user code. I think Threads.@threads still depends on Threads.nthreads() to determine the number of tasks to use, and this could be a bad heuristic depending on your problem size. (But if this is a problem you can do manual decomposition of your problem into tasks with @spawn.)

Overall I think it's quite reasonable to try just using @threads unconditionally. It would be best to do some tests with some plausible outer threaded loop — such as might be written by a user — around your library to reassure yourself.

view this post on Zulip Chris Foster (May 19 2022 at 00:00):

still depends on Threads.nthreads() to determine the number of tasks to use

Actually, this is no longer true in 1.9-DEV — it uses dynamic scheduling instead which is great.

view this post on Zulip Chris Foster (May 19 2022 at 00:01):

Or... scratch that, it has some dynamic scheduling aspects, but the number of tasks is still the number of threads.

view this post on Zulip Chris Foster (May 19 2022 at 00:02):

Anyway, I'd say try it and see!

view this post on Zulip Mason Protter (May 19 2022 at 00:31):

Chris Foster said:

Threads.@threads uses the composable task-level parallelism now, so in theory it composes well with other multithreaded user code. I think Threads.@threads still depends on Threads.nthreads() to determine the number of tasks to use, and this could be a bad heuristic depending on your problem size. (But if this is a problem you can do manual decomposition of your problem into tasks with @spawn.)

Overall I think it's quite reasonable to try just using @threads unconditionally. It would be best to do some tests with some plausible outer threaded loop — such as might be written by a user — around your library to reassure yourself.

Well.. if 'now' means on the bleeding edge non-released versions then yeah

view this post on Zulip Mason Protter (May 19 2022 at 00:32):

I'd recommend strongly against shipping packages that use Threads.@threads without a way to disable it

view this post on Zulip Chris Foster (May 19 2022 at 00:45):

if 'now' means on the bleeding edge non-released versions then yeah

1.6 is the LTS and it uses the task system?

Oh... waait a minute. No, it kind of uses it, but I was mistaken in exactly how. Looking closer, it actually subverts the scheduler and statically assigns tasks to threads.

In that case yeah, I totally agree.

view this post on Zulip mbaz (May 19 2022 at 01:21):

Thanks everyone for the pointers! I think I'll be conservative and disable threads by default, but use a keyword argument to allow users to enable it.

view this post on Zulip Mason Protter (May 19 2022 at 01:55):

Chris Foster said:

if 'now' means on the bleeding edge non-released versions then yeah

1.6 is the LTS and it uses the task system?

Oh... waait a minute. No, it kind of uses it, but I was mistaken in exactly how. Looking closer, it actually subverts the scheduler and statically assigns tasks to threads.

In that case yeah, I totally agree.

Yeah, there's a lot of confusion about this and it's bad. It wasn't until a couple months ago that the PR was merged to make it dynamic rather than static. Jeff was on the fence about it. My argument in favour of the dynamic scheduler was that many people right now are under the mistaken assumption that it is dynamic, so that's what we should provide by default.

view this post on Zulip Mason Protter (May 19 2022 at 02:18):

This is why I constantly tell people "don't use Base threads, just use ThreadsX.jl unless you know exactly what you're doing and why you need something different. "

view this post on Zulip Sukera (May 19 2022 at 10:59):

I kind of hate that it took this long since 1.3 to get proper dynamic and nested threading...

view this post on Zulip Sukera (May 19 2022 at 11:01):

the initial multithreading blogpost was almost 3 years ago, and while the @spawn tasks were easily nest & composable, @threads just wasn't

view this post on Zulip Sukera (May 19 2022 at 11:01):

https://julialang.org/blog/2019/07/multithreading/#how_to_use_it


Last updated: Oct 02 2023 at 04:34 UTC