Stream: helpdesk (published)

Topic: excessive startup cost


view this post on Zulip Maarten (Jun 20 2022 at 13:15):

I have been using julia for the past 5 years, and I keep running into problems where my codebase suddenly takes an extremely excessive amount of time to start up. I have started multiple projects where at one point or another this issue cropped up, and even had to abandon one - rewriting it in another language.

How is one supposed to debug this? What am I doing wrong? Which heuristics are getting derailed? If it is type instability, can we make the default behavior to just give up after failing to infer for too long?

For a silly example, this is a work-in-progress package : https://github.com/quantumghent/PEPSKit.jl/tree/djeezus (you really need that branch)

using PEPSKit,TensorKit
peps = InfinitePEPS(2,3);
leading_boundary(peps,CTMRG(truncdim(5),1e-12))

takes an absolutely unusable amount of time to run

view this post on Zulip Sukera (Jun 20 2022 at 13:17):

Do you know about https://github.com/timholy/SnoopCompile.jl ?

view this post on Zulip Sukera (Jun 20 2022 at 13:20):

Also seems like that branch can't even be added by itself:

(jl_vbpKOh) pkg> add https://github.com/quantumghent/PEPSKit.jl#djeezus
    Updating git-repo `https://github.com/quantumghent/PEPSKit.jl`
   Resolving package versions...
ERROR: Unsatisfiable requirements detected for package MPSKit [bb1c41ca]:
 MPSKit [bb1c41ca] log:
 ├─possible versions are: 0.1.0-0.7.0 or uninstalled
 ├─restricted to versions 0.7 by PEPSKit [52969e89], leaving only versions 0.7.0
  └─PEPSKit [52969e89] log:
    ├─possible versions are: 0.1.0 or uninstalled
    └─PEPSKit [52969e89] is fixed to version 0.1.0
 └─restricted by julia compatibility requirements to versions: 0.1.0-0.4.0 or uninstalled  no versions left

Is there a circular dependency between your packages?

view this post on Zulip Sukera (Jun 20 2022 at 13:21):

ah, my julia version is too new

view this post on Zulip Maarten (Jun 20 2022 at 13:36):

I have tried snoopcompile in the past, but nothing obvious showed. For example, I did not see particularly many method invalidations. That said, the package changed a lot apparently, I will try it again

I can change the package (or remove the mpskit dependency) if you want to get it to run

view this post on Zulip Sukera (Jun 20 2022 at 13:38):

I was just trying to see where the time of your example was actually spent

view this post on Zulip Maarten (Jun 20 2022 at 13:39):

(I'm now running snoopi_deep on it)

view this post on Zulip Sukera (Jun 20 2022 at 13:39):

is it package loading or runtime? If it's package loading, SnoopCompile can catch that. If it's runtime, you'll have to optimize your algorithms. If it's compile time, you can try to use precompile statements to reduce latency after package loading (though snoop compile can help finding the right invocations for that)

view this post on Zulip Sukera (Jun 20 2022 at 13:40):

so the first thing I usually do when trying to optimize performance is figuring out _where_ the time is spent in the first place - I usually start with runtime, since that's where people most often write suboptimal code

view this post on Zulip Maarten (Jun 20 2022 at 13:41):

Loading is near instantaneous, runtime is always good, compile time takes very long. In the past it has often been a single type instability (for example due to captured variables in closures) which takes the compile time from 30 seconds to 10 minutes.

view this post on Zulip Sukera (Jun 20 2022 at 13:42):

where does leading_boundary come from? I can't seem to find it in either PEPSKit or TensorKit

view this post on Zulip Sukera (Jun 20 2022 at 13:43):

have you run JET.jl on your code, to find those type instabilities more easily?

view this post on Zulip Maarten (Jun 20 2022 at 13:45):

Sukera said:

where does leading_boundary come from? I can't seem to find it in either PEPSKit or TensorKit

It's defined in MPSKit, extended in PEPSKit for the particular case that is being called (in src/algorithms)

view this post on Zulip Maarten (Jun 20 2022 at 13:46):

Sukera said:

have you run JET.jl on your code, to find those type instabilities more easily?

haven't tried JET yet. I tried it in the past, and then julia segfaulted

view this post on Zulip Maarten (Jun 20 2022 at 13:48):

snoopcompile results are in InferenceTimingNode: 0.304494/237.488271 on Core.Compiler.Timings.ROOT() with 4 direct children

view this post on Zulip Sukera (Jun 20 2022 at 13:51):

yep, sounds like A LOT of specialization is happening for what you're doing

view this post on Zulip Sukera (Jun 20 2022 at 13:52):

you may want to look into what exactly is eating up all that time

view this post on Zulip Sukera (Jun 20 2022 at 13:52):

but precompile statements will certainly help. it'll mostly shift the time to package install time, but people usually feel better about it being spent there :shrug:

view this post on Zulip Sukera (Jun 20 2022 at 13:55):

in your specific case, introducing some function barriers may just help enough already

view this post on Zulip Sukera (Jun 20 2022 at 13:56):

not everything in your big left_move function for example relies on PType, right?

view this post on Zulip Sukera (Jun 20 2022 at 13:56):

like the inner hot loops for example

view this post on Zulip Sukera (Jun 20 2022 at 13:57):

not a perfect fit from the example in the docs/your code, but I think https://docs.julialang.org/en/v1/manual/performance-tips/#kernel-functions may apply

view this post on Zulip Maarten (Jun 20 2022 at 14:37):

I still think there's an issue here. The code I showed does not run - it's work in progress - and takes forever to start running. Yet if I now simply complete the algorithm (implement a few extra methods, not change the code) to make it run successfully, then the actual startup cost goes from 200+ seconds to like 30 second.

I will try to further reduce startup time with function barriers (because left_move is indeed rather clunky), but I will also try to construct a minimal example to illustrate this long-startuptime behavior.

view this post on Zulip Sukera (Jun 20 2022 at 15:11):

very interesting

view this post on Zulip Sukera (Jun 20 2022 at 15:12):

when you say you had to implement extra methods to make it work, what were those methods for?

view this post on Zulip Sukera (Jun 20 2022 at 15:12):

is it possible you hit some fallback from one of your dependencies, which could have resulted in undue specialization?

view this post on Zulip Rik Huijzer (Jun 20 2022 at 18:35):

For finding compilation time, you can also use the normal profiler with StatProfileHTML. That works great too and is easier to use IMO

view this post on Zulip Rik Huijzer (Jun 20 2022 at 18:37):

Adding precompile directives can help but not for overspecialisations like is the case here? Then you would need to run precompile directives in a loop

view this post on Zulip Rik Huijzer (Jun 20 2022 at 18:38):

Maarten make sure you’re running the newest Julia version possible. The compiler has gotten a lot better especially in Julia 1.8


Last updated: Oct 02 2023 at 04:34 UTC