excessive startup cost · helpdesk (published)

I have been using julia for the past 5 years, and I keep running into problems where my codebase suddenly takes an extremely excessive amount of time to start up. I have started multiple projects where at one point or another this issue cropped up, and even had to abandon one - rewriting it in another language.

How is one supposed to debug this? What am I doing wrong? Which heuristics are getting derailed? If it is type instability, can we make the default behavior to just give up after failing to infer for too long?

using PEPSKit,TensorKit
peps = InfinitePEPS(2,3);
leading_boundary(peps,CTMRG(truncdim(5),1e-12))

Sukera (Jun 20 2022 at 13:17):

Sukera (Jun 20 2022 at 13:20):

(jl_vbpKOh) pkg> add https://github.com/quantumghent/PEPSKit.jl#djeezus
    Updating git-repo `https://github.com/quantumghent/PEPSKit.jl`
   Resolving package versions...
ERROR: Unsatisfiable requirements detected for package MPSKit [bb1c41ca]:
 MPSKit [bb1c41ca] log:
 ├─possible versions are: 0.1.0-0.7.0 or uninstalled
 ├─restricted to versions 0.7 by PEPSKit [52969e89], leaving only versions 0.7.0
 │ └─PEPSKit [52969e89] log:
 │   ├─possible versions are: 0.1.0 or uninstalled
 │   └─PEPSKit [52969e89] is fixed to version 0.1.0
 └─restricted by julia compatibility requirements to versions: 0.1.0-0.4.0 or uninstalled — no versions left

Sukera (Jun 20 2022 at 13:21):

Maarten (Jun 20 2022 at 13:36):

I have tried snoopcompile in the past, but nothing obvious showed. For example, I did not see particularly many method invalidations. That said, the package changed a lot apparently, I will try it again

I can change the package (or remove the mpskit dependency) if you want to get it to run

Sukera (Jun 20 2022 at 13:38):

Maarten (Jun 20 2022 at 13:39):

Sukera (Jun 20 2022 at 13:39):

is it package loading or runtime? If it's package loading, SnoopCompile can catch that. If it's runtime, you'll have to optimize your algorithms. If it's compile time, you can try to use precompile statements to reduce latency after package loading (though snoop compile can help finding the right invocations for that)

Sukera (Jun 20 2022 at 13:40):

so the first thing I usually do when trying to optimize performance is figuring out _where_ the time is spent in the first place - I usually start with runtime, since that's where people most often write suboptimal code

Maarten (Jun 20 2022 at 13:41):

Loading is near instantaneous, runtime is always good, compile time takes very long. In the past it has often been a single type instability (for example due to captured variables in closures) which takes the compile time from 30 seconds to 10 minutes.

Sukera (Jun 20 2022 at 13:42):

where does leading_boundary come from? I can't seem to find it in either PEPSKit or TensorKit

Sukera (Jun 20 2022 at 13:43):

Maarten (Jun 20 2022 at 13:45):

It's defined in MPSKit, extended in PEPSKit for the particular case that is being called (in src/algorithms)

Maarten (Jun 20 2022 at 13:46):

Maarten (Jun 20 2022 at 13:48):

snoopcompile results are in InferenceTimingNode: 0.304494/237.488271 on Core.Compiler.Timings.ROOT() with 4 direct children

Sukera (Jun 20 2022 at 13:51):

Sukera (Jun 20 2022 at 13:52):

but precompile statements will certainly help. it'll mostly shift the time to package install time, but people usually feel better about it being spent there :shrug:

Sukera (Jun 20 2022 at 13:55):

in your specific case, introducing some function barriers may just help enough already

Sukera (Jun 20 2022 at 13:56):

not everything in your big left_move function for example relies on PType, right?

Sukera (Jun 20 2022 at 13:56):

Sukera (Jun 20 2022 at 13:57):

Maarten (Jun 20 2022 at 14:37):

I still think there's an issue here. The code I showed does not run - it's work in progress - and takes forever to start running. Yet if I now simply complete the algorithm (implement a few extra methods, not change the code) to make it run successfully, then the actual startup cost goes from 200+ seconds to like 30 second.

I will try to further reduce startup time with function barriers (because left_move is indeed rather clunky), but I will also try to construct a minimal example to illustrate this long-startuptime behavior.

Sukera (Jun 20 2022 at 15:11):

Sukera (Jun 20 2022 at 15:12):

when you say you had to implement extra methods to make it work, what were those methods for?

Sukera (Jun 20 2022 at 15:12):

is it possible you hit some fallback from one of your dependencies, which could have resulted in undue specialization?

Rik Huijzer (Jun 20 2022 at 18:35):

For finding compilation time, you can also use the normal profiler with StatProfileHTML. That works great too and is easier to use IMO

Rik Huijzer (Jun 20 2022 at 18:37):

Adding precompile directives can help but not for overspecialisations like is the case here? Then you would need to run precompile directives in a loop

Rik Huijzer (Jun 20 2022 at 18:38):

Maarten make sure you’re running the newest Julia version possible. The compiler has gotten a lot better especially in Julia 1.8

Stream: helpdesk (published)

Topic: excessive startup cost

Maarten (Jun 20 2022 at 13:15):

Sukera (Jun 20 2022 at 13:17):

Sukera (Jun 20 2022 at 13:20):

Sukera (Jun 20 2022 at 13:21):

Maarten (Jun 20 2022 at 13:36):

Sukera (Jun 20 2022 at 13:38):

Maarten (Jun 20 2022 at 13:39):

Sukera (Jun 20 2022 at 13:39):

Sukera (Jun 20 2022 at 13:40):

Maarten (Jun 20 2022 at 13:41):

Sukera (Jun 20 2022 at 13:42):

Sukera (Jun 20 2022 at 13:43):

Maarten (Jun 20 2022 at 13:45):

Maarten (Jun 20 2022 at 13:46):

Maarten (Jun 20 2022 at 13:48):

Sukera (Jun 20 2022 at 13:51):

Sukera (Jun 20 2022 at 13:52):

Sukera (Jun 20 2022 at 13:52):

Sukera (Jun 20 2022 at 13:55):

Sukera (Jun 20 2022 at 13:56):

Sukera (Jun 20 2022 at 13:56):

Sukera (Jun 20 2022 at 13:57):

Maarten (Jun 20 2022 at 14:37):

Sukera (Jun 20 2022 at 15:11):

Sukera (Jun 20 2022 at 15:12):

Sukera (Jun 20 2022 at 15:12):

Rik Huijzer (Jun 20 2022 at 18:35):

Rik Huijzer (Jun 20 2022 at 18:37):

Rik Huijzer (Jun 20 2022 at 18:38):