Stream: helpdesk (published)

Topic: julia memory usage


view this post on Zulip Maarten (Aug 28 2021 at 15:31):

I have a julia script that somehow requires 60+ gb of memory. However, running

varinfo(imported=true,sortby=:size,all=true)

shows something which adds up to 100mb. If I turn recursive=true on in varinfo, the process allocates an extra 40gb and runs out of memory...

How can I found out how this memory is wasted? Are there any "usual suspects"?

view this post on Zulip Chad Scherrer (Aug 28 2021 at 17:44):

I'd guess you may have lots of intermediate allocations that are freed, but the garbage collector isn't keeping up. You could try @allocated on different pieces of the code, or annotate the code with TimerOutputs.jl

view this post on Zulip Maarten (Aug 28 2021 at 18:00):

I'm not sure, if I run just half of the script and get back repl access, then after GC.gc(), there shouldn't be any intermediary variables right? yet if I do this, the memory usage remains ridiculously high. Nothing is running, and the variables I see in varinfo are quite modest in size.

view this post on Zulip Maarten (Aug 28 2021 at 18:02):

it's also bugging me that varinfo goes and takes up 40+ gigabytes before crashing.

view this post on Zulip Jakob Nybo Nissen (Aug 29 2021 at 08:00):

Even if no variables are leaked into the REPL? Generated code is not GCd, do you by any chance do a lot of codegen?

In either case, it would be nice if you could share the script as a MWE of a potential GC bug

view this post on Zulip Maarten (Aug 29 2021 at 08:10):

at the moment it's not very minimal. There is a bit of generated code; can I check the size of the compiled code? There are a few variables leaked into the repl, but those I can see with varinfo, and they amount to 100mb. I can try to reduce the script a bit, and then share it.

view this post on Zulip Maarten (Aug 29 2021 at 09:01):

I modified varinfo to spit out the intermediary results and I see lines like:

MPSKit.MPSKit.LinearAlgebra.MPSKit.MPSKit.LinearAlgebra.BLAS.MPSKit.MPSKit.LinearAlgebra.MPSKit.MPSKit.LinearAlgebra.BLAS.LinearAlgebra.*

which implies that it revisits modules that it already visited, so it's pretty bugged.

view this post on Zulip Fredrik Bagge Carlson (Aug 29 2021 at 10:05):

Is the script performing disk IO? If so, is your issue similar to https://github.com/JuliaData/CSV.jl/issues/850
?

view this post on Zulip Maarten (Aug 30 2021 at 11:37):

a tiny bit of disk io, so it's not similar

view this post on Zulip Maarten (Aug 30 2021 at 11:38):

I wrote a dirty-patched version of varinfo derp.txt and the total memory usage known about by julia is estimated at 600mb

view this post on Zulip c42f (Sep 01 2021 at 01:38):

MPSKit.MPSKit.LinearAlgebra.MPSKit

This does seem troubling — how does the binding LinearAlgebra.MPSKit come to exist?

view this post on Zulip Maarten (Sep 01 2021 at 05:22):

That turned out to be a problem in varinfo itself, which I think I have patched locally. However, I'm not entirely sure how the function was meant to work (why did it filter out Base,Main and Core), so I'm not sure if I should open a pull request

view this post on Zulip Maarten (Sep 01 2021 at 05:24):

there was also a "mistake" in base, summarysize on bitarrays failed as size appears undefined

view this post on Zulip Maarten (Sep 01 2021 at 05:40):

whut

julia> total = 0; for (k,v) in SUNRepresentations.CGCCACHE[(3,Float64)]
       total+=Base.summarysize(v)
       end

julia> total
2184221472

julia> Base.summarysize(SUNRepresentations.CGCCACHE[(3,Float64)])
463032

view this post on Zulip Maarten (Sep 01 2021 at 05:42):

I understand summarysize is a lower bound, but this seems a bit too crazy

view this post on Zulip Maarten (Sep 01 2021 at 06:09):

aha, perhaps this is related https://github.com/JuliaLang/julia/issues/41941

view this post on Zulip c42f (Sep 03 2021 at 02:41):

This does seem related and it was fixed in nightly already - https://github.com/JuliaLang/julia/pull/41492
Though I'm not sure that will fix all your varinfo problems here.

view this post on Zulip Maarten (Sep 03 2021 at 07:09):

indeed it was! I changed varinfo a bit, I'm not sure if I should open a pull request as it does things differently as the other varinfo, but in combination with the latest julia it now shows nicely where the memory usage is coming from! :D

view this post on Zulip Maarten (Sep 03 2021 at 09:48):

the varinfo thing has apparantly also been fixed https://github.com/JuliaLang/julia/pull/42061

view this post on Zulip Jameson Nash (Sep 09 2021 at 05:12):

why did it filter out Base,Main and Core

I don't know either (author of that fix here), but I did give is a lovely fake name in my patch :)

view this post on Zulip Maarten (Sep 09 2021 at 15:55):

thanks a lot for the fix, the code is now also much nicer to read :) we also resolved the underlying issue in our own code, and now everything works again with much less memory usage

view this post on Zulip Maarten (Aug 24 2022 at 00:53):

I'm on julia 1.8.0 and am running into the same issue once again. Julia takes up 34.1G, the output of varinfo is

julia> varinfo(all=true,minsize=1024,imported=true,sortby=:size)
  name                    size summary
  –––––––––––––––– ––––––––––– –––––––––––––––––––––––––––––––
  Base                         Module
  Core                         Module
  Main                         Module
  TensorKit        122.409 MiB Module
  LinearAlgebra      4.727 MiB Module
  JLD2               3.486 MiB Module
  MPSKit             2.533 MiB Module
  TensorOperations   2.154 MiB Module
  KrylovKit          1.668 MiB Module
  Test               1.205 MiB Module
  Printf             1.092 MiB Module
  OptimKit         973.529 KiB Module
  DelimitedFiles   957.035 KiB Module
  Parameters       952.041 KiB Module
  FastClosures     879.606 KiB Module
  InteractiveUtils 326.286 KiB Module
  err                4.919 KiB 1-element Base.ExceptionStack
  ##meta#58          3.129 KiB IdDict{Any, Any} with 2 entries
  ans                1.469 KiB Markdown.MD

How do I even start debugging such a thing? The script takes a while to run before reaching this level of memory usage, so it's not easy to make a minimal working example

view this post on Zulip Sukera (Aug 24 2022 at 07:36):

have you done any allocation/performance optimization on your code? Are you doing everything in global scope or are you working in functions?

view this post on Zulip Michael Fiano (Aug 24 2022 at 07:37):

Julia 1.8.0 will use a lot more memory for LinearAlgebra

view this post on Zulip Michael Fiano (Aug 24 2022 at 07:45):

(but it should be faster)

view this post on Zulip Maarten (Aug 24 2022 at 07:58):

Sukera said:

have you done any allocation/performance optimization on your code? Are you doing everything in global scope or are you working in functions?

I have not - the code heavily relies on krylov eigenvalue solvers which temporarily may allocate a lot (it's pretty much all you see if you run pprof, it's the dominating contribution). However, after the simulation is done this shouldn't matter - GC.gc() should clear it up. The simulation uses functions.

I was hoping that whatever it will turn out to be would turn up in varinfo(all=true,imported=true,recursive=true)

view this post on Zulip Maarten (Aug 24 2022 at 07:59):

Michael Fiano said:

Julia 1.8.0 will use a lot more memory for LinearAlgebra

What exactly changed there? Varinfo claims it doesn't really use too much:

julia> varinfo(LinearAlgebra,all=true,minsize=1024,imported=true,sortby=:size,recursive=true)
  name                             size summary
  ––––––––––––––––––––––––– ––––––––––– –––––––––––––––––––––––––––––––––
  Broadcast                   6.830 MiB Module
  LinearAlgebra               4.682 MiB Module
  BLAS.LinearAlgebra          4.682 MiB Module
  BLAS                        1.216 MiB Module
  BLAS.BLAS                   1.216 MiB Module
  LAPACK                    997.065 KiB Module
  LAPACK.LAPACK             997.065 KiB Module
  ##meta#58                 359.730 KiB IdDict{Any, Any} with 174 entries
  LAPACK.##meta#58          131.451 KiB IdDict{Any, Any} with 90 entries
  BLAS.##meta#58             82.208 KiB IdDict{Any, Any} with 56 entries
  Libdl                       2.386 KiB Module
  BlasHessenbergQ             1.297 KiB UnionAll
  StridedMaybeAdjOrTransMat   1.281 KiB UnionAll
  AdjOrTransStridedMat        1.094 KiB UnionAll

view this post on Zulip Sukera (Aug 24 2022 at 07:59):

it's possible that you're running into the GC not returning the memory to the system after it was allocated

view this post on Zulip Sukera (Aug 24 2022 at 08:00):

shouldn't matter too much, as it's all virtual memory

view this post on Zulip Sukera (Aug 24 2022 at 08:00):

gimme a sec, there was an issue about this...

view this post on Zulip Michael Fiano (Aug 24 2022 at 08:00):

A lot more BLAS threads by default

view this post on Zulip Michael Fiano (Aug 24 2022 at 08:01):

Tune it with BLAS.set_num_threads() if you need to

view this post on Zulip Maarten (Aug 24 2022 at 08:02):

I set the blas threads to 1, as they conflict with julia multithreading

view this post on Zulip Sukera (Aug 24 2022 at 08:03):

there we go https://github.com/JuliaLang/julia/issues/30653

view this post on Zulip Sukera (Aug 24 2022 at 08:04):

and now that I read that issue, this one as well https://github.com/JuliaLang/julia/issues/42566

view this post on Zulip Maarten (Aug 24 2022 at 08:07):

Sukera said:

there we go https://github.com/JuliaLang/julia/issues/30653

thanks a ton!!!!

ccall(:malloc_trim, Cvoid, (Cint,), 0)

clears it right up! So it's nothing inherently wrong in my code, I'm perfectly fine with julia not returning that memory to the os

view this post on Zulip Michael Fiano (Aug 24 2022 at 08:08):

I don't think that would be Julia

view this post on Zulip Michael Fiano (Aug 24 2022 at 08:08):

That would be your libc implementation

view this post on Zulip Sukera (Aug 24 2022 at 08:14):

indeed

view this post on Zulip Sukera (Aug 24 2022 at 08:15):

julia has already called free on that memory, malloc just then was like "nah I don't need it yet, you keep it" in its internal table

view this post on Zulip Sukera (Aug 24 2022 at 08:17):

what lots of people don't know is that malloc is not just a simple table mapping pointers to ranges of memory, but a somewhat complex system for managing allocations systemwide.. and it's optimized for returning fast, so if you have another metric to optimize for, it may not be optimal

view this post on Zulip Maarten (Aug 24 2022 at 09:35):

there's still something weird going on - no other threads are running, yet I see a fluctuating summarysize. Probably nothing to worry about?

julia> for i in 1:10; @show Base.summarysize(envs)/(1024*1024*1024); end;
Base.summarysize(envs) / (1024 * 1024 * 1024) = 0.6670088469982147
Base.summarysize(envs) / (1024 * 1024 * 1024) = 3.7884459123015404
Base.summarysize(envs) / (1024 * 1024 * 1024) = 2.5368779748678207
Base.summarysize(envs) / (1024 * 1024 * 1024) = 0.6451764479279518
Base.summarysize(envs) / (1024 * 1024 * 1024) = 2.2040650248527527
Base.summarysize(envs) / (1024 * 1024 * 1024) = 2.928450770676136
Base.summarysize(envs) / (1024 * 1024 * 1024) = 0.72379120439291
Base.summarysize(envs) / (1024 * 1024 * 1024) = 0.8031216561794281
Base.summarysize(envs) / (1024 * 1024 * 1024) = 3.0689545273780823
Base.summarysize(envs) / (1024 * 1024 * 1024) = 2.5368198826909065

Last updated: Nov 06 2024 at 04:40 UTC