I've heard that Enzyme.jl supports array mutation and I'm trying to refine my understanding of what that means. The Enzyme docs get fairly dense pretty quickly.
And to hit the XY problem on the head, I'm trying to do this in the context of some simulations which add/remove elements from arrays and trying to see if I can get the benefit of AD in solving for parameters. Here's a contrived example, which Enzyme fails on with the error below:
function f(cashflows,r)
assets = Float64[]
for cf in cfs
push!(assets,cashflows)
assets .*= (1+r)
end
sum(assets)
end
gradient(Forward,x->f([1,1,1,1],x),0.05)
errors with:
MethodError: no method matching similar(::Float64)
So I think my questions are:
tmp
seems overly contrived that I'm not sure where I'd see that type of thing in practice? This example with the tmp seems overly contrived that I'm not sure where I'd see that type of thing in practice?
All package code does that because it's the most performant way to do things. Not sure it's contrived given it's one of the most common coding patterns to pass in a cache struct :sweat_smile:. Normally you'd put it to the front and !
it, then also normally you'd use the whole thing, but this is showing some of the edge cases and how to handle them associated with mutation so it's doing a bit more complicated of a case than the standard.
XY problem question: Any experience doing simulations of asset and liabilites and getting AD to work on the end-to-end simulation? (I don't really expect answers here)
In terms of SDEs, yes quite a bit.
What type of mutations does enzyme support and not support?
I'm not sure there are any that aren't generally supported? Mutable structs, arrays, etc. are all fine. Memory
might not be supported yet, but that's because it's not even released yet :shrug:, but that should get support when it's released. Mutation is actually the easy thing for Enzyme. Allocation (i.e. handling GC) is harder.
In fact with Enzyme, generally if you're having issues the best thing to do is to remove allocations and rely on mutation more, given that the thing it trips on the most is autodiff of the GC
Your problem there is just that you wrote down a description that doesn't make mathematical sense. The gradient isn't defined on a scalar function. The gradient is a vector-function operation. Yes it's homeomorphic to the operation defined on a vector of one thing, but that has a very different computational representation. So you should be asking for the derivative if you want the derivative, not asking for the gradient if you want the derivative.
I think it makes more sense... a way to read this:
Activity of temporary storage. If you pass in any temporary storage which may be involved in an active computation to a function you want to differentiate, you must also pass in a duplicated temporary storage for use in computing the derivatives.
Is that you need to avoid allocating memory inside the function and the tmp
array is really just the usual pattern for pre-allocating some storage for use in the function.
You are right that the lack of !
and not putting the temporary container as the first argument threw me off here.
I was able to get my example to work the way I inteded with:
function f(cfs,assets,r)
for i in eachindex(cfs)
push!(assets,cfs[i])
assets .*= (1+r)
end
sum(assets)
end
Enzyme.autodiff(Reverse, f, Const(fill(1.,4)),Duplicated(Vector{Float64}(undef, 1), Vector{Float64}(undef, 1)), Active(0.05))
Last updated: Nov 06 2024 at 04:40 UTC