Is there any way to detect or get a callback when the world age is advanced?
Context is that I'm generating struct definitions from serialized data. I was previously using Dicts
but users complained it was too slow. It is workable as is with either caveat that the user is responsible for hitting top level or using Base.invokelatest
, but the former is not so user friendly and the latter has significant overhead (e.g 100 ns becomes 300 ns).
Ideally I would like to either hide the defined structs until the world age advances (or if there is some other trigger for when they are available) or (preferably) have some hook which removes Base.invokelatest
when the world age has advanced.
Not that I know of
generating struct definitions from serialized data
Are you certain that you need to generate the definitions at runtime, not just parse them into existing structs?
Unfortunately yes :(
It is a horrible format where the struct definitions are embedded in the data
So the approach is to take a pass over the data and generate the struct definitions.
I.. uh.. huh.
I'd advise to run, but that doesn't sound like an option
is the format really undefined, or "just" arbitrarily nested, but consisting of some simple base blocks everything is built out of?
Nope, it is basically C structs (unions, dynamically sized arrays, the whole shebang) with the struct definition basically embedded in a header
dynamically sized arrays stored inline in the struct? hoh boy
do you have a concrete example?
Maybe dynamic sized is sloppy terminology. What i mean is that the size is not a hardcoded constant but is instead given as a member of the struct (which ofc must be placed before the array or else the struct would be unreadble even in C I guess).
but is the array actually stored inline in the C struct or rather as a pointer to a block of memory of the correct size?
No idea, but in the versions I have a serialized so all data is in a consecutive stream
that sounds like a serialization strategy, rather than actually how the data is laid out in memory in C
you don't want to serialize pointers for someone else to consume - they can't do anything with your pointers after all
Writing the decoding was pretty straight forward. Main problem is that nested Dicts turned out too slow. I have considered NamedTuples, but I'd like to be able to dispatch on the result so I though I should go all the way and generate structs.
ok, let me ask differently - are the number of fields in the C structs fixed?
on a per-struct basis
that sounds like a serialization strategy, rather than actually how the data is laid out in memory in C
Yes, I don't care how the data is layed out in the C application.
Yes, I don't care how the data is layed out in the C application.
but julia does, since that's probably what you want to mimic for deserializing (albeit not 1:1, since it sounds like you want to parse an array of known length and just have a struct with a Vector
typed field, instead of a Ptr
)
ok, let me ask differently - are the number of fields in the C structs fixed?
Yes, for a certain type of a struct. There are different types and even different versions of the same type (e.g. one form rev A of the software and then the same struct in rev B of the software).
ok, then that's all you need for the structs on the julia side, no?
but julia does, since that's probably what you want to mimic for deserializing
Absolutely, and that part I have gotten down already.
ok, then that's all you need for the structs on the julia side, no?
Yup, that is what the current code does.
for any given revision, the number of fields and their sizes is known
what I'm saying is that you can generate the structs for each version ahead of time, since their sizes are known
for any given revision, the number of fields and their sizes is known
Sure. For example, in the current code I cache the struct definitions based on revision and identity so the number of struct definitions is much smaller than the total number of structs I need to deserialize.
as in, the bit sizes. That the arras are of arbitrary length doesn't really matter, since they're (from what you told me) not stored inline with the struct itself
what I'm saying is that you can generate the structs for each version ahead of time, since their sizes are known
Yes, this is what I'm doing now. I'm investigating if there is some way I can improve the user experience here since I guarantee that the first thing a user will do when I release this improvement is to call both the struct generating call as well as the deserialization call in the same function.
Why are you insisting on generating the struct when the user requests it, instead of ahead of time, before the user even gets your code?
You seemingly know the library & the struct sizes that are possible, can't you generate the struct definitions way before then and just load them as regular julia source code?
Why are you insisting on generating the struct when the user requests it, instead of ahead of time, before the user even gets your code?
There are way too many different possible structs and software revisions to do this practically.
Are your users expected to work with two different revisions of the software at the same time?
Yes. In particular they are expected to get a blob of data to analyze without any knowledge about which version of the software generated it.
..Oo
short of generating a lookup table based on some version identifier early on in the stream, I can't think of how you'd even do that in any language
dynamic languages like julia or python have an advantage in that they don't generally have to care about the layout of stuff in memory ("just type it Any
and look it up at runtime"), but that leads to bad performance, as you found out
I agree. I guess Base.invokelatest
still gives me about a 200 or so times speedup compared to the Dict
approach so it is a decent fallback.
What you're describing is more or less arbitrary ABI conversion, which in general is not possible without a stringent contract about the stream of data and being VERY CAREFUL not to have incompatible regressions.
Are there any statically compiled libraries doing the same task you're trying to do?
They'll generally have the same limitations as julia here, if you want performance.
Yes, but they are doing it in runtime as well and are about 1000 times slower than my package
Either way, I'd suggest structuring your code in such a way that users themselves don't have to run either invokelatest
nor generate the types themselves.
Yes, but they are doing it in runtime as well and are about 1000 times slower than my package
Then you're running into fundamentally unsolved problems in computer engineering, I'm afraid
you can consider yourself lucky that the approach is already as fast as it is
I'm fully aware that there is no magic to get the types known at runtime.
The thing that annoys me just a little bit is that 99% users will run it like this from the REPL
julia> result = readfile("somefile.blob")
julia> plot(get(result, :thisOrThatStruct.somedata)
Where generating the structs during readfile
but not using them until get
is called would work without performance penalty.
I am very lucky indeed. Just as the Julia creators I'm also greedy :)
is readfile
your function?
if not, I'd suggest putting the generation code there
Yes, bad name example maybe. Real function has a more obtuse name
either way, you will not get around world age
there is no world age callback because the increased world age is only visible to existing code once you return to top level scope
so even if you define new structs with eval
, no existing code will ever be able to call that, unless you go the invokelatest
route or return to top level scope
if we had a callback like that, we'd more or less have to allow arbitrary code to run at each eval
.. which would be very scary and a great source of indeterminism
Yeah, I realize there might be some hard constraints here to what looks like a solveable problem from the outside. I mean, someone must know the world age or else it would not be possible to assert on it.
An ugly solution I have thought about is to just have a try catch, maybe come up with some way to cache the result of the try-catch from the entry point if possible.
that would not help you - you'd have to return to the top level again or call invokelatest
in the catch
block (which is going to be slower than just calling invokelatest
directly, since you have to try
first, only offering a speedup when you have the struct already)
if you want a taste for why this is a difficult problem and are not afraid of dissertations, I can recommend Jeff Bezanson's Phd thesis and another paper that's formalized the world age mechanism into a calculus
It would help me in the 99% of cases when people do as above. For the remaining 1% I could print a warning that things might be a bit slower than they have to be. Fully aware that I can't cheat the wordage here.
Anyways, I think I can work with what I have here. Maybe just have a manual command to clean out invokelatest
will be just good enough.
Couldn't you also use named tuples everywhere?
I imagine that would prevent convenient dispatch
I imagine that would prevent convenient dispatch
This is indeed the main reason why I didn't go for them. I was also abit worried by exploding compile times (I have seen structs which cover a few pages when fully unrolled) but TBH I don't know if structs really have an advantage over tuples w.r.t this. I think the generation code I have can be easily converted to making a named tuple so I should perhaps just give it a shot. That will have to be an exercise for monday though...
at some point, large structs do perform better than the equivalent tuple would, I think
Fwiw, I ended up doing this for now:
currentworld = ccall(:jl_get_tls_world_age, UInt, ())
all(methods(f)) do fmethod
fmethod.primary_world <= currentworld
end
Context: Firstly, this was never about trying to cheat the world age boundary. Its just providing convenience to end users so I don't need to educate them about world ages. Previously the cost of this was perpetual slow speed, and now it is me having to maintain things like the above shenanigans which seems like a good tradeoff for the speed improvement I got.
The above code is only called for functions wrapped in another callable struct. Due to the nature of the problem, I maintain a cache of metadata to struct constructors so as soon as the expression above returns true, the constructors in the cache are unwrapped. If the check fails, it does Base.invokelatest
.
There is also a way to call the constructors in the wrapped struct so that the above check is bypassed. Some mock examples:
julia> result = read_horrible_struct_format_file("some.file") # this generates struct definitions
julia> plot(get(result.someData)) # Will do the above world age check, and unwrap all constructors in the cache
julia> plot(get(result.someOtherData)) # No checks performed, call constructor directly
julia> function readandplot(file)
result = read_horrible_struct_format_file(file)
plot(get(result.someData)) # Warns the user about slow speed, including some tips for mitigation, then does Base.invokelatest
plot(get(NoWorldCheck(), result.someData)) # Just calls Base.invokelatest
end
The package which does this is pretty much a "leaf" package meant for interactive analysis, so the first example above is how almost all users use it.
Last updated: Nov 06 2024 at 04:40 UTC