Is there a AbstractVector
implementation somewhere that takes a vector of vectors and returns a single reduce(vcat, vecs)
without copying the data?
Are the vectors all the same size?
If so, any of these https://github.com/mcabbott/LazyStack.jl?tab=readme-ov-file#other-stack-like-packages combined with vec
.
If not, then maybe I only know of CatViews.jl
We are actually trying to reduce the vertices of geometries lazily here:
https://github.com/JuliaGeometry/Meshes.jl/issues/1127
But it seems that this is not possible?
It seems like a really hard problem. As a practical matter I expect that the only solution is to have MultiPolytope
store its vertices in this form (i.e. with the vertices already inlined but with a wrapper that gives it the interface of an array of arrays). I don't know if that's compatible with what you need the representation to do elsewhere, at the very least I'm guessing it's not compatible with how the individual primitives are stored. Perhaps a better approach would be trying to reduce the number of times this needs to be done?
In case it's not already abundantly clear: Array{<:AbstractArray}
simply does not store data this way, the elements are actually pointers, so avoiding copying is impossible in the general case. That pretty much leaves you with a wrapper type that makes an Array
behave like an Array{<:AbstractArray}
, but that'll inevitably affect your implementation of MultiPolytope
.
Does it need to be a Vector{Vector}
or can it be a Tuple of vectors?
i.e. is it really large, and is its size stable?
The Multi
geometry represents regions in the world like countries that are made of polygons in the continent plus polygons in the ocean (islands). It is common to have as many as a couple of 100s of polygons sometimes inside a Multi
geom.
Perhaps what you want is more like a lazy iterator over vertices? I think that would be pretty efficient even if you did it very naively. Collecting them efficiently, by contrast, is hard.
We thought of lazy iterators, but it is not user-friendly. Perhaps we will need to introduce a new function to return interators.
So vertices
remains as is, and a new vertexiterator
returns an iterator
If the primitives in MultiPolytope
were all of the same type, you could efficiently calculate the address of each vertex, in which case you could return a wrapper object that behaves exactly like your statement, but I'm guessing this is not the case. If that's not the case you could potentially index the primitive types and then do something like CatArray
from LazyArrays.jl but that sounds like way more trouble than a lazy iterator.
Might also be worth reminding yourself that it's really easy for users to get your explicit array out of a lazy iterator, i.e. collect(lazyiterator)
.
All the comments are really helpful. Thank you. I think we will go with two functions in the API: vertices
and vertexiterator
.
That way we introduce a new lazy function in a non-breaking way.
Júlio Hoffimann has marked this topic as resolved.
I like this problem because it seems like a paradigmatic example of one of maybe 5 problems that are all that seem to occur over and over again in software engineering: procedure is good for representation , but bad for representation , but procedure is good for representation but bad for representation , and converting between and is expensive, then how can I compute both and ? I think it was weinberg or someone like that who said that the peculiar feature of reality is plurality of description.
After this object is built, would you want to be able to make insertions into the individual sub-vectors later on, or would its structure be "frozen" after it's created?
that is, do you ever need to push!
/ append!
?
These geometries are often "frozen" after creation. We can modify the underlying vectors of vertices, but that is not what happens in most algorithms. We tend to implement algorithms that avoid mutation (thinking of parallel jobs).
If that's the case, you might be able to do something like
struct FlattenedVector{T} <: AbstractVector{T}
vov::Vector{Vector{T}}
inds::Vector{Tuple{Int, Int}}
len::Int
function FlattenedVector(vov::Vector{Vector{T}}) where {T}
len = sum(length, vov)
inds = Vector{Tuple{Int, Int}}(undef, len)
ind = 1
for i in eachindex(vov)
for j ∈ eachindex(vov[i])
inds[ind] = (i, j)
ind += 1
end
end
new{T}(vov, inds, len)
end
end
Base.size(v::FlattenedVector) = (v.len,)
function Base.getindex(v::FlattenedVector, ind::Integer)
(i, j) = v.inds[ind]
v.vov[i][j]
end
function Base.setindex!(v::FlattenedVector, x, ind::Integer)
(i, j) = v.inds[ind]
v.vov[i][j] = x
v
end
I don't know if this sort of construction is really worth it for you use-case but I think it's at least a possibility.
The main challenge we face here is that we don't want to materialize the vov
field as a vector of vector. We want to just consume the underlying geometries, querying their vectors. So the input of the function is a vector of geometries, and the output is a vector of concatenated vertices.
Just a hipshot suggestion for the initial question without having digested the conversation after: https://github.com/SciML/RecursiveArrayTools.jl
DrChainsaw said:
without having digested the conversation
That's covered in the link given above by @Michael Abbott
FWIW CatViews.jl is slow, as noted in the README of LazyStack.jl. The type instability of the registered version 1.0 was fixed five years ago in 1.0.1. But it appears that 1.0.1 was never registered.
Is the inner vector type of static size, e.g., similar to a static array? If so, reshape(reinterpret(Float64, vector_of_vectors), length(first(vector_of_vectors)), :)
gives you a matrix without copying
Last updated: Nov 22 2024 at 04:41 UTC