Is there a way of finding out how many bytes remain in an IO stream? I want to be able to work out if e.g.peek(io, Int)
will error without calling it.
I think this can probably only be done by trying to read the file and seeing what happens, then resetting the seek
point (which is how peek
works), but I thought I'd ask anyway.
You can use bytesavailable
maybe?
Returns zero for me for a nonzero file
ohhhh, so interesting that you should mention this... I think this may be a bug
are you on 1.6?
I saw this also, maybe we should open an issue
Ah, right! I ran into this the other day.
ok, we definitely need an issue
let me ask on slack first
It might be correct though, since the io stream needs to block before you can read?
no, I've seen it give 0 and then immediately readavailable
returns something of length
Yea but that is because readavailable
does block and fill the buffer I think: https://github.com/JuliaLang/julia/blob/2eeef2e231a55cac770543b6dd673e349adfd797/base/iostream.jl#L381
I'm confused, shouldn't readavailable
always return the number of bytes indicated before calling it by bytesavailable
?
that was my understanding of what they're supposed to do
bytesavailable
seems to work on IOBuffer
btw, though it may be worth noting that IOBuffer
never blocks
Yea I don't know actually.
I believe this is something of a MWE
julia> open(io -> write(io, "what up"), "testfile.txt", "w+")
7
julia> f = open("testfile.txt")
IOStream(<file testfile.txt>)
julia> bytesavailable(f)
0
julia> readavailable(f)
7-element Vector{UInt8}:
0x77
0x68
0x61
0x74
0x20
0x75
0x70
I believe bytesavailable
should return 7
# num bytes available without blocking
It is written in the source code.
Yup, that's what the docs say, and the above does not block, so this looks like a bug
So, it's not the number of bytes from here till the end of the stream.
Mmmm....
You need to block to readavailable
I mean, you can not assume that readavailable
will give you bytesavailable
bytes I think
why not?
Because that could have changed between calling bytesavailable
and readavailable
.
well yeah, ok fair enough, but we are seeing it in examples where that is not the case
Hmmm.... I do not quite understand what is intended behaviour, but code itself is rather straightforward.
bytesavailable(s::IOStream) = @_lock_ios s ccall(:jl_nb_available, Int32, (Ptr{Cvoid},), s.ios)
function readavailable(s::IOStream)
lock(s.lock)
nb = ccall(:jl_nb_available, Int32, (Ptr{Cvoid},), s.ios)
if nb == 0
ccall(:ios_fillbuf, Cssize_t, (Ptr{Cvoid},), s.ios)
nb = ccall(:jl_nb_available, Int32, (Ptr{Cvoid},), s.ios)
end
a = Vector{UInt8}(undef, nb)
nr = ccall(:ios_readall, Csize_t, (Ptr{Cvoid}, Ptr{Cvoid}, Csize_t), s, a, nb)
if nr != nb
unlock(s.lock)
throw(EOFError())
end
unlock(s.lock)
return a
end
What example? In your example you get mroe bytes than bytesavailable
says so I think that is fine.
So to readavailable
it verify number of available bytes and if it is zero it fills the buffer and update this number.
So, I think, correct meaning of bytesavailable
is not the number of bytes available in iostream
, but number of bytes available in the intermediate buffer.
Purely technical thing.
Yeah, I noticed that it's a direct ccall, kind of makes me suspect it's an intended thing, but I don't know why
well, regardless, I find it all very confusing
"number of bytes available before blocking" sounds very straightforward, and it's clearly not giving me that
It depends on how you define blocking
For me, blocking
means that you temporarily "own" file handler to read next chunk of data from the file.
well I did not do an @async
or @spawn
and it's not stopping the execution, so that seems to me like it's not blocking
Yea I am pretty sure that is what is meant.
What are you using it for? I thought I would have use for it the other day but in the end I only use read
and write
with async tasks.
https://github.com/JuliaLang/julia/issues/24526 has some review of this btw
Ok, I just had a conversation on slack, I think our problem is just that we are conflating different types of blocking
readavailable seems to normally only block on relatively fast stuff
I feel like bytesavailable
isn't what I want. I want to know if there are at least N bytes remaining in the file or stream, but I don't care how many are currently buffered. When N = 1, this is equivalent to eof(io)
, but I want to be able to pick other Ns.
Last updated: Dec 28 2024 at 04:38 UTC