Stream: helpdesk (published)

Topic: ✔ When to use IOBuffer?


view this post on Zulip Júlio Hoffimann (Nov 22 2025 at 11:07):

Reading from io = open(fname) is much slower than reading from IOBuffer(read(io)). I understand that this is because all the bytes are copied to RAM.

If we can guarantee that all bytes in io fit in RAM, then it is always better to use IOBuffer for speed?

view this post on Zulip Sukera (Nov 22 2025 at 12:04):

How are you measuring? If you exclude the time it takes to reads the data into RAM for the IOBuffer from your measurement, of course it's faster. Only the processing time is left after all. However, your overall application likely is not magically fast, because the data still has to be read into RAM.

view this post on Zulip Júlio Hoffimann (Nov 22 2025 at 12:05):

I am including the time it takes to create the IOBuffer and it is still faster.

view this post on Zulip Sukera (Nov 22 2025 at 12:41):

Interesting :thinking: do you have an example?

view this post on Zulip Júlio Hoffimann (Nov 22 2025 at 12:53):

Can try to produce one with more time later. Currently trying to debug another issue.

view this post on Zulip Júlio Hoffimann (Nov 22 2025 at 14:30):

The example could be as simple as reading a matrix of size 2500×25000 stored in the IO:

for j in 1:25000
  for i in 1:2500
    read(io, Float64)
  end
end

view this post on Zulip Jakob Nybo Andersen (Nov 22 2025 at 17:01):

That is almost certainly because the file object (IOStream) has some overhead, mostly from taking a lock associated with the file. You can match the performance of the IOBuffer by buffering the file object in Julia. That is, you make a small buffer, e.g. a 16 KiB Vector{UInt8}, then read into that

view this post on Zulip Júlio Hoffimann (Nov 22 2025 at 17:02):

You mean there is a method of read that accepts a small buffer to mutate?

view this post on Zulip Gunnar Farnebäck (Nov 22 2025 at 17:34):

help?> read!
search: read! read real rpad readdir Threads isready prepend! readeach readline readlink replace!

  read!(stream::IO, array::AbstractArray)
  read!(filename::AbstractString, array::AbstractArray)

  Read binary data from an I/O stream or file, filling in array.

view this post on Zulip Jakob Nybo Andersen (Nov 22 2025 at 17:49):

For what it's worth, I'm quite unhappy with the API that Base provides, which is why I made the package BufferIO.jl to improve this area of Julia

view this post on Zulip Jakob Nybo Andersen (Nov 22 2025 at 17:49):

For a more mature, though less efficient alternative, look at BufferedStreams.jl

view this post on Zulip Júlio Hoffimann (Nov 22 2025 at 17:54):

Nice! One more thing to learn when I find some time :slight_smile:

view this post on Zulip Notification Bot (Nov 22 2025 at 19:09):

Júlio Hoffimann has marked this topic as resolved.

view this post on Zulip Nathan Zimmerberg (Nov 24 2025 at 20:14):

There is also InputBuffers.jl which I made as an even faster less buggy version of IOBuffer. I also use this with a FileArray type to avoid using IOStream or mmap, which I have had problems with. https://github.com/JuliaIO/ZipArchives.jl/blob/c48c367b6208c9733ff0e2f3431a6b8d13939126/test/test_file-array.jl#L22

view this post on Zulip Júlio Hoffimann (Nov 24 2025 at 20:18):

Still fighting with IO here, but seeing some progress.

view this post on Zulip Nathan Zimmerberg (Nov 24 2025 at 20:28):

If you have the memory it will almost always be nicer to just read everything into memory first and do your your processing on a big Vector{UInt8}. Beyond performance doing this greatly simplifies the logic for error handling, because any IO errors can be handled up front, also, unlike IO, the Vector interface is well documented.


Last updated: Nov 27 2025 at 04:44 UTC