#2518 Truncation Buf.eachLine() and Buf.readAllLines()

SlimerDude Wed 30 Mar 2016

I discovered recently, whilst trying to read and parse a SQL file, that Buf.eachLine() and Buf.readAllLines() both truncate lines at 4096 chars.

str := "".padl(5000, 'a') + "\n" + "".padl(5000, 'b') + "\n"

echo("\n eachLine:")
str.toBuf.eachLine {
    echo("Line is ${it.size} chars")
}

echo("\n readAllLines:")
str.toBuf.readAllLines.each {
    echo("Line is ${it.size} chars")
}

 eachLine:
Line is 4096 chars
Line is 904 chars
Line is 4096 chars
Line is 904 chars

 readAllLines:
Line is 4096 chars
Line is 904 chars
Line is 4096 chars
Line is 904 chars

There is no mention of this in the docs, and no means to increase the limit.

Only by looking at the Java source did I find they both made a call to in.readLine() with the default max length of 4096.

While I understand the need for a limit when dealing with streams, it seems unnecessary when using Bufs when everything is already in memory. Though it would be nice to keep the Buf and Stream APIs the same.

A compromise may be to propagate the truncation limit down to eachLine() and readAllLines(), as in:

Str[] readAllLines(Int? max := 4096)

Void eachLine(|Str line| f, Int? max := 4096)

The change should also be backwards compatible.

Work Around

For now you can use readLine() instead. It's a lot more code and finicky to use, but you are able to specify a truncation limit.

echo("\n readLine:")
buf := str.toBuf
line := null as Str
while ((line = buf.in.readLine(Int.maxVal)) != null) {
    echo("Line is ${line.size} chars")
}

 readLine:
Line is 5000 chars
Line is 5000 chars

Side Note

A lot of the docs for Buf simply say:

Convenience for InStream.XXX

Making you navigate to InStream to read about it! It'd be really nice if those few lines of documentation from InStream could be copied over to Buf!

SlimerDude Wed 30 Mar 2016

P.S. Here's a Top Tip when converting existing code from using eachLine() to readLine()...

Make sure you change any return statements to continue! Otherwise the return will now exit the method, not the closure!

str.toBuf.eachLine |line| {
    if (...) {
        return  // <--- from this
    }
}
buf := str.toBuf
line := null as Str
while ((line = buf.in.readLine(Int.maxVal)) != null) {
    if (...) {
        continue  // <--- to this
    }
}

Doh!

brian Wed 6 Apr 2016

Definitely not good to have that omitted from the docs - I pushed a fix for that.

Adding maxLine to eachLine though won't work because the closure param needs at the end - that is why I originally designed it like it is

SlimerDude Wed 6 Apr 2016

eachLine() won't work because the closure param needs at the end

Oh yeah, good point. Nothing stopping maxLine from being added to readAllLines() though! :)

brian Fri 16 Sep 2016

After a bit of internal discussion as this is a lingering issue we changed the default behavior to be max of null for readLine and associated helper methods like eachLine. It always seems to cause odd bugs which seem to out weight the benefits of trying to be safe in memory consumption (we do have methods like readAllStr and readAllBuf anyhow).

Login or Signup to reply.