Hi,
I’ve tried for a while and can’t find a solution to my problem. I’m a beginner and a learner.
My file contains 6000 text lines.
I use FILE N$ READLINE X$
I want to start the redline to only start to read from say line 5000 and onwards, not from the beginning of the file
FILE N$ READLINE X$, works, however it takes a very long time to process before it gets to line 5000. I’m thinking if it starts to read from line 5000 then processing time will reduce significantly.
My question how to use FILE N$ READLINE X$ to start reading from line 5000 (or the n’th line in the file)
Would appreciate if anyone can help with a sample routine.
Reading nth line in a file
Re: Reading nth line in a file
I recently tried to figure out how to do this but ended up settling for doing a bunch of READLINEs until I reached the line I wanted. However, your post got me thinking about the problem again and I realized that if you know the file size at the line you want, you can use FILE N$ SETPOS X.
I briefly tested the below function on a Unicode data file I have that has a file size of 955,710 and it seems to work, but do your own testing. Memory may be an issue depending on the file size, as I'm not sure what the limits are for FILE N$ READDIM M, N on various devices.
I briefly tested the below function on a Unicode data file I have that has a file size of 955,710 and it seems to work, but do your own testing. Memory may be an issue depending on the file size, as I'm not sure what the limits are for FILE N$ READDIM M, N on various devices.
Code: Select all
n$ = "/data/unicode_table"
FILE_GOTO_LINE(n$, 5000)
FILE n$ READLINE line$
PRINT line$
DEF FILE_GOTO_LINE (name$, line)
FILE name$ SETPOS 0
IF line = 0 THEN RETURN
FILE name$ READDIM bytes, nBytes
lineCount = 0
FOR i = 0 TO nBytes-1
IF bytes(i) = 10 THEN ' newline character
lineCount += 1
IF lineCount = line THEN BREAK i
END IF
NEXT i
FILE name$ SETPOS i+1
END DEF
Re: Reading nth line in a file
Follow-up thought: If memory is an issue or you just want a safer way to do this, modify the above function to use FILE N$ READDIM M, N, K in a loop, reading K bytes at a time. Then you can keep overwriting the bytes() variable as you count newline characters, keeping track of the total running file size until you reach the target line. I believe all READDIMs begin the read from the current file position, so this means you don't have to read a very large file in its entirety to an array.
Re: Reading nth line in a file
Hi Matt,
Thank you so much, works great.
Regards,
Thank you so much, works great.
Regards,
Re: Reading nth line in a file
You're welcome. However, after some testing I determined that using a bunch of READLINE calls is by far the fastest method.
Here is my test file that times three different FILE_SETLINE functions:
Setting the file position to line 5 results in the following times:
Setting the file position to line 5000 results in the following times:
Here is my test file that times three different FILE_SETLINE functions:
Code: Select all
DEF FILE_SETLINE_RD (file$, line)
FILE file$ SETPOS 0
IF line = 0 THEN RETURN
FILE file$ READDIM bytes, nBytes
lineCount = 0
FOR b = 0 TO nBytes-1
IF bytes(b) = 10 THEN ' newline character
lineCount += 1
IF lineCount = line THEN BREAK b
END IF
NEXT b
FILE file$ SETPOS b+1
END DEF
FILE_SETLINE_RDK.bufSize = 10000
DEF FILE_SETLINE_RDK (file$, line)
FILE file$ SETPOS 0
IF line = 0 THEN RETURN
byteCount = 0
lineCount = 0
DO
FILE file$ READDIM bytes, nBytes, bufSize
FOR b = 0 TO nBytes-1
IF bytes(b) = 10 THEN ' newline character
lineCount += 1
IF lineCount = line THEN BREAK b
END IF
NEXT b
byteCount += b
UNTIL lineCount = line OR FILE_END(file$)
FILE file$ SETPOS byteCount+1
END DEF
DEF FILE_SETLINE_RL (file$, line)
FILE file$ SETPOS 0
FOR i = 0 TO line-1
FILE file$ READLINE line$
NEXT i
END DEF
n$ = "/data/unicode_table"
x = 5000
t = TIME()
FILE_SETLINE_RD(n$,x)
t = TIME() - t
FILE n$ READLINE lineRD$
PRINT "READDIM ="; t
t = TIME()
FILE_SETLINE_RDK(n$,x)
t = TIME() - t
FILE n$ READLINE lineRDK$
PRINT "READDIM K ="; t
t = TIME()
FILE_SETLINE_RL(n$,x)
t = TIME() - t
FILE n$ READLINE lineRL$
PRINT "READLINE ="; t
PRINT ""
IF lineRD$ = lineRDK$ AND lineRD$ = lineRL$ THEN
PRINT "Lines match ✅"
PRINT lineRL$
ELSE
PRINT "Lines don't match ⛔"
PRINT lineRD$
PRINT lineRDK$
PRINT lineRL$
END IF
Code: Select all
READDIM = 0.048583
READDIM K = 0.000683069
READLINE = 0.000148892
Code: Select all
READDIM = 0.191181
READDIM K = 0.153618
READLINE = 0.062402