Page 1 of 1

Using the SPLIT command

Posted: Thu Jan 05, 2017 11:25 pm
by rbytes
TBA

Re: Using the SPLIT command

Posted: Sat Jan 07, 2017 5:48 am
by GeorgeMcGinn
I would think that TAB can be used to split a file, as next to the pipe and comma, TAB is one of the most common characters used as a delimiter.

Using the CHR$ for the TAB character should work. If not, then we need to make that an improvement, for any file that is TAB-delimited cannot be read in or split in SmartBASIC.

George.

rbytes wrote:For the Datamine project, I did a test to see if there were AASCI characters that could be used for splitting strings that would be impossible or very unlikely for a user to enter. That is pretty important in a database, because if the user should happen to type the split character in the middle of a text entry, the data would get badly scrambled.

I found that AASCII characters from 1 to 31 can be used, with the exception of 9 through 13 (which are text formatting characters like tab, line feed, etc.) All the others work fine with SPLIT, but since they produce nothing on screen, you can't see where they are located in the string to be split. For certain applications that wouldn't matter, but some coders might find it a nuisance.

A better solution is CHR$(160). It produces the same space character as the regular space (AASCI 32), so you can confirm where it is placed in your string. But regular spaces will not trigger a split, so you can allow users to enter data containing spaces. In fact, they can't type a chr$(160) on an iOS on-screen keyboard. both Space and Shift-Space type chr$(32).

***** However they can type a chr$(160) on a separate Bluetooth keyboard. I just tested my Logitech Ultrathin for Air kb, and as I expected, Alt-Space will produce chr$(160). I think the odds are pretty slim, though, that someone would enter it into their data.

Other characters I have used are €, £, and ¥, (Euro, Pound and Yen) but conceivably a smart Basic user somewhere in the world might enter those into a data field. The same with any accented letter.

Here is the test program:

Code: Select all

'Split Test by rbytes
'Some characters below 32 in the ASCII table are rarely if ever used. Most don't print anything. Yet they can be used the SPLIT command to separate data elements. The only characters in this range that you shouldn't use are 9 through 13. They are text formatting characters, and will cause problems. 160 is an even better choice, since it looks like a space.
A$="Let's test a rarely-used character for use in the split command"
S$=" "
SPLIT A$ TO M$,N WITH S$
S$=CHR$(160)  ' a little-used character for use in splitting strings. Try some others.
A$=""
FOR t=0 TO 10
  A$&=M$(t)&" * * * "&S$
NEXT T
SPLIT A$ TO M$,N WITH S$
PRINT "A$ = ";A$
PRINT
PRINT "M$ = "
FOR t=0 TO 10
  PRINT M$(t)
NEXT T
PRINT
PRINT "Spaces can be used within individual data elements (eg record fields)"

Re: Using the SPLIT command

Posted: Sat Jan 07, 2017 6:12 am
by rbytes
You are probably right. I thought it might be best to avoid ASCII 9-13 because they might cause problems in viewing delimited strings on the debug page or showing them in a data dump. If those issues aren't important to the coder, then I'm pretty sure any of the characters 9-13 can be used with SPLIT. Try entering them as the split character in my little program, and you can find out for sure. I'll try too. :)

Re: Using the SPLIT command

Posted: Sun Jan 08, 2017 10:09 pm
by rbytes
The results of my testing confirm that CHR$ 10,12 and 13 can be used as the split character to split strings using the SPLIT command. However, when they are inserted into a string, it will become unreadable on the debug screen. You will see a left quote mark and the part of the string up to the first split character, and nothing beyond that. CHR$ 10 is a line feed, so that is to be expected. CHR$ 12 (form feed) and CHR$ 13 (carriage return) have the same effect as CHR$ 10 when the string is displayed in debug. See the attached image of the debug screen.

CHR$ 9 (tab) also works fine as the split character, and does not affect the debug screen display as severely. You will simply see extra spacing within the string where the tab characters are located. All the other characters between AASCI 1 and 32 work as split characters, but are invisible when you view the string in debug, so it is impossible to tell where they are.

CHR$(160) seems to be the best character for splitting, assuming you have the choice. ;)

Re: Using the SPLIT command

Posted: Wed Jan 11, 2017 4:44 am
by GeorgeMcGinn
The same does happen for the CR character. I noticed when using it it appeared as a ".