Using the SPLIT command

Post Reply
User avatar
rbytes
Posts: 1338
Joined: Sun May 31, 2015 12:11 am
My devices: iPhone 11 Pro Max
iPad Pro 11
MacBook
Dell Inspiron laptop
CHUWI Plus 10 convertible Windows/Android tablet
Location: Calgary, Canada
Flag: Canada
Contact:

Using the SPLIT command

Post by rbytes »

TBA
Last edited by rbytes on Mon Feb 06, 2017 12:47 am, edited 1 time in total.
The only thing that gets me down is gravity...

User avatar
GeorgeMcGinn
Posts: 435
Joined: Sat Sep 10, 2016 6:37 am
My devices: IPad Pro 10.5in
IMac
Linux i386
Windows 7 & 10
Location: Venice, FL
Flag: United States of America
Contact:

Re: Using the SPLIT command

Post by GeorgeMcGinn »

I would think that TAB can be used to split a file, as next to the pipe and comma, TAB is one of the most common characters used as a delimiter.

Using the CHR$ for the TAB character should work. If not, then we need to make that an improvement, for any file that is TAB-delimited cannot be read in or split in SmartBASIC.

George.

rbytes wrote:For the Datamine project, I did a test to see if there were AASCI characters that could be used for splitting strings that would be impossible or very unlikely for a user to enter. That is pretty important in a database, because if the user should happen to type the split character in the middle of a text entry, the data would get badly scrambled.

I found that AASCII characters from 1 to 31 can be used, with the exception of 9 through 13 (which are text formatting characters like tab, line feed, etc.) All the others work fine with SPLIT, but since they produce nothing on screen, you can't see where they are located in the string to be split. For certain applications that wouldn't matter, but some coders might find it a nuisance.

A better solution is CHR$(160). It produces the same space character as the regular space (AASCI 32), so you can confirm where it is placed in your string. But regular spaces will not trigger a split, so you can allow users to enter data containing spaces. In fact, they can't type a chr$(160) on an iOS on-screen keyboard. both Space and Shift-Space type chr$(32).

***** However they can type a chr$(160) on a separate Bluetooth keyboard. I just tested my Logitech Ultrathin for Air kb, and as I expected, Alt-Space will produce chr$(160). I think the odds are pretty slim, though, that someone would enter it into their data.

Other characters I have used are €, £, and ¥, (Euro, Pound and Yen) but conceivably a smart Basic user somewhere in the world might enter those into a data field. The same with any accented letter.

Here is the test program:

Code: Select all

'Split Test by rbytes
'Some characters below 32 in the ASCII table are rarely if ever used. Most don't print anything. Yet they can be used the SPLIT command to separate data elements. The only characters in this range that you shouldn't use are 9 through 13. They are text formatting characters, and will cause problems. 160 is an even better choice, since it looks like a space.
A$="Let's test a rarely-used character for use in the split command"
S$=" "
SPLIT A$ TO M$,N WITH S$
S$=CHR$(160)  ' a little-used character for use in splitting strings. Try some others.
A$=""
FOR t=0 TO 10
  A$&=M$(t)&" * * * "&S$
NEXT T
SPLIT A$ TO M$,N WITH S$
PRINT "A$ = ";A$
PRINT
PRINT "M$ = "
FOR t=0 TO 10
  PRINT M$(t)
NEXT T
PRINT
PRINT "Spaces can be used within individual data elements (eg record fields)"
George McGinn
Computer Scientist/Cosmologist/Writer/Photographer
Member: IEEE, IEEE Computer Society
IEEE Sensors Council & IoT Technical Community
American Association for the Advancement of Science (AAAS)

User avatar
rbytes
Posts: 1338
Joined: Sun May 31, 2015 12:11 am
My devices: iPhone 11 Pro Max
iPad Pro 11
MacBook
Dell Inspiron laptop
CHUWI Plus 10 convertible Windows/Android tablet
Location: Calgary, Canada
Flag: Canada
Contact:

Re: Using the SPLIT command

Post by rbytes »

You are probably right. I thought it might be best to avoid ASCII 9-13 because they might cause problems in viewing delimited strings on the debug page or showing them in a data dump. If those issues aren't important to the coder, then I'm pretty sure any of the characters 9-13 can be used with SPLIT. Try entering them as the split character in my little program, and you can find out for sure. I'll try too. :)
The only thing that gets me down is gravity...

User avatar
rbytes
Posts: 1338
Joined: Sun May 31, 2015 12:11 am
My devices: iPhone 11 Pro Max
iPad Pro 11
MacBook
Dell Inspiron laptop
CHUWI Plus 10 convertible Windows/Android tablet
Location: Calgary, Canada
Flag: Canada
Contact:

Re: Using the SPLIT command

Post by rbytes »

The results of my testing confirm that CHR$ 10,12 and 13 can be used as the split character to split strings using the SPLIT command. However, when they are inserted into a string, it will become unreadable on the debug screen. You will see a left quote mark and the part of the string up to the first split character, and nothing beyond that. CHR$ 10 is a line feed, so that is to be expected. CHR$ 12 (form feed) and CHR$ 13 (carriage return) have the same effect as CHR$ 10 when the string is displayed in debug. See the attached image of the debug screen.

CHR$ 9 (tab) also works fine as the split character, and does not affect the debug screen display as severely. You will simply see extra spacing within the string where the tab characters are located. All the other characters between AASCI 1 and 32 work as split characters, but are invisible when you view the string in debug, so it is impossible to tell where they are.

CHR$(160) seems to be the best character for splitting, assuming you have the choice. ;)
Attachments
IMG_9314.PNG
IMG_9314.PNG (3.16 MiB) Viewed 3970 times
The only thing that gets me down is gravity...

User avatar
GeorgeMcGinn
Posts: 435
Joined: Sat Sep 10, 2016 6:37 am
My devices: IPad Pro 10.5in
IMac
Linux i386
Windows 7 & 10
Location: Venice, FL
Flag: United States of America
Contact:

Re: Using the SPLIT command

Post by GeorgeMcGinn »

The same does happen for the CR character. I noticed when using it it appeared as a ".
George McGinn
Computer Scientist/Cosmologist/Writer/Photographer
Member: IEEE, IEEE Computer Society
IEEE Sensors Council & IoT Technical Community
American Association for the Advancement of Science (AAAS)

Post Reply