Author Topic: Trimming spaces and tabs inside a string  (Read 1011 times)

Vortex

  • Full Member
  • ***
  • Posts: 132
    • View Profile
Trimming spaces and tabs inside a string
« on: June 08, 2024, 04:33:52 AM »
Hello,

An example to trim all spaces and tabs inside a string :

Code: [Select]
FUNCTION RemoveSpaces (MyStr AS STRING, buff AS STRING) AS INTEGER

    LOCAL buff2 AS LPBYTE
    LOCAL t as UCHAR

    SET lookupTbl [] AS UCHAR
        1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, _
        0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, _
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, _
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, _
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, _
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, _
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, _
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1

    END SET

    buff2 = (LPBYTE)buff
   
    DO
        t=*MyStr
        *buff = t
        buff = buff+lookupTbl[t]
        MyStr = MyStr+1
       
    LOOP WHILE t

    FUNCTION = (LPBYTE)buff-buff2-1
   
END FUNCTION


DIM l AS INTEGER
DIM b AS STRING

l = RemoveSpaces("   This   is a test. ", b)
PRINT "Trimmed string = ",b
PRINT "Length of the string = ",l

airr

  • Sr. Member
  • ****
  • Posts: 252
    • View Profile
Re: Trimming spaces and tabs inside a string
« Reply #1 on: June 08, 2024, 11:48:04 AM »
Interesting code, Vortex, Thanks.

Here is my take on it:

Code: [Select]
dim l,b as string

l = RemoveSpaces("   This   is a test. ",b)
print "Trimmed string = ",b
print "Length of the string = ", l

pause

FUNCTION RemoveSpaces (MyStr AS STRING, buff AS STRING) AS INTEGER
    replace spc$ with nul$ in MyStr
    memcpy(buff,MyStr,len(MyStr))
    return len(buff)
END FUNCTION

Not sure it's 100% safe, but it seems to work in this instance.

AIR.

MrBcx

  • Administrator
  • Hero Member
  • *****
  • Posts: 2392
    • View Profile
Re: Trimming spaces and tabs inside a string
« Reply #2 on: June 08, 2024, 11:51:38 AM »
Below is another way that uses built-in BCX features to remove all 6 common white space characters.

Vortex's lookup table needs only minor modifications to achieve the same results.
In fact, his lookup table could be modified to filter any of the 256 ASCII characters.
His function also runs much faster, especially important in a data intensive operation.

Code: [Select]
DIM b AS STRING
b = "   This   is a test. "

RemoveAllWhiteSpace(b)

PRINT "Trimmed string = ", b
PRINT "Length of the string = ", LEN(b)
PAUSE

SUB RemoveAllWhiteSpace(MyStr AS STRING)
    REMOVE TAB$ from MyStr  '  9 ASCII character
    REMOVE LF$  from MyStr  ' 10 ASCII character
    REMOVE VT$  from MyStr  ' 11 ASCII character
    REMOVE FF$  from MyStr  ' 12 ASCII character
    REMOVE CR$  from MyStr  ' 13 ASCII character
    REMOVE SPC$ from MyStr  ' 32 ASCII character
END SUB


' Ref:  https://bcxbasiccoders.com/webhelp/html/bcxsystemvariables.htm

' Ref:  http://www.tutorialspoint.com/c_standard_library/c_function_isspace.htm

airr

  • Sr. Member
  • ****
  • Posts: 252
    • View Profile
Re: Trimming spaces and tabs inside a string
« Reply #3 on: June 08, 2024, 12:39:00 PM »
@MrB, your second cited reference (for the ispace() function) got me thinking, so I threw this together:

Code: [Select]
Function RemoveSpaces(MyStr as string, buff as string) as integer
    dim as int i, j = 0, length = len(MyStr)

    for integer i = 0 to length -1
        if not isspace(MyStr[i]) then
            buff[j] = MyStr[i] 
            j++         
        end if
    next

    return j
End Function

AIR.

MrBcx

  • Administrator
  • Hero Member
  • *****
  • Posts: 2392
    • View Profile
Re: Trimming spaces and tabs inside a string
« Reply #4 on: June 08, 2024, 12:58:33 PM »
Hello Armando,

My earlier code is for users who prefer or need an easy to understand procedure to tinker with.

Below is a string function that derives from your code.  I like this a lot!

Code: [Select]
PRINT NoWhiteSpace ("   This     is    a    test    ")

PAUSE

FUNCTION NoWhiteSpace (MyStr AS STRING) AS STRING
    DIM buff$ * LEN(MyStr)
    DIM AS INT i, j = 0, length = LEN(MyStr)
    FOR INTEGER i = 0 TO length -1
        IF NOT isspace(MyStr[i]) THEN
            buff[j] = MyStr[i]
            j++
        END IF
    NEXT
    FUNCTION = buff$
END FUNCTION


djsb

  • Full Member
  • ***
  • Posts: 130
    • View Profile
Re: Trimming spaces and tabs inside a string
« Reply #5 on: June 08, 2024, 01:50:38 PM »
Maybe someday I'll understand how the code posted above works.

airr

  • Sr. Member
  • ****
  • Posts: 252
    • View Profile
Re: Trimming spaces and tabs inside a string
« Reply #6 on: June 08, 2024, 01:55:58 PM »
using MrB's function, I pasted the code into ChatGPT and asked it to provide an explanation of the code.  Here is the result:

Quote
Purpose of the Function

The NoWhiteSpace function is designed to take a string as input and return a new string that is identical to the input string but with all whitespace characters removed.
Steps Involved

    Declaration: The function is named NoWhiteSpace and it expects a single string argument. It returns a string as well.

    Variable Initialization:
        A buffer string (buff$) is created to hold the characters from the input string that are not whitespace. This buffer string is initialized to have the same length as the input string, although it will not necessarily use all of this space.
        Several integer variables are declared:
            i will serve as a loop counter to iterate through each character in the input string.
            j is initialized to 0 and will be used to keep track of the current position in the buffer string where the next non-whitespace character should be placed.
            length holds the length of the input string to determine how many characters need to be checked.

    Loop Through Characters:
        A loop runs from the first character to the last character of the input string.
        Inside this loop, each character of the input string is checked to see if it is a whitespace character.

    Check and Append Non-Whitespace Characters:
        If the character is not a whitespace character, it is added to the buffer string at the current position indicated by j.
        The position counter j is then incremented to prepare for the next non-whitespace character.

    Return the Result:
        After the loop has processed all characters in the input string, the buffer string (which now contains only the non-whitespace characters) is returned as the result of the function.

Summary

In summary, the NoWhiteSpace function processes an input string character by character, removing any whitespace characters and constructing a new string with only the non-whitespace characters. This new string is then returned as the output of the function.

AIR.

MrBcx

  • Administrator
  • Hero Member
  • *****
  • Posts: 2392
    • View Profile
Re: Trimming spaces and tabs inside a string
« Reply #7 on: June 08, 2024, 03:00:29 PM »
There's more than one way to skin a cat.

The function below lets us remove all individual characters in the second argument from the first argument.

Code: [Select]
DIM MyString AS STRING

MyString  = "   This   is a test. "
PRINT MyString

MyString  = RemoveAny(MyString, CHR$(9, 10, 11, 12, 13, 32))  ' <<-- These 6 chars collectively make up "white-space"

PRINT "Trimmed string = ", MyString
PRINT "Length of the string = ", LEN(MyString )
PAUSE

FUNCTION RemoveAny (Target AS STRING, CharsToRemove AS STRING) AS STRING
    DIM AS PCHAR WritePtr
    DIM AS INTEGER TargetLen
    DIM AS STRING Result
    TargetLen = LEN(Target)
    WritePtr = Result
    FOR INT i = 1 TO TargetLen
        IF INSTR(CharsToRemove, MID$(Target, i, 1)) = 0 THEN
            *WritePtr = Target[i-1]
            INCR WritePtr
        END IF
    NEXT
    *WritePtr = 0   ' Null-terminate the result
    FUNCTION = Result$
END FUNCTION




« Last Edit: June 08, 2024, 04:43:33 PM by MrBcx »

MrBcx

  • Administrator
  • Hero Member
  • *****
  • Posts: 2392
    • View Profile
Re: Trimming spaces and tabs inside a string
« Reply #8 on: June 08, 2024, 03:13:47 PM »
I promise, this is the last one I'll upload today.  I just found it in my snippets library.

Code: [Select]

FUNCTION REMOVE_ANY$ (szMainStr AS LPCTSTR, szMatchStr AS LPCTSTR, Sensitivity = TRUE)
    DIM RAW AS INT nLen = LEN(szMainStr) + 1
    DIM szStr$ * nLen
    szStr$ = szMainStr$
    IF Sensitivity = TRUE THEN
        FOR LONG i = 1 TO nLen
            szStr$ = REMOVE$(szStr$, MID$(szMatchStr$, i, 1))
        NEXT
    ELSE
        FOR LONG i =  1 TO nLen
            szStr$ = IREMOVE$(szStr$, MID$(szMatchStr$, i, 1))
        NEXT
    END IF
    FUNCTION = szStr$
END FUNCTION

'*********************************************************************************************
'                                     E X A M P L E
'*********************************************************************************************
 DIM a$
 a$ = "abcABCdefghi1234567890"
 PRINT a$                          ' Before any removals      results:  abcABCdefghi1234567890
 PRINT
 a$ = REMOVE_ANY$(a$, "abc456")    ' Case-sensitive removal   results:  ABCdefghi1237890
 PRINT a$
 PRINT
 a$ = REMOVE_ANY$(a$, "abc456", 0) ' Case-insensitive removal results:  defghi1237890
 PRINT a$
 PAUSE


airr

  • Sr. Member
  • ****
  • Posts: 252
    • View Profile
Re: Trimming spaces and tabs inside a string
« Reply #9 on: June 08, 2024, 03:39:49 PM »
Okay, I slightly refactored my version using isspace() to use a while loop, while also providing comments (courtesy of ChatGPT - too lazy to do that part myself!):

Code: [Select]
' Declare the function RemoveSpaces that takes a string (MyStr) and a buffer string (buff) as inputs and returns an integer
Function RemoveSpaces(MyStr as string, buff as string) as integer

    ' Declare integer variables i, j, and length. Initialize length to the length of MyStr.
    ' i and j will be used as counters, but are not explicitly initialized here, defaulting to 0.
    dim as int i, j, length = len(MyStr)
   
    ' Start a while loop that runs as long as i is less than length of MyStr
    while i < length

        ' Check if the current character in MyStr (at index i) is not a whitespace character.
        ' If it's not a whitespace, assign it to the buff string at index j and then increment j.
        if not isspace(MyStr[i]) then buff[j++] = MyStr[i]

        ' Increment the counter i by 1 to move to the next character in MyStr
        incr i

    ' End of the while loop
    wend

    ' Return the value of j, which represents the length of the new string in buff without spaces
    return j

' End of the function
End Function

AIR.