More like a WARNING than a Tip or Trick

Started by MrBcx, June 01, 2025, 02:37:33 PM

Previous topic - Next topic

MrBcx

Quote from: MrBcx on October 04, 2025, 08:43:34 PMUpdate - I've fixed this bug.  What looked like correctly functioning C code was in fact flawed
as well but it compiled and ran but never released its heap memory correctly.  Wayne's code now
translates, compiles, and runs correctly regardless if compiling for c or c++


Fixing this bug led me to a happy discovery regarding properly releasing heap memory
allocated by LOCAL DYNAMIC ARRAYS when used in plain C and C++ modes.  This is the
entry that I made today inside the BCX 8.3.0 revisions log:


Kevin Diggins  : IMPROVED:  Memory that is allocated by LOCAL DYNAMIC ARRAYS is now properly
                released back to the heap, regardless if used in plain C or C++ mode.
                For example:   
          
                        PRINT TestFunc$("Everything is possible")
                        PRINT TestFunc2 (PI)
                        PAUSE

                        FUNCTION TestFunc$ (MyArg$)
                            DIM DYNAMIC DynArray1$ [10, 10]   
                            DIM DYNAMIC DynArray2$ [10, 10]   
                            DIM DYNAMIC DynArray3$ [10, 10]     
                            DynArray1$[5, 5] = MyArg$         
                            FUNCTION = DynArray1$[5, 5]
                        END FUNCTION


                        FUNCTION TestFunc2 (TheOne AS DOUBLE) AS DOUBLE
                            DIM DYNAMIC DynArray[10, 10] AS DOUBLE
                            DynArray[5, 5] = TheOne
                            FUNCTION = DynArray[5, 5] + 1 + 2 + 3
                        END FUNCTION


        ;D  ;D  ;D  ;D  ;D

MrBcx

Quote from: MrBcx on October 04, 2025, 09:25:35 AM
Quote from: Robert on October 04, 2025, 01:57:09 AMHi MrBcx:

Here's another nut to throw in your bowl.

One of Wayne Halsdorf's gems.

C is O.K.

C++ code, post BCX 8.2.7, is broken.


Hi Robert,

Thanks for the bug report.  I think I'll be able to crack this one.

BCX 8.2.7 and earlier emit code that allows Wayne's demo to compile and run with the
c++ option but what is not emitted is code to destroy the local dynamic array, thus
leaking heap memory each time it is called.

That needed code is emitted when using BCX 8.2.8 and newer but I see there is a
dimensional miscounting bug in the void* statement that needs investigation/repair.

Update - I've fixed this bug.  What looked like correctly functioning C code was in fact flawed
as well but it compiled and ran but never released its heap memory correctly.  Wayne's code now
translates, compiles, and runs correctly regardless if compiling for c or c++

This fix and more will be included in BCX 8.3.0 later this month.



MrBcx

Quote from: Robert on October 04, 2025, 01:57:09 AMHi MrBcx:

Here's another nut to throw in your bowl.

One of Wayne Halsdorf's gems.

C is O.K.

C++ code, post BCX 8.2.7, is broken.


Hi Robert,

Thanks for the bug report.  I think I'll be able to crack this one.

BCX 8.2.7 and earlier emit code that allows Wayne's demo to compile and run with the
c++ option but what is not emitted is code to destroy the local dynamic array, thus
leaking heap memory each time it is called.

That needed code is emitted when using BCX 8.2.8 and newer but I see there is a
dimensional miscounting bug in the void* statement that needs investigation/repair.

Robert

Quote from: MrBcx on October 01, 2025, 07:34:44 AMHi Robert -- I'm going to need to give this next idea some more thought. 

....

I generally like trying to crack tough nuts but I might need to leave this one in the bowl.



Hi MrBcx:

Here's another nut to throw in your bowl.

One of Wayne Halsdorf's gems.

C is O.K.

C++ code, post BCX 8.2.7, is broken.

DIM sA$

sA$ = "all of following is " & DQ$ & "inside double quotes, all text between will not be changed" & DQ$ & ", except this part"
PRINT sA$
sA$ = ExReplace$(sA$,"all","All", DQ$, DQ$)
PRINT sA$
sA$ = ExReplace$(sA$,", except",". Except", DQ$, DQ$)
PRINT sA$

sA$ = "all of following is <inside less than and greater than symbols, all text between will not be changed>, except this part"
PRINT sA$
sA$ = ExReplace$(sA$,"all","All", "<", ">")
PRINT sA$
sA$ = ExReplace$(sA$,", except",". Except", "<", ">")
PRINT sA$
'PAUSE

FUNCTION ExReplace$(sInputStr$, sFrom$, sTo$, sExcludeStart$, sExcludeEnd$)
  DIM sNewStr$
  DYNAMIC sTokenArr[20][1024] AS CHAR
  DIM iFlag AS INT
  DIM iCNT AS INT
  DIM j AS INT

  iCNT = FastLexer(sTokenArr,sInputStr$,sExcludeStart$+sExcludeEnd$)
  sNewStr$ = ""
  iFlag = 0

  FOR j = 0 TO iCNT
    IF (sTokenArr$[j] = sExcludeStart$) AND (iFlag = 0) THEN
      iFlag++
    ELSEIF sTokenArr$[j] = sExcludeEnd$ THEN
      iFlag--
    ELSE
      IF iFlag = 0 THEN
        sTokenArr$[j] = REPLACE$(sTokenArr$[j],sFrom$, sTo$)
      END IF
    END IF
    sNewStr$ = sNewStr$ + sTokenArr$[j]
  NEXT

  FUNCTION = sNewStr$
END FUNCTION

FUNCTION FastLexer(Stk AS CHAR PTR PTR, Arg$, delim2$) AS INT
  RAW Ndx AS INT
  DIM RAW cnt1=0, cnt2=0
  DIM RAW pd2 AS PCHAR
  Ndx = 1

  WHILE Arg[cnt1]
    pd2 = delim2

    WHILE *pd2
      IF *(pd2++) = Arg[cnt1] THEN
        IF cnt2 THEN Stk[Ndx++][cnt2]=0
        Stk[Ndx][0] = Arg[cnt1]
        Stk[Ndx++][1]=0 : cnt2 = 0
        GOTO again
      END IF
    WEND

    Stk[Ndx][cnt2++]=Arg[cnt1]

    again:
    INCR cnt1
  WEND

  Stk[Ndx][cnt2]=0
  IF cnt2 = 0 THEN DECR Ndx
  FUNCTION = Ndx
END SUB



MrBcx

Hi Robert -- I'm going to need to give this next idea some more thought. 

For those who mainly use C++ "as a better C", I can probably lift the C++ check on
local dynamic non-string arrays, with the (documented) understanding that the ultimate
responsibility would fall to the programmer.  The BCX translator does 90% of its magic
by employing heuristics.  Trying to account for the always increasing complexities of C++,
in this context, seems overwhelming to me at this moment.

BCX 8.2.8 introduced a system-wide local variable named "Bcx_RetVal", each one datatyped
to the function return value it resides within, EVEN WHEN LOCAL NON-STRING ARRAYS ARE NOT USED.
Those mods provided benefits beyond the freeing of memory in SUBS and FUNCTIONS but those benefits
came at the expense of even more internal complexity while processing the translation.

I generally like trying to crack tough nuts but I might need to leave this one in the bowl.


Robert

Hi MrBcx:

Ouch !

I should've R.T.F.M., where it was written by me;

👉 This improvement only works with plain C code. It won't work with C++ code that requires a C++ compiler. In that case, "FUNCTION =" should not include references to dynamic arrays because those will be destroyed before the FUNCTION can fetch the data.

R.T.F.M. means Remembered The F*****g Manual.

Thanks for straightening me out. This has been bothering me all day.

MrBcx

Hi Robert,

Seeing you translating for c++ was the clue I needed.

I refer you to the BCX Revisions.txt  (8.2.8 section) 


Kevin Diggins  : NEW! Added the freeing of LOCAL DYNAMIC NON-STRING ARRAYS in SUBS and FUNCTIONS:
                 When translated, the following example will now include the necessary C/C++
                 code that releases the dynamic (heap) memory back to the pool.
                 NOTE:  This improvement only works with plain C code.  It won't work with C++
                 code that -REQUIRES- a C++ compiler.  In that case, "FUNCTION =" should -NOT-
                 include references to dynamic arrays because those will be destroyed before the
                 FUNCTION can fetch the data.  C++ templates, references, and other features not
                 allowed in plain C limited my ability to include those in my new algorithm.


Robert

Quote from: MrBcx on September 30, 2025, 07:28:22 PMRobert,

I could have been more clear here:  https://bcxbasiccoders.com/smf/index.php?topic=1360.msg7728#msg7728

* BCX 8.2.8 contains extensive work that emits this function epilog, like I stated earlier:


  Bcx_RetVal=1.0-CDBL(LevMatrix[len1][len2])/maxLen;
  if (LevMatrix) { DestroyArr((void **)LevMatrix, 2, 1); LevMatrix=NULL; }
  return Bcx_RetVal;
}



BCX 8.2.9 introduced the FORMAT$() function. 

The updated code that I provided here: https://bcxbasiccoders.com/smf/index.php?topic=1360.msg7728#msg7728

requires BCX 8.2.9 because it uses the FORMAT$() function.


I don't intentionally try to mislead anyone but I am human and therefore subject
to human frailties.  I can usually be reached for comment on this forum.  ;)


What's puzzling to me is why you're reporting:

QuoteBCX 8.2.8 and BCX 8.2.9 return code is


  if (LevMatrix) { DestroyArr((void **)LevMatrix, 2, 1); LevMatrix=NULL; }
  return 1.0-CDBL(LevMatrix[len1][len2])/maxLen;



This seems like you are not using 828 or 829.



Hi MrBcx:

Here's my output from your Levenshtein demo

C:\T>bcx829 temp -c

BCX BASIC to C/C++ Translator (c) 1999-2025 by Kevin Diggins
Version 8.2.9 (09/25/2025) Compiled using MS Visual C++ (194435217) for 64-bit Windows
[Lines In: 60 ] [Lines Out: 819 ] [Statements: 56 ] [Time: 0.01 Sec's]
BCX translated [Temp.Bas] to [Temp.Cpp] for a C++ Compiler


C:\T>vs143c64 temp

C:\T>setlocal
**********************************************************************
** Visual Studio 2022 Developer Command Prompt v17.14.16
** Copyright (c) 2025 Microsoft Corporation
**********************************************************************
[vcvarsall.bat] Environment initialized for: 'x64'
Compiling "C:\T\temp.cpp" with Microsoft VS143 cl.exe
Microsoft (R) C/C++ Optimizing Compiler Version 19.44.35217 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

temp.cpp
Linking ................
Microsoft (R) Incremental Linker Version 14.44.35217.0
Copyright (C) Microsoft Corporation.  All rights reserved.

Finished!

C:\T>.\temp

C:\T>

No output when I run the demo.

STRING_SIMILARITY function .cpp translation

double STRING_SIMILARITY (char* s1,char* s2)
{
  int  len1={0};
  int  len2={0};
  int  i={0};
  int  j={0};
  int  cost={0};
  int  maxLen={0};
  char*  p1={0};
  char*  p2={0};
  p1=s1;
  p2=s2;
  len1=(int)strlen(s1);
  len2=(int)strlen(s2);
  maxLen=MAX(len1,len2);
  if(len1==0&&len2==0 ){
      return 1.0;
    }
  if(len1==0||len2==0 ){
      return 0.0;
    }
  if(strcmp(s1,s2)==0){
      return 1.0;
    }
  if(len1>BCXSTRSIZE||len2>BCXSTRSIZE ){
      return -1;
    }
  int  **LevMatrix=0;
  {
  size_t dimensions[2] = {(size_t)len1+1, (size_t)len2+1};
  LevMatrix= (int**)CreateArr (LevMatrix,sizeof(int),0,2, dimensions);
  }
  for(i=0; i<=len1; i++)
    {
      LevMatrix[i][0]=i;
    }
  for(j=0; j<=len2; j++)
    {
      LevMatrix[0][j]=j;
    }
  for(i=1; i<=len1; i++)
    {
      for(j=1; j<=len2; j++)
        {
          if(p1[i-1]==p2[j-1]){
              cost=0;
            }
          else
            {
              cost=1;
            }
          LevMatrix[i][j]=MIN(LevMatrix[i-1][j]+1,MIN(LevMatrix[i][j-1]+1,LevMatrix[i-1][j-1]+cost));
        }
    }
  if (LevMatrix) { DestroyArr((void **)LevMatrix, 2, 1); LevMatrix=NULL; }
  return 1.0-CDBL(LevMatrix[len1][len2])/maxLen;
}


No Bcx_RetVal.

???

MrBcx

Robert,

I could have been more clear here:  https://bcxbasiccoders.com/smf/index.php?topic=1360.msg7728#msg7728

* BCX 8.2.8 contains extensive work that emits this function epilog, like I stated earlier:


  Bcx_RetVal=1.0-CDBL(LevMatrix[len1][len2])/maxLen;
  if (LevMatrix) { DestroyArr((void **)LevMatrix, 2, 1); LevMatrix=NULL; }
  return Bcx_RetVal;
}



BCX 8.2.9 introduced the FORMAT$() function. 

The updated code that I provided here: https://bcxbasiccoders.com/smf/index.php?topic=1360.msg7728#msg7728

requires BCX 8.2.9 because it uses the FORMAT$() function.


I don't intentionally try to mislead anyone but I am human and therefore subject
to human frailties.  I can usually be reached for comment on this forum.   ;)


What's puzzling to me is why you're reporting:

QuoteBCX 8.2.8 and BCX 8.2.9 return code is


  if (LevMatrix) { DestroyArr((void **)LevMatrix, 2, 1); LevMatrix=NULL; }
  return 1.0-CDBL(LevMatrix[len1][len2])/maxLen;



This seems like you are not using 828 or 829.


Robert

#15
Hi MrBcx:

There's something you're not telling us.

BCX 8.2.8 and BCX 8.2.9 return code is

  if (LevMatrix) { DestroyArr((void **)LevMatrix, 2, 1); LevMatrix=NULL; }
  return 1.0-CDBL(LevMatrix[len1][len2])/maxLen;

and no value is returned, instead of


  Bcx_RetVal=1.0-CDBL(LevMatrix[len1][len2])/maxLen;
  if (LevMatrix) { DestroyArr((void **)LevMatrix, 2, 1); LevMatrix=NULL; }
  return Bcx_RetVal;
}

with return of

0.945945945945946


MrBcx

#14
I thought I would round out this thread with my updated example that demonstrates how BCX now
prevents the initial memory corruption problem, thanks to my BCX Version 8.2.8 improvements.

My original code required introducing a temporary variable of type DOUBLE to safeguard the
FUNCTION's return value.  That temporary variable is no longer needed, now that BCX handles
all the housekeeping automatically.  The example also uses the newly introduced FORMAT$().


DIM score AS DOUBLE
score = STRING_SIMILARITY("Calculate minimum of three operations", "Calculate minimum of the operations")
PRINT "Similarity Score: ", FORMAT$(score,"#.##%")
PAUSE


FUNCTION STRING_SIMILARITY (s1$, s2$) AS DOUBLE
    '===========================================================
    ' This function uses the Levenshtein distance algorithm but
    ' returns a value expressing the percentage of similarity of
    ' the string arguments.
    ' If either Arg LEN > BCXSTRSIZE, function returns -1
    ' 0 = no similarity, 1 = Args are equal, and everything
    ' else is a percentage of similarity between 0.0 to 1.0
    '===========================================================
    DIM AS INT len1, len2, i, j, cost, maxLen
    DIM AS CHAR PTR p1, p2

    p1 =  s1$
    p2 =  s2$

    len1 = LEN(s1$)
    len2 = LEN(s2$)

    maxLen = MAX(len1, len2)

    ' Handle edge cases
    IF len1 = 0 AND len2 = 0 THEN FUNCTION = 1.0
    IF len1 = 0 OR len2 = 0 THEN FUNCTION = 0.0
    IF s1$ = s2$ THEN FUNCTION = 1.0
    IF len1 > BCXSTRSIZE OR len2 > BCXSTRSIZE THEN FUNCTION = -1

    ' Create 2D Levenshtein matrix
    REDIM LevMatrix [len1+1, len2+1] AS INTEGER ' Make the matrix dynamic

    ' Initialize first row and column
    FOR i = 0 TO len1
        LevMatrix[i, 0] = i
    NEXT
    FOR j = 0 TO len2
        LevMatrix[0, j] = j
    NEXT

    ' Fill the matrix
    FOR i = 1 TO len1
        FOR j = 1 TO len2
            ' Compare characters using pointers
            IF p1[i-1] = p2[j-1] THEN
                cost = 0
            ELSE
                cost = 1
            END IF
            ' Calculate minimum of three operations: deletion, insertion, and substitution
            LevMatrix[i, j] = MIN(LevMatrix[i-1, j] + 1, MIN(LevMatrix[i, j-1] + 1, LevMatrix[i-1, j-1] + cost))
        NEXT
    NEXT

    FUNCTION = 1.0 - CDBL(LevMatrix[len1, len2]) / maxLen
END FUNCTION


BCX correctly frees the local array and properly returns the expression, error free.


  Bcx_RetVal=1.0-CDBL(LevMatrix[len1][len2])/maxLen;
  if (LevMatrix) { DestroyArr((void **)LevMatrix, 2, 1); LevMatrix=NULL; }
  return Bcx_RetVal;
}



This just goes to show what a little bit of mental "elbow grease" and tenacity can do.  ;D 


djsb

Some day, with a bit of effort I may understand this stuff. Thanks for the work you've done in improving BCX.

MrBcx

Quote from: MrBcx on June 08, 2025, 12:48:14 AMMany, many hours later, I now have this kind of function properly ...
 

More specifically, user functions have always relied on a local BCX string variable named BCX_RetStr
when returning results from our functions.  Non-string functions always returned their results directly.
I suppose that has worked well enough for most users but my recent discovery of local non-string dynamic
arrays -NOT- releasing their memory was too much for me to ignore.

I've added a new local variable named Bcx_RetVal that (generally) gets its datatype from the function
header, in essense, INT, ULONG, HWND, HDC, etc.  That enabled me to devise a plan for releasing local
dynamic array heap memory automatically.  Now we won't need to remember to use a temporary variable in
our function returns when those include local dynamic array data.  Less is more.

For reasons I won't get into, my changes required handling numerous edge cases that I discovered during
my regression testings.  Those served as a reminder (for me) to not be afraid of (temporarily) breaking
things and to stay focused on the goal of improving BCX. 

The cruel irony in all of this is that the more BCX improves, the less it (seemingly) gets used.  More
and more, I feel like I've become the old guy who spends his time polishing his 1956 Chevy Bel Air.

MrBcx

Many, many hours later, I now have this kind of function properly
emitting the heap clean up code automatically.  This only works
when creating a plain C file.
Code that requires a c++ compiler
has not changed and will still require diligence when using
functions like this example.



FUNCTION TestFunc (TheOne AS DOUBLE) AS DOUBLE
    DIM DYNAMIC DynArray[10, 10] AS DOUBLE
    DynArray[5, 5] = TheOne
    FUNCTION = DynArray[5, 5] + 1 + 2 + 3
END FUNCTION




double TestFunc (double TheOne)
{
  double   Bcx_RetVal= {0};
  double  **DynArray=0;
  {
  size_t dimensions[2] = {(size_t)10, (size_t)10};
  DynArray= (double**)CreateArr (DynArray,sizeof(double),0,2, dimensions);
  }
  DynArray[5][5]=TheOne;
  Bcx_RetVal = DynArray[5][5]+ 1 + 2 + 3;
  if (DynArray) { DestroyArr((void **)DynArray, 2, 1); DynArray=NULL; }
  return Bcx_RetVal;
}


MrBcx

#10
BCX 8.2.8 will eliminate the nasty heap memory leak by translating this:


FUNCTION TestFunc$ (TheOne$)
    DIM AS INTEGER size = 10
    DIM DYNAMIC DynArray1$ [size, size]    ' 10 x 10 x 2048 = 204,800 bytes
    DIM DYNAMIC DynArray2$ [size, size]    ' 10 x 10 x 2048 = 204,800 bytes
    DIM DYNAMIC DynArray3$ [size, size]    ' 10 x 10 x 2048 = 204,800 bytes

    DynArray1$[size - 5, size - 5] = TheOne$
    FUNCTION = DynArray1$[size - 5, size - 5]
END FUNCTION


to this:


char * TestFunc (char* TheOne)
{
  char  *BCX_RetStr= {0};
  int      size=10;
  char  ***DynArray1=0;
  {
  size_t dimensions[3] = {(size_t)size, (size_t)size, (size_t)BCXSTRSIZE};
  DynArray1= (char***)CreateArr (DynArray1,sizeof(char),0,3, dimensions);
  }
  char  ***DynArray2=0;
  {
  size_t dimensions[3] = {(size_t)size, (size_t)size, (size_t)BCXSTRSIZE};
  DynArray2= (char***)CreateArr (DynArray2,sizeof(char),0,3, dimensions);
  }
  char  ***DynArray3=0;
  {
  size_t dimensions[3] = {(size_t)size, (size_t)size, (size_t)BCXSTRSIZE};
  DynArray3= (char***)CreateArr (DynArray3,sizeof(char),0,3, dimensions);
  }
  strcpy(DynArray1[size-5][size-5],TheOne);
  BCX_RetStr = BCX_TempStr(strlen(DynArray1[size-5][size-5]));
  strcpy(BCX_RetStr,DynArray1[size-5][size-5]);
 
  if (DynArray3) { DestroyArr((void **)DynArray3, 3, 1); DynArray3=NULL; }
  if (DynArray2) { DestroyArr((void **)DynArray2, 3, 1); DynArray2=NULL; }
  if (DynArray1) { DestroyArr((void **)DynArray1, 3, 1); DynArray1=NULL; }
 
  return BCX_RetStr;
}



I'm certain I have more discoveries and repairs to make in my future.