Author Topic: Line Input (Chapter 2)  (Read 458 times)

MrBcx

  • Administrator
  • Sr. Member
  • *****
  • Posts: 264
    • View Profile
Line Input (Chapter 2)
« on: February 21, 2020, 09:31:22 AM »
I want to say a few things about LINE INPUT.

Generally well explained in the BcxHelp.Chm, BCX provides 2 versions of the LINE INPUT statement

 --- one for reading text from a file
 --- one for reading text from the keyboard

Neither implementation allocates memory -- the memory variable is supplied by the user.

It is the BASIC programmer's responsibility to help guard against the potential dangers that a
memory overrun can cause.  LINE INPUT from the keyboard may be a perfectly suitable design option,
if you do not share your program with others.  Otherwise, LINE INPUT could easily be made to overrun
memory and potentially allow a computer to be hacked.  LINE INPUT was created years before the
internet came into being and before the widespread attacks that are unfortunately commonplace today.
 
BcxHelp.chm needs a minor but important update to its section explaining LINE INPUT with the keyboard.

The receiving string variable must be a statically dimensioned string variable - it cannot be a dynamically
dimensioned string because BCX's current translation emits the compile-time C operator "sizeof" which
does not work for determining the number of bytes that a dynamically dimension string can hold.

Next up:

LINE INPUT with reading data from files.

In the previous LINE INPUT thread, I provided two functions that can be easily used to help guard against
memory overrun using the LINE INPUT statement for reading from files.  Neither function allocates any
memory, so the risk of memory overrun is practically zero.  More importantly, the two functions give
the BASIC programmer ways to prevent memory overruns by allowing you to measure the length of
each line of text BEFORE it is stored in your string variable.

Here is the link to those two functions, along with an example of how you can use them.

https://bcxbasiccoders.com/smf/index.php?topic=138.msg573#msg573
« Last Edit: February 21, 2020, 02:42:29 PM by MrBcx »

Robert

  • Full Member
  • ***
  • Posts: 212
    • View Profile
Re: Line Input (Chapter 2)
« Reply #1 on: February 22, 2020, 06:08:49 AM »
I want to say a few things about LINE INPUT.

Generally well explained in the BcxHelp.Chm, BCX provides 2 versions of the LINE INPUT statement

 --- one for reading text from a file
 --- one for reading text from the keyboard

Neither implementation allocates memory -- the memory variable is supplied by the user.

It is the BASIC programmer's responsibility to help guard against the potential dangers that a
memory overrun can cause.  LINE INPUT from the keyboard may be a perfectly suitable design option,
if you do not share your program with others.  Otherwise, LINE INPUT could easily be made to overrun
memory and potentially allow a computer to be hacked.  LINE INPUT was created years before the
internet came into being and before the widespread attacks that are unfortunately commonplace today.
 
BcxHelp.chm needs a minor but important update to its section explaining LINE INPUT with the keyboard.

The receiving string variable must be a statically dimensioned string variable - it cannot be a dynamically
dimensioned string because BCX's current translation emits the compile-time C operator "sizeof" which
does not work for determining the number of bytes that a dynamically dimension string can hold.

Next up:

LINE INPUT with reading data from files.

In the previous LINE INPUT thread, I provided two functions that can be easily used to help guard against
memory overrun using the LINE INPUT statement for reading from files.  Neither function allocates any
memory, so the risk of memory overrun is practically zero.  More importantly, the two functions give
the BASIC programmer ways to prevent memory overruns by allowing you to measure the length of
each line of text BEFORE it is stored in your string variable.

Here is the link to those two functions, along with an example of how you can use them.

https://bcxbasiccoders.com/smf/index.php?topic=138.msg573#msg573

I have altered the LINE INPUT from file example in the BCX Help file from


Code: [Select]

Example: LINE INPUT from file


 DIM a$
 DIM z%

 a$ = "Test"

 OPEN a$ FOR OUTPUT AS fp1
 ? "Creating test file ..."
 FOR z% = 1 TO 1000
  FPRINT fp1, "This is line no.", STR$(z%)
 NEXT
 CLOSE fp1

 OPEN a$ FOR INPUT AS fp1
 WHILE NOT EOF(fp1)
  LINE INPUT fp1, a$
  ? a$
 WEND
 CLOSE fp1

 CLS

 ? "Removing Test file"

 KILL "test"


to

Code: [Select]

Example: LINE INPUT from file

 DIM Buffer$[1048832] AS CHAR
 DIM z%
 
 Buffer$ = "Test"
 
 OPEN Buffer$ FOR OUTPUT AS fp1
 ? "Creating test file ..."
 FOR z% = 1 TO 1000
   FPRINT fp1, "This is line no.", STR$(z%)
 NEXT
 CLOSE fp1
 
 OPEN Buffer$ FOR INPUT AS fp1
 WHILE NOT EOF(fp1)
   LINE INPUT fp1, Buffer$
   ? Buffer$
 WEND
 CLOSE fp1
 
 CLS
 
 ? "Removing Test file"
 
 KILL "test"


Remarks:
In the example above, the size of the string being dimensioned

 DIM Buffer$[1048832] AS CHAR
 
may be altered but the Buffer$ must be dimensioned large enough to accomodate the longest line to be input from the file.
The 1048832 value, large enough to ensure that there will not be a buffer overflow, is 256 bytes larger than
the maximum number of characters that can be copied from the file into Buffer$.

Robert

  • Full Member
  • ***
  • Posts: 212
    • View Profile
Re: Line Input (Chapter 2)
« Reply #2 on: February 22, 2020, 06:40:27 AM »
Here's a version of the LINE INPUT file example incorporating MrBCX's GetLongestLineInFile routine, the result of which is used to dimension the input buffer of the LINE INPUT procedure. C'mon you Munchkins, streamline this bloat. You may even reinvent getline.

Code: [Select]

DIM FileName$
DIM BufSize%
DIM z%

FileName$ = "Test"

OPEN FileName$ FOR OUTPUT AS fp1
? "Creating test file ..."
FOR z% = 1 TO 1000
  FPRINT fp1, "This is line no.", STR$(z%)
NEXT
CLOSE fp1

BufSize% = GetLongestLineInFile(FileName$)
DIM DYNAMIC Buffer$[BufSize] AS CHAR

OPEN FileName$ FOR INPUT AS fp1
WHILE NOT EOF(fp1)
  LINE INPUT fp1, Buffer$
  ? Buffer$
WEND
CLOSE fp1

CLS

? "Removing Test file"

KILL "test"

FUNCTION GetLongestLineInFile (FileName$) AS UINT
   DIM Longest AS UINT
   DIM LinLen AS UINT

   FP = FREEFILE

   OPEN FileName$ FOR INPUT AS FP

   WHILE NOT EOF (FP)
     LinLen = GetLineLen (FP)
     IF LinLen > Longest THEN Longest = LinLen
     fseek(FP, (LinLen+2), SEEK_CUR)
   WEND

   CLOSE FP
   FUNCTION = Longest
END FUNCTION

FUNCTION GetLineLen (FP as FILE) AS UINT
   DIM Counter AS UINT
   DIM OneByte [1] AS BYTE     ' Allocate just 1 byte
   WHILE *OneByte <> 10        ' While that byte <> LF
      GET$ FP, OneByte$, 1     ' get next byte
      Counter++                ' keep track of bytes read
   WEND
   Counter--
   '********************************************************************
   ' Now return the file pointer back to the start of the current line
   '********************************************************************
   fseek(FP, -(Counter+2), SEEK_CUR) ' +2 accounts for CRLF on Windows
   FUNCTION = Counter  ' this is the length of our line
END FUNCTION


MrBcx

  • Administrator
  • Sr. Member
  • *****
  • Posts: 264
    • View Profile
Re: Line Input (Chapter 2)
« Reply #3 on: February 22, 2020, 08:29:49 AM »
I have altered the LINE INPUT from file example in the BCX Help file ...

Robert ...

The original and your reworked version both use a common variable, A$ in the original and Buffer$ in your reworked version.
In both cases, the variable stores a filename and later stores the data.  However legal or safe that may be, it looks confusing
and poorly designed.  Since it's intended to be a simple example, I'd recommend using the variable only for the data and
change to using only a quoted literal string for the filename.

I probably wrote the original example when quick and dirty surged through my veins.  You likely wanted to respect the
original intent with your reworked version.  It's never too late to start all over again.  (Steppenwolf)

P.S.  I noticed your expanded example with GetLongestLineInFile uses unique variables for the filename and data.  Muy Bueno!
« Last Edit: February 22, 2020, 08:36:20 AM by MrBcx »

Robert

  • Full Member
  • ***
  • Posts: 212
    • View Profile
Re: Line Input (Chapter 2)
« Reply #4 on: February 22, 2020, 02:10:32 PM »
I have altered the LINE INPUT from file example in the BCX Help file ...

Robert ...

The original and your reworked version both use a common variable, A$ in the original and Buffer$ in your reworked version.
In both cases, the variable stores a filename and later stores the data.  However legal or safe that may be, it looks confusing
and poorly designed.  Since it's intended to be a simple example, I'd recommend using the variable only for the data and
change to using only a quoted literal string for the filename.

I probably wrote the original example when quick and dirty surged through my veins.  You likely wanted to respect the
original intent with your reworked version.
  It's never too late to start all over again.  (Steppenwolf)

P.S.  I noticed your expanded example with GetLongestLineInFile uses unique variables for the filename and data.  Muy Bueno!

Oh ! Yes, respect, there was that, but I thought you were being clever.

Yes, I agree. I shall fix that.

Steppenwolf, ah yeah, back in the day, you shoulda been there.

That line, "It's never too late to start all over again" displays the essence of Herman Hesse's novel Steppenwolf, of which, Hesse complained that it was "more often and more violently misunderstood" than any of his other writings. Hesse felt that his readers focused only on the suffering and despair that are depicted in, the protagonist of Steppenwolf, Harry Haller's life, thereby missing the possibility of transcendence and healing. Steppenwolf's (the band) John Kay and Nick St. Nicholas got it when they wrote

"Oh, no, not too late
It's never too late to start all over again."

Wayne Halsdorf

  • Newbie
  • *
  • Posts: 6
    • View Profile
Re: Line Input (Chapter 2)
« Reply #5 on: February 22, 2020, 06:06:48 PM »
Here's something that I was working on a few weeks ago but never finished testing. There are places requiring error checking. If someone wants to convert it. It was written to be compliant with the newer standards.
 
Code: [Select]
// TESTIINGCPP.cpp : This file contains the 'main' function. Program execution begins and ends there.
//

#include <iostream>
#include <stdio.h>
#include <malloc.h>

#define Blocksize 1024

typedef struct tagReadBlock
{
  size_t Used;
  char sStr[Blocksize];
  tagReadBlock* pNextReadBlock;
} tReadBlock;

tagReadBlock tInput;
size_t ziBlockCount, ziByresused, ziByreAloocated;

void MakeFile(void);
void ReadData(FILE* , char**);

int main()
{
  char* psBuffer;
  FILE* fp; 
  MakeFile();
  if (fopen_s(&fp, "test.txt", "r+")) {
    printf_s("Could not open file for reading\n");
    exit(-1);
  }

  ReadData(fp,&psBuffer);
  printf_s("%s\n", psBuffer);
  if (NULL != *psBuffer) { printf_s("Length of sBuffer %zi\n", strlen(psBuffer)); }
  printf("%zi BlocksUsed to get %zi bytes\n", ziBlockCount + 1, ziByresused);

}

void   ReadData(FILE *fp,char** psBuffer)
{
  tagReadBlock* ptCurrentInputBlock;
  int iChar = 0;
  ptCurrentInputBlock = &tInput;
  ptCurrentInputBlock->Used = 0;
  ziByresused = 0;

  ziBlockCount = 0;
  while (NULL != fp && ((iChar = getc(fp)) != EOF && iChar != '\n'))
  {
    if (Blocksize == ptCurrentInputBlock->Used) {
      if (NULL == ptCurrentInputBlock->pNextReadBlock) {
        ptCurrentInputBlock->pNextReadBlock = (tagReadBlock*)calloc(1, sizeof(tagReadBlock));
      }
      ptCurrentInputBlock = ptCurrentInputBlock->pNextReadBlock;
      ptCurrentInputBlock->Used = 0;
      ziBlockCount++;
    }
    ziByresused++;
    ptCurrentInputBlock->sStr[ptCurrentInputBlock->Used++] = (char)iChar;
  }

  ptCurrentInputBlock = &tInput;
  ziByreAloocated = ziByresused + 2;
  *psBuffer = (char*)calloc(ziByreAloocated, sizeof(char));
  for (size_t i = 0;i < ziBlockCount;i++) {
    if (NULL != *psBuffer) { strncpy_s(* psBuffer + i * Blocksize, ziByreAloocated - i * Blocksize, ptCurrentInputBlock->sStr, Blocksize); }
    ptCurrentInputBlock = ptCurrentInputBlock->pNextReadBlock;
  }

  if (NULL != *psBuffer) { strncpy_s(*psBuffer + ziBlockCount * Blocksize, ziByreAloocated - ziBlockCount * Blocksize, ptCurrentInputBlock->sStr, ptCurrentInputBlock->Used);  ptCurrentInputBlock = ptCurrentInputBlock->pNextReadBlock; }

}

void MakeFile(void)
{
  int iChar = 0;
  FILE* fp=NULL;
  if (fopen_s(&fp, "test.txt", "r+")) {
    printf_s("making file\n");
    if (fopen_s(&fp, "test.txt", "w+")) {
      printf_s("Could not open file for writing\n");
      exit(-1);
    }
    for (int i = 0;i < 4000000;i++) {
      iChar = 32 + (i % 96);
      if (NULL != fp) { fprintf_s(fp, "%c", (char)iChar); }
    }
    fflush(fp);
  }
  if (NULL != fp) { fclose(fp); };
}


Robert

  • Full Member
  • ***
  • Posts: 212
    • View Profile
Re: Line Input (Chapter 2)
« Reply #6 on: February 24, 2020, 05:31:01 PM »
Here's something that I was working on a few weeks ago but never finished testing. There are places requiring error checking. If someone wants to convert it. It was written to be compliant with the newer standards.
 
Code: [Select]
// TESTIINGCPP.cpp : This file contains the 'main' function. Program execution begins and ends there.
//

#include <iostream>
#include <stdio.h>
#include <malloc.h>

#define Blocksize 1024

typedef struct tagReadBlock
{
  size_t Used;
  char sStr[Blocksize];
  tagReadBlock* pNextReadBlock;
} tReadBlock;

tagReadBlock tInput;
size_t ziBlockCount, ziByresused, ziByreAloocated;

void MakeFile(void);
void ReadData(FILE* , char**);

int main()
{
  char* psBuffer;
  FILE* fp; 
  MakeFile();
  if (fopen_s(&fp, "test.txt", "r+")) {
    printf_s("Could not open file for reading\n");
    exit(-1);
  }

  ReadData(fp,&psBuffer);
  printf_s("%s\n", psBuffer);
  if (NULL != *psBuffer) { printf_s("Length of sBuffer %zi\n", strlen(psBuffer)); }
  printf("%zi BlocksUsed to get %zi bytes\n", ziBlockCount + 1, ziByresused);

}

void   ReadData(FILE *fp,char** psBuffer)
{
  tagReadBlock* ptCurrentInputBlock;
  int iChar = 0;
  ptCurrentInputBlock = &tInput;
  ptCurrentInputBlock->Used = 0;
  ziByresused = 0;

  ziBlockCount = 0;
  while (NULL != fp && ((iChar = getc(fp)) != EOF && iChar != '\n'))
  {
    if (Blocksize == ptCurrentInputBlock->Used) {
      if (NULL == ptCurrentInputBlock->pNextReadBlock) {
        ptCurrentInputBlock->pNextReadBlock = (tagReadBlock*)calloc(1, sizeof(tagReadBlock));
      }
      ptCurrentInputBlock = ptCurrentInputBlock->pNextReadBlock;
      ptCurrentInputBlock->Used = 0;
      ziBlockCount++;
    }
    ziByresused++;
    ptCurrentInputBlock->sStr[ptCurrentInputBlock->Used++] = (char)iChar;
  }

  ptCurrentInputBlock = &tInput;
  ziByreAloocated = ziByresused + 2;
  *psBuffer = (char*)calloc(ziByreAloocated, sizeof(char));
  for (size_t i = 0;i < ziBlockCount;i++) {
    if (NULL != *psBuffer) { strncpy_s(* psBuffer + i * Blocksize, ziByreAloocated - i * Blocksize, ptCurrentInputBlock->sStr, Blocksize); }
    ptCurrentInputBlock = ptCurrentInputBlock->pNextReadBlock;
  }

  if (NULL != *psBuffer) { strncpy_s(*psBuffer + ziBlockCount * Blocksize, ziByreAloocated - ziBlockCount * Blocksize, ptCurrentInputBlock->sStr, ptCurrentInputBlock->Used);  ptCurrentInputBlock = ptCurrentInputBlock->pNextReadBlock; }

}

void MakeFile(void)
{
  int iChar = 0;
  FILE* fp=NULL;
  if (fopen_s(&fp, "test.txt", "r+")) {
    printf_s("making file\n");
    if (fopen_s(&fp, "test.txt", "w+")) {
      printf_s("Could not open file for writing\n");
      exit(-1);
    }
    for (int i = 0;i < 4000000;i++) {
      iChar = 32 + (i % 96);
      if (NULL != fp) { fprintf_s(fp, "%c", (char)iChar); }
    }
    fflush(fp);
  }
  if (NULL != fp) { fclose(fp); };
}


Hi Wayne:

I just ran your code using the DeRangeD.txt as input and suppressed any print to screen output and it took 29 seconds to work through the 1 million line file.

 I was surprised. I thought with all that complexity that it would take much longer.

Wayne Halsdorf

  • Newbie
  • *
  • Posts: 6
    • View Profile
Re: Line Input (Chapter 2)
« Reply #7 on: February 24, 2020, 08:36:16 PM »
The only reason it is so slow is that it reads the file byte by byte. I haven't tried any method that can read a block at a time. This was more of a trst to for my understanding how various functions work.