Problems with the letters of my language Ñ, á,é...

Started by Luis Candelas, October 19, 2023, 11:41:59 AM

Previous topic - Next topic

Luis Candelas

Hello.
For some reason, the compiled program does not identify the special Spanish characters well. When reading from a file the word PEQUEÑO transforms it into PEQUE├æO, replacing the letter Ñ. The same thing happens with the letters á, é... All the "special" characters.
With version 6 this did not happen to me. Do I have to use some parameter when compiling so that this does not happen?
I'm thinking if it's not the compiler. I use PELLES C, because I see that the generated C program is correct.
Thanks

MrBcx

Hi Luis,

I don't know what "version 6" is.

Anyway,

The following BCX demonstration works correctly with Pelles C, MSVC, MINGW, and Lcc-Win32

Save the attached data file, "data.txt" as UTF-8

Save, translate, and compile "Test.bas".  Make sure Test.exe and data.txt are in the same folder.


Robert

Quote from: MrBcx on October 19, 2023, 12:23:55 PM
Hi Luis,

I don't know what "version 6" is.

Anyway,

The following BCX demonstration works correctly with Pelles C, MSVC, MINGW, and Lcc-Win32

Save the attached data file, "data.txt" as UTF-8

Save, translate, and compile "Test.bas".  Make sure Test.exe and data.txt are in the same folder.

Hi MrBCX:

I don't know what to say other than to parrot intelligence, "Close but no cigar!"

Windows 10 and 11.

Tried setting CHCP 65001 and 1252 in the code.

Prints as expected to console running CHCP 65001, but MSGBOX scrambled.

Am I the only one with a problem?

Come on you lazy lurkers give us some feedback !


MrBcx

Quote from: Robert on October 19, 2023, 04:34:16 PM
Quote from: MrBcx on October 19, 2023, 12:23:55 PM
Hi Luis,

I don't know what "version 6" is.

Anyway,

The following BCX demonstration works correctly with Pelles C, MSVC, MINGW, and Lcc-Win32

Save the attached data file, "data.txt" as UTF-8

Save, translate, and compile "Test.bas".  Make sure Test.exe and data.txt are in the same folder.

Hi MrBCX:

I don't know what to say other than to parrot intelligence, "Close but no cigar!"

Windows 10 and 11.

Tried setting CHCP 65001 and 1252 in the code.

Prints as expected to console running CHCP 65001, but MSGBOX scrambled.

Am I the only one with a problem?

Come on you lazy lurkers give us some feedback !


Hi Robert,

* I changed the MSGBOX to PRINT txt
* I typed:  chcp 65001 at a command prompt
* I executed Test.exe
* Output:

C:\Temp>test

PEQUEÑO 1
PEQUEÑO 2
PEQUEÑO 3


Press any key to continue . . .

* Next ...

C:\Temp>chcp 1252
Active code page: 1252

C:\Temp>test
PEQUEÃ'O 1
PEQUEÃ'O 2
PEQUEÃ'O 3


Press any key to continue . . .


That's all I know.

Vortex

Hello,

This what the message box displays om my Windows 10 system at work :

PEQUEÃ'O 1
PEQUEÃ'O 2
PEQUEÃ'O 3


The text file data.txt is ANSI but Windows 10's notepad utility is switching to UTF8 by default  :

https://techcommunity.microsoft.com/t5/windows-10/notepad-default-encoding-utf8-windows-10-version-1903/m-p/814513

MrBcx

Quote from: Vortex on October 20, 2023, 05:04:24 AM

The text file data.txt is ANSI  ...


Nope ... that's not correct.

The file that I uploaded was created in BED, in UTF-8 format without BOM.

See the attached snapshot.



MrBcx

Clear as mud ... just like this footnote on the MS link:

CP_ACP equates to CP_UTF8 only if running on Windows Version 1903 (May 2019 Update) or above and the ActiveCodePage property described above is set to UTF-8. Otherwise, it honors the legacy system code page. We recommend using CP_UTF8 explicitly.

So it seems I owe Vortex an apology.

Microsoft's motto should be:  Microsoft - where black is white and wrong is right!




Robert

Quote from: Vortex on October 20, 2023, 05:04:24 AM
Hello,

This what the message box displays om my Windows 10 system at work :

PEQUEÃ'O 1
PEQUEÃ'O 2
PEQUEÃ'O 3


The text file data.txt is ANSI but Windows 10's notepad utility is switching to UTF8 by default  :

https://techcommunity.microsoft.com/t5/windows-10/notepad-default-encoding-utf8-windows-10-version-1903/m-p/814513

Hi Vortex:

If you are running Windows Version 1903 (May 2019 Update) or newer then

go to

Windows Settings > Time & language > Language & region > Administrative language settings > Change system locale

and check Beta: Use Unicode UTF-8

Otherwise,the UTF-8 has to be converted to UTF-16LE and the code has to be modified to run UTF-16LE UNICODE. The BCX expert on UTF-16LE UNICODE is Ian Casey. There is a link to Ian Casey's work on the BCX home page at

https://bcxbasiccoders.com/

Robert

Quote from: MrBcx on October 20, 2023, 10:25:47 PM
Clear as mud ... just like this footnote on the MS link:

CP_ACP equates to CP_UTF8 only if running on Windows Version 1903 (May 2019 Update) or above and the ActiveCodePage property described above is set to UTF-8. Otherwise, it honors the legacy system code page. We recommend using CP_UTF8 explicitly.

So it seems I owe Vortex an apology.

Microsoft's motto should be:  Microsoft - where black is white and wrong is right!

Hi MrBCX:

On your machine, did you

go to

Windows Settings > Time & language > Language & region > Administrative language settings > Change system locale

and check Beta: Use Unicode UTF-8

???

If the Beta: Use Unicode UTF-8 is not checked on your machine, I can not imagine how the correct glyphs were displayed in the MessageBox.

Robert

Quote from: MrBcx on October 20, 2023, 08:37:06 AM
Quote from: Vortex on October 20, 2023, 05:04:24 AM

The text file data.txt is ANSI  ...


Nope ... that's not correct.

The file that I uploaded was created in BED, in UTF-8 format without BOM.

See the attached snapshot.

Hi MrBCX:

Your file is in UTF-8 format without BOM.

FWIW, ANSI N with tilde (Ñ) means code point hex A5 for code page 437.

It is at a different code point, hex D1 for code pages

ISO-8859-1
ISO-8859-3
ISO-8859-9
Windows-1252
Windows-1254
Windows-1258

The Unicode code point is hex D1.

It gets abysmally confusing when all hundreds of code pages are considered.



 



MrBcx

Quote from: Robert on October 20, 2023, 11:39:02 PM

Hi MrBCX:

On your machine, did you go to

Windows Settings > Time & language > Language & region > Administrative language settings > Change system locale  and check Beta: Use Unicode UTF-8

NO ... This was the first time I ever navigated to that dialog.  See attached screenshot


If the Beta: Use Unicode UTF-8 is not checked on your machine, I can not imagine how the correct glyphs were displayed in the MessageBox.

We're Microsoft! Where black is white and wrong is right!

Robert

Quote from: MrBcx on October 21, 2023, 12:16:15 AM
Quote from: Robert on October 20, 2023, 11:39:02 PM

Hi MrBCX:

On your machine, did you go to

Windows Settings > Time & language > Language & region > Administrative language settings > Change system locale  and check Beta: Use Unicode UTF-8

NO ... This was the first time I ever navigated to that dialog.  See attached screenshot


If the Beta: Use Unicode UTF-8 is not checked on your machine, I can not imagine how the correct glyphs were displayed in the MessageBox.

We're Microsoft! Where black is white and wrong is right!

And it's hard to live that way

How I look and how I feel
What I'm told and what is real
That's two different things

What they ask and what I pay
What I think and what I say
That's two different things

It isn't kind, it isn't fair
I have to draw the line somewhere

Vortex

Hi Kevin,

No worries. This time, I checked the text file with Notedpad++, it's UTF8 without byte order mark.

Hi Robert,

Thanks, I will test the UNICODE setting on my computer at work.

Luis Candelas

When I said version 6 I was referring to bcx 6.82
I attach a screenshot. I write the code in BED, and the words with accents. The file name has the letter Ñ.


At some point it stops identifying it by that letter and changes it, as seen in the lower sale.
In the saved file I open it with Jen's file and the word 'párrafo' in the code modifies it.
I'm very clumsy because I can't compile the code to show you the example.
I attach the code, the file it reads and the image.
I have checked the system configuration (windows 10) and I have everything in Spanish. I don't see where to make the local language changes, which I think shouldn't be necessary.
Thank you very much to all.