Author Topic: Coding problems (Read 67988 times)

Kiemc · « **on:** November 29, 2010, 07:56:42 PM »

When I try to start MDSC.jar, the following error message pops up:

'An error occured while loading the file d:ProgramsMDSCv0.2.0langEnglish.lang: Unknown or misplaced element at line 1, position 1:'

You've forgotten to write the backslashes into the path as well.

I click OK, and then the same with the German language file:

'An error occured while loading the file d:ProgramsMDSCv0.2.0langGerman.lang: Unknown or misplaced element at line 1, position 1: □'

I click OK again, and the program starts, but shows somewhere labels like 'Ask-Title' or 'Yes-Title'.

I open the problematic language files with Notepad, and I see that English is saved in UTF-8, German in Unicode.

I save them in ANSI (on my machine it's Windows Central European, windows-1250), and no problem occurs, the language-selecting menu appears correctly. As I know, Unicode or UTF-8 have been created specially to cope with these international coding problems. German Unicode and Hungarian Unicode should be the same. But what if some coding-conversion is done by Java automatically? I don't know Java at all, it's your profession. But, at the moment, it's the only one reason I can think of. Maybe the problem is that Java wants to convert characters on my machine into Central European Windows (windows-1250) or into Central European ISO (iso-8859-2), while on your machine they are converted into Western European Windows (windows-1252) or into Western European ISO (iso-8859-1). Of course, I just guess. The problem can be absolutely different, it's just one slight idea.

Meanwhile an other problem has occured: sometimes when I start the program, it says: 'No languages found in folder c:lang' (and sometimes it tries other invalid paths, too). Why does MDSC want to open the language file from C:\, when the program's root directory is D:\Programs\MDSCv0.2.0\? And, at the same time, pictures on the buttons are also not willing to load. Strange.

But let's go on to exporting and importing, where the problem is the contrary. Importing MyDefrag's own script files is problemless.

I've created a simple script with MDSC, and then exported. Then I tried to import it, but the following error message appeared:

'Failed to parse the script:
Importing the script as version 4.3.1 failed: Unknown or misplaced element at line 1, position 1: □'

And it writes more lines in the dialogue, telling exactly the same with other MyDefrag versions down to 4.2.7.

The script is exported in Unicode. But if I save it in ANSI, it is sucessfully imported. Or, almost sucessfully. Special language characters, like á, é letters and so on, don't appear correctly, just white squares. However, if I open the scriptfile itself (with Notepad), they are all OK. And there's the same problem with the translated language files.

So, it's a good idea to keep text files, such as scripts and language files, coded in Unicode - then they can be used worldwide, and there won't be any problems with the translations. The only thing to do is to make the program able to handle Unicode files. You've thought that it does handle well, but I must aware you: it doesn't, at least in other countries.

Chris · « **Reply #1 on:** November 29, 2010, 09:06:21 PM »

Hi Kiemc, thanks for your extensive bug report, I appreciate it.

Quote from: Kiemc on November 29, 2010, 07:56:42 PM

You've forgotten to write the backslashes into the path as well.

No, I haven't forgotten them. They are in the original string but they are removed when replacing the "%s" in the original string with the error message. I'm aware of this but haven't paid much attention to it, since it's just an optical bug.

Quote from: Kiemc on November 29, 2010, 07:56:42 PM

I click OK again, and the program starts, but shows somewhere labels like 'Ask-Title' or 'Yes-Title'.

Without a language file, the names and descriptions of the script elements are also missing.

Quote from: Kiemc on November 29, 2010, 07:56:42 PM

I save them in ANSI (on my machine it's Windows Central European, windows-1250), and no problem occurs, the language-selecting menu appears correctly. As I know, Unicode or UTF-8 have been created specially to cope with these international coding problems. German Unicode and Hungarian Unicode should be the same. But what if some coding-conversion is done by Java automatically? I don't know Java at all, it's your profession. But, at the moment, it's the only one reason I can think of. Maybe the problem is that Java wants to convert characters on my machine into Central European Windows (windows-1250) or into Central European ISO (iso-8859-2), while on your machine they are converted into Western European Windows (windows-1252) or into Western European ISO (iso-8859-1). Of course, I just guess. The problem can be absolutely different, it's just one slight idea.

The character reading is automatically done by a java class. I just read the first three bytes, try to match them with a known format header and then tell the Java InputStream which format the file has. I'm also using the UTF-8 for files with the appropriate BOM and for files without them (ANSI or UTF-8). I suspect that there is a problem with the comparison of the known file headers, and since that fails, an ANSI/UTF-8 reader is used, which tries to read the BOM as a normal ANSI/UTF-8 character. Of course, this is not a know MyDefrag script element. I'll try using a different method to determine the file format.

Quote from: Kiemc on November 29, 2010, 07:56:42 PM

Meanwhile an other problem has occured: sometimes when I start the program, it says: 'No languages found in folder c:lang' (and sometimes it tries other invalid paths, too). Why does MDSC want to open the language file from C:\, when the program's root directory is D:\Programs\MDSCv0.2.0\? And, at the same time, pictures on the buttons are also not willing to load. Strange.

Yes, strange indeed. I only use relative paths when trying to read languages, MDSC scripts and icons. I just create a new reader with the path "lang\english.lang", which should be automatically completed with the root path like "D:\Programs\MDSCv0.2.0\lang\english.lang". I don't know why other locations are used as root directory.

Quote from: Kiemc on November 29, 2010, 07:56:42 PM

But let's go on to exporting and importing, where the problem is the contrary.

It's exactly the same problem as above. The files are valid Unicode files. My program knows how to write Unicode, but know how to read it.

Quote from: Kiemc on November 29, 2010, 07:56:42 PM

And it writes more lines in the dialogue, telling exactly the same with other MyDefrag versions down to 4.2.7.

Yes, if the parsing fails, the file is again parsed with different versions until (hopefully) the right version of this script is found.

Quote from: Kiemc on November 29, 2010, 07:56:42 PM

The script is exported in Unicode. But if I save it in ANSI, it is sucessfully imported. Or, almost sucessfully. Special language characters, like á, é letters and so on, don't appear correctly, just white squares. However, if I open the scriptfile itself (with Notepad), they are all OK. And there's the same problem with the translated language files.

I suspect that this is a different problem than above. Could you upload the converted file with just a few of these special characters, so that I can take a look at it if I have the same problem.

Quote from: Kiemc on November 29, 2010, 07:56:42 PM

So, it's a good idea to keep text files, such as scripts and language files, coded in Unicode - then they can be used worldwide, and there won't be any problems with the translations. The only thing to do is to make the program able to handle Unicode files. You've thought that it does handle well, but I must aware you: it doesn't, at least in other countries.

You're right (unfortunately). I hope we can figure out where the problem resides.

Kiemc · « **Reply #2 on:** November 30, 2010, 03:18:07 PM »

Quote from: Chris on November 29, 2010, 09:06:21 PM

Could you upload the converted file with just a few of these special characters, so that I can take a look at it if I have the same problem.

I've attached the modified language file in three formats: ANSI, UNICODE and UTF-8. I've changed the "Open a MyDeSC script" message to a simple Hungarian phrase containing all the special Hungarian language characters: Árvíztűrő tükörfúrógép (I don't know if it appears well here in the Forum, on your screen).

Chris · « **Reply #3 on:** November 30, 2010, 04:28:51 PM »

Thanks for the files, I appreciate it.
I get the same problem when using the ANSI file (as well with "my" special letters ö,ä,ü). UFT-8 and UNICODE on the other hand are displayed fine, like they should.

I've changed the way the BOMs are detected. Before, I read these as single characters, which worked fine on my system, but now I'm reading them as single bytes. I'm confident that this will work on your system, too. I'll upload the new version soon.

Chris · « **Reply #4 on:** January 12, 2011, 10:04:44 PM »

I've just released a new version with a different approach in detecting the file formats. Could you please try this version and see if the files can be read correctly?

Kiemc · « **Reply #5 on:** January 25, 2011, 09:17:25 PM »

I'm very glad to report that now, with Unicode coding, both applying translated language files and importing existing scripts, everything works fine and properly! No more error messages appear, and special national characters also appear correctly.

But only if the files are coded in Unicode. If they are in ANSI, special characters don't appear, just white squares. (But even in this case, error message doesn't appear.)

So: everything is OK, but only in Unicode.

Apart from this coding problem, I haven't had time to test your program thoroughly - maybe later, but I seem to be quite busy in the following months. So I don't know if there are other bugs as far as e.g. graphics is concerned. And - at least for some time - I can't continue the translation either. I'm sorry - but final exam is coming...

Chris · « **Reply #6 on:** January 25, 2011, 09:27:46 PM »

Thanks for the report, good to see that it's working now.

Quote from: Kiemc on January 25, 2011, 09:17:25 PM

But only if the files are coded in Unicode. If they are in ANSI, special characters don't appear, just white squares. (But even in this case, error message doesn't appear.)

That is normal, Java uses a different build-in code page for these symbols. Just keep using UTF or Unicode and you're fine.

Chris's Community

News:

Author Topic: Coding problems (Read 67988 times)

Kiemc

Coding problems

Chris

Re:Coding problems

Kiemc

The converted language files

Chris

Re:Coding problems

Chris

Re:Coding problems

Kiemc

In Unicode everything is fine

Chris

Re:In Unicode everything is fine