The online racing simulator
LFS encoding and codepages
(7 posts, started )
LFS encoding and codepages
I know codepages have been covered a fair bit before (yes, I have a search button ), but I've not found any info detailing the additions of Korean and Simp/Trad Chinese. I also have a couple of questions about the actual codepages that are used, as I'm not sure and, lets face it, a noob. In the past I've really only worked with unicode, so I'm in a brave and chaotic new world here.

So far I think I'm right in saying I have...


Code Name Codepage
---- ---- --------
^L Latin1 iso8859_1
^G Greek iso8859_7
^C Cyrillic iso8859_5
^J Japanese ???
^E Eastern Europe iso8859_2
^T Turkish iso8859_9
^B Baltic iso8859_4
^H Traditional Chinese ???
^S Simpified Chinese ???
^K Korean ???

Note: I'm using codepage names pulled out of my code, and they tend to use lots of aliases. I'm working with Python at the moment and you can see a list of supported codecs here.

I'm not sure of the actual codecs used for Japanese, Traditional and Simplified Chinese, and Korean. I could maybe take a guess, but I don't like doing that at all, as I'm having enough trouble converting all this stuff into unicode as it is, without encouraging extra weirdness.

Alms for the poor m'lud.
edit: irrelevant.
taken from here : http://www.lfsforum.net/showthread.php?t=36628


<?php 
    $sets 
= array ('L' => 'CP1252',
                   
'G' => 'ISO-8859-7',
                   
'C' => 'CP1251',
                   
'E' => 'ISO-8859-2',
                   
'T' => 'ISO-8859-9',
                   
'B' => 'ISO-8859-13',
                   
'J' => 'SJIS-win',
                   
'S' => 'CP936',
                   
'K' => 'CP949',
                   
'H' => 'CP950');
?>

Note that LFS uses the Windows codepages rather than ISO. Though the Windows ones are based on ISO, they have some additional characters.
However, as you can see in my experiments I still had to use ISO for some codepages. Windows CP's didn't appear to work (properly) for all.

PS, these are just suggestions. Like you said there are many aliases, so some may work better for you than me.
edit: irrelevant.
Quote from DarkTimes :I took the names from the Python codecs, and using ISO just made it look nicer printed in a list.

It may look nicer in a list, but not all of the values will be correct.
edit: irrelevant.
LOL, my bad. Quick reading, I thought you meant that you where going to stick with the ISO-XXXX code pages because it looks better in your code when you print it out. LOL, that's quite something!

LFS encoding and codepages
(7 posts, started )
FGED GREDG RDFGDR GSFDG