Blame | Last modification | View Log | Download | RSS feed
The current version of AS supports the concept loadable language modules,i.e. the language AS speaks to you is not set during compile time. Instead,AS tries to detect the language environment at startup and then to loadthe appropriate set of messages dynamically. The process of detectiondiffers depending on the platform: On MS-DOS and OS/2 systems, AS queriesthe COUNTRY setting made from CONFIG.SYS. On Unix systems, AS looks forthe environment variablesLC_MESSAGESLC_ALLLANGand takes the first two letters from the variable that is found first.These two letters are interpreted as a code for the country you livein.Currently, AS knows the languages 'german' (code 049 resp. DE) andenglish (code 001 resp. EN). Any other setting leads to the defaultenglish language. Sorry, but I do not know more languages good enoughto do other translations. You may now ask if you could add morelanguages to AS, and this is just what I hoped for when I wrote theselines ;-)Messages are stored in text files with the extension '.res'. Sinceparsing text files at every startup of the assembler would be quiteinefficient, the '.res' files are transformed into a binary, indexedformat that can be read with a few block read statements. Thetranslation is done during the build process with a special toolcalled 'rescomp' (you might have seen the execution of rescomp whileyou built the C version of AS). rescomp parses the input file(s),assigns a number to each message, packs the messages to a single arrayof chars with an index table, and creates an additional header filethat contains the numbers assigned to each message. A run-timelibrary then allows to look up the messages via their numbers.A message source file consists of a couple of control statements.Empty lines are ignored; lines that start with a semicolon aretreated as comments (i.e. they are also ignored). The firstcontrol statement a message file contains is the 'Langs' statement,which indicates the languages the messages in this file will support.This is a *GLOBAL* setting, i.e. you cannot omit languages for singlemessages! The Command has the following form:Langs <Code>(<Country-Code(s),...>) ....'Code' is the two-letter abbreviation for a language, e.g. 'DE' forgerman. Please use only UPPERcase! The code is followed by acomma-separated list of DOS-style country codes for DOS and OS/2environments. As you see, several country codes may point to asingle language this way. For example, if you want to assign theenglish language to both americans and british people, writeLangs EN(001,061) <further languages>In case AS finds a language environment that was not explicitlyhandled in the message file, the first language given to the 'Langs'command is used. You may override this via the 'Default' statement.e.g.Default DEOnce the language is specified, the 'Message' command is theonly one left to be explained. This command starts the definition ofa message. The message file compiler reads the next 'n' lines, with'n' being the number of languages defined by the 'Langs' command. Asample message definition would look likeMessage TestMessage"Dies ist ein Test""This is a test"given that you specified german and english language with the 'Langs'command.In case the messages become longer than a single line (messages maycontain newline characters, more about this later), the use of abackslash (\) as a line continuation parameter is allowed:Message TestMessage2"Dies ist eine" \"zweizeilige Nachricht""This is a" \"two-line message"Since we deal with non-english languages, we also have to deal withcharacters that are not part of the standard ASCII character set - apoint where UNIX systems are traditionally weak. Since we cannotassume that all terminals have the capability to enter alllanguage-specific character directly, there must be an 'escapemechanism' to write them as a sequence of standard ASCII characters.The message file compiler uses a subset of the sequences used in SGMLand HTML:ä ë ï ö ü--> lowercase umlauted charactersÄ Ë Ï Ö Ü--> uppercase umlauted charactersß--> german sharp s²--> exponential 2µ--> micron characterà è ì ò ù--> lowercase accent grave charactersÀ È Ì Ò Ù--> uppercase accent grave charactersá é í ó ú--> lowercase accent acute charactersÁ É Í Ó Ú--> uppercase accent acute charactersâ ê î ô û--> lowercase accent circonflex charactersÂ Ê Î Ô Û--> uppercase accent circonflex charactersç Ç--> lowercase / uppercase cedillañ Ñ--> lowercase / uppercase tilded nå Å--> lowercase / uppercase ringed aæ &Aelig;--> lowercase / uppercase ae diphtong¿ ¡--> inverted question / exclamation mark\n--> newline characterUpon translation of a message file, the message file compiler willreplace these sequences with the correct character encodings for thetarget platform. In the extreme case of a bare 7-bit-ASCII system,this may imply the translation to a sequence of ASCII characters that'emulate' the non-ASCII character. *NEVER* use the special charactersdirectly in the message source files, as this would destroy theirportability!!!The number of supported language-specific characters used to bestrongly biased to the german language. The reason for this issimple: german is the only non-english language AS currentlysupports...sorry, but English and German is the amount of languagesim am sufficiently fluent in to make a translation...help of others toextend the range is mostly welcome, and this is the primary reasonwhy I explained the whole stuff ;-)So, if you feel brave enough to add a language (don't forget thatthere's also an almost-300-page user's manual that waits fortranslation ;-), the following steps have to be taken:1. Find out which non-ASCII characters you additionally need.I can then extend the message file compiler appropriately.2. Add your language to the 'Langs' statement in 'header.res'.This file is included into all other message files, so youonly have to do this once :-)3. go through all other '.res' files and add the line to allmessages........4. recompile AS5. You're done!That's about everything to be said about the technical side.Let's go to the political side. I'm prepared to get confrontedwith two opinions after you read this:"Gee, that's far too much effort for such a tool. And anyway, whoneeds anything else than english on a Unix system? Unix is some-thing that was born to be english, and you better accept that!""Hey, why did you reinvent the wheel? There's catgets(), there'sGNU-gettext, and..."Well, i'll try to stay polite ;-)First, the fact that Unix is so biased towards the english language isin no way god-given, it's just the way it evolved. Unix was developedin the USA, and the typical Unix users were up to now people who hadno problems with english - university students, developers etc. Butthe times have changed: Linux and *BSD have made Unix cheap, and we arefacing more and more Unix users from other circles - people whopreviously only knew MS-LOSS and MS-Windog, and who were told by theirnearest freak that Unix is a great thing. Such users typically will notaccept a system that only speaks english, given that every 500-Dollar-Windows PC speaks to them in their native language, so why not thisUnix system that claims to be sooo great ?!Furthermore, do not forget that AS is not a Unix-only tool: It runson MS-DOS and OS/2 too, and a some people try to make it go on Macs(though this seems to be a much harder piece of work...). On thesesystems, localization is the standard!The portability to non-Unix platforms is the reason why I did not choosean existing package to manage message catalogs. catgets() seems to beUnix-specific (and it even is not available on all Unix systems!), andabout gettext...well, I just did not look into it...it might have worked,but most of the GNU tools ported to DOS I have seen so far needed 32-bit-extenders, which I wanted to avoid. So I quickly hacked up my ownlibrary, but I promise that I will at least reuse it for my own projects!chardefs.h