Details | Last modification | View Log | RSS feed
Rev | Author | Line No. | Line |
---|---|---|---|
1186 | savelij | 1 | The current version of AS supports the concept loadable language modules, |
2 | i.e. the language AS speaks to you is not set during compile time. Instead, |
||
3 | AS tries to detect the language environment at startup and then to load |
||
4 | the appropriate set of messages dynamically. The process of detection |
||
5 | differs depending on the platform: On MS-DOS and OS/2 systems, AS queries |
||
6 | the COUNTRY setting made from CONFIG.SYS. On Unix systems, AS looks for |
||
7 | the environment variables |
||
8 | |||
9 | LC_MESSAGES |
||
10 | LC_ALL |
||
11 | LANG |
||
12 | |||
13 | and takes the first two letters from the variable that is found first. |
||
14 | These two letters are interpreted as a code for the country you live |
||
15 | in. |
||
16 | |||
17 | Currently, AS knows the languages 'german' (code 049 resp. DE) and |
||
18 | english (code 001 resp. EN). Any other setting leads to the default |
||
19 | english language. Sorry, but I do not know more languages good enough |
||
20 | to do other translations. You may now ask if you could add more |
||
21 | languages to AS, and this is just what I hoped for when I wrote these |
||
22 | lines ;-) |
||
23 | |||
24 | Messages are stored in text files with the extension '.res'. Since |
||
25 | parsing text files at every startup of the assembler would be quite |
||
26 | inefficient, the '.res' files are transformed into a binary, indexed |
||
27 | format that can be read with a few block read statements. The |
||
28 | translation is done during the build process with a special tool |
||
29 | called 'rescomp' (you might have seen the execution of rescomp while |
||
30 | you built the C version of AS). rescomp parses the input file(s), |
||
31 | assigns a number to each message, packs the messages to a single array |
||
32 | of chars with an index table, and creates an additional header file |
||
33 | that contains the numbers assigned to each message. A run-time |
||
34 | library then allows to look up the messages via their numbers. |
||
35 | |||
36 | A message source file consists of a couple of control statements. |
||
37 | Empty lines are ignored; lines that start with a semicolon are |
||
38 | treated as comments (i.e. they are also ignored). The first |
||
39 | control statement a message file contains is the 'Langs' statement, |
||
40 | which indicates the languages the messages in this file will support. |
||
41 | This is a *GLOBAL* setting, i.e. you cannot omit languages for single |
||
42 | messages! The Command has the following form: |
||
43 | |||
44 | Langs <Code>(<Country-Code(s),...>) .... |
||
45 | |||
46 | 'Code' is the two-letter abbreviation for a language, e.g. 'DE' for |
||
47 | german. Please use only UPPERcase! The code is followed by a |
||
48 | comma-separated list of DOS-style country codes for DOS and OS/2 |
||
49 | environments. As you see, several country codes may point to a |
||
50 | single language this way. For example, if you want to assign the |
||
51 | english language to both americans and british people, write |
||
52 | |||
53 | Langs EN(001,061) <further languages> |
||
54 | |||
55 | In case AS finds a language environment that was not explicitly |
||
56 | handled in the message file, the first language given to the 'Langs' |
||
57 | command is used. You may override this via the 'Default' statement. |
||
58 | e.g. |
||
59 | |||
60 | Default DE |
||
61 | |||
62 | Once the language is specified, the 'Message' command is the |
||
63 | only one left to be explained. This command starts the definition of |
||
64 | a message. The message file compiler reads the next 'n' lines, with |
||
65 | 'n' being the number of languages defined by the 'Langs' command. A |
||
66 | sample message definition would look like |
||
67 | |||
68 | Message TestMessage |
||
69 | "Dies ist ein Test" |
||
70 | "This is a test" |
||
71 | |||
72 | given that you specified german and english language with the 'Langs' |
||
73 | command. |
||
74 | |||
75 | In case the messages become longer than a single line (messages may |
||
76 | contain newline characters, more about this later), the use of a |
||
77 | backslash (\) as a line continuation parameter is allowed: |
||
78 | |||
79 | Message TestMessage2 |
||
80 | "Dies ist eine" \ |
||
81 | "zweizeilige Nachricht" |
||
82 | "This is a" \ |
||
83 | "two-line message" |
||
84 | |||
85 | Since we deal with non-english languages, we also have to deal with |
||
86 | characters that are not part of the standard ASCII character set - a |
||
87 | point where UNIX systems are traditionally weak. Since we cannot |
||
88 | assume that all terminals have the capability to enter all |
||
89 | language-specific character directly, there must be an 'escape |
||
90 | mechanism' to write them as a sequence of standard ASCII characters. |
||
91 | The message file compiler uses a subset of the sequences used in SGML |
||
92 | and HTML: |
||
93 | |||
94 | ä ë ï ö ü |
||
95 | --> lowercase umlauted characters |
||
96 | Ä Ë Ï Ö Ü |
||
97 | --> uppercase umlauted characters |
||
98 | ß |
||
99 | --> german sharp s |
||
100 | ² |
||
101 | --> exponential 2 |
||
102 | µ |
||
103 | --> micron character |
||
104 | à è ì ò ù |
||
105 | --> lowercase accent grave characters |
||
106 | À È Ì Ò Ù |
||
107 | --> uppercase accent grave characters |
||
108 | á é í ó ú |
||
109 | --> lowercase accent acute characters |
||
110 | Á É Í Ó Ú |
||
111 | --> uppercase accent acute characters |
||
112 | â ê î ô û |
||
113 | --> lowercase accent circonflex characters |
||
114 | Â Ê Î Ô Û |
||
115 | --> uppercase accent circonflex characters |
||
116 | ç Ç |
||
117 | --> lowercase / uppercase cedilla |
||
118 | ñ Ñ |
||
119 | --> lowercase / uppercase tilded n |
||
120 | å Å |
||
121 | --> lowercase / uppercase ringed a |
||
122 | æ &Aelig; |
||
123 | --> lowercase / uppercase ae diphtong |
||
124 | ¿ ¡ |
||
125 | --> inverted question / exclamation mark |
||
126 | \n |
||
127 | --> newline character |
||
128 | |||
129 | Upon translation of a message file, the message file compiler will |
||
130 | replace these sequences with the correct character encodings for the |
||
131 | target platform. In the extreme case of a bare 7-bit-ASCII system, |
||
132 | this may imply the translation to a sequence of ASCII characters that |
||
133 | 'emulate' the non-ASCII character. *NEVER* use the special characters |
||
134 | directly in the message source files, as this would destroy their |
||
135 | portability!!! |
||
136 | |||
137 | The number of supported language-specific characters used to be |
||
138 | strongly biased to the german language. The reason for this is |
||
139 | simple: german is the only non-english language AS currently |
||
140 | supports...sorry, but English and German is the amount of languages |
||
141 | im am sufficiently fluent in to make a translation...help of others to |
||
142 | extend the range is mostly welcome, and this is the primary reason |
||
143 | why I explained the whole stuff ;-) |
||
144 | |||
145 | So, if you feel brave enough to add a language (don't forget that |
||
146 | there's also an almost-300-page user's manual that waits for |
||
147 | translation ;-), the following steps have to be taken: |
||
148 | |||
149 | 1. Find out which non-ASCII characters you additionally need. |
||
150 | I can then extend the message file compiler appropriately. |
||
151 | 2. Add your language to the 'Langs' statement in 'header.res'. |
||
152 | This file is included into all other message files, so you |
||
153 | only have to do this once :-) |
||
154 | 3. go through all other '.res' files and add the line to all |
||
155 | messages........ |
||
156 | 4. recompile AS |
||
157 | 5. You're done! |
||
158 | |||
159 | That's about everything to be said about the technical side. |
||
160 | Let's go to the political side. I'm prepared to get confronted |
||
161 | with two opinions after you read this: |
||
162 | |||
163 | "Gee, that's far too much effort for such a tool. And anyway, who |
||
164 | needs anything else than english on a Unix system? Unix is some- |
||
165 | thing that was born to be english, and you better accept that!" |
||
166 | |||
167 | "Hey, why did you reinvent the wheel? There's catgets(), there's |
||
168 | GNU-gettext, and..." |
||
169 | |||
170 | Well, i'll try to stay polite ;-) |
||
171 | |||
172 | First, the fact that Unix is so biased towards the english language is |
||
173 | in no way god-given, it's just the way it evolved. Unix was developed |
||
174 | in the USA, and the typical Unix users were up to now people who had |
||
175 | no problems with english - university students, developers etc. But |
||
176 | the times have changed: Linux and *BSD have made Unix cheap, and we are |
||
177 | facing more and more Unix users from other circles - people who |
||
178 | previously only knew MS-LOSS and MS-Windog, and who were told by their |
||
179 | nearest freak that Unix is a great thing. Such users typically will not |
||
180 | accept a system that only speaks english, given that every 500-Dollar- |
||
181 | Windows PC speaks to them in their native language, so why not this |
||
182 | Unix system that claims to be sooo great ?! |
||
183 | |||
184 | Furthermore, do not forget that AS is not a Unix-only tool: It runs |
||
185 | on MS-DOS and OS/2 too, and a some people try to make it go on Macs |
||
186 | (though this seems to be a much harder piece of work...). On these |
||
187 | systems, localization is the standard! |
||
188 | |||
189 | The portability to non-Unix platforms is the reason why I did not choose |
||
190 | an existing package to manage message catalogs. catgets() seems to be |
||
191 | Unix-specific (and it even is not available on all Unix systems!), and |
||
192 | about gettext...well, I just did not look into it...it might have worked, |
||
193 | but most of the GNU tools ported to DOS I have seen so far needed 32-bit- |
||
194 | extenders, which I wanted to avoid. So I quickly hacked up my own |
||
195 | library, but I promise that I will at least reuse it for my own projects! |
||
196 | |||
197 | chardefs.h |