Subversion Repositories pentevo

Rev

Details | Last modification | View Log | RSS feed

Rev Author Line No. Line
1186 savelij 1
The current version of AS supports the concept loadable language modules,
2
i.e. the language AS speaks to you is not set during compile time.  Instead,
3
AS tries to detect the language environment at startup and then to load
4
the appropriate set of messages dynamically.  The process of detection
5
differs depending on the platform: On MS-DOS and OS/2 systems, AS queries
6
the COUNTRY setting made from CONFIG.SYS.  On Unix systems, AS looks for
7
the environment variables
8
 
9
LC_MESSAGES
10
LC_ALL
11
LANG
12
 
13
and takes the first two letters from the variable that is found first.
14
These two letters are interpreted as a code for the country you live
15
in.  
16
 
17
Currently, AS knows the languages 'german' (code 049 resp. DE) and
18
english (code 001 resp. EN).  Any other setting leads to the default
19
english language.  Sorry, but I do not know more languages good enough
20
to do other translations.  You may now ask if you could add more 
21
languages to AS, and this is just what I hoped for when I wrote these
22
lines ;-)
23
 
24
Messages are stored in text files with the extension '.res'.  Since
25
parsing text files at every startup of the assembler would be quite
26
inefficient, the '.res' files are transformed into a binary, indexed
27
format that can be read with a few block read statements.  The 
28
translation is done during the build process with a special tool 
29
called 'rescomp' (you might have seen the execution of rescomp while
30
you built the C version of AS).  rescomp parses the input file(s),
31
assigns a number to each message, packs the messages to a single array
32
of chars with an index table, and creates an additional header file
33
that contains the numbers assigned to each message.  A run-time
34
library then allows to look up the messages via their numbers.
35
 
36
A message source file consists of a couple of control statements.
37
Empty lines are ignored; lines that start with a semicolon are
38
treated as comments (i.e. they are also ignored).  The first
39
control statement a message file contains is the 'Langs' statement,
40
which indicates the languages the messages in this file will support.
41
This is a *GLOBAL* setting, i.e. you cannot omit languages for single
42
messages!  The Command has the following form:
43
 
44
Langs <Code>(<Country-Code(s),...>) ....
45
 
46
'Code' is the two-letter abbreviation for a language, e.g. 'DE' for
47
german.  Please use only UPPERcase!  The code is followed by a
48
comma-separated list of DOS-style country codes for DOS and OS/2
49
environments.  As you see, several country codes may point to a
50
single language this way.  For example, if you want to assign the
51
english language to both americans and british people, write
52
 
53
Langs EN(001,061) <further languages>
54
 
55
In case AS finds a language environment that was not explicitly
56
handled in the message file, the first language given to the 'Langs'
57
command is used.  You may override this via the 'Default' statement.
58
e.g.
59
 
60
Default DE
61
 
62
Once the language is specified, the 'Message' command is the
63
only one left to be explained.  This command starts the definition of
64
a message.  The message file compiler reads the next 'n' lines, with
65
'n' being the number of languages defined by the 'Langs' command.  A
66
sample message definition would look like
67
 
68
Message TestMessage
69
 "Dies ist ein Test"
70
 "This is a test"
71
 
72
given that you specified german and english language with the 'Langs'
73
command. 
74
 
75
In case the messages become longer than a single line (messages may
76
contain newline characters, more about this later), the use of a
77
backslash (\) as a line continuation parameter is allowed:
78
 
79
Message TestMessage2
80
 "Dies ist eine" \
81
 "zweizeilige Nachricht"
82
 "This is a" \
83
 "two-line message"
84
 
85
Since we deal with non-english languages, we also have to deal with
86
characters that are not part of the standard ASCII character set - a
87
point where UNIX systems are traditionally weak.  Since we cannot
88
assume that all terminals have the capability to enter all
89
language-specific character directly, there must be an 'escape
90
mechanism' to write them as a sequence of standard ASCII characters. 
91
The message file compiler uses a subset of the sequences used in SGML
92
and HTML:
93
 
94
 &auml; &euml; &iuml; &ouml; &uuml;
95
   --> lowercase umlauted characters
96
 &Auml; &Euml; &Iuml; &Ouml; &Uuml;
97
   --> uppercase umlauted characters
98
 &szlig;
99
   --> german sharp s
100
 &sup2;
101
   --> exponential 2
102
 &micro;
103
   --> micron character
104
 &agrave; &egrave; &igrave; &ograve; &ugrave;
105
   --> lowercase accent grave characters
106
 &Agrave; &Egrave; &Igrave; &Ograve; &Ugrave;
107
   --> uppercase accent grave characters
108
 &aacute; &eacute; &iacute; &oacute; &uacute;
109
   --> lowercase accent acute characters
110
 &Aacute; &Eacute; &Iacute; &Oacute; &Uacute;
111
   --> uppercase accent acute characters
112
 &acirc; &ecirc; &icirc; &ocirc; &ucirc;
113
   --> lowercase accent circonflex characters
114
 &Acirc; &Ecirc; &Icirc; &Ocirc; &Ucirc;
115
   --> uppercase accent circonflex characters
116
 &ccedil; &Ccedil;
117
   --> lowercase / uppercase cedilla
118
 &ntilde; &Ntilde;
119
   --> lowercase / uppercase tilded n
120
 &aring; &Aring;
121
   --> lowercase / uppercase ringed a
122
 &aelig; &Aelig;
123
   --> lowercase / uppercase ae diphtong
124
 &iquest; &iexcl;
125
   --> inverted question / exclamation mark
126
 \n
127
   --> newline character
128
 
129
Upon translation of a message file, the message file compiler will
130
replace these sequences with the correct character encodings for the
131
target platform.  In the extreme case of a bare 7-bit-ASCII system,
132
this may imply the translation to a sequence of ASCII characters that
133
'emulate' the non-ASCII character.  *NEVER* use the special characters
134
directly in the message source files, as this would destroy their
135
portability!!!
136
 
137
The number of supported language-specific characters used to be
138
strongly biased to the german language.  The reason for this is
139
simple: german is the only non-english language AS currently
140
supports...sorry, but English and German is the amount of languages
141
im am sufficiently fluent in to make a translation...help of others to
142
extend the range is mostly welcome, and this is the primary reason
143
why I explained the whole stuff ;-)
144
 
145
So, if you feel brave enough to add a language (don't forget that
146
there's also an almost-300-page user's manual that waits for
147
translation ;-), the following steps have to be taken:
148
 
149
  1. Find out which non-ASCII characters you additionally need.
150
     I can then extend the message file compiler appropriately.
151
  2. Add your language to the 'Langs' statement in 'header.res'.
152
     This file is included into all other message files, so you
153
     only have to do this once :-)
154
  3. go through all other '.res' files and add the line to all
155
     messages........
156
  4. recompile AS
157
  5. You're done!
158
 
159
That's about everything to be said about the technical side.
160
Let's go to the political side.  I'm prepared to get confronted
161
with two opinions after you read this:
162
 
163
  "Gee, that's far too much effort for such a tool.  And anyway, who
164
   needs anything else than english on a Unix system?  Unix is some-
165
   thing that was born to be english, and you better accept that!"
166
 
167
  "Hey, why did you reinvent the wheel?  There's catgets(), there's
168
   GNU-gettext, and..."
169
 
170
Well, i'll try to stay polite ;-)
171
 
172
First, the fact that Unix is so biased towards the english language is
173
in no way god-given, it's just the way it evolved.  Unix was developed
174
in the USA, and the typical Unix users were up to now people who had
175
no problems with english - university students, developers etc.  But
176
the times have changed: Linux and *BSD have made Unix cheap, and we are
177
facing more and more Unix users from other circles - people who 
178
previously only knew MS-LOSS and MS-Windog, and who were told by their
179
nearest freak that Unix is a great thing.  Such users typically will not
180
accept a system that only speaks english, given that every 500-Dollar-
181
Windows PC speaks to them in their native language, so why not this 
182
Unix system that claims to be sooo great ?!
183
 
184
Furthermore, do not forget that AS is not a Unix-only tool: It runs
185
on MS-DOS and OS/2 too, and a some people try to make it go on Macs
186
(though this seems to be a much harder piece of work...).  On these
187
systems, localization is the standard!
188
 
189
The portability to non-Unix platforms is the reason why I did not choose
190
an existing package to manage message catalogs.  catgets() seems to be
191
Unix-specific (and it even is not available on all Unix systems!), and
192
about gettext...well, I just did not look into it...it might have worked,
193
but most of the GNU tools ported to DOS I have seen so far needed 32-bit-
194
extenders, which I wanted to avoid.  So I quickly hacked up my own 
195
library, but I promise that I will at least reuse it for my own projects!
196
 
197
chardefs.h