[erlang-questions] Read out the operating system language

Richard A. O'Keefe ok@REDACTED
Tue Mar 22 00:37:49 CET 2016


On 22/03/16 2:23 am, Michael S wrote:

> Not the type ;)
> I mean the language english, german, spanish, french etc.

The operating system doesn't *have* a language.
In the C / UNIX world, "language" is broken up into facets:
     $LANG             - ultimate default
       $LC_ALL         - interim default
         $LC_COLLATE   - how to compare strings
         $LC_CTYPE     - how to classify characters
         $LC_MESSAGE   - how to display error/warning/info messages
         $LC_MONETARY  - how to display amounts of money
         $LC_NUMERIC   - how to display numbers
         $LC_TIME      - how to display dates and times

When you ask about a particular facet, if that facet is not set,
LC_ALL will be tried, and if that is not set, LANG will be tried.
Locales include
     C                 - the ASCII-ish default from the C standard
     POSIX             - the ASCII-ish default from the POSIX standard
     <language>[_<territory>][.<encoding>][@<variant>]
The locale names are less standardised than one would like.
The format is *typically* as shown, where the <language> is a 2-letter
ISO language code if the language has one, a 3-letter ISO code if it
doesn't, and dear knows what if it hasn't even one of those.
That's usually in lower case.
The <territory> is a 2-letter ISO country code if it has one, a
3-letter ISO code if it doesn't, and whatever otherwise.
That's usually in upper case.
The <encoding> could be UTF-8, GBK, KOI8-U, ISO8859-n, &c.
You might think it's upper case, but you might get BIG5 or Big5,
EUC-JP or eucJP, and so on.
The <variant> is used for things like @euro, @bokmal, @nynorsk

As a rough suggestion, try
    $LC_MESSAGES,
    $LC_ALL,
    $LANG
in that order until one of them starts with a bunch of letters
and is not C or POSIX.

language() ->
     case check_language("LC_MESSAGES")
       of {yes,Lang} -> Lang
        ; no ->
          case check_language("LC_ALL")
            of {yes,Lang} -> Lang
             ; no ->
               case check_language("LANG")
                 of {yes,Lang} -> Lang
               end
          end
     end.

check_language(Source) ->
     case os:getenv(Source)
       of false                     -> no
        ; Chars when is_list(Chars) -> check_language(Chars, [])
     end.

check_language([Char|Chars], Acc)
   when Char >= $a, Char =< $z ->
     check_language(Chars, [Char|Acc]);
check_language([Char|Chars], Acc)
   when Char >= $A, Char =< $Z ->
     check_language(Chars, [Char+32|Acc]);
check_language([], "c") ->
     no;
check_language([], "xisop") ->
     no;
check_language([_|_], []) ->
     no;
check_language(_, Acc) ->
     {yes, lists:reverse(Acc)}.

Much depends on what you want this for.  If you are going to
use it for reporting messages, you really want to pay attention
to the *whole* of $LC_MESSAGES, lest you display semi-
comprehensible text with risible spelling and grammar errors,
as will happen if you generate en_US text for an en_NZ user.

One reason for @ variants is that countries sometimes
go through spelling reforms, although from what a German
student a couple of years ago told me, not always successfully.
Point is, you might need a date or revision name as well as
language and territory.




More information about the erlang-questions mailing list