[erlang-questions] filelib:fold_files (file:list_dir)and unicode filesystems

Dmitri Girenko Dmitri.Girenko@REDACTED
Tue Feb 26 12:03:12 CET 2008


Hello,

> Greetings,
> 
> How do you mean "invisible to erlang"?
> 
> I have a file called åäö. It was created in a plan9 tool
> (http://plan9.bell-labs.com/sys/doc/acme.html) and plan9 uses unicode
> throughout (http://plan9.bell-labs.com/sys/doc/utf.html). Using default
> settings on unix the file is displayed as åäö. Using erlang the file
> is also åäö .

I'm sorry, I should had been more specific.

You are right, this is a UTF-8 encoded file name and this is a possible workaround for *nix systems. (The name later has to be kept in UTF-8 or translated to UTF-16 if needed). But on windows I have the following:
________________
Erlang (BEAM) emulator version 5.6 [smp:2] [async-threads:0]

Eshell V5.6  (abort with ^G)
1> ls().
1.txt                  128.css                ????.txt
2> filelib:fold_files(".",".*txt",false,fun(F,_)->io:format("~s~n",[F]) end, []).
./1.txt
Ok
4> file:list_dir(".").
{ok,["????.txt","1.txt"]}

So list_dir sees as "????" and fold_files does not show it at all. Is that regexp matching problem or file listing?

Also, it is impossible to open that file for IO.

Probably using utf-8 on *nix seems to be a sufficient workaround and win32 compatibility... has to be ... well, you know (:

BR
Dmitri

> ; erl
> Eshell V5.5.5.5  (abort with ^G)
> 2> ls().
> åäö
> ok
> 
> 
> bengt
> 
>  On Tue, 2008-02-26 at 09:51 +0200, Dmitri Girenko wrote:
> > Hi all,
> >
> > I have a process that monitors specific folders for changed files. The
> > problem is that when a filename contains Unicode characters which are
> > outside of the system default codepage (Latin-1), those files are just
> > invisible to erlang.
> >
> > There's a workaround for linux - using UTF-8 translation seems to work,
> > but apparently it doesn't work on windows.
> >
> > I think that once the "string as a list" problem is settled down, then
> > the efile driver should be updated to support Unicode filenames, maybe
> > using UTF-16 string encoding, rather than UTF-8.
> >
> > Are there any plans to do this?
> >
> > Dmitri Girenko
> >
> > Porkkalankatu 13C 00180
> > Helsinki, Finland
> > Tel:    +358-201-500-574
> > Mobile: +358-50-40-333-21
> > Fax:    +358-201-500-501
> >
> > _______________________________________________
> > erlang-questions mailing list
> > erlang-questions@REDACTED
> > http://www.erlang.org/mailman/listinfo/erlang-questions
> 
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://www.erlang.org/mailman/listinfo/erlang-questions



More information about the erlang-questions mailing list