<html>
<head>
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
</head>
<body>
<div class="moz-cite-prefix">On 10/24/21 1:55 AM, Dan Gudmundsson
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CANX4uuOb7kZGZeK0zXUTi8SZbqt-yvVYY-w6QWzDcxjuF_ygrg@mail.gmail.com">
<meta http-equiv="content-type" content="text/html;
charset=windows-1252">
<div dir="ltr">
<div dir="ltr"><br>
</div>
<div>In my opinion, this should not be done, strings and in
particular unicode strings seem</div>
<div>to be very confusing as it is with two representations in
OTP APIs.<br>
</div>
<div><br>
</div>
<div>UTF-8 (and friends) is an encoding of UNICODE codepoints,
you should never</div>
<div>operate on the encoding <br>
</div>
</div>
</blockquote>
The way I was thinking about io:format("~t8s~n",[[16#C2,16#A2]]).
was that adding the "8" was an assert, saying this list data must
contain UTF-8 integers. That helps to avoid ambiguity and catch any
problems while also allowing a list of bytes to be used in the same
way as a binary (with the addition of a single character to the io
format string). So, I wasn't thinking of it as operating on the
encoding, but rather being more specific about the string
translation (t == translation, with 8|16|32 options to ensure the
specific translation is occurring or an error exception shows what
the translation problem is).<br>
<br>
Best Regards,<br>
Michael<br>
<br>
<br>
<blockquote type="cite"
cite="mid:CANX4uuOb7kZGZeK0zXUTi8SZbqt-yvVYY-w6QWzDcxjuF_ygrg@mail.gmail.com">
<div dir="ltr">
<div dir="ltr"><br>
</div>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Sun, Oct 24, 2021 at
10:35 AM Michael Truog <<a
href="mailto:mjtruog@gmail.com" moz-do-not-send="true">mjtruog@gmail.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">I was wondering if there
was interest in modifying the io interpretation <br>
of "~ts" to allow an integer between the t and s for forcing
a <br>
particular unicode interpretation. That would allow a list
of bytes to <br>
be interpreted as UTF8, to provide the same output as a
binary:<br>
1> io:format("~ts~n",[<<16#C2,16#A2>>]).<br>
¢<br>
ok<br>
2> io:format("~t8s~n",[[16#C2,16#A2]]).<br>
¢<br>
ok<br>
<br>
I was also wondering if bytestring types would be added to
Erlang/OTP, like:<br>
-type nonempty_bytestring() :: nonempty_list(byte()).<br>
-type bytestring() :: list(byte()).<br>
<br>
They are useful in iolists to ensure only bytes (not other
integers) are <br>
in nested lists.<br>
<br>
Best Regards,<br>
Michael<br>
</blockquote>
</div>
</div>
</blockquote>
<br>
</body>
</html>