[erlang-questions] Binary string literal syntax

Thu Jun 7 00:56:29 CEST 2018

> On 7 Jun 2018, at 00:21, zxq9@REDACTED wrote:
> 
> On 2018年6月6日水曜日 11時41分01秒 JST Sean Hinde wrote:
> 
>> As a protocol wrangling language I would argue Erlang has no peers, but many more protocols are string based now than when the bit syntax was invented.
> 
> By count this is patently false. Most protocols are binary based, as the number of ad hoc binary protocols created for IoT vasty outnumber the handful of prolific string-based ones. Can you think of a better language for IoT protocol wrangling than Erlang?

No arguments from me on the suitability of Erlang for protocol wrangling. And these string based ones are definitely prolific. I spent today dealing with json in Erlang for some banking protocol

> 
> Sure, most people have no clue how to program sockets these days so they use HTTP for everything -- but that isn't *most* protocols, that's a relatively small set of overwhelmingly *prolific* protocols. My prediction is that binary protocols will become more prolific as the extremely limited shared resource of wireless bandwidth becomes more and more saturated (and I don't think compression is a fix-all here, though it certainly helps).

I don’t think it really matters how we count. Text based protocols are here and Erlang ought to provide a great programming environment for them too.

> Better handling of UTF-8 (or unicode, more generally, as remember Windows is natively UTF-16...) would be nice as a single case to latch on to and really focus on supporting from every angle -- but it is VERY FAR from being The One Grand Unifying Case.
> 
> I've commented on quite a few threads about encodings, strings, why graphemes and lexemes matter, and the myopia that comes with dealing with mostly European originated languages. I live in Japan and deal with Shift-JIS, JIS, JIS7, ISO-2022, EUC, etc. variants all the time. Is that a common case? Not in the West, but it is totally normal here -- especially dealing with web data.
> 
> * Binary protocols are alive and well
> * The old encodings are far from dead.
> * You have a good point about improvements being possible and desirable.
> * The best way to proceed is not clear.
> * The unicode-correctish improvements to the string and unicode modules are very encouraging.

Nice summary. You have obviously thought about this a lot. Any thoughts on a better solution? What would you do?

Maybe a hypothetical new string literal type treated as unicode internally but with transparent conversion to utf-8 by default when sent to io (with the option to override)? I get Japan, but utf8 is a sane default.

Or maybe some new slick syntax to create a string literal in any encoding.

The bit syntax was designed for picking apart bit twiddling telecom protocols. It was clearly not designed with the primary goal of representing alternative forms of string literals. It’s just not what you would choose for that application.

Sean

> 
> -Craig
> _______________________________________________
> erlang-questions mailing list
> erlang-questions@REDACTED
> http://erlang.org/mailman/listinfo/erlang-questions