Complexity Shock Horror II: the Sequel (was Re: MD5 in erlang.)

Fri Apr 11 10:26:11 CEST 2003

Finally, some feedback!. Too bad it was mostly negative. Discussion 
inserted below.

/ Raimo

Robert Virding wrote:
> I will reply to this mail instead of Raimo's reply to my reply.
> 
> ----- Original Message -----
> From: "Raimo Niskanen" <raimo.niskanen@REDACTED>
> To: <erlang-questions@REDACTED>
> Sent: Wednesday, April 02, 2003 10:00 AM
> Subject: Re: Complexity Shock Horror II: the Sequel (was Re: MD5 in erlang.)
> 
> 
> 
>>The Erlang style Base#Integer notation is such an odd freak that it also
>>should not have a letter of its own. I will use b/B for signed
>>unprefixed instead so you won't have to use x/X with empty prefix. >
> 
> It may be odd and slightly freakish but it is part of the language so it
> should be supported if we are going to output based integers. After all we
> are adding some thing to the *ERLANG* libraries! That is why there should be
> a ~#!

I thougth a lot about using ~#, but could not decide if it should output 
uppercase or lowercase letters, and I could not find another similar 
character for the other case (upper/lower).

Therefore I selected x/X.

> 
> 
>>This gives the general x/X:
>>io_lib:format("~.16x", [-31,"0x"]) -> "-0x1f"
>>io_lib:format("~.16X", [-31,"0x"]) -> "-0x1F"
>>b/B is just x/X with empty prefix:
>>io_lib:format("~.16b", [-31]) -> "-1f"
>>io_lib:format("~.16B", [-31]) -> "-1F"
> 
> 
> No I defintely don't like it! Two major objections:
> 1. You are adding a completely new concept with an option to have an
> included string in the middle of an output, this is not consistent with

The problem is that to be able to output "-16#1f" for -31 and "16#1f for 
+31, something ("16#") has to be inserted between the sign and the number.

> anything. I am strongly for consistency and simplicity. Otherwise why not go

So am I, but only as long as it also is useful.

> for the complete set of Common Lisp options and we can scrap half of the
> libraries in one fell swoop.
> 2. If you are using x/X for hex then b/B definitely suggests _B_inary!

I am not using x/X for hex, it is for prefiXed output, and b/B is for 
any-Base integer (x/X is also any-base, by the way). The only one 
necessaryis x/X, since b/B is just x/X with empty prefix string, but I 
think it would be awkward to have to give an empty prefix string in the 
farily common cases when you do not want any prefix.

> 3. I don't really understand why we need to directly support other language
> formats. This just a grumble!
> 
Well, there are lots of applications that needs to write files that are 
to be read by alien applications, so it should be possible, and fairly 
simple.

> Also if u/U and b/B mean hex why the .16? Or do you mean that they mean bit
> output and you give the base in the precision? So can you write "~.8b" to
> get octal? Otherwise it is completely redundant and very confusing.
> 

Yes, .16 is the base. Allowed bases are 2 throught 36, with default 10.

> 
>>and u/U is the same as b/B with the argument integer 'band'ed with
>>((1<<WordSize)-1):
>>io_lib:format("~.16U", [31,32]) -> "1F"
>>io_lib:format("~.16u", [-31,32]) -> "ffffffe1"
>>io_lib:format("~8.16.0u", [31,32]) -> "0000001f"
>>io_lib:format("~8.16.0U", [-31,32]) -> "FFFFFFE1"
> 
> 
> Some objections here:
> 
> 1. I assume the second argument is the "word size". Why add an extra option
> for something which already exists in the language? Why can't the user just
> write 31 band 255 and be done with. As people have rightly pointed out there
> is already too much junk in the libraries without adding more unneccessary
> stuff!

Convenience! I think it might be fairly common to output an unsigned 
word. The suggestion from Happi had a binary bit field of this sort, but 
it was only base 2.

> 2. You are using the field width inconsistently! It does not mean the width
> of the outputed data but the total width of allowed field to output the
> data. If there is not enough data to fill the field padding is inserted.
> Using the precision is consistent with intent. Like I did! :-)

Sorry, I am to stupid to understand your second sentence. And I have no 
access to your suggestion so I do not know how you did it. It was 
Happi's suggestion that got me started.

I do not think I use the field width inconsistently. It is the only 
parameter I have not tampered with - I use the precision field 
inconsistently! The 3'rd line in the example above outputs a 32 bit wide 
  unsigned word in base 16 in a 8 character wide field, padded with zero 
characters. It is a trick to get the same width for negative and 
positive numbers since I did not want to use yet another character for 
zero padding to word size, which of course is a possibility.

And I am not certain I am using the precision field inconsistently, 
either. It already means different things for different format characters.

> 3. (Sarcastically) If I do io:format("~8.16u", [31,"0x"]) will that insert
> "0x" somewhare in the output?

Unfortunately impossible since ~u must know if it has extra arguments in 
the argument list. They are mandatory.

The only arguments that can be optional are field width, precision and 
padding character. That is why I put number base in the precision field, 
since there is a natural default (10). Mandatory arguments can be put in 
the argument list, and do not have to be numeric. That is why I put the 
prefix string there.

> 
> 
>>Objections, anyone?
> 
> 
> Yes, defintely lots of objections! Too many inconsistencies and wrong usage
> of existing features to be allowed in! Cover the language needs first and
> let the users handle special cases. Adding this would be Big Lose!

It should be easy to handle common "special cases".

> 
> I know the io:format options aren't the best but at least they are
> consistent.
> 
> You mentioned somewhere else a ~- option, this could be useful for
> io:fwrite. The io:fread option would be useful but I think the option
> shouldn't use a generic u.

Why not ~u, what do you mean generic? It was supposed to be consistent 
with my suggestion of u/U for unsigned output. I have not suggested an 
option for scanning any-base integers i.e ~# could scan both "16#1f" and 
"8#37", is this what you are looking for? My ~16u would scan "1f", and 
to scan "-16#1f" you would have to know which base you are expecting and 
use ~-16#~u to scan it. Not quite satisfactory.

~- for fwrite, that is a thought. Then you could (if ~u now output the 
absolute value and did not take a wordsize argument) do:
	U = -31,
	io:fwrite("~-16#~.16u", [U,U]) -> "-16#1f".
This makes it hard to make use of the field width, though. Especially 
for right adjusted and padded fields. How would you e.g output a 
"......-16#123abc" field? This also speaks for having a prefix string a 
s argument to the format character.

Or, did you think of another use of ~-?

> 
> Robert
> 
> 

The demands I have are:
1) It must be possible to output signed Erlang style base prefixed 
integers "-16#1abc"in at least bases 2..16, but why not 2..36, 
preferably in both lowercase and uppercase.
2) It should be possible to output 2..36 base integers with other 
prefixes (also no prefix), even if the numbers are negative, in both 
lowercase and uppercase.
3) It should be fairly simple to interprete a number as a unsigned word 
of any width and print it like "0000".."FFFF" or also "0".."FFFF". If 
you can output unprefixed it is of course easy to do with the band 
operator and zero pad character.

Fortunately, there is still some time left to R9C checkin stop, to 
decide on an implementation reasonable for everyone.