[erlang-patches] Improve pad character handling in base64 MIME decoding functions

Thomas O'Dowd tpodowd@REDACTED
Thu Dec 9 02:51:07 CET 2010


On Wed, 2010-12-08 at 10:06 +0100, Niclas Axelsson wrote:
> Thank you, Tom. Will include your patch into 'pu'. But since it breaks the
> behaviour (and might not be backwards compatible), I can not guarantee 
> it will
> be merged into 'dev'.
> 
> Regards,
> Niclas Axelsson, Erlang/OTP

Hi Niclas,

Thanks for your time! Btw. I also noticed some inconsistencies in the
older implementation between mime_decode_to_string/1 and mime_decode/1
when I was reading the code and deciding how to fixing the padding
character handling.

The inconsistency happens when 2 padding characters would be required to
complete the data but more data is found between the 2 padding
characters.

1> base64:mime_decode(<<"ab=cd=">>).
<<"i">>
2> base64:mime_decode_to_string(<<"ab=cd=">>).
** exception error: no function clause matching base64:b64d_ok(eq)
     in function  base64:decode/2

With the new patch the result is consistent between the two functions
(although the result is different as it decodes more).

1> base64:mime_decode(<<"ab=cd=">>).
<<105,183,29>>
2> base64:mime_decode_to_string(<<"ab=cd=">>).
[105,183,29]

Also, previously in the two pad case, the code also skips extra trailing
padding but for any other case it doesn't. Here is an example. Note that
it works for 3, but not for 4 and 5.

3> base64:mime_decode(<<"ab======">>).            
<<"i">>
4> base64:mime_decode(<<"abc=====">>).
** exception error: no match of right hand side value <<105,183,0:2>>
     in function  base64:mime_decode_binary/2
5> base64:mime_decode(<<"abcd====">>).
** exception error: no match of right hand side value <<105,183,29>>
     in function  base64:mime_decode_binary/2

The patch fixes this inconsistency also.

3> base64:mime_decode(<<"ab======">>).
<<"i">>
4> base64:mime_decode(<<"abc=====">>).
<<"i·">>
5> base64:mime_decode(<<"abcd====">>).
<<105,183,29>>

Regarding backwards compatibility, extra pad characters either embedded
in the MIME base64 encoded stream or at the end of the stream is quite
rare as a good base64 encoder should never generate such data. So
backwards compatibility should not be a major concern in this respect I
think.

FYI, my other favourite language Python can decode each of the examples
I listed above with the same results as the new patch. So that is nice.
Haven't tried other languages.

>>> base64.decodestring("abcd====")
'i\xb7\x1d'
>>> base64.decodestring("abc====")
'i\xb7'
>>> base64.decodestring("ab====")
'i'
>>> base64.decodestring("ab=cd=")
'i\xb7\x1d'

Sorry for the long mail but just thought the extra information might
help.

Tom.



More information about the erlang-patches mailing list