[erlang-questions] binary_to_term and leaking atoms
Jayson Vantuyl
kagato@REDACTED
Mon Jan 4 12:01:57 CET 2010
Yeah, I thought of that after sending it (and followed up cryptically).
When data gets passed around, I generally don't expect it to be properly validated, so I was being a bit paranoid.
On Jan 4, 2010, at 2:35 AM, Zoltan Lajos Kis wrote:
> Just to see this clearly:
> How can a pid or fun be dangerous? If you don't trust them, don't call
> them or send messages to them.
>
>> This was exactly what I was thinking. As for naming the option, perhaps
>> you should use "only_existing_atoms" just to make it look similar to
>> list_to_existing_atom/1. Also, their docs should probably refer to each
>> other, just for context. I would also want options to prohibit_funs and
>> prohibit_pids. Both of these would be necessary to safely use
>> binary_to_term with unsafe binaries. Anything else anyone can think of?
>>
>> As an alternative to adding options, you could also just have
>> safe_binary_to_term/1 which prohibited funs, pids, and new atoms. Unless
>> there are other compelling options, an option list might be overkill. I'd
>> imagine that this will get used in a tight loop, so handling the lists for
>> each call might also be a bit of an unnecessary performance hit (although,
>> I find this unlikely).
>>
>> So, specifically, either:
>>
>> %% @spec safe_binary_to_term(binary()) -> term()
>> %% @doc Limited form of binary_to_term which won't create funs, pids, or
>> new atoms.
>> %% To be used to limit danger of decoding untrusted external binaries.
>>
>> Or:
>>
>> %% @spec binary_to_term(binary(), [ only_existing_atoms | prohibit_pid |
>> prohibit_fun | safe ]) -> term()
>> %% @doc Same as binary_to_term/1, but with special decoding options.
>> %% only_existing_atoms: prohibit creation of new atoms
>> %% prohibit_pid: prohibits creation of pids
>> %% prohibit_fun: prohibits creation of funs
>> %% safe: same as above three options, useful when decoding binaries from
>> untrusted sources
>>
>> Take your pick, but either one would make my world much easier.
>>
>> On Jan 4, 2010, at 1:54 AM, Kenneth Lundin wrote:
>>
>>> I think we will implement atom GC one day. We have discussed this
>>> several times and there
>>> are solutions with only small performance decrease (10% or maybe less).
>>> It is a major thing to implement and we cannot prioritize this for now
>>> and unknown when we can.
>>>
>>> Extended functionality in binary_to_term can be a good compromise in
>>> the mean time.
>>> The question is how it should work.
>>>
>>> Assume we introduce:
>>>
>>> binary_to_term(Bin,Options)
>>>
>>> What options should we have:
>>>
>>> 'no_new_atoms' could make the function crash with reason badarg if the
>>> binary contains
>>> encoded "new" non existing atoms
>>>
>>> or
>>>
>>> 'new_atoms_as_binaries' could translate "new" existing atoms to binaries
>>>
>>> etc.
>>>
>>> I think the first step is to define how the new function should work,
>>> what suggestions do you have here?
>>>
>>> /Kenneth Erlang/OTP Ericsson
>>>
>>> On Mon, Jan 4, 2010 at 9:28 AM, Jayson Vantuyl <kagato@REDACTED> wrote:
>>>> On the up side, short atoms sound like a fantastic idea. On the down
>>>> side, atom GC sounds like it could easily kill performance. I'd
>>>> imagine that everything would have to stop when atoms were GC'd. Even
>>>> if not, I'd also imagine it would be a large and invasive change.
>>>>
>>>> I don't suppose I could get a safe binary_to_term in the meantime?
>>>> What if I submitted a patch?
>>>>
>>>> On Jan 4, 2010, at 12:17 AM, Joe Armstrong wrote:
>>>>
>>>>> The real problem is that we don't have garbage collection of atoms.
>>>>> lists_to_exiting_atom is a hack to try and
>>>>> get around this.
>>>>>
>>>>> Erlang was designed for an environment of trusted nodes so we never
>>>>> worried about atom garbage collection.
>>>>> The case for adding atom GC has never been compelling enough to do it.
>>>>>
>>>>> On a 64 bit machine the case for atoms seems weak - you could make a
>>>>> new data type "short atoms" and
>>>>> store them in 64 bits, long atoms could be on the local process heap
>>>>> and not in the atom table, You could use some
>>>>> smart pointer scheme to avoid unnecessary string comparisons when
>>>>> comparing atoms ...
>>>>>
>>>>> /Joe
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Jan 4, 2010 at 3:58 AM, Jayson Vantuyl <kagato@REDACTED>
>>>>> wrote:
>>>>>> I've been writing a lot of Erlang lately, and I feel like I'm missing
>>>>>> something.
>>>>>>
>>>>>> Specifically, list_to_existing_atom is awesome for preventing atom
>>>>>> leak; binary_to_term is great for easily building flexible network
>>>>>> protocols; and {packet,N} makes framing the protocol a breeze.
>>>>>>
>>>>>> That said, I can't get the safety of list_to_existing_atom with
>>>>>> binary_to_term. binary_to_term will automatically create any atoms
>>>>>> (as well as funs) that a remote sender wants. This is has
>>>>>> necessitated writing custom protocol encoders / decoders, and makes
>>>>>> Erlang's external binary term format incredibly useless. It would be
>>>>>> very nice to add a version of binary_to_term that has an extra
>>>>>> argument which contains options. This would generally useful to
>>>>>> allow prohibiting creation of new atoms, prohibiting creation of funs
>>>>>> / pids, and maybe even to specify backwards-compatible binary formats
>>>>>> (making it easier to interoperate with older versions of Erlang).
>>>>>>
>>>>>> --
>>>>>> Jayson Vantuyl
>>>>>> kagato@REDACTED
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> ________________________________________________________________
>>>>>> erlang-questions mailing list. See http://www.erlang.org/faq.html
>>>>>> erlang-questions (at) erlang.org
>>>>>>
>>>>>>
>>>>>
>>>>> ________________________________________________________________
>>>>> erlang-questions mailing list. See http://www.erlang.org/faq.html
>>>>> erlang-questions (at) erlang.org
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Jayson Vantuyl
>>>> kagato@REDACTED
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> ________________________________________________________________
>>>> erlang-questions mailing list. See http://www.erlang.org/faq.html
>>>> erlang-questions (at) erlang.org
>>>>
>>>>
>>
>>
>> ________________________________________________________________
>> erlang-questions mailing list. See http://www.erlang.org/faq.html
>> erlang-questions (at) erlang.org
>>
>>
>>
>
>
More information about the erlang-questions
mailing list