[erlang-questions] Two beautiful programs - or web programming made easy
Edmond Begumisa
ebegumisa@REDACTED
Wed Feb 16 13:48:35 CET 2011
On Security:
I think if you take a closer look at _how_ Joe is actually using eval,
you'll find you need not be so alarmed by it. You'll find the *same*
security issues all web-programmers are used to. You'll find you can deal
with them in the *same* ways.
See comments inline...
On Tue, 15 Feb 2011 07:23:59 +1100, Frédéric Trottier-Hébert
<fred.hebert@REDACTED> wrote:
>
>
> On 2011-02-14, at 15:17 PM, Edmond Begumisa wrote:
>
>> Here's where you loose me: Say I have a js function foo that I want to
>> serve from my Erlang webserver and call from my browser. Note that
>> unlike JSON, this is an actual function and should therefore contain
>> code (so parsing out the code wouldn't make any sense)...
>>
>> a) Static version: Stick foo in a file foo.js on the server. Erlang
>> side streams the file to the browser, which reads it, interprets it and
>> runs it.
>>
>> b) Eval version: Stick foo in a message. Erlang side streams the
>> function to the browser, which using eval, reads it, interprets it and
>> runs it.
>>
>> Why is b) suddenly crossing some threshold of risk that is not
>> *equally* inherent to a)? What unacceptable extra risk am I introducing
>> by doing b) instead of a)? Why does doing b) require N times more
>> competence and security conciseness that you suddenly don't trust
>> yourself and would much prefer a) instead?
>>
>
> Not a whole lot of chances for things to go wrong at first. The only way
> to really have problem would be to dynamically take the filename -- some
> user-submitted variable, maybe not properly whitelisted. For the future,
> though? If at some point you get to customise the script a bit with
> run-time data. If it's simple integers and again whitelisted data, not
> much of a problem. The time you start taking in strings with random
> input, it gets more risky.
>
But it gets more risky in *both* cases, in equal magnitude. Doing a)
instead of b) won't protect if you start "taking in strings with random
input". Let's analyse how it "gets more risky" and then compare it to
Joe's code...
Say programmer Helga stores a value into a Key-Value DB carelessly from
the browser like so...
/* NB: We cannot trust ::textContent here coz we're saving it for later
use */
var f = getElementById('f').textContent; // Say f = 'x + y'
kvSaveToServer("f", f); // Server stores f verbatim
Then later, on another page, she tries a little calc UI...
/* NB: We can reasonably trust ::textContent here coz it's only used on
the same page */
var x = getElementById('x').textContent; // Trusted
var y = getElementById('y').textContent; // Trusted
var f = kvReadFromServer('f'); // Untrusted (f = 'x + y', but our
saving was dubious)
var z = eval(f);
The warning from Mozilla and others concerning eval here is clear and
every web-developer is aware of it: Helga can't really be sure what f
contains because it's code that comes from someone other than her (e.g. a
malicious user when it was saved.) Hence the general advice to avoid
eval(). But if she *is* sure what f contains because coz she damn-well
wrote it like so...
var x = getElementById('x').textContent; // Trusted
var y = getElementById('y').textContent; // Trusted
var z = eval('x + y'); // Trusted (Evaling code we've written)
Then this is no different from...
var x = getElementById('x').textContent; // Trusted
var y = getElementById('y').textContent; // Trusted
var z = x + y; // Trusted (code we've written)
It makes no difference whether the *static* 'x + y' comes streaming in
from the server and ran as eval('x + y') in the former case, or if it
comes locally from a js file that also comes from the server in the latter
case (there might be a scope issue in the former but we'll get to that.)
Now let's take it further and load the arbitrary strings you mention
within the eval'ed code (closer to what Joe's doing but not quite). I
believe this is the code that alarms you...
/* Assume keys x and y were saved earlier using careless
getElementById('x').textContent verbatim */
var x = kvReadFromServer('x'); // Untrusted
var y = kvReadFromServer('y'); // Untrusted
var z = eval('x + y'); // **Looks trusted but is not**
However, and here's half the point I was making, this is the same as...
var x = kvReadFromServer('x'); // Untrusted
var y = kvReadFromServer('y'); // Untrusted
var z = x + y; // **Looks trusted but is not**
So the non-eval'ed code here is just as risky and deceptive as the eval'ed
code. Helga avoiding eval on *code that she wrote* (former) and using a
non-eval'ed version *that she wrote* (latter) will not protect her. Using
the former is not _increasing_ her susceptibility to code injections from
the "random strings" coming from others in the database. She's equally
screwed either way by her carelessness with the DB.
**BUT**, and this is key, that's not even what Joe's doing! Basically, Joe
is taking this supposedly safer latter static version, and eval'ing it
like so...
var c = "var x = kvReadFromServer('x'); // Untrusted" +
"var y = kvReadFromServer('y'); // Untrusted" +
"var z = x + y; // **Looks trusted but is not**";
eval(c);
Joe is taking the code that you'd prefer to see in a static file, and just
eval'ing it. Yet, and this is the second half of my point, any
carelessness in either is equal. I find it hard to be petrified by the
lower version while being more comfortable with the one above because it's
somehow "less risky." The real problem is the untrusted values.
AFIAK (at least with the Mozilla code-base of which I'm fairly familiar
with), using eval like this *introduces* only one *new* security concern:
3rd party js can see the scope at which eval was evoked. And if you're
using 3rd party js that you can't trust, you're probalby screwed anyway :)
All other security concerns are the *same* in both versions.
Now, getting more realistic, he'll obviously want to do the reading server
side, thus replacing kvReadFromServer('x') with <<"'", Val, "'">> cat'ed
into the eval'ed string before it's delivered. So he now gets something
like...
var c = "var x = 'valfromuser'; // Untrusted" +
"var y = 'valfromuser'); // Untrusted" +
"var z = x + y; // **Looks trusted but is not**";
eval(c);
AHA! You raised a valid concern about how doing the proper escaping would
be tricky to get correct, esp from Erlang, and jquery does this sort of
thing better. But nothing stops Joe from making calls to JQuery with the
code he pushes (actually, he does use JQuery), or from using the native
JSON parser...
var c = "var x = JSON.Parse('valfromuser'); // Trusted" +
"var y = JSON.Parse('valfromuser'); // Trusted" +
"var z = x + y; // Trusted";
eval(c);
- Edmond -
>> From what I've read, it appears that you'd never use js at all in your
>> websites, let alone js that generates content. It would always be "too
>> risky."
>
> No, there are cases that are good for it, but you have to be very
> careful. In the context of Joe's mail, it was "hey I could replace my
> whole site by this kind of pattern" which is not the same as say,
> "inject user-related data in the JS to generate better advertisement" or
> "have a dynamic way of showing profile pictures" (although there are
> cleaner ways than streaming JS to an eval function for this).
>
> As a general (and generic) pattern, the eval() in Joe's code worries me.
> Individual cases can be tested and proven safe on an individual basis
> without too much trouble.
>>
>> - Edmond -
>>
>> On Mon, 14 Feb 2011 12:39:06 +1100, Frédéric Trottier-Hébert
>> <fred.hebert@REDACTED> wrote:
>>
>>> Replies are still in between bits of text.
>>> On 2011-02-13, at 15:43 PM, Edmond Begumisa wrote:
>>>
>>>> On Mon, 14 Feb 2011 05:59:19 +1100, Frédéric Trottier-Hébert
>>>> <fred.hebert@REDACTED> wrote:
>>>>
>>>>>
>>>>> On 2011-02-12, at 06:33 AM, Joe Armstrong wrote:
>>>>>
>>>>>>
>>>>>> The Javascript equivalent is:
>>>>>>
>>>>>> function onMessage(evt) {
>>>>>> eval(evt.data);
>>>>>> }
>>>>>>
>>>>>> Where the data comes from a websocket.
>>>>>>
>>>>> This is rather risky. Eval will take any code whatsoever and run it
>>>>> for you.
>>>>
>>>> Likewise the browser will take any static js (<script> tags)
>>>> whatsoever from your server and run it for you.
>>>
>>> Right. This is why ideally you want to pass in very precise function
>>> and do something RPC-like (despite Joe not liking it) or have your own
>>> parser (as it is the case with JSON). It's not that it's impossible to
>>> make the other ways safe, but I wouldn't trust most people (including
>>> myself) to get it right most of the time.
>>>>
>>>>> If you have dynamic content, without proper escaping and being very
>>>>> careful, users could run arbitrary code in your page, including
>>>>> stuff to steal session data and send it over to either some other
>>>>> site, or perform actions for the user which they do not necessarily
>>>>> approve on (making their profile public, closing their account,
>>>>> worms, etc.)
>>>>>
>>>>
>>>> Likewise if you have any dynamic content in js code on your server
>>>> without proper escaping and not being very careful, users could...
>>>
>>> Exactly. You're always as safe as your weakest link. Some frameworks
>>> (like JQuery on some methods) handle it for you, but usually there is
>>> no such thing for 'eval'.
>>>
>>>> ... Don't the "same source" XXS rule for non-evaled code apply to
>>>> evaled code? Doesn't the same duty of care to end-users for
>>>> protecting privacy, properly escaping data, etc, apply in both cases?
>>>> Don't you have to be careful either way?
>>>>
>>>
>>> Yes. But some ways to do things are safer than others by default. The
>>> problem with 'doing things right' is how much trust you put in
>>> yourself and your team of developers. I'm of the opinion that most
>>> people who feel good enough to handle security actually overlook a lot
>>> of it. Have you always checked everything for XSS in all encodings?
>>> CSRF? Ever used something like MD5 or SHA to hash passwords? Sent such
>>> passwords over e-mail, etc? Those are very basic options and I can
>>> tell you that most developers to have worked on the web had a problem
>>> with at least one of these at some point or another one. Hell, even
>>> gmail had sever CSRF holes at some point that let people randomly
>>> inject themselves into your forwarded email adresses.
>>>
>>> Security is hard, and stepping clear of the risky line is often a good
>>> option if you're not 100% sure of what you're doing. A cook skilled
>>> enough can likely prepare a meal while juggling with knives safely,
>>> but it's often not necessary to do so, and often not appropriate for
>>> everyone to follow that line either.
>>>
>>>>> In fact, this is a reason why people like Douglas Crockford prefered
>>>>> to write JSON parsers rather than just evaluating them. It's just
>>>>> not safe enough.
>>>>>
>>>>
>>>> Indeed you are correct, but...
>>>>
>>>> From http://www.json.org/js.html ...
>>>>
>>>> "...The use of eval is indicated when the source is *trusted* and
>>>> *competent*..."
>>>
>>> The *competent* part is the one that worries me. I think most
>>> developers (myself included) tend to overestimate their competence
>>> when it comes to security.
>>>>
>>>> "...In web applications over XMLHttpRequest, communication is
>>>> permitted only to the same origin that provide that page, so it is
>>>> *trusted*. But it *might not be competent*. If the server is not
>>>> rigorous in its JSON encoding, or if it does not scrupulously
>>>> validate all of its inputs, then it could deliver invalid JSON text
>>>> that could be carrying dangerous script..."
>>>>
>>>> So it boils down to the competence of the code on the server. You
>>>> have to be careful how you construct your pages and javascript. But
>>>> then, this should *always* be the case.
>>>
>>> Yes, agreed. Again, I'm supporting the position of 'why risk it?' not
>>> the line of 'it's impossible to be safe!'
>>>
>>>>> Plus you have to call the javascript parser and whatnot, which is
>>>>> usually rather slow.
>>>>
>>>> One could send core of the app logic in a static js file then have
>>>> the eval only making simple calls like "appui.getInvoinces()". That
>>>> will perform fairly well.
>>>
>>> Yes, if the invoices do contain fairly limited and well-defined data
>>> that you know can *never* cause a problem.
>>>>
>>>>> The whole idea is pretty bad on the web, where you have to assume
>>>>> that people will actively try to break your stuff and steal data
>>>>> from other users (or you).
>>>>>
>>>>
>>>> That assumption is a bit dramatic. Questions on security cannot be
>>>> viewed in isolation of application. One of my favorite quotes from
>>>> Bruce Schneier is applicable here. He was once asked about the
>>>> possibility of chaos ensuing due to internet security breaches...
>>>>
>>>> "No. Chaos is hard to create, even on the Internet. Here's an
>>>> example. Go to Amazon.com. Buy a book without using SSL. Watch the
>>>> total lack of chaos."
>>>
>>> The idea is fairly dramatic, but the concept is basically that once
>>> someone's got an axe to grind against you or your applications, then
>>> someone actively trying to break your stuff is *actually* going to
>>> happen. A lax attitude is what made one of our products (at some
>>> previous job) vulnerable to Russian hackers who ended up emailing
>>> customers with addresses stolen straight out of our databases. When it
>>> happens, it's already too late to react.
>>>>
>>>> I don't see how you can canvas the "whole idea" as being bad. It may
>>>> require adjustments here and there. e.g For particular pages where
>>>> paranoid security is needed, nothing stops you from doing it
>>>> differently there. You could crypto what's sent. You could even serve
>>>> those pages the standard way with static files and SSL if it makes
>>>> you feel safer.
>>>
>>> SSL is protecting you against things like man-in-the-middle attacks.
>>> Encryption helps you on other points. There is nothing there regarding
>>> problems with application-level security. The whole idea is not bad,
>>> but I would certainly want a serious specialist to look over my
>>> application if I were to use that trick in many places.
>>>
>>>
>>>>
>>>>>>
>>>>>> This technique is amazingly powerful.
>>>>>>
>>>>>> So now I only need one generic web page. Think of that.
>>>>>>
>>>>>> Only one page is needed - forever.
>>>>>>
>>>>> This is a problem when it comes to bookmarks, sharing the link with
>>>>> a friend, searchability, browser history, etc. The web wasn't
>>>>> exactly intended to be a stateful thing and you'll have to resort to
>>>>> hacks such as hash-bangs to get around it. I suggest reading Tim
>>>>> Bray's Broken Links to see why that isn't a good solution anyway.
>>>>>
>>>>
>>>> True. But this problem is an age-old general AJAX/dynamic-markup
>>>> problem. I agree it might be very visible in this case.
>>>>
>>>> However, I've written XULRunner apps with no back buttons -- no need
>>>> for them with easy-to-navigate UIs. Most Adobe AIR apps I've seen
>>>> have no browser history. It's made me question: How badly do
>>>> end-users really need those things? If they do, couldn't we give them
>>>> better application-specific versions inside our web-app UI?
>>>
>>> If I'm using a browser, I'd enjoy being able to use the web. What
>>> constitutes a 'very easy to use' application to you might not be the
>>> same for everyone. I do remember many flash pages falling pray to the
>>> same problem. I think this is mostly a deeply rooted problem in the
>>> web where you're piggy-back riding sessions on a protocol that was
>>> absolutely not made for that. It sometimes works well enough (I'm
>>> thinking of chat applications or even grooveshark here), so it's
>>> certainly not black and white, but I figure you know what I mean.
>>>>
>>>>> Plus I'd argue that javascript and Erlang should be kept separate
>>>>> and you shouldn't try to generate one with the other,
>>>>
>>>> Good point. I thought about sending the js in static files and
>>>> reducing the calls from Erlang to simple one-liners. But also note
>>>> that the more powerful aspect of this (IMO) is not just sending js,
>>>> but sending UI elements. Sending blocks of UI to an empty page! How
>>>> can anyone not like that?
>>>>
>>> Separation of concerns. JS is about behaviours on the page, dynamic
>>> content. UI is both HTML (structure) and CSS (presentation). One very
>>> simple question I like to ask to sort this out is "would I be able to
>>> hire a designer to work on my site without guiding them around too
>>> much?" "Could I hire someone to just work on my javascript and HTML
>>> without them needing to know anything else?"
>>>
>>> If you say no to these, you might have some overlapping domains in
>>> what you're doing.
>>>
>>> Then again, I'm a fan of really well-separated components in my
>>> applications, which is why I like Erlang's processes and OTP
>>> applications in the first place :)
>>>
>>> Another advantage of keeping things separate is caching -- this is
>>> however pretty application and audience specific in terms of needs and
>>> requirements.
>>>
>>>> - Edmond -
>>>>
>>>>> but at this point, I figure it's more of a matter of who wants to
>>>>> give himself the trouble than anything.
>>>>>
>>>>>
>>>>> --
>>>>> Fred Hébert
>>>>> http://www.erlang-solutions.com
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
>>>
>>>
>>> --
>>> Fred Hébert
>>> http://www.erlang-solutions.com
>>>
>>>
>>> ________________________________________________________________
>>> erlang-questions (at) erlang.org mailing list.
>>> See http://www.erlang.org/faq.html
>>> To unsubscribe; mailto:erlang-questions-unsubscribe@REDACTED
>>>
>>
>>
>> --
>> Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
>
>
> ________________________________________________________________
> erlang-questions (at) erlang.org mailing list.
> See http://www.erlang.org/faq.html
> To unsubscribe; mailto:erlang-questions-unsubscribe@REDACTED
>
--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
More information about the erlang-questions
mailing list