[erlang-questions] Two beautiful programs - or web programming made easy

Wed Feb 16 14:37:05 CET 2011

Comments inline and at the end.
On 2011-02-16, at 07:48 AM, Edmond Begumisa wrote:

> On Security:
> 
> I think if you take a closer look at _how_ Joe is actually using eval, you'll find you need not be so alarmed by it. You'll find the *same* security issues all web-programmers are used to. You'll find you can deal with them in the *same* ways.
> 
> See comments inline...
> 
> But it gets more risky in *both* cases, in equal magnitude. Doing a) instead of b) won't protect if you start "taking in strings with random input". Let's analyse how it "gets more risky" and then compare it to Joe's code...
> 
> Say programmer Helga stores a value into a Key-Value DB carelessly from the browser like so...
> 
>  /* NB: We cannot trust ::textContent here coz we're saving it for later use */
> 
>  var f = getElementById('f').textContent;  // Say f = 'x + y'
>  kvSaveToServer("f", f); // Server stores f verbatim
> 
> Then later, on another page, she tries a little calc UI...
> 
>  /* NB: We can reasonably trust ::textContent here coz it's only used on the same page */
> 
>  var x = getElementById('x').textContent;  // Trusted
>  var y = getElementById('y').textContent;  // Trusted
>  var f = kvReadFromServer('f');  // Untrusted (f = 'x + y', but our saving was dubious)
>  var z = eval(f);
> 
> The warning from Mozilla and others concerning eval here is clear and every web-developer is aware of it: Helga can't really be sure what f contains because it's code that comes from someone other than her (e.g. a malicious user when it was saved.) Hence the general advice to avoid eval(). But if she *is* sure what f contains because coz she damn-well wrote it like so...
> 
>  var x = getElementById('x').textContent; // Trusted
>  var y = getElementById('y').textContent; // Trusted
>  var z = eval('x + y');  // Trusted (Evaling code we've written)
> 
> Then this is no different from...
> 
>  var x = getElementById('x').textContent; // Trusted
>  var y = getElementById('y').textContent; // Trusted
>  var z = x + y; // Trusted (code we've written)
> 
> It makes no difference whether the *static* 'x + y' comes streaming in from the server and ran as eval('x + y') in the former case, or if it comes locally from a js file that also comes from the server in the latter case (there might be a scope issue in the former but we'll get to that.)
> 
> Now let's take it further and load the arbitrary strings you mention within the eval'ed code (closer to what Joe's doing but not quite). I believe this is the code that alarms you...
> 
>  /* Assume keys x and y were saved earlier using careless getElementById('x').textContent verbatim */
> 
>  var x = kvReadFromServer('x'); // Untrusted
>  var y = kvReadFromServer('y'); // Untrusted
>  var z = eval('x + y');  // **Looks trusted but is not**
> 
> However, and here's half the point I was making, this is the same as...
> 
>  var x = kvReadFromServer('x'); // Untrusted
>  var y = kvReadFromServer('y'); // Untrusted
>  var z = x + y;  // **Looks trusted but is not**
> 
> So the non-eval'ed code here is just as risky and deceptive as the eval'ed code. Helga avoiding eval on *code that she wrote* (former) and using a non-eval'ed version *that she wrote* (latter) will not protect her. Using the former is not _increasing_ her susceptibility to code injections from the "random strings" coming from others in the database. She's equally screwed either way by her carelessness with the DB.
> 
> **BUT**, and this is key, that's not even what Joe's doing! Basically, Joe is taking this supposedly safer latter static version, and eval'ing it like so...
> 
>  var c = "var x = kvReadFromServer('x'); // Untrusted" +
>          "var y = kvReadFromServer('y'); // Untrusted" +
>          "var z = x + y;  // **Looks trusted but is not**";
>  eval(c);
> 
> Joe is taking the code that you'd prefer to see in a static file, and just eval'ing it. Yet, and this is the second half of my point, any carelessness in either is equal. I find it hard to be petrified by the lower version while being more comfortable with the one above because it's somehow "less risky." The real problem is the untrusted values.
> 
> AFIAK (at least with the Mozilla code-base of which I'm fairly familiar with), using eval like this *introduces* only one *new* security concern: 3rd party js can see the scope at which eval was evoked. And if you're using 3rd party js that you can't trust, you're probalby screwed anyway :) All other security concerns are the *same* in both versions.

Nothing to argue there. Again, specific use cases can be safe. My biggest worry has to do with escaping and how to do it properly. There's an inherent risk in using dynamic data in all cases. The last way (with 'eval(c)') is till safe when you know exactly what data you get in.

To quote myself another time, As a general (and generic) pattern, the eval() in Joe's code worries me. Individual cases can be tested and proven safe on an individual basis without too much trouble.

> 
> Now, getting more realistic, he'll obviously want to do the reading server side, thus replacing kvReadFromServer('x') with <<"'", Val, "'">> cat'ed into the eval'ed string before it's delivered. So he now gets something like...
> 
>  var c = "var x = 'valfromuser'; // Untrusted" +
>          "var y = 'valfromuser'); // Untrusted" +
>          "var z = x + y;  // **Looks trusted but is not**";
>  eval(c);
> 
> AHA! You raised a valid concern about how doing the proper escaping would be tricky to get correct, esp from Erlang, and jquery does this sort of thing better. But nothing stops Joe from making calls to JQuery with the code he pushes (actually, he does use JQuery), or from using the native JSON parser...
> 
>  var c = "var x = JSON.Parse('valfromuser'); // Trusted" +
>          "var y = JSON.Parse('valfromuser'); // Trusted" +
>          "var z = x + y;  // Trusted";
>  eval(c);
> 
> - Edmond -
> 
> 
> -- 
> Using Opera's revolutionary e-mail client: http://www.opera.com/mail/

Yes, hard-coding all the calls to jQuery could help. But only slightly so. You now need to escape on many levels:

- Is the user submitting data that could break HTML? If so, is the user breaking it with:
   - invalid tags?
   - invalid tag attributes?
   - invalid tag content?
   - are URLs valid?

The rules for an HTML element's attributes are not the same that the content of the tag. How deep does the nesting go? Do you handle all cases? That's where you want your framework to act and you could call it from code generated in Erlang, yes.

- Is the user submitting data that could break Javascript itself?

Right now this might be the biggest (if not one of the only) risk of using Joe's method if you do take care about making sure everything goes through a framework. Your 'valfromuser' here could be the culprit and you need to have a pass over your data in Erlang to make sure you won't mess it up when inputting it into the code to eval.

 Escaping JS is somewhat conceptually simple, but unless you know all the tricks, nothing guarantees you won't be caught by a thing like Google's UTF-7 hack, or some other weird escaping technique. 

The point here is that you might be trying to escape and handle data on the backend, while it will run on the frontend. It is generally just saner to send it over to JS in these cases. Again, JS knows JS better and you'll generally be safer. 

You have to know, do all browsers handle JS the same? Can something break things in one browsers but not the others? You might already know that older versions of IE do accept things like backticks (`) as a valid quotation character or that certain HTML entities can do the job fine while your browser of choice might not accept them the same way (I assume your browser of choice is not some old IE). Are you sure you're going to handle these cases correctly all of the time when inputting user data as variables or as strings in your code?

I know I wouldn't be so sure myself. If different browsers handle things differently (and that might also depend on the doctype of your page, so there's no foolproof solution from the backend's point of view without additional user knowledge), can you again give me a guarantee that your code is safe? And if it's safe today, will it still be safe tomorrow? Will it be safe on all of my pages, even if my frontend guy ends up changing things?

I'm pretty sure that it's less effort to just keep your JS framework of choice up to date and focus on making sure the data won't break HTML once it's been used with JS, but that might just be me.

- Are you going to mess up the content your users provided by trying to escape it?

This isn't a security concern, but an applicative one. Erlang's modules are in latin-1 by default. Erlang can handle UTF-8. What's your webpage in? Is the input and the output in the same encoding? Will manipulating and escaping your data in Erlang risk messing it up? This is somewhat easy to solve, but yeah. Just another concern.

Ultimately, it's your application and I won't be trying to break into your office to scold you for using things I don't like. You know the kind of payload your site could represent. You might or might not fully know how safe your page is. You make a judgement call, but I felt like voicing my worries over this thread because a lot of people here didn't seem to worry at all (which worried me more!)

--
Fred Hébert
http://www.erlang-solutions.com