[erlang-questions] Two beautiful programs - or web programming made easy

Mon Feb 14 21:37:44 CET 2011

On 2011-02-14, at 15:17 PM, Edmond Begumisa wrote:

> You've outlined a nice list of the top security concerns of *every* web-developer generating dynamic content from the client-side (probalby a good chunk websites uploaded since 2005.)
> 
> What I still don't get is why you find the generation of dynamic content from static js files acceptable while using eval for the same you find unacceptable. I don't see how XSS, CSRF, SQL Injection, and all the things you list are more unmanageable from static js that generates content vs streamed js that generates content.
> 

The point of my security list wasn't that much about eval itself rather than contradicting the precise point that 'all you need to do is encrypt javascript'. That's a reductionist and erroneous view.

Generatic dynamic content from static JS files has a few advantages: caching on the browser side, distributing the code via CDNs rather than through your app server, benefiting from JIT if available (rather than calling the compiler/interpreter each time you update content), potential static analysis of code (even through things like JS-Lint). Also a smaller payload on the network and bandwidth -- if you only send in the functions to run the code once rather than on every call, you'll save a lot. In most cases, rendering the page (CSS included), running the JS and transferring the data counts for 90% of the time a user will wait when querying a page. Streaming JS to then evaluate it is going to be terrible for performances on larger scale applications. There's probably more to add to the list, but that's what I can think of in 15-30 seconds.

And things like what I mentioned are not more or less unmanageable from static files (I agree with you there), except for XSS:

XSS is better treated in many cases by things like JS frameworks. If I'm getting the result from some web query into JS, I will receive a neat string, without a chance of it being wrong. If I then push this string through my framework (say JQuery), it'll take care of doing specific escaping of things like element attributes, element content, etc. If I do it dynamically through my applications, chances are much better I'll get the escaping wrong in Erlang (and you need to escape on more levels) than JS, where it can be made on a per-element basis when building the DOM: "create a tag, add the attribute, add another one, add the tag's content, push it" vs. "mix and matches all these strings into hopefully valid JS". This is even truer when you consider hacks such as Google's UTF-7 encounter back in the day. This follows the idea that JS knows JS better.

That's the same reason why you might want things like Erlang handling Erlang parsing, SQL handling its own escaping, etc.  If you generate and send JS as one over the wire, you will have to double-check it server-side. Unless you're running with node.js, that's going to be annoying for no good reason.

For CSRF, There is likely no incidence at all. It's a question of shared data between the server and HTML forms. How that data gets there is not really important at first. I could be wrong on that one though and it might be worse than what I expect. For SQL injection, it's all about the last line of defence before sending the data to your DB engine. If you treat it in JS, God have mercy on your application.

But yeah, this little security roundup was again to comment on the 'encrypting your JS' is what you need comments. There's a safety element to using eval, and also performance, clarity and semantic concerns to be had.

> - Edmond -
> 
> 
> On Mon, 14 Feb 2011 23:43:57 +1100, Frédéric Trottier-Hébert <fred.hebert@REDACTED> wrote:
> 
>> On 2011-02-14, at 03:35 AM, Joe Armstrong wrote:
>>> 
>>> Ok so "separation of concerns" is good but having different notations for expressing the concerns
>>> is crazy- to make a web thing that interacts with a server you need to learn something like
>>> 
>>>        HTML
>>>        Javascript
>>>        CSS
>>>        PHP
>>>        MySQL
>>> 
>>> And to be able to configure Apache and MySQL - other combinations are possible.
>> 
>> I can agree with that. To have a functional website, you do need to know a lot of different technologies. The web evolved organically and each part of the problem space had its own solution developed over time.
>> 
>>> 
>>> Then you have to split the flow of control to many places.
>>> 
>>> All of this is crazy madness. There should be *one* notation that is powerful enough to express all
>>> these things. In the browser is seems sensible to forget about css and html only use Javascript
>>> The only communication with the browser should be by sending it javascript.
>> 
>> There should, but there isn't. The truth here is that most programmers are awful at design. In any somewhat large setup, your backend programmers, your designers and your integrators (the guys just handling HTML, and CSS, maybe some Javascript) are not necessarily the same person.
>> 
>> Right now the ring of web technologies is divided in a way that makes it somewhat simple to have different people from different background and knowledges to work on different part of your software. It makes sense to have the designer or integrator to be able to change the look and feel of a website without having to play in your code and maybe mess up database queries. Modern template engines in fact try to forbid all kinds of seriously side-effecting code (like DB queries) from happening in the templates.
>> 
>> There should be no worry for your guy working in Javascript that he'd not need to suddenly learn Erlang to be able to debug your application.
>> Then again, this separation of concerns allows specialists to work on their speciality with more ease. It makes things somewhat simpler in larger organisations, but quite painful for one-man operations. I'll tell you that it makes a lot of sense when you know all of the tools in the toolkit though :)
>> 
>>> How you generate the javascript is irrelevant - by hand or by program - who cares. If you make it by
>>> program the chances are that it's right.
>>> 
>> Yes and no. Generated javascript is nearly as old as the language -- many, many .NET apps had that kind of things. Some editors like Dreamweaver could generate JS for you. One of the problem with this is that it was often pure garbage, or it wouldn't work in all browsers uniformly. If you can manage to generate and capture complex behaviours in a compliant manner, all the better. I have myself lost much hope with regards to that though.
>> 
>>> Security is orthogonal to this - send encrypted js over the wire and make sure your key-rings are secure
>>> this is a completely different problem.
>> 
>> This is only transmission security. Encryption has nothing to do with Cross-Site Scripting (XSS, where some user is able to run arbitrary JS in your page for you and ends up stealing information), Cross Site Request Forgery (CSRF, where the attacker uses the fact your application is forgetting about things like the origin of the queries to hijack the client's session in their place. This is related to Same Origin Policy issues and not easy to handle), SQL injection, overwriting some parameters because you don't fetch them in the right order server-side (see problems with the $_REQUEST variable in PHP), etc.
>> 
>> 1. XSS
>> XSS is, as mentioned above, the ability to run abritrary JS on a page. This is the risky thing with your eval. http://en.wikipedia.org/wiki/Cross-site_scripting contains many details on understanding the related issues. It's not always a simple matter of escaping. Some more advanced attacks even rely on string encoding to make sure your escaping fails. See http://www.governmentsecurity.org/forum/index.php?showtopic=18105.
>> 
>> 2. CSRF
>> CSRF is a tricky thing. Because HTTP doesn't support sessions, over the years, the guys from Netscape (back then) or Opera (or whoever) ended up using Cookies to share data on every query. What happens there is that on every query the browser sends to a server, it also packages the cookies neatly in the headers -- no matter what page you were on when they were sent. The issue here is that the server might not check from what page the call is coming from.
>> 
>> Basically, if twitter had an URL call such as http://twitter.com/tweet/add?message=SomeMessageHere that would automatically add a tweet from your account and I put that link in an image tag on some site, every time you would load that image, you would automatically make a call to the server, your browser sending in your cookies and making it look like YOU actually made that call, even if you didn't know. In this case, the request is especially easy to do because twitter would be using GET parameters to have side-effects on the server. By forcing people into using POST, you can make things harder, but not impossible.
>> 
>> One way to work again POST is using a fake website -- let's say I use learnyousomeerlang.com. On my own site, I'll be putting a fake javascript form inside an iframe (so that the page doesn't refresh when submitted) and have the script automatically send in the POST form. Now I send the link to my trick page over twitter and everyone who clicks on it from there will be guaranteed to have their session open and sending in data. I've in fact used this trick to have the site owner at my old job to close his own admin account on his own website so he could realise the importance of the threat.
>> 
>> How can you solve this one? Well there are a few ways -- for one you could check the HTTP referrer, but that won't work everywhere -- if you expect calls from flash, it doesn't always send these elements of the HTTP header. In the case of HTTPS, depending on how you handle things, the header might not always be sent either so you can't know for sure. Better than that, if I'm using the <img> trick on your own website (on twitter, for twitter users), the domain will be the same, without you being able to check for anything.
>> 
>> The only foolproof way to do this is to use what they call 'tokens': each call you make to the server has to have a unique piece of data that the server knows about that can prove that the call you just made comes from you, but also from your own forms on your own websites. These tokens should have an expiration time and be hidden from plain view, submitted automatically with any form. If you don't have this, your application might not be safe.
>> 
>> This has *nothing* to do with encryption, and everything to do with not understanding the potential threats of the web correctly. It is an application-level issue, much like XSS is. And it's pretty damn important.
>> 
>> 3. SQL injection is a different beast, where you do not properly escape the parameters of a request going to the database, letting your run arbitrary DB calls. Erlang with Mnesia doesn't have to worry about that, but Erlang with any SQL has to, even if you end up using QLC (it depends on the library at the back in this case though, and is generally safe enough). http://en.wikipedia.org/wiki/Sql_injection has sufficient details.
>> 
>> 4.You have to consider that sometimes these attacks are combined together to be able to really do damage.
>> 
>> I haven't even covered using weak hashing for passwords, bad security policies on cookies, opening files on dynamic paths without filtering the input, etc.
>> 
>> Web application security is not a joke and it's certainly not easy. It's a very serious thing and most developers get it wrong at one point or another. Wordpress got it wrong, Twitter got it wrong, facebook got it wrong, Google got it wrong, and so on, even though they're supposed to be leaders in the field. Most of them got it wrong more than once too. This is why I kind of support a 'paranoid' line of thought.
>> 
>> --
>> Fred Hébert
>> http://www.erlang-solutions.com
>> 
> 
> 
> -- 
> Using Opera's revolutionary e-mail client: http://www.opera.com/mail/