[erlang-questions] Two beautiful programs - or web programming made easy

Wed Feb 16 15:25:55 CET 2011

I'm a glass-half-full kinda guy :)

I'd like to think this is the kind of community where someone can say ..

"Hey, I've got this whacky idea. It's early stages but I think I'm onto  
something."

Then in *addition* to the community saying "We'll, there are problems x, y  
and z you may have overlooked", the community *also* says "Possibly p, q,  
and r might help with these."

I think if you look at what Joe's doing, it can be expanded on to get  
something very useful. Let's take some of everyone's concerns and actually  
*try* give Joe some advice on how they might be addressed.

I've started inline, hopefully others can add some input...

On Tue, 15 Feb 2011 07:37:44 +1100, Frédéric Trottier-Hébert  
<fred.hebert@REDACTED> wrote:

>
>
> On 2011-02-14, at 15:17 PM, Edmond Begumisa wrote:
>
>> You've outlined a nice list of the top security concerns of *every*  
>> web-developer generating dynamic content from the client-side (probalby  
>> a good chunk websites uploaded since 2005.)
>>
>> What I still don't get is why you find the generation of dynamic  
>> content from static js files acceptable while using eval for the same  
>> you find unacceptable. I don't see how XSS, CSRF, SQL Injection, and  
>> all the things you list are more unmanageable from static js that  
>> generates content vs streamed js that generates content.
>>
>
> The point of my security list wasn't that much about eval itself rather  
> than contradicting the precise point that 'all you need to do is encrypt  
> javascript'. That's a reductionist and erroneous view.
>
> Generatic dynamic content from static JS files has a few advantages:  
> caching on the browser side, distributing the code via CDNs rather than  
> through your app server,

How do AJAX sites solve this? One way is to break the "one page app" into  
a few pages. Maybe he could introduce a window concept that maps to a  
page...

Pid ! {new, window(...)}
.. work .. work...
Pid ! {new, window(...)}

There's a start.

> benefiting from JIT if available (rather than calling the  
> compiler/interpreter each time you update content), potential static  
> analysis of code (even through things like JS-Lint). Also a smaller  
> payload on the network and bandwidth -- if you only send in the  
> functions to run the code once rather than on every call, you'll save a  
> lot.

I suggested before sending parts of your app in ordinary static js files,  
then call the functions as libraries. The benefits above will start to be  
felt.

Hmmm.. I wonder what Nitrogen does, they might have some tricks.

> In most cases, rendering the page (CSS included), running the JS and  
> transferring the data counts for 90% of the time a user will wait when  
> querying a page. Streaming JS to then evaluate it is going to be  
> terrible for performances on larger scale applications.

I dunno about this. Might be a bit of a blanket statement. Wasn't the very  
reason AJAX came around to INCREASE performance of larger scale  
applications *precisely* by streaming markup and JS for evaluation and  
rendering on-demand rather than all-at-once because in the latter case you  
normally send more than is actually needed?

>  There's probably more to add to the list, but that's what I can think  
> of in 15-30 seconds.

Likewise, I could probably add more but these are the ideas for improving  
Joe's concept that I can come up with in 15-30 seconds ;) Others can pitch  
in.

>
> And things like what I mentioned are not more or less unmanageable from  
> static files (I agree with you there), except for XSS:
>
> XSS is better treated in many cases by things like JS frameworks. If I'm  
> getting the result from some web query into JS, I will receive a neat  
> string, without a chance of it being wrong. If I then push this string  
> through my framework (say JQuery), it'll take care of doing specific  
> escaping of things like element attributes, element content, etc.

What stopping you?

As I illustrated in the previous mail on security, you can call JQuery  
 from code that's being run in eval too! Joe's calling everything from  
JQuery to SVG libraries!

> If I do it dynamically through my applications, chances are much better  
> I'll get the escaping wrong in Erlang (and you need to escape on more  
> levels) than JS, where it can be made on a per-element basis when  
> building the DOM: "create a tag, add the attribute, add another one, add  
> the tag's content, push it" vs. "mix and matches all these strings into  
> hopefully valid JS". This is even truer when you consider hacks such as  
> Google's UTF-7 encounter back in the day. This follows the idea that JS  
> knows JS better.
>

Right, let's convert that statement from a critisism into a really good  
piece of advice:

Joe: That code where you're manipulating the DOM, where you do  
".insertElement" and such, know what? Better do that via JQuery instead.

> That's the same reason why you might want things like Erlang handling  
> Erlang parsing, SQL handling its own escaping, etc.  If you generate and  
> send JS as one over the wire, you will have to double-check it  
> server-side.

How about adding templating? (I suggested this to Joe off-list)...

One could possibly use leex/yecc to compile say "std.tpl" file and access  
it from Erlang code like so...

   Pid ! {insert, std.grid(List)}

which might use the content of std.tpl to produce...

   Pid ! {insert, <<"<table>blah blah</table>">>)  % Or  
<<"createElement(blah)">>

which might then stream to the client...

   "document.body.innerHTML("<table>blah blah</table>") /* Or  
createElement/or jQuery insert call */

The nice thing with this is, std.tpl could have versions.

Joe likes SVG: so his std.grid(List) might produce some fancy SVG code.
I like XUL: so my std.grid(List) might produce "<grid>blah</grid>"

Templating might be extendable, so you might have an app specific my.tpl  
which extends on std.tpl...

so when you: Pid ! {new, my.wnd(..)}

the client gets a new page with stylesheet references, script tags, etc,  
specified in the my.tpl

With all the great minds on this list, surely suggestions could be made to  
turn this early one-paged code into something more and more useful??

- Edmond -

> Unless you're running with node.js, that's going to be annoying for no  
> good reason.

>
> For CSRF, There is likely no incidence at all. It's a question of shared  
> data between the server and HTML forms. How that data gets there is not  
> really important at first. I could be wrong on that one though and it  
> might be worse than what I expect. For SQL injection, it's all about the  
> last line of defence before sending the data to your DB engine. If you  
> treat it in JS, God have mercy on your application.
>
> But yeah, this little security roundup was again to comment on the  
> 'encrypting your JS' is what you need comments. There's a safety element  
> to using eval, and also performance, clarity and semantic concerns to be  
> had.
>
>> - Edmond -
>>
>>
>> On Mon, 14 Feb 2011 23:43:57 +1100, Frédéric Trottier-Hébert  
>> <fred.hebert@REDACTED> wrote:
>>
>>> On 2011-02-14, at 03:35 AM, Joe Armstrong wrote:
>>>>
>>>> Ok so "separation of concerns" is good but having different notations  
>>>> for expressing the concerns
>>>> is crazy- to make a web thing that interacts with a server you need  
>>>> to learn something like
>>>>
>>>>        HTML
>>>>        Javascript
>>>>        CSS
>>>>        PHP
>>>>        MySQL
>>>>
>>>> And to be able to configure Apache and MySQL - other combinations are  
>>>> possible.
>>>
>>> I can agree with that. To have a functional website, you do need to  
>>> know a lot of different technologies. The web evolved organically and  
>>> each part of the problem space had its own solution developed over  
>>> time.
>>>
>>>>
>>>> Then you have to split the flow of control to many places.
>>>>
>>>> All of this is crazy madness. There should be *one* notation that is  
>>>> powerful enough to express all
>>>> these things. In the browser is seems sensible to forget about css  
>>>> and html only use Javascript
>>>> The only communication with the browser should be by sending it  
>>>> javascript.
>>>
>>> There should, but there isn't. The truth here is that most programmers  
>>> are awful at design. In any somewhat large setup, your backend  
>>> programmers, your designers and your integrators (the guys just  
>>> handling HTML, and CSS, maybe some Javascript) are not necessarily the  
>>> same person.
>>>
>>> Right now the ring of web technologies is divided in a way that makes  
>>> it somewhat simple to have different people from different background  
>>> and knowledges to work on different part of your software. It makes  
>>> sense to have the designer or integrator to be able to change the look  
>>> and feel of a website without having to play in your code and maybe  
>>> mess up database queries. Modern template engines in fact try to  
>>> forbid all kinds of seriously side-effecting code (like DB queries)  
>>> from happening in the templates.
>>>
>>> There should be no worry for your guy working in Javascript that he'd  
>>> not need to suddenly learn Erlang to be able to debug your application.
>>> Then again, this separation of concerns allows specialists to work on  
>>> their speciality with more ease. It makes things somewhat simpler in  
>>> larger organisations, but quite painful for one-man operations. I'll  
>>> tell you that it makes a lot of sense when you know all of the tools  
>>> in the toolkit though :)
>>>
>>>> How you generate the javascript is irrelevant - by hand or by program  
>>>> - who cares. If you make it by
>>>> program the chances are that it's right.
>>>>
>>> Yes and no. Generated javascript is nearly as old as the language --  
>>> many, many .NET apps had that kind of things. Some editors like  
>>> Dreamweaver could generate JS for you. One of the problem with this is  
>>> that it was often pure garbage, or it wouldn't work in all browsers  
>>> uniformly. If you can manage to generate and capture complex  
>>> behaviours in a compliant manner, all the better. I have myself lost  
>>> much hope with regards to that though.
>>>
>>>> Security is orthogonal to this - send encrypted js over the wire and  
>>>> make sure your key-rings are secure
>>>> this is a completely different problem.
>>>
>>> This is only transmission security. Encryption has nothing to do with  
>>> Cross-Site Scripting (XSS, where some user is able to run arbitrary JS  
>>> in your page for you and ends up stealing information), Cross Site  
>>> Request Forgery (CSRF, where the attacker uses the fact your  
>>> application is forgetting about things like the origin of the queries  
>>> to hijack the client's session in their place. This is related to Same  
>>> Origin Policy issues and not easy to handle), SQL injection,  
>>> overwriting some parameters because you don't fetch them in the right  
>>> order server-side (see problems with the $_REQUEST variable in PHP),  
>>> etc.
>>>
>>> 1. XSS
>>> XSS is, as mentioned above, the ability to run abritrary JS on a page.  
>>> This is the risky thing with your eval.  
>>> http://en.wikipedia.org/wiki/Cross-site_scripting contains many  
>>> details on understanding the related issues. It's not always a simple  
>>> matter of escaping. Some more advanced attacks even rely on string  
>>> encoding to make sure your escaping fails. See  
>>> http://www.governmentsecurity.org/forum/index.php?showtopic=18105.
>>>
>>> 2. CSRF
>>> CSRF is a tricky thing. Because HTTP doesn't support sessions, over  
>>> the years, the guys from Netscape (back then) or Opera (or whoever)  
>>> ended up using Cookies to share data on every query. What happens  
>>> there is that on every query the browser sends to a server, it also  
>>> packages the cookies neatly in the headers -- no matter what page you  
>>> were on when they were sent. The issue here is that the server might  
>>> not check from what page the call is coming from.
>>>
>>> Basically, if twitter had an URL call such as  
>>> http://twitter.com/tweet/add?message=SomeMessageHere that would  
>>> automatically add a tweet from your account and I put that link in an  
>>> image tag on some site, every time you would load that image, you  
>>> would automatically make a call to the server, your browser sending in  
>>> your cookies and making it look like YOU actually made that call, even  
>>> if you didn't know. In this case, the request is especially easy to do  
>>> because twitter would be using GET parameters to have side-effects on  
>>> the server. By forcing people into using POST, you can make things  
>>> harder, but not impossible.
>>>
>>> One way to work again POST is using a fake website -- let's say I use  
>>> learnyousomeerlang.com. On my own site, I'll be putting a fake  
>>> javascript form inside an iframe (so that the page doesn't refresh  
>>> when submitted) and have the script automatically send in the POST  
>>> form. Now I send the link to my trick page over twitter and everyone  
>>> who clicks on it from there will be guaranteed to have their session  
>>> open and sending in data. I've in fact used this trick to have the  
>>> site owner at my old job to close his own admin account on his own  
>>> website so he could realise the importance of the threat.
>>>
>>> How can you solve this one? Well there are a few ways -- for one you  
>>> could check the HTTP referrer, but that won't work everywhere -- if  
>>> you expect calls from flash, it doesn't always send these elements of  
>>> the HTTP header. In the case of HTTPS, depending on how you handle  
>>> things, the header might not always be sent either so you can't know  
>>> for sure. Better than that, if I'm using the <img> trick on your own  
>>> website (on twitter, for twitter users), the domain will be the same,  
>>> without you being able to check for anything.
>>>
>>> The only foolproof way to do this is to use what they call 'tokens':  
>>> each call you make to the server has to have a unique piece of data  
>>> that the server knows about that can prove that the call you just made  
>>> comes from you, but also from your own forms on your own websites.  
>>> These tokens should have an expiration time and be hidden from plain  
>>> view, submitted automatically with any form. If you don't have this,  
>>> your application might not be safe.
>>>
>>> This has *nothing* to do with encryption, and everything to do with  
>>> not understanding the potential threats of the web correctly. It is an  
>>> application-level issue, much like XSS is. And it's pretty damn  
>>> important.
>>>
>>> 3. SQL injection is a different beast, where you do not properly  
>>> escape the parameters of a request going to the database, letting your  
>>> run arbitrary DB calls. Erlang with Mnesia doesn't have to worry about  
>>> that, but Erlang with any SQL has to, even if you end up using QLC (it  
>>> depends on the library at the back in this case though, and is  
>>> generally safe enough). http://en.wikipedia.org/wiki/Sql_injection has  
>>> sufficient details.
>>>
>>> 4.You have to consider that sometimes these attacks are combined  
>>> together to be able to really do damage.
>>>
>>> I haven't even covered using weak hashing for passwords, bad security  
>>> policies on cookies, opening files on dynamic paths without filtering  
>>> the input, etc.
>>>
>>> Web application security is not a joke and it's certainly not easy.  
>>> It's a very serious thing and most developers get it wrong at one  
>>> point or another. Wordpress got it wrong, Twitter got it wrong,  
>>> facebook got it wrong, Google got it wrong, and so on, even though  
>>> they're supposed to be leaders in the field. Most of them got it wrong  
>>> more than once too. This is why I kind of support a 'paranoid' line of  
>>> thought.
>>>
>>> --
>>> Fred Hébert
>>> http://www.erlang-solutions.com
>>>
>>
>>
>> --
>> Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
>

-- 
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/