[erlang-questions] Two beautiful programs - or web programming made easy
Edmond Begumisa
ebegumisa@REDACTED
Wed Feb 16 15:25:55 CET 2011
I'm a glass-half-full kinda guy :)
I'd like to think this is the kind of community where someone can say ..
"Hey, I've got this whacky idea. It's early stages but I think I'm onto
something."
Then in *addition* to the community saying "We'll, there are problems x, y
and z you may have overlooked", the community *also* says "Possibly p, q,
and r might help with these."
I think if you look at what Joe's doing, it can be expanded on to get
something very useful. Let's take some of everyone's concerns and actually
*try* give Joe some advice on how they might be addressed.
I've started inline, hopefully others can add some input...
On Tue, 15 Feb 2011 07:37:44 +1100, Frédéric Trottier-Hébert
<fred.hebert@REDACTED> wrote:
>
>
> On 2011-02-14, at 15:17 PM, Edmond Begumisa wrote:
>
>> You've outlined a nice list of the top security concerns of *every*
>> web-developer generating dynamic content from the client-side (probalby
>> a good chunk websites uploaded since 2005.)
>>
>> What I still don't get is why you find the generation of dynamic
>> content from static js files acceptable while using eval for the same
>> you find unacceptable. I don't see how XSS, CSRF, SQL Injection, and
>> all the things you list are more unmanageable from static js that
>> generates content vs streamed js that generates content.
>>
>
> The point of my security list wasn't that much about eval itself rather
> than contradicting the precise point that 'all you need to do is encrypt
> javascript'. That's a reductionist and erroneous view.
>
> Generatic dynamic content from static JS files has a few advantages:
> caching on the browser side, distributing the code via CDNs rather than
> through your app server,
How do AJAX sites solve this? One way is to break the "one page app" into
a few pages. Maybe he could introduce a window concept that maps to a
page...
Pid ! {new, window(...)}
.. work .. work...
Pid ! {new, window(...)}
There's a start.
> benefiting from JIT if available (rather than calling the
> compiler/interpreter each time you update content), potential static
> analysis of code (even through things like JS-Lint). Also a smaller
> payload on the network and bandwidth -- if you only send in the
> functions to run the code once rather than on every call, you'll save a
> lot.
I suggested before sending parts of your app in ordinary static js files,
then call the functions as libraries. The benefits above will start to be
felt.
Hmmm.. I wonder what Nitrogen does, they might have some tricks.
> In most cases, rendering the page (CSS included), running the JS and
> transferring the data counts for 90% of the time a user will wait when
> querying a page. Streaming JS to then evaluate it is going to be
> terrible for performances on larger scale applications.
I dunno about this. Might be a bit of a blanket statement. Wasn't the very
reason AJAX came around to INCREASE performance of larger scale
applications *precisely* by streaming markup and JS for evaluation and
rendering on-demand rather than all-at-once because in the latter case you
normally send more than is actually needed?
> There's probably more to add to the list, but that's what I can think
> of in 15-30 seconds.
Likewise, I could probably add more but these are the ideas for improving
Joe's concept that I can come up with in 15-30 seconds ;) Others can pitch
in.
>
> And things like what I mentioned are not more or less unmanageable from
> static files (I agree with you there), except for XSS:
>
> XSS is better treated in many cases by things like JS frameworks. If I'm
> getting the result from some web query into JS, I will receive a neat
> string, without a chance of it being wrong. If I then push this string
> through my framework (say JQuery), it'll take care of doing specific
> escaping of things like element attributes, element content, etc.
What stopping you?
As I illustrated in the previous mail on security, you can call JQuery
from code that's being run in eval too! Joe's calling everything from
JQuery to SVG libraries!
> If I do it dynamically through my applications, chances are much better
> I'll get the escaping wrong in Erlang (and you need to escape on more
> levels) than JS, where it can be made on a per-element basis when
> building the DOM: "create a tag, add the attribute, add another one, add
> the tag's content, push it" vs. "mix and matches all these strings into
> hopefully valid JS". This is even truer when you consider hacks such as
> Google's UTF-7 encounter back in the day. This follows the idea that JS
> knows JS better.
>
Right, let's convert that statement from a critisism into a really good
piece of advice:
Joe: That code where you're manipulating the DOM, where you do
".insertElement" and such, know what? Better do that via JQuery instead.
> That's the same reason why you might want things like Erlang handling
> Erlang parsing, SQL handling its own escaping, etc. If you generate and
> send JS as one over the wire, you will have to double-check it
> server-side.
How about adding templating? (I suggested this to Joe off-list)...
One could possibly use leex/yecc to compile say "std.tpl" file and access
it from Erlang code like so...
Pid ! {insert, std.grid(List)}
which might use the content of std.tpl to produce...
Pid ! {insert, <<"<table>blah blah</table>">>) % Or
<<"createElement(blah)">>
which might then stream to the client...
"document.body.innerHTML("<table>blah blah</table>") /* Or
createElement/or jQuery insert call */
The nice thing with this is, std.tpl could have versions.
Joe likes SVG: so his std.grid(List) might produce some fancy SVG code.
I like XUL: so my std.grid(List) might produce "<grid>blah</grid>"
Templating might be extendable, so you might have an app specific my.tpl
which extends on std.tpl...
so when you: Pid ! {new, my.wnd(..)}
the client gets a new page with stylesheet references, script tags, etc,
specified in the my.tpl
With all the great minds on this list, surely suggestions could be made to
turn this early one-paged code into something more and more useful??
- Edmond -
> Unless you're running with node.js, that's going to be annoying for no
> good reason.
>
> For CSRF, There is likely no incidence at all. It's a question of shared
> data between the server and HTML forms. How that data gets there is not
> really important at first. I could be wrong on that one though and it
> might be worse than what I expect. For SQL injection, it's all about the
> last line of defence before sending the data to your DB engine. If you
> treat it in JS, God have mercy on your application.
>
> But yeah, this little security roundup was again to comment on the
> 'encrypting your JS' is what you need comments. There's a safety element
> to using eval, and also performance, clarity and semantic concerns to be
> had.
>
>> - Edmond -
>>
>>
>> On Mon, 14 Feb 2011 23:43:57 +1100, Frédéric Trottier-Hébert
>> <fred.hebert@REDACTED> wrote:
>>
>>> On 2011-02-14, at 03:35 AM, Joe Armstrong wrote:
>>>>
>>>> Ok so "separation of concerns" is good but having different notations
>>>> for expressing the concerns
>>>> is crazy- to make a web thing that interacts with a server you need
>>>> to learn something like
>>>>
>>>> HTML
>>>> Javascript
>>>> CSS
>>>> PHP
>>>> MySQL
>>>>
>>>> And to be able to configure Apache and MySQL - other combinations are
>>>> possible.
>>>
>>> I can agree with that. To have a functional website, you do need to
>>> know a lot of different technologies. The web evolved organically and
>>> each part of the problem space had its own solution developed over
>>> time.
>>>
>>>>
>>>> Then you have to split the flow of control to many places.
>>>>
>>>> All of this is crazy madness. There should be *one* notation that is
>>>> powerful enough to express all
>>>> these things. In the browser is seems sensible to forget about css
>>>> and html only use Javascript
>>>> The only communication with the browser should be by sending it
>>>> javascript.
>>>
>>> There should, but there isn't. The truth here is that most programmers
>>> are awful at design. In any somewhat large setup, your backend
>>> programmers, your designers and your integrators (the guys just
>>> handling HTML, and CSS, maybe some Javascript) are not necessarily the
>>> same person.
>>>
>>> Right now the ring of web technologies is divided in a way that makes
>>> it somewhat simple to have different people from different background
>>> and knowledges to work on different part of your software. It makes
>>> sense to have the designer or integrator to be able to change the look
>>> and feel of a website without having to play in your code and maybe
>>> mess up database queries. Modern template engines in fact try to
>>> forbid all kinds of seriously side-effecting code (like DB queries)
>>> from happening in the templates.
>>>
>>> There should be no worry for your guy working in Javascript that he'd
>>> not need to suddenly learn Erlang to be able to debug your application.
>>> Then again, this separation of concerns allows specialists to work on
>>> their speciality with more ease. It makes things somewhat simpler in
>>> larger organisations, but quite painful for one-man operations. I'll
>>> tell you that it makes a lot of sense when you know all of the tools
>>> in the toolkit though :)
>>>
>>>> How you generate the javascript is irrelevant - by hand or by program
>>>> - who cares. If you make it by
>>>> program the chances are that it's right.
>>>>
>>> Yes and no. Generated javascript is nearly as old as the language --
>>> many, many .NET apps had that kind of things. Some editors like
>>> Dreamweaver could generate JS for you. One of the problem with this is
>>> that it was often pure garbage, or it wouldn't work in all browsers
>>> uniformly. If you can manage to generate and capture complex
>>> behaviours in a compliant manner, all the better. I have myself lost
>>> much hope with regards to that though.
>>>
>>>> Security is orthogonal to this - send encrypted js over the wire and
>>>> make sure your key-rings are secure
>>>> this is a completely different problem.
>>>
>>> This is only transmission security. Encryption has nothing to do with
>>> Cross-Site Scripting (XSS, where some user is able to run arbitrary JS
>>> in your page for you and ends up stealing information), Cross Site
>>> Request Forgery (CSRF, where the attacker uses the fact your
>>> application is forgetting about things like the origin of the queries
>>> to hijack the client's session in their place. This is related to Same
>>> Origin Policy issues and not easy to handle), SQL injection,
>>> overwriting some parameters because you don't fetch them in the right
>>> order server-side (see problems with the $_REQUEST variable in PHP),
>>> etc.
>>>
>>> 1. XSS
>>> XSS is, as mentioned above, the ability to run abritrary JS on a page.
>>> This is the risky thing with your eval.
>>> http://en.wikipedia.org/wiki/Cross-site_scripting contains many
>>> details on understanding the related issues. It's not always a simple
>>> matter of escaping. Some more advanced attacks even rely on string
>>> encoding to make sure your escaping fails. See
>>> http://www.governmentsecurity.org/forum/index.php?showtopic=18105.
>>>
>>> 2. CSRF
>>> CSRF is a tricky thing. Because HTTP doesn't support sessions, over
>>> the years, the guys from Netscape (back then) or Opera (or whoever)
>>> ended up using Cookies to share data on every query. What happens
>>> there is that on every query the browser sends to a server, it also
>>> packages the cookies neatly in the headers -- no matter what page you
>>> were on when they were sent. The issue here is that the server might
>>> not check from what page the call is coming from.
>>>
>>> Basically, if twitter had an URL call such as
>>> http://twitter.com/tweet/add?message=SomeMessageHere that would
>>> automatically add a tweet from your account and I put that link in an
>>> image tag on some site, every time you would load that image, you
>>> would automatically make a call to the server, your browser sending in
>>> your cookies and making it look like YOU actually made that call, even
>>> if you didn't know. In this case, the request is especially easy to do
>>> because twitter would be using GET parameters to have side-effects on
>>> the server. By forcing people into using POST, you can make things
>>> harder, but not impossible.
>>>
>>> One way to work again POST is using a fake website -- let's say I use
>>> learnyousomeerlang.com. On my own site, I'll be putting a fake
>>> javascript form inside an iframe (so that the page doesn't refresh
>>> when submitted) and have the script automatically send in the POST
>>> form. Now I send the link to my trick page over twitter and everyone
>>> who clicks on it from there will be guaranteed to have their session
>>> open and sending in data. I've in fact used this trick to have the
>>> site owner at my old job to close his own admin account on his own
>>> website so he could realise the importance of the threat.
>>>
>>> How can you solve this one? Well there are a few ways -- for one you
>>> could check the HTTP referrer, but that won't work everywhere -- if
>>> you expect calls from flash, it doesn't always send these elements of
>>> the HTTP header. In the case of HTTPS, depending on how you handle
>>> things, the header might not always be sent either so you can't know
>>> for sure. Better than that, if I'm using the <img> trick on your own
>>> website (on twitter, for twitter users), the domain will be the same,
>>> without you being able to check for anything.
>>>
>>> The only foolproof way to do this is to use what they call 'tokens':
>>> each call you make to the server has to have a unique piece of data
>>> that the server knows about that can prove that the call you just made
>>> comes from you, but also from your own forms on your own websites.
>>> These tokens should have an expiration time and be hidden from plain
>>> view, submitted automatically with any form. If you don't have this,
>>> your application might not be safe.
>>>
>>> This has *nothing* to do with encryption, and everything to do with
>>> not understanding the potential threats of the web correctly. It is an
>>> application-level issue, much like XSS is. And it's pretty damn
>>> important.
>>>
>>> 3. SQL injection is a different beast, where you do not properly
>>> escape the parameters of a request going to the database, letting your
>>> run arbitrary DB calls. Erlang with Mnesia doesn't have to worry about
>>> that, but Erlang with any SQL has to, even if you end up using QLC (it
>>> depends on the library at the back in this case though, and is
>>> generally safe enough). http://en.wikipedia.org/wiki/Sql_injection has
>>> sufficient details.
>>>
>>> 4.You have to consider that sometimes these attacks are combined
>>> together to be able to really do damage.
>>>
>>> I haven't even covered using weak hashing for passwords, bad security
>>> policies on cookies, opening files on dynamic paths without filtering
>>> the input, etc.
>>>
>>> Web application security is not a joke and it's certainly not easy.
>>> It's a very serious thing and most developers get it wrong at one
>>> point or another. Wordpress got it wrong, Twitter got it wrong,
>>> facebook got it wrong, Google got it wrong, and so on, even though
>>> they're supposed to be leaders in the field. Most of them got it wrong
>>> more than once too. This is why I kind of support a 'paranoid' line of
>>> thought.
>>>
>>> --
>>> Fred Hébert
>>> http://www.erlang-solutions.com
>>>
>>
>>
>> --
>> Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
>
--
Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
More information about the erlang-questions
mailing list