[erlang-questions] Two beautiful programs - or web programming made easy

Joe Armstrong erlang@REDACTED
Wed Feb 16 16:34:00 CET 2011


2011/2/16 Edmond Begumisa <ebegumisa@REDACTED>

> I'm a glass-half-full kinda guy :)
>
> I'd like to think this is the kind of community where someone can say ..
>
> "Hey, I've got this whacky idea. It's early stages but I think I'm onto
> something."
>
> Then in *addition* to the community saying "We'll, there are problems x, y
> and z you may have overlooked", the community *also* says "Possibly p, q,
> and r might help with these."
>
> I think if you look at what Joe's doing, it can be expanded on to get
> something very useful. Let's take some of everyone's concerns and actually
> *try* give Joe some advice on how they might be addressed.
>
> I've started inline, hopefully others can add some input...
>
>
>
> On Tue, 15 Feb 2011 07:37:44 +1100, Frédéric Trottier-Hébert <
> fred.hebert@REDACTED> wrote:
>
>
>>
>> On 2011-02-14, at 15:17 PM, Edmond Begumisa wrote:
>>
>>  You've outlined a nice list of the top security concerns of *every*
>>> web-developer generating dynamic content from the client-side (probalby a
>>> good chunk websites uploaded since 2005.)
>>>
>>> What I still don't get is why you find the generation of dynamic content
>>> from static js files acceptable while using eval for the same you find
>>> unacceptable. I don't see how XSS, CSRF, SQL Injection, and all the things
>>> you list are more unmanageable from static js that generates content vs
>>> streamed js that generates content.
>>>
>>>
>> The point of my security list wasn't that much about eval itself rather
>> than contradicting the precise point that 'all you need to do is encrypt
>> javascript'. That's a reductionist and erroneous view.
>>
>> Generatic dynamic content from static JS files has a few advantages:
>> caching on the browser side, distributing the code via CDNs rather than
>> through your app server,
>>
>
> How do AJAX sites solve this? One way is to break the "one page app" into a
> few pages. Maybe he could introduce a window concept that maps to a page...
>
> Pid ! {new, window(...)}
> .. work .. work...
> Pid ! {new, window(...)}
>
> There's a start.
>
>
>  benefiting from JIT if available (rather than calling the
>> compiler/interpreter each time you update content), potential static
>> analysis of code (even through things like JS-Lint). Also a smaller payload
>> on the network and bandwidth -- if you only send in the functions to run the
>> code once rather than on every call, you'll save a lot.
>>
>
> I suggested before sending parts of your app in ordinary static js files,
> then call the functions as libraries. The benefits above will start to be
> felt.
>
> Hmmm.. I wonder what Nitrogen does, they might have some tricks.
>
>
>  In most cases, rendering the page (CSS included), running the JS and
>> transferring the data counts for 90% of the time a user will wait when
>> querying a page. Streaming JS to then evaluate it is going to be terrible
>> for performances on larger scale applications.
>>
>
> I dunno about this. Might be a bit of a blanket statement. Wasn't the very
> reason AJAX came around to INCREASE performance of larger scale applications
> *precisely* by streaming markup and JS for evaluation and rendering
> on-demand rather than all-at-once because in the latter case you normally
> send more than is actually needed?
>
>
>   There's probably more to add to the list, but that's what I can think of
>> in 15-30 seconds.
>>
>
> Likewise, I could probably add more but these are the ideas for improving
> Joe's concept that I can come up with in 15-30 seconds ;) Others can pitch
> in.
>
>
>
>> And things like what I mentioned are not more or less unmanageable from
>> static files (I agree with you there), except for XSS:
>>
>> XSS is better treated in many cases by things like JS frameworks. If I'm
>> getting the result from some web query into JS, I will receive a neat
>> string, without a chance of it being wrong. If I then push this string
>> through my framework (say JQuery), it'll take care of doing specific
>> escaping of things like element attributes, element content, etc.
>>
>
> What stopping you?
>
> As I illustrated in the previous mail on security, you can call JQuery from
> code that's being run in eval too! Joe's calling everything from JQuery to
> SVG libraries!
>
>
>  If I do it dynamically through my applications, chances are much better
>> I'll get the escaping wrong in Erlang (and you need to escape on more
>> levels) than JS, where it can be made on a per-element basis when building
>> the DOM: "create a tag, add the attribute, add another one, add the tag's
>> content, push it" vs. "mix and matches all these strings into hopefully
>> valid JS". This is even truer when you consider hacks such as Google's UTF-7
>> encounter back in the day. This follows the idea that JS knows JS better.
>>
>>
> Right, let's convert that statement from a critisism into a really good
> piece of advice:
>
> Joe: That code where you're manipulating the DOM, where you do
> ".insertElement" and such, know what? Better do that via JQuery instead.


Why? - the only reason I can think of is cross-browser compatibility. At the
moment I don't really
like libraries since I want to see whats going on as near to the bottom
level as I can get - libraries
obscure what's going on. My goal is understanding, and minimal lines of code
(to aid understanding)

Right now I have several competing ideas - I could do with some informed
advice here.

I'll fire off some questions:

For Graphics:

1) SVG or
2) HTML5 canvas

Canvas is faster but has no support for objects, making onclick and ondrag,
onmouseenter in a canvas
is a pain and probably either eats CPU or memory. Any good libraries - I've
looked at all the well know
ones. I only want object support for a canvas (ie object grouping and adding
click, move events to
object groups) - not lots of other stuff. This is why I'm currently using
SVG.

For the keyboard:

How can I get keystrokes into my program from javascript - virtually every I
try is buggy -
Do I really have to sniff the browser type and fix the bugs of every single
browser ..

Rich text. I want to do pixel exact typography

I can make rich text by adding spans and css and stuff in the dom, but I
want pixel accurate
sizeing of spans - I want to do the following:

define several on screen divs with absolute size and position. Link them in
some order.
for example say

<div id="a" style="absolute:...." next="b">
<div id="b" ...                            next="c">

Then given rich text <p><span class="c1">...</span>  I want to flow this
into div a, so that it overflows into div b - I need to pixel exactly
calculate where to spit the text in order to do this.

If I could do this I could easily port erlguten to run in a browser

All these seem like pretty basic things - but I can't seem to find any code

/Joe





>
>
>  That's the same reason why you might want things like Erlang handling
>> Erlang parsing, SQL handling its own escaping, etc.  If you generate and
>> send JS as one over the wire, you will have to double-check it server-side.
>>
>
> How about adding templating? (I suggested this to Joe off-list)...
>
> One could possibly use leex/yecc to compile say "std.tpl" file and access
> it from Erlang code like so...
>
>  Pid ! {insert, std.grid(List)}
>
> which might use the content of std.tpl to produce...
>
>  Pid ! {insert, <<"<table>blah blah</table>">>)  % Or
> <<"createElement(blah)">>
>
> which might then stream to the client...
>
>  "document.body.innerHTML("<table>blah blah</table>") /* Or
> createElement/or jQuery insert call */
>
> The nice thing with this is, std.tpl could have versions.
>
> Joe likes SVG: so his std.grid(List) might produce some fancy SVG code.
> I like XUL: so my std.grid(List) might produce "<grid>blah</grid>"
>
> Templating might be extendable, so you might have an app specific my.tpl
> which extends on std.tpl...
>
> so when you: Pid ! {new, my.wnd(..)}
>
> the client gets a new page with stylesheet references, script tags, etc,
> specified in the my.tpl
>
> With all the great minds on this list, surely suggestions could be made to
> turn this early one-paged code into something more and more useful??
>
> - Edmond -
>
>
>
>  Unless you're running with node.js, that's going to be annoying for no
>> good reason.
>>
>
>
>> For CSRF, There is likely no incidence at all. It's a question of shared
>> data between the server and HTML forms. How that data gets there is not
>> really important at first. I could be wrong on that one though and it might
>> be worse than what I expect. For SQL injection, it's all about the last line
>> of defence before sending the data to your DB engine. If you treat it in JS,
>> God have mercy on your application.
>>
>> But yeah, this little security roundup was again to comment on the
>> 'encrypting your JS' is what you need comments. There's a safety element to
>> using eval, and also performance, clarity and semantic concerns to be had.
>>
>>  - Edmond -
>>>
>>>
>>> On Mon, 14 Feb 2011 23:43:57 +1100, Frédéric Trottier-Hébert <
>>> fred.hebert@REDACTED> wrote:
>>>
>>>  On 2011-02-14, at 03:35 AM, Joe Armstrong wrote:
>>>>
>>>>>
>>>>> Ok so "separation of concerns" is good but having different notations
>>>>> for expressing the concerns
>>>>> is crazy- to make a web thing that interacts with a server you need to
>>>>> learn something like
>>>>>
>>>>>       HTML
>>>>>       Javascript
>>>>>       CSS
>>>>>       PHP
>>>>>       MySQL
>>>>>
>>>>> And to be able to configure Apache and MySQL - other combinations are
>>>>> possible.
>>>>>
>>>>
>>>> I can agree with that. To have a functional website, you do need to know
>>>> a lot of different technologies. The web evolved organically and each part
>>>> of the problem space had its own solution developed over time.
>>>>
>>>>
>>>>> Then you have to split the flow of control to many places.
>>>>>
>>>>> All of this is crazy madness. There should be *one* notation that is
>>>>> powerful enough to express all
>>>>> these things. In the browser is seems sensible to forget about css and
>>>>> html only use Javascript
>>>>> The only communication with the browser should be by sending it
>>>>> javascript.
>>>>>
>>>>
>>>> There should, but there isn't. The truth here is that most programmers
>>>> are awful at design. In any somewhat large setup, your backend programmers,
>>>> your designers and your integrators (the guys just handling HTML, and CSS,
>>>> maybe some Javascript) are not necessarily the same person.
>>>>
>>>> Right now the ring of web technologies is divided in a way that makes it
>>>> somewhat simple to have different people from different background and
>>>> knowledges to work on different part of your software. It makes sense to
>>>> have the designer or integrator to be able to change the look and feel of a
>>>> website without having to play in your code and maybe mess up database
>>>> queries. Modern template engines in fact try to forbid all kinds of
>>>> seriously side-effecting code (like DB queries) from happening in the
>>>> templates.
>>>>
>>>> There should be no worry for your guy working in Javascript that he'd
>>>> not need to suddenly learn Erlang to be able to debug your application.
>>>> Then again, this separation of concerns allows specialists to work on
>>>> their speciality with more ease. It makes things somewhat simpler in larger
>>>> organisations, but quite painful for one-man operations. I'll tell you that
>>>> it makes a lot of sense when you know all of the tools in the toolkit though
>>>> :)
>>>>
>>>>  How you generate the javascript is irrelevant - by hand or by program -
>>>>> who cares. If you make it by
>>>>> program the chances are that it's right.
>>>>>
>>>>>  Yes and no. Generated javascript is nearly as old as the language --
>>>> many, many .NET apps had that kind of things. Some editors like Dreamweaver
>>>> could generate JS for you. One of the problem with this is that it was often
>>>> pure garbage, or it wouldn't work in all browsers uniformly. If you can
>>>> manage to generate and capture complex behaviours in a compliant manner, all
>>>> the better. I have myself lost much hope with regards to that though.
>>>>
>>>>  Security is orthogonal to this - send encrypted js over the wire and
>>>>> make sure your key-rings are secure
>>>>> this is a completely different problem.
>>>>>
>>>>
>>>> This is only transmission security. Encryption has nothing to do with
>>>> Cross-Site Scripting (XSS, where some user is able to run arbitrary JS in
>>>> your page for you and ends up stealing information), Cross Site Request
>>>> Forgery (CSRF, where the attacker uses the fact your application is
>>>> forgetting about things like the origin of the queries to hijack the
>>>> client's session in their place. This is related to Same Origin Policy
>>>> issues and not easy to handle), SQL injection, overwriting some parameters
>>>> because you don't fetch them in the right order server-side (see problems
>>>> with the $_REQUEST variable in PHP), etc.
>>>>
>>>> 1. XSS
>>>> XSS is, as mentioned above, the ability to run abritrary JS on a page.
>>>> This is the risky thing with your eval.
>>>> http://en.wikipedia.org/wiki/Cross-site_scripting contains many details
>>>> on understanding the related issues. It's not always a simple matter of
>>>> escaping. Some more advanced attacks even rely on string encoding to make
>>>> sure your escaping fails. See
>>>> http://www.governmentsecurity.org/forum/index.php?showtopic=18105.
>>>>
>>>> 2. CSRF
>>>> CSRF is a tricky thing. Because HTTP doesn't support sessions, over the
>>>> years, the guys from Netscape (back then) or Opera (or whoever) ended up
>>>> using Cookies to share data on every query. What happens there is that on
>>>> every query the browser sends to a server, it also packages the cookies
>>>> neatly in the headers -- no matter what page you were on when they were
>>>> sent. The issue here is that the server might not check from what page the
>>>> call is coming from.
>>>>
>>>> Basically, if twitter had an URL call such as
>>>> http://twitter.com/tweet/add?message=SomeMessageHere that would
>>>> automatically add a tweet from your account and I put that link in an image
>>>> tag on some site, every time you would load that image, you would
>>>> automatically make a call to the server, your browser sending in your
>>>> cookies and making it look like YOU actually made that call, even if you
>>>> didn't know. In this case, the request is especially easy to do because
>>>> twitter would be using GET parameters to have side-effects on the server. By
>>>> forcing people into using POST, you can make things harder, but not
>>>> impossible.
>>>>
>>>> One way to work again POST is using a fake website -- let's say I use
>>>> learnyousomeerlang.com. On my own site, I'll be putting a fake
>>>> javascript form inside an iframe (so that the page doesn't refresh when
>>>> submitted) and have the script automatically send in the POST form. Now I
>>>> send the link to my trick page over twitter and everyone who clicks on it
>>>> from there will be guaranteed to have their session open and sending in
>>>> data. I've in fact used this trick to have the site owner at my old job to
>>>> close his own admin account on his own website so he could realise the
>>>> importance of the threat.
>>>>
>>>> How can you solve this one? Well there are a few ways -- for one you
>>>> could check the HTTP referrer, but that won't work everywhere -- if you
>>>> expect calls from flash, it doesn't always send these elements of the HTTP
>>>> header. In the case of HTTPS, depending on how you handle things, the header
>>>> might not always be sent either so you can't know for sure. Better than
>>>> that, if I'm using the <img> trick on your own website (on twitter, for
>>>> twitter users), the domain will be the same, without you being able to check
>>>> for anything.
>>>>
>>>> The only foolproof way to do this is to use what they call 'tokens':
>>>> each call you make to the server has to have a unique piece of data that the
>>>> server knows about that can prove that the call you just made comes from
>>>> you, but also from your own forms on your own websites. These tokens should
>>>> have an expiration time and be hidden from plain view, submitted
>>>> automatically with any form. If you don't have this, your application might
>>>> not be safe.
>>>>
>>>> This has *nothing* to do with encryption, and everything to do with not
>>>> understanding the potential threats of the web correctly. It is an
>>>> application-level issue, much like XSS is. And it's pretty damn important.
>>>>
>>>> 3. SQL injection is a different beast, where you do not properly escape
>>>> the parameters of a request going to the database, letting your run
>>>> arbitrary DB calls. Erlang with Mnesia doesn't have to worry about that, but
>>>> Erlang with any SQL has to, even if you end up using QLC (it depends on the
>>>> library at the back in this case though, and is generally safe enough).
>>>> http://en.wikipedia.org/wiki/Sql_injection has sufficient details.
>>>>
>>>> 4.You have to consider that sometimes these attacks are combined
>>>> together to be able to really do damage.
>>>>
>>>> I haven't even covered using weak hashing for passwords, bad security
>>>> policies on cookies, opening files on dynamic paths without filtering the
>>>> input, etc.
>>>>
>>>> Web application security is not a joke and it's certainly not easy. It's
>>>> a very serious thing and most developers get it wrong at one point or
>>>> another. Wordpress got it wrong, Twitter got it wrong, facebook got it
>>>> wrong, Google got it wrong, and so on, even though they're supposed to be
>>>> leaders in the field. Most of them got it wrong more than once too. This is
>>>> why I kind of support a 'paranoid' line of thought.
>>>>
>>>> --
>>>> Fred Hébert
>>>> http://www.erlang-solutions.com
>>>>
>>>>
>>>
>>> --
>>> Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
>>>
>>
>>
>
> --
> Using Opera's revolutionary e-mail client: http://www.opera.com/mail/
>


More information about the erlang-questions mailing list