[erlang-questions] Stuff that breaks when you move it

Fred Hebert (MononcQc) mononcqc@REDACTED
Tue Aug 4 16:53:30 CEST 2009


On Tue, Aug 4, 2009 at 3:53 AM, Joe Armstrong<erlang@REDACTED> wrote:
> On Mon, Aug 3, 2009 at 8:20 PM, Dave Pawson<dave.pawson@REDACTED> wrote:
>> 2009/8/3 Joe Armstrong <erlang@REDACTED>:
>>
>
> M'lord I object, the witness is not a historian, the history of HTML
> and why is was a bad idea
> has not yet been written.
>
> Judge: Do you have any other comments? Try to keep them short, we will
recess in
> twenty minutes.
>
> I will try, M'lord, if the court will permit ...
>
> Judge: get on with it ...
>
> Ladies and Gentelmen of the Jury,
>
> Image I have a a single web page with some js. I *never* intent to
> reuse the js in
> a second page. This is a one-off application.

- That's a valid time for some JS. However, lots of webpage end up re-using
javascript and css on more than a page: menu highlights, header and footer
styles, etc.

> You are saying that I should store this in *two* pages - not one. This
> will have the
> following consequences:
>
>    - One day I will move one of the files but not the other (I'll
> lose the js, for example)
>    - the two files will live lives of their own and I'll get into
> version nightmares

Version nightmare will also be a possibility in case of having to maintain
the same file over many pages, It's partially why files are split the way
they are right now. Of course if it's a one-time use, nothing would keep you
from having it embed in the page and fixing that problem, but in every other
circumstance ever reusing the code, splitting it over many files is the way
to go.


>    - fetching the file has not got transaction semantics. I might get
> the HTML and then
>      get a link failure before fetching the js. I'd like to "either
> fetch both and it works"
>       or "fetch nothing" - the thing I use (firefox) doesn't have
> transaction semantics.
>
> << suppose the HTML always assumes the js will be downloaded -
>       the HTML prints some static content, then calls the JS - but
> the JS is not loaded
>       the net consequence of this is that some text is displayed and
> this  has disasterous
>       consequnces, somebody dies or something. This is because the JS
> was not executed.
>
>       Learned council can check if this has actually happened >>

Javascript is often (as it should be) not necessary to the functionning of a
page. Same for CSS. It should be able to downgrade gracefully. If it
doesn't, it usually is done at the cost of functionality (see webapps like
last.fm or facebook chat). In these cases, no archiving is possible anyway,
given a lot of interaction is needed with the server. Questions of same
origin policies and 'is the server still there?' will ruin that.

I can agree that this is problematic though, as it is for the degradation of
images


>    To avoid these things I have to set up a revision control system
> to make sure the parts
> cannot be separated - I have to change web browsers for transaction
semantics.
> pending this we should ask for a court injunction to stop all browsers.
>


Other advantages of separate files have to do with distribution: in dynamic
applications (serving thens of thousands of people), you want the actual
servers to do as much computing as possible, not using server resources
reading files from disk to send them to people.

As easily over 90% of a page size (and load time) is in its static files,
what you end up doing is using Content Delivery Networks (CDNs). CDNs an
infrastructure of servers located all around the world with the only
objective of delivering static content really fast. What they do by being on
a different domain is allowing you to download files simultaneously (more
than if they were embedded on the page), and faster due to proximity of
servers and fine-tuning of everything needed.

The following time the file is encountered, it is not even downloaded as
it's kept in the browser. By using a diff, assuming a diff like the ones we
have right now, is dowloading the changing line not once, but twice. One for
the old line, one for the new line. You also deny the use of CDNs, making
costs of bandwidth for many corporations much higher and load time for users
much longer.

You also want to send data as fast of possible, possibly sending every bit
of text you've got before the page is even done loading. This is done by
outputing parts of some pages as fast as possible when they're ready. By
using a diff, you still need to generate the whole page, and then have the
server compare data to make a diff file to send.

Getting the diff done can be longer than just sending a new page over in
these circumstances.

It's also unclear what a diff would do to things like a chunked file
sending, how it'd react to javascript queries.


>     My solution is to put the js in the file. With one file all the
> above problems disappear.
>
>      It's a fundamental law of physics - if you have two things in
> two different places
> it's impossible to agree if they are in a consistent state - it's
> mathematically impossible
> (see http://en.wikipedia.org/wiki/Two_Generals'_Problem).
>
>    There are *pragmatic* benefits of including css, js files - but
> this is a premature optimization
> and the net effect of include files can be achieved by a different
> (and sound) mechanism

Given all the above, I would say it is NOT premature optimization when you
know from the beginning you will have tens of thousands of simultaneous
viewers at all time (over millions a month).

It is although useful when you want to archive documents because changes and
updates no longer interest you. This is -- I believe -- the biggest
difference.
The step needed to change a page from dynamic to static for archival and
portability purposes requires you to separate it from its origins; you go
through it with wget, which will change URLs, download dependent files and
store them on your computer. This would be the equivalent of a compiler
linking the files into a single executable and the .dlls around it in some
cases.

Or, if we take it back to Erlang, I see multiple files as the difference
between defining all your functions in a single Erlang application file
(reimplementing the standard library functions inseatd of -import()ing
them), rather than counting on the system to do what is necessary to load
files.

Of course I may not exactly get exactly what you mean, but there are
extremely good reasons to keep content separate in the case of web files and
applications. You'd have to consider them for Erlang too.


More information about the erlang-questions mailing list