Strings (was: Re: are Mnesia tables immutable?)
Thu Jun 29 19:01:25 CEST 2006
Thanks for the reply...
On Jun 29, 2006, at 3:38 PM, Christian S wrote:
> On 6/28/06, ke han <ke.han@REDACTED> wrote:
>> In the example I gave, my countryManager process is a singleton
>> (pardon the
>> oo pattern reference, but thats what it is) that serves the entire
>> VM to
>> answer a list of countries. This is a lengthy list of short utf-8
>> binaries. So wouldn't the list get copied? And won't each short
>> binary in
>> the list get copied as well? There must be a better way.
> When benchmarking, how fast could you serve requests to your
> countryManager? (Btw, registered process would be more erlangy than
thanks..calling it a registered process is best...although if I can
really get my apps designed right, I wouldn't register processes by
name. I would inject the process into the controllers that need to
know about them and not name them at all.
> What job does it do?
I was trying to write a simple example so not to let the app design
get in the way of my point. In tthe apps I'm building there are
_many_ lists of strings. Some of these are as simple as names of
countries, states, technology interests, etc... Some lists are keyed
by other lists...e.g. states grouped by country choice. Since my
app is AJAX oriented, sometimes these lists get encoded into the
original page that gets sent to the browser and sometimes they get
sent later as json data to update dependent lists.
These lists come from mnesia tables and are managed by appropriate
processes which encapsulate access to the lists. Mnesia table size
is another concern..but I think I can deal with this easier than my
main memory concerns.
In addition to these basic look-up-table types lists, lots of other
lists of strings or complex terms (which mostly contain strings)
occur in my app (mostly to create html tables).
The bottom line is that to get at any list, a message is sent from
the yaws page to a controller (a separate process) which then sends a
message to a model (sometimes another process sometimes a record.
Each of these sends is synchronized to wait for a return of a copy of
these lists. So not only is this data stored as lists of integers
(which gets really bad for 64-bit) but they are being copied with
each message send.
>> In order to get around this problem, I would have to destroy MVC
>> and have my model object (countryManager) return an already
>> binary of binaries (or if I'm going to do that I may as well have the
>> countryManager go ahead and serialize it to json form as well).
>> This violates lots of sounds application design. Basic principles of
>> encapsulation and separation of presentation and app logic are
>> well grounded
>> in OO design. These principals apply to non-OO languages as well. I
>> understand that not having object references and copying terms
>> between calls
>> to erlang processes is a key element of erlang. But for non-mutable
>> strings??? Not having a solution for this makes mainstream web
>> apps very
> Since we have first class functions in erlang you can pass your
> countryManager process a function that process the data it has, and
> send you back only the result of that call. No violation of sound
> application design. This is a trick languages without first class
> functions have a hard time to take advantage of, luckily Erlang is not
> that crappy.
yes, I am looking into solutions like this. I will post to the yaws
maillist asking about how to accomplish some of the ideas I have
> You keep mentioning non-mutable strings as we had mutable strings. We
> have ways to modify bindings (process dictionary or ets) but not to
> manipulate the string value itself (hipe extensions ignored). The
> later is a good thing nobody want to give up.
I was stressing the strings were non-mutable (and should have added
don't require character level access) because it seemed the
discussion going on in this thread was talking about many other
unicode issues and I wanted to stress the difference.
> Where are your benchmark that show how mainstream web apps in erlang
> are very inefficient? Maybe you are just doing the wrong thing?
The apps I develop are mostly data in / data out with some nice
presentation and validation on what goes in. This means that the
majority of memory is taken up by strings as most of my data is text
of some form or another. I don't need benchmarks to know that 4
bytes per character is _too_ much. In most cases its 4x too much and
going to 64-bits is off limits with this type of memory allocation.
Add to that the intermittent copying of these lists of integers (one
page request could trigger 20 copies of lengthy lists of lists of
integers in memory...just to stream out a page containing drop down
lists that don't change very often)... and you will get spikes of
memory allocation as the number of page requests grows. It turns out
that processor performance, io, concurrency issues won't be my first
bottleneck...it will be memory taken up by strings!!!
I am actually less concerned about the copy time...but the mem
required by the strings (in the model objects and in mnesia) and the
mem required for a web server to constantly be copying these lists to
sorry..I know this already turned into a rant...I do like erlang very
much...which is why I'm crying out for help on this issue.
I already have one erlang+yaws+mnesia app in production. Its an
internal corporate app and the uptake on usage is slow..,but I can
already tell the memory its taking for all the data is too much...I
should be able to get at least twice as much data in RAM as I have.
The next app I'm writing over this summer should get released this
September. This will be a world-wide highly public app and will
hopefully get lots of page requests. The last time I launched a
large system on the web was a few years ago...it was a Java based web
app. It actually scaled pretty well but I can vividly recall that
what kept me from sleeping at night was worying about my
servers...will some mem leak crash things...will some concurrency
deadlock crash the system..etc...
I chose erlang for this new app because I want to sleep at night when
I launch this next product. I have to launch this new app on one or
two low end servers and pray for success...My biggest fear is memory
support for all my character strings. performance is secondary.
thanks for allowing the rant...
More information about the erlang-questions