[erlang-questions] MySQL Cluster
Sat Oct 21 19:21:40 CEST 2006
Pat e wrote:
> We are prepairing rather big web project that
> will end up with tens of TB of data.
> These sizing questions are popping up on a regular basis now, getting
> to be FAQ frequency.
I'll second that.
> No I haven't tried MySQL Cluster or TimesTen, but I'm hoping for
> some free time to try it out.
I work for a Wall Street firm. We used to have our trading client
running on Solaris, C++ and X-Windows using Sybase. Competitors started
rolling out PC-based solutions and the traders said they didn't want 2
boxes on their desk. The whole industry has shifted to PC-based trading
We went with the whole Visual Studio C++, Times Ten database, Tibco
messaging, blah, blah.
When the individual trader databases started approaching 1GB we could no
longer use Times Ten because they map to a virtual file in RAM. So we
started looking at compressing the data. That bought a few months.
Then we started looking at Sybase RAM versions which turned out to be
Meanwhile the backend backoffice stuff ran on Solaris with Sybase. The
log files started getting too big to deal with. Then the databases
started get too big to deal with. Now the pressure is on to send an
order to the floor in 5 ms which leaves not only not enough time to
touch a DB, but not enough time to touch the disk.
We are in the process of eliminating databases. Why did we use them in
the first place? You just do. If you are an engineer and I say I need
to store a lot of data and your answer is not "A database!", you fail
CS101. No one ever asks, "how do you want to access it, why do you
think you need a database".
"My website needs 20TB of data in the database..."
Why? Are they pictures or movies? => Do not store them in a database.
Do you have 20 Billion users, each with 1000 bytes of data? => Double
check your numbers.
Do you have 1M users with backup data of 20MB each? => Store the 1M
users in a database, store the rest some other way.
Pat e writes:
> Will it be necessary to completley rewrite mnesia for multi-terabyte
That depends. Is it possible for you to look at your problem
differently so that
it doesn't require multi-terabyte disk-only tables?
What is the real problem you want to solve?
Why do you think you need a database?
Why is it so big?
How often will it be queried?
How often will it be modified?
Is it really an archive or a transaction log?
Now, if you are monitoring the sales of canned beans across all shopper
club members in real time to tune your factory production line -- then
you've got a real problem.
Generally, when you push the limits of technology you look at tweaking
it. When you move several orders of magnitude beyond the current state
of the art, you need to look at the basics of the problem, the
underpinnings of the technology and think about coming up with a new
alternative designed for the new problem space.
More information about the erlang-questions