[erlang-questions] MySQL Cluster

Yerl <>
Sat Oct 21 19:38:16 CEST 2006

Hi !
I agree with Jay. It's depends on what data are.
At work, we daily mange/access/operate on ~ 200To database growing every 
year (+100To each year at least).
This isn't really a DB, but  a clever data management system running on 
86 ultra cheap nodes.
The DB contains 4000 million web documents. Any document access is time 
bounded (less than 40ms). The algorithm behind the scene performs 
O(log(N)) in the worst case. Thus, I can sleep happy each night.

Jay Nelson a écrit :
> Pat e wrote:
>  > We are prepairing rather big web project that
>  > will end up with tens of TB of data.
> Scott wrote:
>  > These sizing questions are popping up on a regular basis now, getting
>  > to be FAQ frequency.
> I'll second that.
>  > No I haven't tried MySQL Cluster or TimesTen, but I'm hoping for
>  > some free time to try it out.
> I work for a Wall Street firm.  We used to have our trading client 
> running on Solaris, C++ and X-Windows using Sybase.  Competitors started 
> rolling out PC-based solutions and the traders said they didn't want 2 
> boxes on their desk.  The whole industry has shifted to PC-based trading 
> applications.
> We went with the whole Visual Studio C++, Times Ten database, Tibco 
> messaging, blah, blah.
> When the individual trader databases started approaching 1GB we could no 
> longer use Times Ten because they map to a virtual file in RAM.  So we 
> started looking at compressing the data.  That bought a few months.  
> Then we started looking at Sybase RAM versions which turned out to be 
> much faster.
> Meanwhile the backend backoffice stuff ran on Solaris with Sybase.  The 
> log files started getting too big to deal with.  Then the databases 
> started get too big to deal with.  Now the pressure is on to send an 
> order to the floor in 5 ms which leaves not only not enough time to 
> touch a DB, but not enough time to touch the disk.
> We are in the process of eliminating databases.  Why did we use them in 
> the first place?   You just do.  If you are an engineer and I say I need 
> to store a lot of data and your answer is not "A database!", you fail 
> CS101.  No one ever asks, "how do you want to access it, why do you 
> think you need a database".
> ===============================
> "My website needs 20TB of data in the database..."
> Why?  Are they pictures or movies?  => Do not store them in a database.
> Do you have 20 Billion users, each with 1000 bytes of data?  =>  Double 
> check your numbers.
> Do you have 1M users with backup data of 20MB each?  => Store the 1M 
> users in a database, store the rest some other way.
> ===============================
> Pat e writes:
>  > Will it be necessary to completley rewrite mnesia for multi-terabyte 
> disk-only
> tables?
> That depends.  Is it possible for you to look at your problem 
> differently so that
>  it doesn't require multi-terabyte disk-only tables?
> ===============================
> What is the real problem you want to solve?
> Why do you think you need a database?
> Why is it so big?
> How often will it be queried?
> How often will it be modified?
> Is it really an archive or a transaction log?
> Now, if you are monitoring the sales of canned beans across all shopper 
> club members in real time to tune your factory production line -- then 
> you've got a real problem.
> Generally, when you push the limits of technology you look at tweaking 
> it.  When you move several orders of magnitude beyond the current state 
> of the art, you need to look at the basics of the problem, the 
> underpinnings of the technology and think about coming up with a new 
> alternative designed for the new problem space.
> jay
> _______________________________________________
> erlang-questions mailing list
> http://www.erlang.org/mailman/listinfo/erlang-questions

More information about the erlang-questions mailing list