[erlang-questions] Erlang multicore on AMD: 24 cores (48 pseudocores)

Dmitry Kolesnikov <>
Wed Apr 18 21:17:11 CEST 2012


In some respect, Robert you are right regarding the choice of HW platform. 
By the end platform has to be optimized for your tasks, use-cases and hosting requirments...
The final metric is cost per TX, cost per user....

I am not sure, what system you are developing but I am speaking form prism of Consumer Internet Services where horizontal scalability is only one sensible approach to carry on the traffic. 

Indeed, less nodes you have it is easy to manage and they often perform better due to local IPC, buses, caches, etc. On another hand, you have to invest into high-end HW to mitigate various bottleneck. As the result, high cost per user and more expensive to keep redundancy and scalability.

Bunch of smaller CotS nodes might show worst performance but the system much more robust in terms of HW failure and  bursty traffic. I would not worry about management... If you have 6+ nodes then management practices shall be automated. You start to follow up Erlang mantra "Let it fail". When I've tasked to build 60+ node Erlang cluster. First item in the task list was a collection of management tools to destroy and recover a nodes from scratch... 

Max, you have to answer couple of question for you self.
1. Where servers are hosted? 
2. Is DC has enough facility for multiple node config?
3. What are the role of solution OLTP or Batch Jobs?
4. What are redundancy/availability requirements?
5. what are the traffic profiles?
6. what is you budget and solution life time?

based on them you can make a proper decision. If you are just curios about Erlang performance. Vertical vs Horizontal then EC2 would be a good tools. I've been running Erlang evaluation on bare HW and EC2 (long time ago very old HW). For CPU bound load there is not big difference once CPU utilization below 60%. I think 4xLarge vs 1xHigh-CPU XL would be a good setup to validate scalability of your approach. 

- Dmitry

On Apr 18, 2012, at 9:35 PM, Robert Melton wrote:

> Might also be relevant -> http://www.cpubenchmark.net/cpu_value_available.html#multicpu 
> On Wed, Apr 18, 2012 at 2:27 PM, Robert Melton <> wrote:
> On Wed, Apr 18, 2012 at 2:03 PM, Max Lapshin <> wrote:
> > (how many rows can we process per hour for $25,000...) in our case, bigger
> > more specialized machines crushed a far greater number of cheap machines for
> > the same dollar amount).
> >
> Ok, lets discuss this interesting moment.
> It is possible to buy cheap Core i7 box in Moscow for $700.  AMD
> Opteron with 64 cores costs $7100 in Moscow. So it is $110 per core
> for 2,2GHz AMD and $175 per core for 3,6GHz Intel.
> Even if not to speak about power consumption and don't discuss
> frequencies, I'm afraid that 64 cores will fight for the same memory
> bus and other PC resources.
> So perhaps there is some limit in multicoring? It is very interesting
> that your results show that more cores in single box are always
> better. Maybe there is some communication between tasks in your
> processing code?
> First of all, which Opteron, and which 64 core configuration... I have been out of the shopping industry for a bit, but it was my understanding the Opteron still peaked at 16 core (6200). 
> Secondly, there are more options than Core i7 versus Opteron... those two chips hardly represent the industry.  You have to choose hardware that represents your problem well -- some problems can take advantage of the E7 set of Intel chips for example (10 core, 20 pseudocore) and good on chip cache. 
> In our testing, we were lucky enough to have a quarter million in funding for R&D hardware, so we actually built 9 different setups, each very near 25k.  But, I would broaden your considerations... maybe a greater number of i3 chips would be best, maybe i7 EEs, maybe the best value for you is monster E7-8870's.  The i7 EE has 12 pseudocores and 15MB on chip cache on the high end versions now. 
> You can use a site like: http://www.cpubenchmark.net/high_end_cpus.html#cpuvalue to find the best value in CPU power to dollars -- from scanning that, it looks like the best bet at this time is the AMD Athlon II X4 631 Quad-Core -- but consider that you have to build boxes around that CPU, and the value starts to fall away a bit, moreso if you have any I/O considerations.
> In our case, we had lots of different fairly close bottlenecks, including disk I/O, network I/O and CPU.  Building fewer bigger machines let us spend a good bit on high end network cards and SSD and remain in budget... which at the end of the day is what matters.  In our experience, the AMD "wrappers" (motherboard, etc) tended to run about 15% cheaper, so you might be able to use that as a sort of normalization for costing. 
> If I was building the system today (we built ours sometime ago) -- I would probably start by looking at the i5-2400 and maybe the AMD Athlon II X4 (and see if you can built decent systems around it) ... your goal is to buy CPU bound work units as cheaply as possible, without inducing other bottlenecks (buy a system with terrible I/O and maybe I/O is your new bottleneck). 
> -- 
> --Robert Melton
> -- 
> --Robert Melton

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://erlang.org/pipermail/erlang-questions/attachments/20120418/0b7759bb/attachment.html>

More information about the erlang-questions mailing list