<div dir="ltr"><br><br><div class="gmail_quote">On Thu, Oct 2, 2008 at 7:51 PM, Kostis Sagonas <span dir="ltr"><<a href="mailto:kostis@cs.ntua.gr">kostis@cs.ntua.gr</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

<div class="Ih2E3d">Hynek Vychodil wrote:<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

I would like notice that hipe can cause performance lost for some special cases. There is not simple equation, switch on hipe and you gain some amount of performance surplus. Especially when you have tight loop over not so much complex data structures you can gain big amount of performance.<br>


</blockquote>

<br></div>

It would have been much more helpful for the discussion if you presented some quantitative results (and preferably code) for your claims.  For example, what qualifies as "not so much complex data structures" (by the way, have you measured what happens in "complex" ones?), "big amount", "performance loss for some special cases" (what's "special" about these special cases), etc.</blockquote>

<div><br>My experience coming from first Wide Finder Project, Cedric Beust's challenge and some other works. I was busy to make some worth synthetic tests which only is valuable in this array, I think. My observation is that HiPE gain strongly depend on how task is solved. The fastest BEAM solution of some nontrivial task is often different from fastest HiPE solution. And from another point of view, same code compiled using HiPE can vary from 30 times faster to 20% slower. It means, that for some modules from OTP switching on HiPE can be worth thing (for nondebug environment only) but only for some modules, not for all.<br>

</div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div class="Ih2E3d"><br>

<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

When you do big amount of inter module calls you gain less <br>

</blockquote>

<br></div>

This is slightly misleading: there is no "bad" treatment of inter module calls (other than that dictated by the semantics of hot code loading). What does have a non-negligible cost is so called mode-switching calls: calling interpreted code from native or vice versa. The simplest way to avoid this is to native compile everything. Alas, currently this is not so easy to do for standard libraries.  But note that even without compiling everything, the speed up from native code is not something to disregard so quickly.<br>


<br>

For example, if I read Joel's mails correctly he got:<br>

<br>

3000 games finished<br>

Elapsed: 181.960007s, Average run time: 0.060653335 seconds (for BEAM)<br>

Elapsed:  62.517129s, Average run time: 0.020839043 seconds (for HiPE)<br>

<br>

without compiling any of the OTP libraries.  Joel can correct me if I am wrong here.<div class="Ih2E3d"><br>

<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

and when you do big amount of message passings you can lost some preformance when switch it on.<br>

</blockquote>

<br></div>

It would help more if, instead of rather unquantified statements, you or somebody else posted an actual application (*) showing a non-trivial performance loss when native compiled.  It would at least give us something to work and improve upon.<br>


<br>

Kostis<br>

<br>

(*) not just a synthetic benchmark<br>

</blockquote></div><br>Why not synthetic benchmark? My experience is that good synthetic benchmark is far worthy than "real" application because exactly point where is problem and which way is faster. I know, it is worthy only for one which understand. Application benchmarks are usefull just only for marketing and looks good in PR materials. They allege only how thing works for one special task and it solution, they tells nothink how it will works for a little bit different solution or task. Synthetic benchmarks contrary tell how work each component and you can predict how it will work in different tasks and solutions. It is harder, may be not so modern, but good engeneering.<br clear="all">

<br>-- <br>--Hynek (Pichi) Vychodil<br>

</div>