Mac Intel

Tue Aug 15 08:50:41 CEST 2006

On Mon, 14 Aug 2006 03:17:01 +0100, Joel Reymont wrote:
>One last thing...
>
>The 0xdd opcode corresponds to fstpl (0xddd8). The FPU IP seems to be  
>pointing to fstpl below, in the dump of do_fmul. I don't understand,  
>though, why mc->fs.fpu_mxcsr & 0x000F is true since fstpl is not a  
>SSE2 instruction.
>
>I'm gonna revisit this tomorrow but so far I see that 1) the test  
>program is looping, repeatedly firing exceptions, 2) MXCSR has bits  
>in it set, 3) the IP is not a SSE2 instruction and 4) I clear the FPU  
>state with *((unsigned short *)&mc->fs.fpu_fsw) &= ~0xFF; before  
>exiting the SIGFPE handler.
>
>--
>
>void do_fmul(void)
>{
>     res = atof(a) * atof(b);
>}
>
>Dump of assembler code for function do_fmul:
>0x00001d6d <do_fmul+0>: push   %ebp
>0x00001d6e <do_fmul+1>: mov    %esp,%ebp
>0x00001d70 <do_fmul+3>: push   %ebx
>0x00001d71 <do_fmul+4>: sub    $0x24,%esp
>0x00001d74 <do_fmul+7>: call   0x1ffc <__i686.get_pc_thunk.bx>
>0x00001d79 <do_fmul+12>:        lea    663(%ebx),%eax
>0x00001d7f <do_fmul+18>:        mov    (%eax),%eax
>0x00001d81 <do_fmul+20>:        mov    %eax,(%esp)
>0x00001d84 <do_fmul+23>:        call   0x302c <dyld_stub_atof>
>0x00001d89 <do_fmul+28>:        fstpl  -24(%ebp)
>0x00001d8c <do_fmul+31>:        lea    667(%ebx),%eax
>0x00001d92 <do_fmul+37>:        mov    (%eax),%eax
>0x00001d94 <do_fmul+39>:        mov    %eax,(%esp)
>0x00001d97 <do_fmul+42>:        call   0x302c <dyld_stub_atof>
>0x00001d9c <do_fmul+47>:        fstpl  -16(%ebp)
>0x00001d9f <do_fmul+50>:        movsd  -24(%ebp),%xmm0
>0x00001da4 <do_fmul+55>:        mulsd  -16(%ebp),%xmm0
>0x00001da9 <do_fmul+60>:        lea    4755(%ebx),%eax
>0x00001daf <do_fmul+66>:        mov    (%eax),%eax
>0x00001db1 <do_fmul+68>:        movsd  %xmm0,(%eax)
>0x00001db5 <do_fmul+72>:        add    $0x24,%esp
>0x00001db8 <do_fmul+75>:        pop    %ebx
>0x00001db9 <do_fmul+76>:        pop    %ebp
>0x00001dba <do_fmul+77>:        ret

I strongly suspect that you're looking at the wrong parts of the
ucontext/mcontext/whatever that gets passed to the SIGFPE handler.
In particular, what you want to look at is the application's current
EIP, not some FP IP embedded in the FP state. (And the FP IP is
x87-only, it's unrelated to SSE2.)

There should be a group of 8 or 16 general-purpose (integer)
registers somewhere in the context structure, together with
some flags and the EIP.

Note: the code above shows that Apple hasn't converted completely
to SSE2. Clearly atof() returns its value on top of the x87 stack,
then the code moves that value via memory to the SSE2 registers
before performing an SSE2 multiply. I suspect that their calling
conventions stipulate x87 usage for binary compatibility, while the
compiler has been configured to generate code for a modern SSE2-capable
CPU by default.