首页 > 代码库 > x86的ABI分析(函数实现原理)--part2

x86的ABI分析(函数实现原理)--part2


     As we all know, function is a important concept in programming design. At this moment, I even 
 don‘t know what kind of programming language can working without function. ( maybe i am new).
 maybe some special language can ? But this concept is very necessary indeed. Now, the question
 is : How can we realize this concept in assemble language/machine code ?


  1. Overview


     Three issues will be explain by this example..
             a). How can we call a function?
             b). How can we build a stack frame for local variable ?
             c). How can we pass parameters between caller and callee ?

#include <stdio.h>/**    This is a empty function, it does nothing. we build it for show how can we call a function.*/void call(){ }/**    Explain how can we build a stack frame.*/void frame(){        int b;}/**    Explain something about pass parameters and return result.*/int parameters( int a, int b, int c){<pre class="plain" name="code">        int sum;        sum = a + b + c;        return sum;}	int main(){        int ret;        call( );        frame( );        ret = parameters(1,2,3);        return 0;}

        For discuss the issues, we need to translate it to a low level language, assemble language.     In this example, I use a linux compiler--gcc. It will help us to get a assemble code. (Actually,     there have a problem in here--different compiler may be use different convention, even use      different Application Binary Interface, but they still have some common features.)         The corresponding new code is :

        ......        call:                pushl  %ebp                movl  %esp, %ebp                popl %ebp                ret        ......        frame:                pushl  %ebp                movl  %esp, %ebp                subl   $16, %esp                leave                ret                ......        parameters:                pushl  %ebp                movl  %esp, %ebp                subl  $16, %esp                movl  12(%ebp), %eax                addl  8(%ebp), %eax                addl  16(%ebp), %eax                movl  %eax, -4(%ebp)                movl  -4(%ebp), %eax                leave                ret        ......        main:                leal 4(%esp), %ecx                andl $ -16, %esp                pushl  -4(%ecx)                pushl  %ebp                movl  %esp, %ebp                pushl  %ecx                subl  $28, %esp                call  call                call  frame                movl  $3, 8(%esp)                movl  $2, 4(%esp)                movl  $1, (%esp)                call  parameters                movl  %eax, -8(%ebp)                movl  $0, %eax                addl  $28, %esp                popl  %ecx                popl  %ebp                leal  -4(%ecx), %esp                ret        ......

        (Be careful, Here is AT&T syntax.)

 2. How can we call a function?

        From the view of machine, call a function is equal to change the instruction stream. It 
    seems like simple. Actually, there are another problem, How can we return to the instruction
    stream of the caller ? A valid way is save the instruction pointer before jump to the callee.
        Now, Let us see this example:

        int main()        {                ...                call( );                ...        }        void call()        { }

        This is simple function call, how can we realize it by assemble language ? examine the 
    corresponding code.

        main:                ......                call  call                ......

    In @main function, it call a function @call by a assemble instruction--call. This instruction
    does two things needed to be done. one, save the current value of register @IP in stack. 
    Two, revise the value of @IP to the address of caller. 

            pushl %IP;
            movl  call, %IP;

        call:                ....                ret

        when we complete this subroutine, the next step is to return to the previous instruction 
    stream. The current status is 

            ....              <-- %EBP for caller
            ....
            0xeeee0000 <-- return address
                              <-- %ESP for caller
    So, we just need to pop the data from stack.
           pop %IP;

 3. How can we build a stack frame for local variable ?

        For local variable, there is a important feature that we need--reentrant. we want to local variable 
    can be independent in every function call, even call a recursive function. So we dynamically create 
    independent memory space for every function call, this is called --stack frame. Now , examine the code.

        int main()        {            ...            frame( );            ...        }        void frame()        {            int b;        }

    before we call @frame, all of thing is same with the example above. The current stack is

        base address for main         <-- %EBP        ....        top address of stack of main  <-- %ESP

    when we call this @frame, the new stack is

        base address for main         <-- %EBP        ....        return address of caller        top address of stack of main  <-- %ESP

    then let us examine the progress of callee, the assemble is :

        frame:               pushl  %ebp               movl  %esp, %ebp               subl  $16, %esp               leave               ret

    As we can see, It will save the frame information of caller. and then create a new frame. When we execute the first command:

               pushl  %ebp

    the stack frame is

        bottom of stack of main         <-- %EBP        ....        return address of caller        base address of frame of caller        top of stack of main            <-- %ESP

     when we execute the second instruction:

              movl %ESP, %EBP;

     the new stack frame is:

        stack bottom of main        ....        return address of caller        base address of frame of caller        stack top of main                <-- %EBP        stack top of callee              <-- %ESP

     Actually, the stack top of caller is the stack bottom of callee. So far,we didn‘t allocate memory space for local variable of this function. So the stack bottom of callee and stack top of callee is same temporarily. But in the next instruction, we will allocate space:

               subl  $16, %esp

     As we can see, the memory space allocated is 16 bytes because of some reason about memory alignment and the like. Actually, we just use the first 4 bytes. The new stack is :

        stack bottom of main        ....        return address of caller        base address of frame of caller        stack top of main                <-- %EBP        local variable b        ....        stack top of callee              <-- %ESP

    So far, we have been build a valid stack frame for this new function call.

    The next question is how can we resume the frame of caller when we complete this subroutine?
    That is easy . Recall the frame above, we just need :

            movl %EBP, %ESP;            pop %EBP;

        Actually, there is another more simpler instruction--leave. It will complete those two steps. 
    Now, the stack is :

        stack bottom of main             <-- %EBP        ....        return address of caller        stack top of main                <-- %ESP

    It is seems like all of things become OK.

 4. How can we pass parameters between caller and callee ?

        Usually, Pass parameters is necessary when we call a function. Where should be the place we reside
    those data ? Let us see the example below, we call a function with several parameters.

        int main()        {            int ret;            ...             ret = parameters(1,2,3);             ...        }        int parameters( int a, int b, int c)        {            int sum;            sum = a + b + c;            return sum;        }

    The corresponding assemble code is :

        main:            ...                movl $3, 8(%esp)            movl $2, 4(%esp)            movl $1, (%esp)            call parameters            movl %eax, -8(%ebp)            ...
       parameters:            pushl %ebp            movl  %esp, %ebp            subl  $16, %esp            movl  12(%ebp), %eax            addl  8(%ebp), %eax            addl  16(%ebp), %eax            movl  %eax, -4(%ebp)            movl  -4(%ebp), %eax            leave            ret

     Now, examine those instructions.Before we call this function, the stack is :

            stack bottom of main  <-- %EBP            ...            stack top of main     <-- %ESP

    Then , we push three parameters in reverse order:

            stack bottom of main  <-- %EBP            ...            3                     <-- 3th parameter            2            1            stack top of main     <-- %ESP

    Then, 

         call parameters            

    This is same as we analysis above. we jump to the new instruction stream, and build a new stack frame.

            stack bottom of main              ...            3                        <-- 3th parameter            2            1            return address of caller            base address of frame of caller            stack top of main        <-- %EBP            ...            stack top of callee      <-- %ESP

    In subroutine, if we need parameters, we can simply get it.