首页 > 代码库 > VEX IR语言语法

VEX IR语言语法

/*---------------------------------------------------------------*/
/*--- High-level IR description ---*/
/*---------------------------------------------------------------*/

/* Vex IR is an architecture-neutral intermediate representation.
Unlike some IRs in systems similar to Vex, it is not like assembly
language (ie. a list of instructions). Rather, it is more like the
IR that might be used in a compiler.

相对汇编语言,VEX IR更像是Compiler的中间语言

Code blocks
~~~~~~~~~~~
The code is broken into small code blocks ("superblocks", type:
‘IRSB‘). Each code block typically represents from 1 to perhaps 50
instructions. IRSBs are single-entry, multiple-exit code blocks.
Each IRSB contains three things:

单入口,多出口的代码块,与Intel Pin中的Trace级别相仿

- a type environment, which indicates the type of each temporary
value present in the IRSB

【实例:】

(*ir_block).tyenv
    -types
      -[0] Ity_I32
      -[1] Ity_I32
    -types_size 0x00000008
    -types_used 0x00000002

 

types_used提示有多少个Temp变量被使用,types数组里面分别保存着每个Temp变量的类型

 

- a list of statements, which represent code
【实例:】

stmts_size    0x00000003    intstmts_used    0x00000003    int-     (*ir_block).stmts[0]  
    tag Ist_IMark- (*ir_block).stmts[1]
    tag Ist_WrTmp- (*ir_block).stmts[2]
    tag Ist_Put

 

Statements也是保存在stmts数组中,stmts_used代表实际上使用的Statements的数目

 

 

- a jump that exits from the end the IRSB
【实例:】  

jumpkind    Ijk_Boring

 

最后打印出来的结果如下

0x77D699A0: movl %esi,%espIRSB {  t0:I32   t1:I32           【2个Temp变量】------ IMark(0x77D699A0, 2, 0) ------ 【3个Statements,包含IMark,但是没有包含最后一条,因为它是对于IP寄存器操作的,是自动的】  t0 = GET:I32(32)           【整条是一个Statements,而GET:I32(32)是Expression】  PUT(24) = t0  PUT(68) = 0x77D699A2:I32; exit-Boring

 

其中, 第二条Statements可以继续分解

-     (*ir_block).stmts[1]    tag    Ist_WrTmp        .tmp    0              .tag    Iex_Get                .offset    32                .ty    Ity_I32

 

Because the blocks are multiple-exit, there can be additional
conditional exit statements that cause control to leave the IRSB
before the final exit. Also because of this, IRSBs can cover
multiple non-consecutive sequences of code (up to 3). These are
recorded in the type VexGuestExtents (see libvex.h).

Statements and expressions
~~~~~~~~~~~~~~~~~~~~~~~~~~
Statements (type ‘IRStmt‘) represent operations with side-effects,
eg. guest register writes, stores, and assignments to temporaries.
Expressions (type ‘IRExpr‘) represent operations without
side-effects, eg. arithmetic operations, loads, constants.
Expressions can contain sub-expressions, forming expression trees,
eg. (3 + (4 * load(addr1)).

Statements可以有Side-Effects,但是Expressions是Pure的,没有副作用的。

ST代表从寄存器到内存的数据转移, LD代表从内存到寄存器转移数据

 


 

Expression的类型

typedef   enum {       Iex_Binder=0x15000,      Iex_Get,      Iex_GetI,      Iex_RdTmp,      Iex_Qop,      Iex_Triop,      Iex_Binop,      Iex_Unop,      Iex_Load,      Iex_Const,      Iex_Mux0X,      Iex_CCall   }   IRExprTag;

 

 

 

 Statements的类型

typedef    enum {      Ist_NoOp=0x19000,      Ist_IMark,     /* META */      Ist_AbiHint,   /* META */      Ist_Put,      Ist_PutI,      Ist_WrTmp,      Ist_Store,      Ist_CAS,      Ist_LLSC,      Ist_Dirty,      Ist_MBE,       /* META (maybe) */      Ist_Exit   }    IRStmtTag;