使用Gem5自定义RISC-V指令集

环境搭建

Gem5 编译和示例运行

Gem5的编译,需要依赖一些库文件,在ubuntu上面可以执行下条命令完成安装:

sudo apt-get install git build-essential python-dev python scons zlib1g zlib1g-dev m4 libprotobuf-dev protobuf-compiler libprotoc-dev libgoogle-perftools-dev

当所有这些库文件安装好以后,就可以克隆gem5的仓库了,有两个链接:

git clone https://gem5.googlesource.com/public/gem5
https://github.com/gem5/gem5(会以每15分钟的频率同步上面的链接)

接下来就可以使用scons工具编译Gem5模拟器了,里面有不同的ISA模型,选择RISC-V就行,在编译的时候,可以根据需要选择不同的类型和CPU模型,具体可以查看Gem5的帮助文档:

scons build/riscv/gem5.opt -j4

第一次编译,会花费比较长的时间,后续的编译都是增量编译。 如果编译过程没有什么问题,你就会得到一个模拟器的可执行文件:gem5.opt。为了验证正确与否,可以执行以下预先编译好的hello world程序。

./build/RISCV/gem5.opt configs/example/se.py -c tests/test-progs/hello/bin/riscv/linux/hello --output=sim.out --errout=sim.err

执行的log如下所示:

gem5 Simulator System.  http://gem5.org
gem5 is copyrighted software; use the --copyright option for details.

gem5 compiled Nov 25 2019 17:19:20
gem5 started Nov 28 2019 14:07:22
gem5 executing on llvm-vm, pid 61827
command line: ./build/RISCV/gem5.opt configs/example/se.py -c tests/test-progs/hello/bin/riscv/linux/hello --output=sim.out --errout=sim.err

Global frequency set at 1000000000000 ticks per second
warn: DRAM device capacity (8192 Mbytes) does not match the address range assigned (512 Mbytes)
0: system.remote_gdb: listening for remote gdb on port 7000
**** REAL SIMULATION ****
info: Entering event queue @ 0.  Starting simulation...
info: Increasing stack size by one page.
Exiting @ tick 2988000 because exiting with last active thread context
junningwu@llvm-vm:~/workspace/gem5$ cat m5out/sim.out 
Hello world!

至此,一个可以耍起来的Gem5环境已经搭建完毕。

RISC-V编译环境搭建

想要让Gem5模拟器执行RISC-V的程序代码,就还需要一个交叉编译环境(Cross Compiler),注意这里需要的是Linux的交叉工具链(riscv64-unknown-linux-gnu),而不是ELF的(riscv64-unknown-elf)。

当然,最直接的方式就是下载riscv官方仓库的代码进行编译,然而,无奈博主所在单位的网络不行,下载一直持续性中断,其他人可以尝试一下。

git clone https://github.com/riscv/riscv-gnu-toolchain
cd riscv-gnu-toolchain
git submodule update --init --recursive

博主尝试找了两个提供Linux交叉工具链的网站,但是都不凑效,一家是Arch Linux,另外一家是平头哥,感兴趣的可以下载来尝试尝试。平头哥家的工具链,添加了一些自己的指令,比如addsl,gem5的编译器不支持。

1. https://www.archlinux.org/packages/community/x86_64/riscv64-linux-gnu-gcc/
2. https://www.t-head.cn/file/download?spm=a2ouz.12987056.0.0.356648abVoxMr2&file=1571624106073/T-Head%20Tools%20package.zip

无奈之下,博主下载了CNRV(感谢群头,感谢SiFive),尽管最后的更新停留在2018.7.11日,但是下载好后,编译都正常,最重要的是可以运行。

https://pan.baidu.com/s/1J9N2VvfY9D6zakh8aMO5rg
GitHash: 397c395
MD5Sum: 63b711236a118df48d035b65b1fa5065
打包时间: 2018-Jun-11

解压编译安装后,你就会获得下列工具:

./configure --prefix=/opt/riscv
make

or

./configure --prefix=/opt/riscv
make linux

or

./configure --prefix=/opt/riscv --enable-multilib
make linux
junningwu@llvm-vm:~/workspace/freedom/rocket-chip/riscv-tools/riscv-tc/bin$ ls
elf2hex                              riscv64-unknown-linux-gnu-as
openocd                              riscv64-unknown-linux-gnu-c++
riscv64-unknown-elf-addr2line        riscv64-unknown-linux-gnu-c++filt
riscv64-unknown-elf-ar               riscv64-unknown-linux-gnu-cpp
riscv64-unknown-elf-as               riscv64-unknown-linux-gnu-elfedit
riscv64-unknown-elf-c++              riscv64-unknown-linux-gnu-g++
riscv64-unknown-elf-c++filt          riscv64-unknown-linux-gnu-gcc
riscv64-unknown-elf-cpp              riscv64-unknown-linux-gnu-gcc-7.2.0
riscv64-unknown-elf-elfedit          riscv64-unknown-linux-gnu-gcc-ar
riscv64-unknown-elf-g++              riscv64-unknown-linux-gnu-gcc-nm
riscv64-unknown-elf-gcc              riscv64-unknown-linux-gnu-gcc-ranlib
riscv64-unknown-elf-gcc-7.2.0        riscv64-unknown-linux-gnu-gcov
riscv64-unknown-elf-gcc-ar           riscv64-unknown-linux-gnu-gcov-dump
riscv64-unknown-elf-gcc-nm           riscv64-unknown-linux-gnu-gcov-tool
riscv64-unknown-elf-gcc-ranlib       riscv64-unknown-linux-gnu-gdb
riscv64-unknown-elf-gcov             riscv64-unknown-linux-gnu-gfortran
riscv64-unknown-elf-gcov-dump        riscv64-unknown-linux-gnu-gprof
riscv64-unknown-elf-gcov-tool        riscv64-unknown-linux-gnu-ld
riscv64-unknown-elf-gdb              riscv64-unknown-linux-gnu-ld.bfd
riscv64-unknown-elf-gprof            riscv64-unknown-linux-gnu-nm
riscv64-unknown-elf-ld               riscv64-unknown-linux-gnu-objcopy
riscv64-unknown-elf-ld.bfd           riscv64-unknown-linux-gnu-objdump
riscv64-unknown-elf-nm               riscv64-unknown-linux-gnu-ranlib
riscv64-unknown-elf-objcopy          riscv64-unknown-linux-gnu-readelf
riscv64-unknown-elf-objdump          riscv64-unknown-linux-gnu-run
riscv64-unknown-elf-ranlib           riscv64-unknown-linux-gnu-size
riscv64-unknown-elf-readelf          riscv64-unknown-linux-gnu-strings
riscv64-unknown-elf-run              riscv64-unknown-linux-gnu-strip
riscv64-unknown-elf-size             spike
riscv64-unknown-elf-strings          spike-dasm
riscv64-unknown-elf-strip            termios-xspike
riscv64-unknown-linux-gnu-addr2line  xspike
riscv64-unknown-linux-gnu-ar

编写一个简单的hello world程序,例如

#include <stdio.h>

int main(int argc, char* argv[])
{
    printf("Hello Haawking!\n");
    return 0;
}

编译程序:

../../../../freedom/rocket-chip/riscv-tools/riscv-tc/bin/riscv64-unknown-linux-gnu-gcc -static -v -o hello hello.c

执行的log如下所示:

junningwu@llvm-vm:~/workspace/gem5$ ./build/RISCV/gem5.opt configs/example/se.py -c tests/test-progs/hello_haawking/hello --output=sim.out --errout=sim.err
gem5 Simulator System.  http://gem5.org
gem5 is copyrighted software; use the --copyright option for details.

gem5 compiled Nov 25 2019 17:19:20
gem5 started Nov 28 2019 11:46:28
gem5 executing on llvm-vm, pid 61610
command line: ./build/RISCV/gem5.opt configs/example/se.py -c tests/test-progs/hello_haawking/hello --output=sim.out --errout=sim.err

Global frequency set at 1000000000000 ticks per second
warn: DRAM device capacity (8192 Mbytes) does not match the address range assigned (512 Mbytes)
0: system.remote_gdb: listening for remote gdb on port 7000
**** REAL SIMULATION ****
info: Entering event queue @ 0.  Starting simulation...
info: Increasing stack size by one page.
Exiting @ tick 3370000 because exiting with last active thread context   
junningwu@llvm-vm:~/workspace/gem5$ cat m5out/sim.out 
Hello Haawking!

工具链新增指令

在工具链的不同层级加入一条或多条指令,其工作量和编程友好度是不同的,最直接也是最低层次的一种方法是添加机器码,这种方法最为直接,但是也是效率最低的一种方式,特别是当指令数量增加的时候,程序员需要记得一串二进制数字代表的指令含义。这种方法适用于添加一两条指令的初期探索阶段。

.word 0x34567bd0

另外一种更高层次的添加指令的方式是通过修改Binutlis的insn,这种层次的修改,相对于机器码来说,更高效一些。例如,一条add指令,其insn格式为

add a0,a1,a2  <------->   .insn r 0x33,0,0,a0,a1,a2

The RISC-V Instruction Set Manual Volume I: User-Level ISA lists 12 instruction formats where some of the formats have multiple variants. For the ‘.insn’ pseudo directive the assembler recognizes some of the formats. Typically, the most general variant of the instruction format is used by the ‘.insn’ directive. RISC-V Instruction Formats

R type: .insn r opcode, func3, func7, rd, rs1, rs2
     +-------+-----+-----+-------+----+-------------+
     | func7 | rs2 | rs1 | func3 | rd |      opcode |
     +-------+-----+-----+-------+----+-------------+
     31      25    20    15      12   7             0

R type with 4 register operands: .insn r opcode, func3, func2, rd, rs1, rs2, rs3
     +-----+-------+-----+-----+-------+----+-------------+
     | rs3 | func2 | rs2 | rs1 | func3 | rd |      opcode |
     +-----+-------+-----+-----+-------+----+-------------+
     31    27      25    20    15      12   7             0

I type: .insn i opcode, func3, rd, rs1, simm12
     +-------------+-----+-------+----+-------------+
     |      simm12 | rs1 | func3 | rd |      opcode |
     +-------------+-----+-------+----+-------------+
     31            20    15      12   7             0

S type: .insn s opcode, func3, rd, rs1, simm12
     +--------------+-----+-----+-------+-------------+-------------+
     | simm12[11:5] | rs2 | rs1 | func3 | simm12[4:0] |      opcode |
     +--------------+-----+-----+-------+-------------+-------------+
     31             25    20    15      12            7             0

SB type: .insn sb opcode, func3, rd, rs1, symbol
SB type: .insn sb opcode, func3, rd, simm12(rs1)
     +--------------+-----+-----+-------+-------------+-------------+
     | simm21[11:5] | rs2 | rs1 | func3 | simm12[4:0] |      opcode |
     +--------------+-----+-----+-------+-------------+-------------+
     31             25    20    15      12            7             0

U type: .insn u opcode, rd, simm20
     +---------------------------+----+-------------+
     |                    simm20 | rd |      opcode |
     +---------------------------+----+-------------+
     31                          12   7             0

UJ type: .insn uj opcode, rd, symbol
     +------------+--------------+------------+---------------+----+-------------+
     | simm20[20] | simm20[10:1] | simm20[11] | simm20[19:12] | rd |      opcode |
     +------------+--------------+------------+---------------+----+-------------+
     31           30             21           20              12   7             0

CR type: .insn cr opcode2, func4, rd, rs2
     +---------+--------+-----+---------+
     |   func4 | rd/rs1 | rs2 | opcode2 |
     +---------+--------+-----+---------+
     15        12       7     2        0

CI type: .insn ci opcode2, func3, rd, simm6
     +---------+-----+--------+-----+---------+
     |   func3 | imm | rd/rs1 | imm | opcode2 |
     +---------+-----+--------+-----+---------+
     15        13    12       7     2         0

CIW type: .insn ciw opcode2, func3, rd, uimm8
     +---------+--------------+-----+---------+
     |   func3 |          imm | rd' | opcode2 |
     +---------+--------------+-----+---------+
     15        13             7     2         0

CA type: .insn ca opcode2, func6, func2, rd, rs2
     +---------+----------+-------+------+--------+
     |   func6 | rd'/rs1' | func2 | rs2' | opcode |
     +---------+----------+-------+------+--------+
     15        10         7       5      2        0

CB type: .insn cb opcode2, func3, rs1, symbol
     +---------+--------+------+--------+---------+
     |   func3 | offset | rs1' | offset | opcode2 |
     +---------+--------+------+--------+---------+
     15        13       10     7        2         0

CJ type: .insn cj opcode2, symbol
     +---------+--------------------+---------+
     |   func3 |        jump target | opcode2 |
     +---------+--------------------+---------+
     15        13             7     2         0
	 

第三种方式,相较于第二种,编程更为友好一些。需要借助于riscv-opcodes工具以及riscv-binutils-gdb工具的修改。这种方法,程序员可以在c程序中内嵌汇编或者直接编写汇编程序,反汇编之后也可以看到相应的指令,例如,当增加一条mod指令后,就可以这样编程:

  asm volatile
  (
    "mod   %[z], %[x], %[y]\n\t"
    : [z] "=r" (c)
    : [x] "r" (a), [y] "r" (b)
  ) ; 

第四种方法,就是需要修改编译器,使得用户可以直接编写C等高级语言的应用程序。

修改编译器

如果编译器没有修改,直接编译上面的程序代码,就会报类似的错误:

junningwu@llvm-vm:~/workspace/gem5/tests/test-progs/mod_test$ ../../../../freedom/rocket-chip/riscv-tools/riscv-tc/bin/riscv64-unknown-linux-gnu-gcc --static -o mod_test mod_test.c 
mod_test.c: Assembler messages:
mod_test.c:38: Error: unrecognized opcode `mod a5,a5,a4'

这部分主要参考NITISH SRIVASTAVA 的这篇教程,感兴趣的可以看原文,我这里只是进行一些修改记录。Adding custom instruction to RISCV ISA and running it on gem5 and spike 首先在riscv-tools/riscv-opcodes/opcodes文件中添加下列代码,这里只是教程的示例,后面可能需要根据自己的设计,修改编码内容和赋值。

mod     rd rs1 rs2 31..25=1  14..12=0 6..2=0x1A 1..0=3

然后在当前目录下运行下列代码,获取MATCH和MASK。

cat opcodes-pseudo opcodes opcodes-rvc opcodes-rvc-pseudo opcodes-custom | ./parse-opcodes -c > ~/temp.h
#define MATCH_MOD 0x200006b                                                    
#define MASK_MOD 0xfe00707f

之后就是修改riscv-gnu-toolchain/riscv-binutils-gdb/include/opcode/riscv-opc.h和riscv-gnu-toolchain/riscv-binutils-gdb/opcodes/riscv-opc.c,分别修改头文件和源文件。在头文件里添加MATCH和MASK,在源文件里,添加如下代码:

const struct riscv_opcode riscv_opcodes[] =                                     
{                                                                               
/* name,      isa,   operands, match, mask, match_func, pinfo.  */              
....
....
....
{"mod",       "I",   "d,s,t",  MATCH_MOD, MASK_MOD, match_opcode, 0 }
....
....

关于结构体riscv_opcode的注释:

/* This structure holds information for a particular instruction.  */
struct riscv_opcode
{
  /* The name of the instruction.  */
  const char *name;
  /* The requirement of xlen for the instruction, 0 if no requirement.  */
  unsigned xlen_requirement;
  /* Class to which this instruction belongs.  Used to decide whether or
     not this instruction is legal in the current -march context.  */
  enum riscv_insn_class insn_class;
  /* A string describing the arguments for this instruction.  */
  const char *args;
  /* The basic opcode for the instruction.  When assembling, this
     opcode is modified by the arguments to produce the actual opcode
     that is used.  If pinfo is INSN_MACRO, then this is 0.  */
  insn_t match;
  /* If pinfo is not INSN_MACRO, then this is a bit mask for the
     relevant portions of the opcode when disassembling.  If the
     actual opcode anded with the match field equals the opcode field,
     then we have found the correct instruction.  If pinfo is
     INSN_MACRO, then this field is the macro identifier.  */
  insn_t mask;
  /* A function to determine if a word corresponds to this instruction.
     Usually, this computes ((word & mask) == match).  */
  int (*match_func) (const struct riscv_opcode *op, insn_t word);
  /* For a macro, this is INSN_MACRO.  Otherwise, it is a collection
     of bits describing the instruction, notably any relevant hazard
     information.  */
  unsigned long pinfo;
};

其实,我们发现RISC-V的opcode以及工具链大多沿用MIPS的,因此在MIPS opc的头文件里发现了一些更为详细的注释结果,可以供参考:

/* This structure holds information for a particular instruction.  */
struct mips_opcode
{
  /* The name of the instruction.  */
  const char *name;
  /* A string describing the arguments for this instruction.  */
  const char *args;
  /* The basic opcode for the instruction.  When assembling, this
     opcode is modified by the arguments to produce the actual opcode
     that is used.  If pinfo is INSN_MACRO, then this is 0.  */
  unsigned long match;
  /* If pinfo is not INSN_MACRO, then this is a bit mask for the
     relevant portions of the opcode when disassembling.  If the
     actual opcode anded with the match field equals the opcode field,
     then we have found the correct instruction.  If pinfo is
     INSN_MACRO, then this field is the macro identifier.  */
  unsigned long mask;
  /* For a macro, this is INSN_MACRO.  Otherwise, it is a collection
     of bits describing the instruction, notably any relevant hazard
     information.  */
  unsigned long pinfo;
  /* A collection of additional bits describing the instruction. */
  unsigned long pinfo2;
  /* A collection of bits describing the instruction sets of which this
     instruction or macro is a member. */
  unsigned long membership;
};
/* These are the characters which may appear in the args field of an
   instruction.  They appear in the order in which the fields appear
   when the instruction is used.  Commas and parentheses in the args
   string are ignored when assembling, and written into the output
   when disassembling.
   Each of these characters corresponds to a mask field defined above.
   "1" 5 bit sync type (OP_*_SHAMT)
   "<" 5 bit shift amount (OP_*_SHAMT)
   ">" shift amount between 32 and 63, stored after subtracting 32 (OP_*_SHAMT)
   "a" 26 bit target address (OP_*_TARGET)
   "b" 5 bit base register (OP_*_RS)
   "c" 10 bit breakpoint code (OP_*_CODE)
   "d" 5 bit destination register specifier (OP_*_RD)
   "h" 5 bit prefx hint (OP_*_PREFX)
   "i" 16 bit unsigned immediate (OP_*_IMMEDIATE)
   "j" 16 bit signed immediate (OP_*_DELTA)
   "k" 5 bit cache opcode in target register position (OP_*_CACHE)
       Also used for immediate operands in vr5400 vector insns.
   "o" 16 bit signed offset (OP_*_DELTA)
   "p" 16 bit PC relative branch target address (OP_*_DELTA)
   "q" 10 bit extra breakpoint code (OP_*_CODE2)
   "r" 5 bit same register used as both source and target (OP_*_RS)
   "s" 5 bit source register specifier (OP_*_RS)
   "t" 5 bit target register (OP_*_RT)
   "u" 16 bit upper 16 bits of address (OP_*_IMMEDIATE)
   "v" 5 bit same register used as both source and destination (OP_*_RS)
   "w" 5 bit same register used as both target and destination (OP_*_RT)
   "U" 5 bit same destination register in both OP_*_RD and OP_*_RT
       (used by clo and clz)
   "C" 25 bit coprocessor function code (OP_*_COPZ)
   "B" 20 bit syscall/breakpoint function code (OP_*_CODE20)
   "J" 19 bit wait function code (OP_*_CODE19)
   "x" accept and ignore register name
   "z" must be zero register
   "K" 5 bit Hardware Register (rdhwr instruction) (OP_*_RD)
   "+A" 5 bit ins/ext/dins/dext/dinsm/dextm position, which becomes
        LSB (OP_*_SHAMT; OP_*_EXTLSB or OP_*_STYPE may be used for
        microMIPS compatibility).
	Enforces: 0 <= pos < 32.
   "+B" 5 bit ins/dins size, which becomes MSB (OP_*_INSMSB).
	Requires that "+A" or "+E" occur first to set position.
	Enforces: 0 < (pos+size) <= 32.
   "+C" 5 bit ext/dext size, which becomes MSBD (OP_*_EXTMSBD).
	Requires that "+A" or "+E" occur first to set position.
	Enforces: 0 < (pos+size) <= 32.
	(Also used by "dext" w/ different limits, but limits for
	that are checked by the M_DEXT macro.)
   "+E" 5 bit dinsu/dextu position, which becomes LSB-32 (OP_*_SHAMT).
	Enforces: 32 <= pos < 64.
   "+F" 5 bit "dinsm/dinsu" size, which becomes MSB-32 (OP_*_INSMSB).
	Requires that "+A" or "+E" occur first to set position.
	Enforces: 32 < (pos+size) <= 64.
   "+G" 5 bit "dextm" size, which becomes MSBD-32 (OP_*_EXTMSBD).
	Requires that "+A" or "+E" occur first to set position.
	Enforces: 32 < (pos+size) <= 64.
   "+H" 5 bit "dextu" size, which becomes MSBD (OP_*_EXTMSBD).
	Requires that "+A" or "+E" occur first to set position.
	Enforces: 32 < (pos+size) <= 64.
   Floating point instructions:
   "D" 5 bit destination register (OP_*_FD)
   "M" 3 bit compare condition code (OP_*_CCC) (only used for mips4 and up)
   "N" 3 bit branch condition code (OP_*_BCC) (only used for mips4 and up)
   "S" 5 bit fs source 1 register (OP_*_FS)
   "T" 5 bit ft source 2 register (OP_*_FT)
   "R" 5 bit fr source 3 register (OP_*_FR)
   "V" 5 bit same register used as floating source and destination (OP_*_FS)
   "W" 5 bit same register used as floating target and destination (OP_*_FT)
   Coprocessor instructions:
   "E" 5 bit target register (OP_*_RT)
   "G" 5 bit destination register (OP_*_RD)
   "H" 3 bit sel field for (d)mtc* and (d)mfc* (OP_*_SEL)
   "P" 5 bit performance-monitor register (OP_*_PERFREG)
   "e" 5 bit vector register byte specifier (OP_*_VECBYTE)
   "%" 3 bit immediate vr5400 vector alignment operand (OP_*_VECALIGN)
   see also "k" above
   "+D" Combined destination register ("G") and sel ("H") for CP0 ops,
	for pretty-printing in disassembly only.
   Macro instructions:
   "A" General 32 bit expression
   "I" 32 bit immediate (value placed in imm_expr).
   "+I" 32 bit immediate (value placed in imm2_expr).
   "F" 64 bit floating point constant in .rdata
   "L" 64 bit floating point constant in .lit8
   "f" 32 bit floating point constant
   "l" 32 bit floating point constant in .lit4
   MDMX instruction operands (note that while these use the FP register
   fields, they accept both $fN and $vN names for the registers):  
   "O"	MDMX alignment offset (OP_*_ALN)
   "Q"	MDMX vector/scalar/immediate source (OP_*_VSEL and OP_*_FT)
   "X"	MDMX destination register (OP_*_FD) 
   "Y"	MDMX source register (OP_*_FS)
   "Z"	MDMX source register (OP_*_FT)
   DSP ASE usage:
   "2" 2 bit unsigned immediate for byte align (OP_*_BP)
   "3" 3 bit unsigned immediate (OP_*_SA3)
   "4" 4 bit unsigned immediate (OP_*_SA4)
   "5" 8 bit unsigned immediate (OP_*_IMM8)
   "6" 5 bit unsigned immediate (OP_*_RS)
   "7" 2 bit dsp accumulator register (OP_*_DSPACC)
   "8" 6 bit unsigned immediate (OP_*_WRDSP)
   "9" 2 bit dsp accumulator register (OP_*_DSPACC_S)
   "0" 6 bit signed immediate (OP_*_DSPSFT)
   ":" 7 bit signed immediate (OP_*_DSPSFT_7)
   "'" 6 bit unsigned immediate (OP_*_RDDSP)
   "@" 10 bit signed immediate (OP_*_IMM10)
   MT ASE usage:
   "!" 1 bit usermode flag (OP_*_MT_U)
   "$" 1 bit load high flag (OP_*_MT_H)
   "*" 2 bit dsp/smartmips accumulator register (OP_*_MTACC_T)
   "&" 2 bit dsp/smartmips accumulator register (OP_*_MTACC_D)
   "g" 5 bit coprocessor 1 and 2 destination register (OP_*_RD)
   "+t" 5 bit coprocessor 0 destination register (OP_*_RT)
   "+T" 5 bit coprocessor 0 destination register (OP_*_RT) - disassembly only
   MCU ASE usage:
   "~" 12 bit offset (OP_*_OFFSET12)
   "\" 3 bit position for aset and aclr (OP_*_3BITPOS)
   UDI immediates:
   "+1" UDI immediate bits 6-10
   "+2" UDI immediate bits 6-15
   "+3" UDI immediate bits 6-20
   "+4" UDI immediate bits 6-25
   Octeon:
   "+x" Bit index field of bbit.  Enforces: 0 <= index < 32.
   "+X" Bit index field of bbit aliasing bbit32.  Matches if 32 <= index < 64,
	otherwise skips to next candidate.
   "+p" Position field of cins/cins32/exts/exts32. Enforces 0 <= pos < 32.
   "+P" Position field of cins/exts aliasing cins32/exts32.  Matches if
	32 <= pos < 64, otherwise skips to next candidate.
   "+Q" Immediate field of seqi/snei.  Enforces -512 <= imm < 512.
   "+s" Length-minus-one field of cins/exts.  Enforces: 0 <= lenm1 < 32.
   "+S" Length-minus-one field of cins32/exts32 or cins/exts aliasing
	cint32/exts32.  Enforces non-negative value and that
	pos + lenm1 < 32 or pos + lenm1 < 64 depending whether previous
	position field is "+p" or "+P".
   Loongson-3A:
   "+a" 8-bit signed offset in bit 6 (OP_*_OFFSET_A)
   "+b" 8-bit signed offset in bit 3 (OP_*_OFFSET_B)
   "+c" 9-bit signed offset in bit 6 (OP_*_OFFSET_C)
   "+z" 5-bit rz register (OP_*_RZ)
   "+Z" 5-bit fz register (OP_*_FZ)
   Other:
   "()" parens surrounding optional value
   ","  separates operands
   "[]" brackets around index for vector-op scalar operand specifier (vr5400)
   "+"  Start of extension sequence.
   Characters used so far, for quick reference when adding more:
   "1234567890"
   "%[]<>(),+:'@!$*&\~"
   "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
   "abcdefghijklopqrstuvwxz"
   Extension character sequences used so far ("+" followed by the
   following), for quick reference when adding more:
   "1234"
   "ABCDEFGHIPQSTXZ"
   "abcpstxz"
*/

接下来就重新编译一下工具链riscv-gnu-toolchain,不出问题,应该就可以识别开头的那段程序了。 输出结果为:

junningwu@llvm-vm:~/workspace/gem5/tests/test-progs/mod_test$ ../../../../freedom/rocket-chip/riscv-tools/riscv-tc-new/bin/riscv64-unknown-linux-gnu-gcc  --static -o mod_test mod_test.c
junningwu@llvm-vm:~/workspace/gem5/tests/test-progs/mod_test$ ../../../../freedom/rocket-chip/riscv-tools/riscv-tc-new/bin/riscv64-unknown-linux-gnu-objdump -D mod_test | grep "mod"
mod_test:     file format elf64-littleriscv
   102ca:	02e787eb          	mod	a5,a5,a4
   1594e:	070030ef          	jal	ra,189be <_IO_switch_to_get_mode>
   15c16:	3cf230ef          	jal	ra,397e4 <_IO_switch_to_wget_mode>
   15cd4:	311230ef          	jal	ra,397e4 <_IO_switch_to_wget_mode>
   16c98:	527010ef          	jal	ra,189be <_IO_switch_to_get_mode>
   17068:	157010ef          	jal	ra,189be <_IO_switch_to_get_mode>

修改Gem5模拟器

首先我们先来看看两个程序示例,一个就是之前已经跑通过的hello haawking例子,另外一个就是刚刚编译通过的mod test例子。在没有修改过Gem5模拟器的时候,毫无意外,hello haawking例子应该可以正常运行,而mod test则肯定出错。 下面是仿真的结果,我们先来看一下:

junningwu@llvm-vm:~/workspace/gem5$ ./build/RISCV/gem5.debug configs/example/se.py -c tests/test-progs/hello_haawking/hello --output=sim.out --errout=sim.err
gem5 Simulator System.  http://gem5.org
gem5 is copyrighted software; use the --copyright option for details.

gem5 compiled Nov 28 2019 18:57:55
gem5 started Dec  2 2019 09:23:29
gem5 executing on llvm-vm, pid 42656
command line: ./build/RISCV/gem5.debug configs/example/se.py -c tests/test-progs/hello_haawking/hello --output=sim.out --errout=sim.err

Global frequency set at 1000000000000 ticks per second
warn: DRAM device capacity (8192 Mbytes) does not match the address range assigned (512 Mbytes)
0: system.remote_gdb: listening for remote gdb on port 7000
**** REAL SIMULATION ****
info: Entering event queue @ 0.  Starting simulation...
info: Increasing stack size by one page.
Exiting @ tick 3370000 because exiting with last active thread context
junningwu@llvm-vm:~/workspace/gem5$ cat m5out/sim.out 
Hello Haawking!
junningwu@llvm-vm:~/workspace/gem5$ ./build/RISCV/gem5.debug configs/example/se.py -c tests/test-progs/mod_test/mod_test --output=sim.out --errout=sim.err
gem5 Simulator System.  http://gem5.org
gem5 is copyrighted software; use the --copyright option for details.

gem5 compiled Nov 28 2019 18:57:55
gem5 started Dec  2 2019 09:24:23
gem5 executing on llvm-vm, pid 42672
command line: ./build/RISCV/gem5.debug configs/example/se.py -c tests/test-progs/mod_test/mod_test --output=sim.out --errout=sim.err

Global frequency set at 1000000000000 ticks per second
warn: DRAM device capacity (8192 Mbytes) does not match the address range assigned (512 Mbytes)
0: system.remote_gdb: listening for remote gdb on port 7000
**** REAL SIMULATION ****
info: Entering event queue @ 0.  Starting simulation...
info: Increasing stack size by one page.
panic: Unknown instruction 0x02e787eb at pc 0x00000000000102ca
Memory Usage: 687648 KBytes
Program aborted at tick 2660000
--- BEGIN LIBC BACKTRACE ---
./build/RISCV/gem5.debug(_Z15print_backtracev+0x32)[0x1257243]
./build/RISCV/gem5.debug(_Z12abortHandleri+0x6e)[0x1268dfb]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7f6656223390]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x38)[0x7f6654c2b428]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x16a)[0x7f6654c2d02a]
./build/RISCV/gem5.debug(_ZN7PCEventD1Ev+0x0)[0xf1c520]
./build/RISCV/gem5.debug(_ZN8RiscvISA16UnknownInstFault8invokeSEEP13ThreadContextRK14RefCountingPtrI10StaticInstE+0xb6)[0x144e7da]
./build/RISCV/gem5.debug(_ZN8RiscvISA10RiscvFault6invokeEP13ThreadContextRK14RefCountingPtrI10StaticInstE+0x5ec)[0x144e51c]
./build/RISCV/gem5.debug(_ZN13BaseSimpleCPU9advancePCERKSt10shared_ptrI9FaultBaseE+0x12b)[0x13f8fb7]
./build/RISCV/gem5.debug(_ZN15AtomicSimpleCPU4tickEv+0xa0d)[0x13d5605]
./build/RISCV/gem5.debug[0x13d1385]
./build/RISCV/gem5.debug[0x13d5a4c]
./build/RISCV/gem5.debug(_ZNKSt8functionIFvvEEclEv+0x32)[0x1005318]
./build/RISCV/gem5.debug(_ZN20EventFunctionWrapper7processEv+0x1c)[0x1004e6a]
./build/RISCV/gem5.debug(_ZN10EventQueue10serviceOneEv+0xe9)[0x1265d89]
./build/RISCV/gem5.debug(_Z9doSimLoopP10EventQueue+0x1fa)[0x12708ac]
./build/RISCV/gem5.debug(_Z8simulatem+0x312)[0x12704fd]
./build/RISCV/gem5.debug[0x2792a3c]
./build/RISCV/gem5.debug[0x2790ac4]
./build/RISCV/gem5.debug[0x278d623]
./build/RISCV/gem5.debug[0x278d68f]
./build/RISCV/gem5.debug[0x12e080a]
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x7852)[0x7f66564e07b2]
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x85c)[0x7f665661711c]
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x6ffd)[0x7f66564dff5d]
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x85c)[0x7f665661711c]
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x6ffd)[0x7f66564dff5d]
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x85c)[0x7f665661711c]
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x6ffd)[0x7f66564dff5d]
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x85c)[0x7f665661711c]
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalCode+0x19)[0x7f66564d8de9]
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x613b)[0x7f66564df09b]
--- END LIBC BACKTRACE ---
Aborted (core dumped)

可以看到,出现了一个致命的错误,无法识别的指令,当我们打开反汇编代码之后,就可以看到pc值为0x102ca处为一条mod指令。

00000000000102ae <main>:
   102ae:	1101                	addi	sp,sp,-32
   102b0:	ec06                	sd	ra,24(sp)
   102b2:	e822                	sd	s0,16(sp)
   102b4:	1000                	addi	s0,sp,32
   102b6:	4795                	li	a5,5
   102b8:	fef42623          	sw	a5,-20(s0)
   102bc:	4789                	li	a5,2
   102be:	fef42423          	sw	a5,-24(s0)
   102c2:	fec42783          	lw	a5,-20(s0)
   102c6:	fe842703          	lw	a4,-24(s0)
   102ca:	02e787eb          	mod	a5,a5,a4
   102ce:	fef42223          	sw	a5,-28(s0)
   102d2:	fe442783          	lw	a5,-28(s0)
   102d6:	0007871b          	sext.w	a4,a5
   102da:	4785                	li	a5,1
   102dc:	00f70a63          	beq	a4,a5,102f0 <main+0x42>
   102e0:	0004c7b7          	lui	a5,0x4c
   102e4:	7f078513          	addi	a0,a5,2032 # 4c7f0 <tcache_thread_freeres+0x58>
   102e8:	2c6050ef          	jal	ra,155ae <_IO_puts>
   102ec:	57fd                	li	a5,-1
   102ee:	a801                	j	102fe <main+0x50>
   102f0:	0004d7b7          	lui	a5,0x4d
   102f4:	80078513          	addi	a0,a5,-2048 # 4c800 <tcache_thread_freeres+0x68>
   102f8:	2b6050ef          	jal	ra,155ae <_IO_puts>
   102fc:	4781                	li	a5,0
   102fe:	853e                	mv	a0,a5
   10300:	60e2                	ld	ra,24(sp)
   10302:	6442                	ld	s0,16(sp)
   10304:	6105                	addi	sp,sp,32
   10306:	8082                	ret

第一次尝试(按照教程)

编译器的修改如上面所示。接下来就是按照教程,修改Gem5模拟器。修改文件arch/riscv/isa/decoder.isa,找到add处,插入mod的译码部分。这里需要说明,教程中opcode的值是0x33,而在Gem5模拟器中,对一直为0x3的低两位做了移位处理,因此模拟器中的opcode为0x0c。

    0x33: decode FUNCT3 {
        format ROp {
            0x0: decode FUNCT7 {
                0x0: add();
                0x1: mul(, IntMultOp);
                0x10: mod();
                0x20: sub();
            }

接下来我们跟上面一样,编译gem5模拟器。但是奇迹并未出现,结果依然是无法识别指令。 我们分析一下反汇编的指令0x02e787eb。这里根据risc-v的手册,可以看到,opcode已经变化了。按照原理,手动写出来的机器码应该是0x025202b3。

第二次尝试

所以我们进行第二次尝试,分别进行如下修改。

///toolchain
mod     rd rs1 rs2 31..25=2  14..12=0 6..2=0x0C 1..0=3

#define MATCH_MOD 0x4000033
#define MASK_MOD  0xfe00707f

{"mod",       "I",   "d,s,t",  MATCH_MOD, MASK_MOD, match_opcode, 0 },
///gem5
0x0c: decode FUNCT3 {
            format ROp {
                0x0: decode FUNCT7 {
                    0x0: add();
                    0x1: mul(, IntMultOp);
                    0x2: mod();
                    0x20: sub();
                }

修改编译器后的反汇编代码:

00000000000102ae <main>:
   102ae:	1101                	addi	sp,sp,-32
   102b0:	ec06                	sd	ra,24(sp)
   102b2:	e822                	sd	s0,16(sp)
   102b4:	1000                	addi	s0,sp,32
   102b6:	4795                	li	a5,5
   102b8:	fef42623          	sw	a5,-20(s0)
   102bc:	4789                	li	a5,2
   102be:	fef42423          	sw	a5,-24(s0)
   102c2:	fec42783          	lw	a5,-20(s0)
   102c6:	fe842703          	lw	a4,-24(s0)
   102ca:	04e787b3          	mod	a5,a5,a4
   102ce:	fef42223          	sw	a5,-28(s0)
   102d2:	fe442783          	lw	a5,-28(s0)
   102d6:	0007871b          	sext.w	a4,a5
   102da:	4785                	li	a5,1
   102dc:	00f70a63          	beq	a4,a5,102f0 <main+0x42>
   102e0:	0004c7b7          	lui	a5,0x4c
   102e4:	7f078513          	addi	a0,a5,2032 # 4c7f0 <tcache_thread_freeres+0x58>
   102e8:	2c6050ef          	jal	ra,155ae <_IO_puts>
   102ec:	57fd                	li	a5,-1
   102ee:	a801                	j	102fe <main+0x50>
   102f0:	0004d7b7          	lui	a5,0x4d
   102f4:	80078513          	addi	a0,a5,-2048 # 4c800 <tcache_thread_freeres+0x68>
   102f8:	2b6050ef          	jal	ra,155ae <_IO_puts>
   102fc:	4781                	li	a5,0
   102fe:	853e                	mv	a0,a5
   10300:	60e2                	ld	ra,24(sp)
   10302:	6442                	ld	s0,16(sp)
   10304:	6105                	addi	sp,sp,32
   10306:	8082                	ret

修改模拟器后的仿真结果:

junningwu@llvm-vm:~/workspace/gem5$ ./build/RISCV/gem5.debug configs/example/se.py -c tests/test-progs/mod_test/mod_test --output=sim.out --errout=sim.err
gem5 Simulator System.  http://gem5.org
gem5 is copyrighted software; use the --copyright option for details.

gem5 compiled Dec  2 2019 10:18:05
gem5 started Dec  2 2019 11:16:19
gem5 executing on llvm-vm, pid 15607
command line: ./build/RISCV/gem5.debug configs/example/se.py -c tests/test-progs/mod_test/mod_test --output=sim.out --errout=sim.err

Global frequency set at 1000000000000 ticks per second
warn: DRAM device capacity (8192 Mbytes) does not match the address range assigned (512 Mbytes)
0: system.remote_gdb: listening for remote gdb on port 7000
**** REAL SIMULATION ****
info: Entering event queue @ 0.  Starting simulation...
info: Increasing stack size by one page.
Exiting @ tick 3405000 because exiting with last active thread context
junningwu@llvm-vm:~/workspace/gem5$ cat m5out/sim.out 

HAAWKING TEST: PASSED

至此,根据教程进行指令集添加和仿真验证的工作就到一段路了。 One more Thing,再来看一下Gem5模拟器执行mod指令的trace文件,

./build/RISCV/gem5.debug --debug-flags=Exec --debug-file=trace.out configs/example/se.py -c tests/test-progs/mod_test/mod_test --output=sim.out --errout=sim.err
2653000: system.cpu: T0 : @main    : c_addi sp, -32             : IntAlu :  D=0x7ffffffffffffdc0
2653500: system.cpu: T0 : @main+2    : c_sdsp ra, 24(sp)          : MemWrite :  D=0x00000000000104aa A=0x7ffffffffffffdd8
2654000: system.cpu: T0 : @main+4    : c_sdsp s0, 16(sp)          : MemWrite :  D=0x00000000000107e8 A=0x7ffffffffffffdd0
2654500: system.cpu: T0 : @main+6    : c_addi4spn s0, sp, 32      : IntAlu :  D=0x7ffffffffffffde0
2655000: system.cpu: T0 : @main+8    : c_li a5, 5                 : IntAlu :  D=0x0000000000000005
2655500: system.cpu: T0 : @main+10    : sw a5, -20(s0)             : MemWrite :  D=0x0000000000000005 A=0x7ffffffffffffdcc
2656000: system.cpu: T0 : @main+14    : c_li a5, 2                 : IntAlu :  D=0x0000000000000002
2657000: system.cpu: T0 : @main+16    : sw a5, -24(s0)             : MemWrite :  D=0x0000000000000002 A=0x7ffffffffffffdc8
2658000: system.cpu: T0 : @main+20    : lw a5, -20(s0)             : MemRead :  D=0x0000000000000005 A=0x7ffffffffffffdcc
2659000: system.cpu: T0 : @main+24    : lw a4, -24(s0)             : MemRead :  D=0x0000000000000002 A=0x7ffffffffffffdc8
2660000: system.cpu: T0 : @main+28    : mod a5, a5, a4             : IntAlu :  D=0x0000000000000001
2661000: system.cpu: T0 : @main+32    : sw a5, -28(s0)             : MemWrite :  D=0x0000000000000001 A=0x7ffffffffffffdc4
2662000: system.cpu: T0 : @main+36    : lw a5, -28(s0)             : MemRead :  D=0x0000000000000001 A=0x7ffffffffffffdc4
2663000: system.cpu: T0 : @main+40    : addiw a4, a5, 0            : IntAlu :  D=0x0000000000000001
2663500: system.cpu: T0 : @main+44    : c_li a5, 1                 : IntAlu :  D=0x0000000000000001
2664000: system.cpu: T0 : @main+46    : beq a4, a5, 20             : IntAlu : 
2664500: system.cpu: T0 : @main+66    : lui a5, 77                 : IntAlu :  D=0x000000000004d000
2665000: system.cpu: T0 : @main+70    : addi a0, a5, -2040         : IntAlu :  D=0x000000000004c808
2665500: system.cpu: T0 : @main+74    : jal ra, 21174              : IntAlu :  D=0x00000000000102fc
Junning Wu wechat
Welcome To Contact Me with My Personal WeChat ID(Weixin)