Using Gem5 03 Gem5 Running Tutorial
Using Gem5 03 Gem5 Running Tutorial
Using gem5
• gem5 standard • Full system sim • Instruction • gem5’s GPGPU • Using other
library • Accelerating execution simulators w/
simulation • Adding an gem5
instruction
• Whatever you
want!
Running Things on gem5
A presentation by
Maryam Babaie
OOO Action Item ☺
Launch codespace and run the following commands:
cd gem5
2. The m5 Utility
i. Examples on m5 Utility
3. Cross-compiling
4. Traffic Generator
Intro. to Syscall Emulation Mode
Previously on gem5: how to build & use
Building with Scons:
scons build/{ISA}/gem5.{variant} -j {cpus}
Once compiled, gem5 can then be run using:
build/{ISA}/gem5.{variant} [gem5 options] {simulation script} [script options]
Example:
build/X86/gem5.fast --outdir=simple_out configs/learning_gem5/part1/simple.py --l1i_size=32kB
Syscall
No interaction
Emulation
with OS
Mode
What is Syscall Emulation?
Syscall Emulation (SE) mode does not model all the devices in a system.
However, SE only emulates Linux system calls, and only models user-mode code.
When to use/avoid Syscall Emulation?
If you do not need to model the OS, and you want extra performance,
then you should use SE mode.
However, if you need high fidelity modeling of the system, or if OS interactions like
page table walks are important, then you should use FS mode.
The m5 Utility
The m5 Utility API
“m5ops” are the special opcodes that can be used in m5 to issue special instructions.
Options include:
• exit (delay): Stop the simulation in delay nanoseconds.
• resetstats (delay, period): Reset simulation statistics in delay nanoseconds; repeat this every period nanoseconds.
• dumpstats (delay , period): Save simulation statistics to a file in delay nanoseconds; repeat this every period nanoseconds.
• dumpresetstats (delay ,period): same as dumpstats; resetstats;
It is best to insert the option(s) directly in the source code of the application.
m5ops.h header file has prototypes for all the functionalities/options must be included.
The application should be linked with the appropriate m5 & libm5.a files.
To build m5 and libm5.a, run the following command in the gem5/util/m5/ directory.
scons build/{TARGET_ISA}/out/m5
Note: if you are using a x86 system for other ISAs, you need to have the cross-compiler
After building the m5 and libm5.a as described, link them to your code:
Config file:
materials/using-gem5/03-running/simple.py
Commands
Compile the code: gcc materials/using-gem5/03-running/example1/se_example.cpp -o exampleBin
Include gem5/m5ops.h
cd gem5/util/m5
scons build/x86/out/m5
Example 1
Add gem5/include to your compiler’s include search path.
Note: if you try to locally run the output binary in your host, it will generate error:
SE mode uses the host for many things.
Run gem5:
gem5/build/X86/gem5.debug --debug-flags=ExecAll materials/using-gem5/03-running/simple.py > debugOut.txt
Example 2: checking a directory
Example2 code:
materials/using-gem5/03-running/example2/dir_example.cpp
Config file:
materials/using-gem5/03-running/simple.py
Commands
Compile the code: g++ materials/using-gem5/03-running/example2/dir_example.cpp -o exampleBin
• Filesystem
• Most of systemcalls
• I/O devices
• Interrupts
• TLB misses → Page table walks
• Context switches
• multiple threads
You may have a multithreaded execution, but there's no context switches & no spin locks
Cross-compiling
Cross-compiling from one ISA to another.
Example: Cross-compiling
Host = X86 → Target: ARM64
(1) Build m5 utility for arm64
cd gem5/util/m5
Also, you need to let gem5 know where the libraries associated with the guest ISA are located, using “redirect”.
Example: Cross-compiling (Dynamic)
You should modify the config file (simple.py) as follows:
Used for creating test cases for caches, interconnects, and memory controllers, etc.
Load/ P
CPU o Memory
Traffic Store
generator r System
LLC Data
t
Rd/Wr P
Req o Memory
PyTrafficGen r System
t Data
PyTrafficGen’s parameters allow you to control the characteristics of the generated traffic.
Parameter Definition
pattern The pattern of generated addresses: linear/ random
duration The duration of generating requests in ticks (quantum of time in gem5).
start address The lower bound for addresses that the synthetic traffic will access.
end address The upper bound for addresses that the synthetic traffic will access.
minimum period The minimum timing difference between two consecutive requests in ticks.
maximum period The maximum timing difference between two consecutive requests in ticks.
request size The number of bytes that are read/written by each request.
read percentage The percentage of reads among all the requests, the rest of requests are write requests.
Example3: PyTrafficGen
Code: “src/mem”
Linear/Random
Memory System
Types:
Duration Rd/Wr MemCtrl, HetMemCtrl, etc
P
Rd/Wr % Req Memory Ctrlr
o Scheduling Policy
PyTrafficGen r FCFS, FRFCFS
Address Range
Data
t Memory Interface Device Type
DDRs, NVM, HBM, etc
Device Size
• m5 utility API is a useful tool for simulation behavior and performance analysis.
• Cross compilers should be used if the host and guest ISAs are different.
• Traffic generator can abstract away the details of a data requestor such as CPU for
generating test cases for memory systems.