by David Broman, 2017
From the root path, the regression test suite is executed by the command
>>make test
The regression test is not very developed and should include a large set of program tests.
Note: This text describes how to perform WCET analysis using abstract search manually. This is not the optimal way for an end user, but is a good way to learn the internals of the analysis tool.
Before you follow this tutorial, please go through the getting started guide.
Go to directory /runtime
, which contains all OCaml files that will be compiled together with the generated analysis file.
The first step is to generate the continuation passing style OCaml assembly code. We compile the file test/demo/fact.c
using KTA, but write the output file to the runtime directory.
The KTA command that we will use is called wcet
. By executing command
>>kta wcet ../test/demo/fact.c fact -compile -optimize 3 -cpsocaml > tmp.ml
we generate a file tmp.ml
that contains the continuation passing style code for performing abstract execution of the factorial function.
Please take a look at tmp.ml
. Compare the generated code with the MIPS code that you get using the disasm
command.
If we now want to perform WCET analysis on the factorial function, we need to compile and execute tmp.ml
. We can do this using the following command
>>ocamlbuild tmp.native -lib str -- -args a0=[5,5]
The output is as follows
Counter: 0
BCET: 25 cycles
WCET: 25 cycles
at = Any v0 = 120 v1 = 1 a0 = 1
a1 = Any a2 = Any a3 = Any t0 = Any
t1 = Any t2 = Any t3 = Any t4 = Any
t5 = Any t6 = Any t7 = Any s0 = Any
s1 = Any s2 = Any s3 = Any s4 = Any
s5 = Any s6 = Any s7 = Any t8 = Any
t9 = Any k0 = Any k1 = Any gp = Any
sp = 2147483640 fp = Any ra = Any
Note that the analysis was executed with the argument interval [5,5]
, which means that it was executed with the concrete value 5. Hence, it is correct that the output value for the factorial of 5 is 120. Note also that the best-case execution time (BCET) and the worst-case execution time (WCET) are both 25 cycles.
If we instead execute the program as follows
>>ocamlbuild tmp.native -lib str -- a0=[3,5]
it means that the input value is an abstract interval value, between values 3 and 5.
The output is then
Counter: 0
BCET: 15 cycles
WCET: 25 cycles
at = Any v0 = [6,120] v1 = 1 a0 = 1
a1 = Any a2 = Any a3 = Any t0 = Any
t1 = Any t2 = Any t3 = Any t4 = Any
t5 = Any t6 = Any t7 = Any s0 = Any
s1 = Any s2 = Any s3 = Any s4 = Any
s5 = Any s6 = Any s7 = Any t8 = Any
t9 = Any k0 = Any k1 = Any gp = Any
sp = 2147483640 fp = Any ra = Any
Note now that the WCET is 25 cycles, but the BCET is 15 cycles. Note also that we got an abstract interval value for the return value. Depending on the input value (3, 4, or 5), we will get an output value between 6 and 120.
There is an alternative way to run the KTA wcet analysis tool in one command. In that case an environment variable, KTA_WCET_RUNTIME_PATH, is used for the runtime path. The steps are the following:
>> export KTA_WCET_RUNTIME_PATH=/path/to/runtime
The KTA command, with $KTA_PATH=/path/to/kta, is:
>> kta wcet $KTA_PATH/test/demo/fact.c fact -compile -optimize 3 -args a0=[3,5]
and the output:
Counter: 0
BCET: 15 cycles
WCET: 25 cycles
at = Any v0 = [6,120] v1 = 1 a0 = 1
a1 = Any a2 = Any a3 = Any t0 = Any
t1 = Any t2 = Any t3 = Any t4 = Any
t5 = Any t6 = Any t7 = Any s0 = Any
s1 = Any s2 = Any s3 = Any s4 = Any
s5 = Any s6 = Any s7 = Any t8 = Any
t9 = Any k0 = Any k1 = Any gp = 4231232
sp = 2147483640 fp = Any ra = Any
Time Elapsed 0.000000s
The KTA code that generates the CPS (continuation-passing style) OCAML code is located in directory src/
. The actual wet analysis takes place executing the CPS code, which is linked together with the runtime files located in directory runtime/
.
To be updated
The CPS OCaml Code file is structured as follows:
-
A list of block identifier names and numbers. These basic block names correspond to the names of the basic block in the machine code. Compare with the disassembled code. The actual numbers are used for fast access into the array
bblocks
, which is defined at the end of the file. -
The actual program code. Each basic block is represented as a function (a co-routine) which is involved using continuation passing style. Note that each basic block contains function calls to functions that correspond to one assembly instruction.
-
The basic block table
bblocks
includes the following information:func
: the function defined above. This is used when calling another co-routine.name
: used for pretty printing.nextid
: identifier number to the next basic block, if no branch occurs. Note that thenext
call in the basic blocks is used to actually perform the jump. The distance number is computed by KTA by performing a graph search on the control flow graph.list
: distance to the final basic block. This distance number is used when the runtime system should choose when to jump.addr
: start (code) address for the basic block.caller
: used for function calls.
-
The
main
function starts the analysis process by calling functionanalyse
, located in moduleAbstractMIPS
.
Note that the CPS file is automatically generated by KTA.
This module contains the main functionality of the analysis runtime system.
Some of the main data structures:
-
mstate
: Contains the main state. This structure is passed around in all function calls.cblock
: Is the current block id that is being processed.pc
: current program counterpstate
: the current program state (containing registers, memory etc.)batch
: this is batch of program states that should be processed for the current block id. This is why the block id and the program counter is not part of the program state. All the processes in the batch (the list) are processed in turn, until it is empty. See functiondequeue
.bbtable
: this is just a pointer to the bbtable that is define in the CPS file.prio
: This is the main priority queue. When there are no more program states to process in a batch, thedequeue
function finds the next batch of program states to process.returnid
: used when returning from a function call.cstack
: when a function call takes place, the whole queue of program states are pushed on the stack. Hence, it always finishes the analysis of the deepest function call first.branch
: special branch handling forslt
andbeq
. This is the tricky thing thatslt
needs to cut abstract intervals.
-
progstate
: Contains an abstract state of the registers, the memory, BCET, and WCET values.
Here are some of the essential functions:
-
analyze
: located at the end of the file and is the start of the analysis. It callsanalyze_main
which starts up the actual analysis. -
analyze_main
: Initiates the main state and starts the process. -
enqueue
: One of the central functions. When the analysis is finished with a basic block, it enqueues the basic block again into the priority queue (see for instance functionnext
). Theenqueue
function inserts the basic block in the right place in the priority queue (which is for simplicity represented as a list). This is pretty fast, since often it will be placed on top of the list. -
dequeue
: is called to get the next basic block to process. First, it checks if there is anything to do on the batch. If not, it checks if it has finished the analysis, or if it is returning from a function call. If not, it dequeues one program state.
Immutable representation of an abstract interval register. As it turns out, this is in general faster than using a lazy variant.
Contain the interval arithmetic functions.
Contains the abstract memory definitions. The nomem
flag is used for testing. This abstract memory module is using immutable Maps and a lazy join, where elements are just jointed when they are read.