Programmers Manual FlexGripPlus SASS
Programmers Manual FlexGripPlus SASS
Programmer’s Manual
Authors: Contact:
Josie Esteban Rodriguez Condia [email protected]
Boyang Du [email protected]
Gianluca Roascio [email protected]
Eduard Sci [email protected]
Juan David Guerrero Balaguera [email protected]
All reported Op-codes are fully compatible with the SASS assembly language in the SM_1.0 for GPGPUs using the G80 microarchitecture. The
opcodes were specifically determined to support the verification, testing and operation of the FlexGripPlus GPGPU model.
The manual was developed by the Electronic CAD & Reliability Group (CAD) in the Department of Control and Computer Engineering (DAUIN).
Politecnico di Torino
Italy, 2020
The Floating Point Unit (FPU) extension and op-codes were developed in collaboration between Politecnico di Torino and the Grenoble Institute of
Technology.
The Special Functions Unit (SFU) extension and op-codes were developed in cooperation between Politecnico di Torino and Universidad
Pedagogica y Tecnologica de Colombia (UPTC).
All Activities performed in the development of the FlexGripPlus GPGPU model were supported with fundings by the European Commission
through the Horizon 2020 RESCUE-ETN project under grant 722325. For more information: http://rescue-etn.eu/
https://github.com/Jerc007/Open-GPGPU-FlexGrip-
CAD Group
RESCUE-ETN
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
Content
Glosary: .......................................................................................................................................................................................................... 4
TABLES OF SASS INSTRUCTIONS SUPPORTED IN FLEXGRIPPLUS .......................................................................................... 5
Table 1 Control-flow instructions supported in FlexGripPlus. ............................................................................................................ 5
Table 2 Arithmetic and logic instructions in FlexGripPlus................................................................................................................... 5
Table 3 Data handling and memory instructions in FlexGripPlus. ...................................................................................................... 5
Table 4 Floating Point Unit (FPU) instructions supported in FlexGripPlus. ...................................................................................... 6
Table 5 Special function unit (SFU) instructions supported in FlexGripPlus. .................................................................................... 6
INSTRUCTIONS .......................................................................................................................................................................................... 7
Control-flow Instructions ......................................................................................................................................................................... 7
BRA instruction: ....................................................................................................................................................................................... 8
BAR instruction: ....................................................................................................................................................................................... 9
RET instruction:...................................................................................................................................................................................... 10
SSY instruction: ...................................................................................................................................................................................... 11
NOP instruction: ..................................................................................................................................................................................... 12
TRAP instruction: ................................................................................................................................................................................... 13
CAL instruction: ..................................................................................................................................................................................... 14
Arithmetic and logic instructions........................................................................................................................................................... 15
I2I instruction (CVT):............................................................................................................................................................................. 16
IMUL Instruction:................................................................................................................................................................................... 18
IMUL32 Instruction:............................................................................................................................................................................... 19
IMUL32I Instruction: ............................................................................................................................................................................. 20
SHL/SHR Instructions: .......................................................................................................................................................................... 21
IADD Instruction: ................................................................................................................................................................................... 22
IADD32 Instruction: ............................................................................................................................................................................... 24
IADD32I Instruction: .............................................................................................................................................................................. 25
IMAD Instruction: .................................................................................................................................................................................. 26
IMAD32 Instruction: .............................................................................................................................................................................. 27
IMAD32I Instruction: ............................................................................................................................................................................. 28
LOP Instruction: ..................................................................................................................................................................................... 29
ISET Instruction: .................................................................................................................................................................................... 31
Data handling and memory instructions ............................................................................................................................................... 33
MVC Instruction: .................................................................................................................................................................................... 34
GLD Instruction: ..................................................................................................................................................................................... 35
GST Instruction: ..................................................................................................................................................................................... 36
MOV Instruction: (check final details) ................................................................................................................................................. 37
MOV32 Instruction: ................................................................................................................................................................................ 38
MVI Instruction: ..................................................................................................................................................................................... 39
R2G Instruction: ..................................................................................................................................................................................... 40
R2A Instruction: ...................................................................................................................................................................................... 41
A2R Instruction: ...................................................................................................................................................................................... 42
ADA Instruction: ..................................................................................................................................................................................... 43
Floating point instructions ..................................................................................................................................................................... 44
FADD32 Instruction: .............................................................................................................................................................................. 45
FADD Instruction: .................................................................................................................................................................................. 46
FADD32I Instruction: ............................................................................................................................................................................. 47
FMUL Instruction: .................................................................................................................................................................................. 48
FMUL32 Instruction: .............................................................................................................................................................................. 49
FMUL32I Instruction: ............................................................................................................................................................................ 50
FMAD32 Instruction: ............................................................................................................................................................................. 51
FMAD32I Instruction: ............................................................................................................................................................................ 53
F2F Instruction:....................................................................................................................................................................................... 54
F2I Instruction: ....................................................................................................................................................................................... 55
I2F Instruction: ....................................................................................................................................................................................... 56
FSET Instruction: ................................................................................................................................................................................... 57
RCP Instruction: ..................................................................................................................................................................................... 59
RCP32 Instruction: ................................................................................................................................................................................. 60
Especial function unit instructions ........................................................................................................................................................ 61
SIN instruction: ....................................................................................................................................................................................... 62
COS instruction:...................................................................................................................................................................................... 63
RRO instruction: (Range Reduction Operation) ................................................................................................................................. 64
LG2 instruction: ...................................................................................................................................................................................... 65
EX2 instruction: ...................................................................................................................................................................................... 66
RSQ instruction:...................................................................................................................................................................................... 67
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
Glosary:
In the description of the opcodes, the following words are employed to represent the resources in the GPGPU:
INSTRUCTIONS
Control-flow Instructions
BRA
BAR
RET
SSY
NOP
TRAP
CAL
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
BRA instruction:
Checked OK
This instruction generates a change in the warp PC in the SM. (Brach operation in the GPU)
PC ← offset address
Mnemonics:
Direct BRA (offset address) BRA 0x1E0
Indirect BRA (predicate_register.condition) offset_address BRA C0.NE, 0x1e0
(SASS_assembly_lib):
Format: BRA(int offset, int condition, int pred_reg_cond, int marker)
Note:
The original version of this instruction (in FlexGrip) was implemented with an address limit of 18 bits. (High part of the memory address is not
implemented) GPGPU-FLEXGRIP instruction memory is limited to 18 bits of address pointer. This condition was repaired in the extended version of
the model.
BAR instruction:
Checked
Mnemonics:
BAR.(type)
(SASS_assembly_lib):
Pending…
RET instruction:
Checked YES
This instruction returns from a kernel execution or a thread path (taken-not taken) in case of divergence.
Mnemonics:
RET
RET Cx (COND) (predicate condition)
(SASS_assembly_lib):
Format: RET(int condition, int pred_reg_cond, int marker)
Condition: (43-39)
Pred_reg_cond: (45-44)
Marker: (1-0)
Note:
The original version of this instruction (in FlexGrip) was able to stop the kernel execution. The additional feature of returning from a thread path
was added in the improved version FlexGrip*.
SSY instruction:
Checked YES
This instruction defines the convergence point for a potential divergence generation program kernels. This instruction activates the divergence
stack module in the GPGPU.
Mnemonics:
SSY 0xd88
(SASS_assembly_lib):
SSY(int offset, int condition, int pred_reg_cond, int marker) (The predicate condition is not employed in this instruction, but is present in the
code and function description)
Offset: (24-9)
Condition: (43-39)
Pred_reg_cond: (45-44)
Marker: (1-0)
NOP instruction:
Checked YES
Mnemonics:
NOP
NOP.S (predicate condition)
(SASS_assembly_lib):
Format: RET(int condition, int pred_reg_cond, int marker)
Condition: (43-39)
Pred_reg_cond: (45-44)
Marker: (1-0)
TRAP instruction:
Checked NO, Implemented pending.
Mnemonics:
TRAP
(SASS_assembly_lib):
Pending…
CAL instruction:
Checked YES
(SASS_assembly_lib):
Pending…
I2I
IMUL
IMUL32
IMUL32I
SHL
SHR
IADD
IADD32
IADD32I
IMAD
IMAD32
IMAD32I
LOP
ISET
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
This instruction performs the conversion of formats among integer values. It should be noted that this instruction is only available for integer
operands.
Mnemonics:
I2I.(destiny operand format).(source operand format) (destiny location of the operand),(source location of the operand)
The formats in destiny and source may be signed (S), unsigned (U), and 8, 16, and 32 bits wide. The sources and destinies may be registers, shared
memory, constant memory, or global memory locations.
(SASS_assembly_lib):
Formats:
I2I_32_16(int dest_reg, int source_reg_1, char hilo_1, char sigd, int condition, int pred_reg_cond, char set_pred, int pred_reg_set, int marker)
I2I_U32_S32_abs2(int dest_reg, int source_reg_1, int condition, int pred_reg_cond, char set_pred, int pred_reg_set, int marker)
I2I_S32_S32_neg2(int dest_reg, int source_reg_1, int condition, int pred_reg_cond, char set_pred, int pred_reg_set, int marker)
I2I_32_16_shmem(int dest_reg, int addr_reg, int offset, char sigd, int condition, int pred_reg_cond, char set_pred, int pred_reg_set, int marker)
I2I_32_32_o0x7f(int source_reg_1, char sigd, int condition, int pred_reg_cond, char set_pred, int pred_reg_set, int marker)
I2I_32_16_BEXT(int dest_reg, int source_reg_1, char hilo_1, char sigd, int condition, int pred_reg_cond, char set_pred, int pred_reg_set, int
marker)
I2I_32_16_BEXT_shmem(int dest_reg, int addr_reg, int offset, int condition, int pred_reg_cond, char set_pred, int pred_reg_set, int marker)
I2I_16_16_BEXT_shmem(int dest_reg, char hilo_d, int addr_reg, int offset, int condition, int pred_reg_cond, char set_pred, int pred_reg_set, int
marker)
dest_reg: (8-2)
hilo_1: (9)
condition: (39-43)
Pred_reg_cond: (45-44)
set_pred: (38)
pred_reg_set: (38-37)
marker: (1-0)
source_reg_1: (10-15) or (16-22) depending on the function
sigd: (48) or (59)
addr_reg: (26-27) and (34-37) (address register Ax)
hilo_d: (2)
offset: (9-13)
IMUL Instruction:
Checked YES
This instruction performs the integer multiplication of two operands. The sources can be registers or constant memory locations. The operation of
these instructions could be dependable on predicate conditions.
Mnemonics:
IMUL.(destiny operand format).(source operand format) (destiny location of the operand), (source 1 ), (source 2)
The formats in destiny and source may be signed (S), unsigned (U) in 16, and 32 bits wide. The sources and destinies may be registers, shared
memory, constant memory, or global memory locations.
(SASS_assembly_lib):
Formats:
IMUL_U16_U16_shmem(int dest_reg, int addr_reg, int offset, int source_reg_2, char hilo_2, int condition, int pred_reg_cond, char set_pred, int
pred_reg_set, int marker)
IMUL_S16_S16_regs(int dest_reg, int source_reg_1, char hilo_1, int source_reg_2, char hilo_2, int condition, int pred_reg_cond, char set_pred,
int pred_reg_set, int marker)
IMUL_S16_S16_shmem(int dest_reg, int addr_reg, int offset, int source_reg_2, int hilo_2, int condition, int pred_reg_cond, char set_pred, int
pred_reg_set, int marker)
IMUL_U16_U16_regs(int dest_reg, int source_reg_1, char hilo_1, int source_reg_2, char hilo_2, int condition, int pred_reg_cond, char set_pred,
int pred_reg_set, int marker)
(32 - 33)2 instr_marker 00 = normal reg Access(load or store) (not extra instruction)
(by default)
01 = normal reg Access(load or store) (with Join) (extra
instruction)
10 = normal reg Access(load or store) (with Exit)
11 = immediate
(34)1 Not used 0
(35)1 destination type 0 = Register destination 1 = Internal operation
(36-37)2 Predicate register set (enabling a new flag) or Not used C0 = 00 (by default) C1 = 01
C2 = 10 C3 = 11
(38)1 Set predicate register 1 = enabled predicate register set
0 = disabled predicate register set
(39 – 43) 5 predicate_condition condition
encoding name Description
formula
0x00 never always false (not used) 0
0x01 l less tan (S & ~Z) ^ O
0x02 e Equal Z & ~S
0x03 le less than or equal S ^ (Z | O)
0x04 g greater tan ~Z & ~(S ^ O)
less or greater tan / not
0x05 lg ~Z
equal
0x06 ge greater than or equal ~(S ^ O)
0x07 lge Ordered ~Z | ~S
0x08 u Unordered Z&S
0x09 lu less than or unordered S^O
0x0a eu equal or unordered Z
0x0b leu not greater than Z | (S ^ O)
0x0c gu greater than or unordered ~S ^ (Z | O)
0x0d lgu not equal to ~Z | S
0x0e geu not less tan (~S | Z) ^ O
0x0f always always true (by default) 1
0x10 o Overflow O
0x11 c carry / unsigned not below C
0x12 a unsigned above ~Z & C
0x13 s sign / negative S
0x1c ns not sign / positive ~S
0x1d na unsigned not above Z | ~C
0x1e nc not carry / unsigned below ~C
0x1f no no overflow ~O
(44 - 45) 2 Input predicate register to compare before to operate C0= 00 C1 = 01
C2= 10 C3 = 11
(46 – 52) 7 Not used 000 0000
53 Shared memory use for Source_2? Yes = 1 No = 0
(54 – 60) 7 Not used 000 0000
(61 – 63)3 Sub_op_code 000
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
IMUL32 Instruction:
Checked YES
Mnemonics:
IMUL32.(destiny operand format).(source operand format) (destiny location of the operand), (source 1 ), (source 2)
The formats in destiny and source may be signed (S), unsigned (U) in 16, and 32 bits wide. The sources and destinies may be registers, shared
memory, constant memory, or global memory locations.
(SASS_assembly_lib):
Formats:
IMUL32_U16_U16_regs(int dest_reg, int source_reg_1, char hilo_1, int source_reg_2, char hilo_2)
IMUL32_U16_U16_shmem(int dest_reg, int addr_reg, int offset, int source_reg_2, char hilo_2)
IMUL32I Instruction:
Checked YES
This instruction performs the integer multiplication of two operands using an immediate operand. The sources can be registers.
Mnemonics:
IMUL32I.(destiny operand format).(source operand format) (destiny location of the operand), (source 1 ), (Imm)
The formats in destiny and source may be signed (S), unsigned (U) in 16, and 32 bits wide. The sources and destinies may be registers, shared
memory, constant memory, or global memory locations.
(SASS_assembly_lib):
Formats:
IMUL_U16_U16_shmem(int dest_reg, int addr_reg, int offset, int source_reg_2, char hilo_2, int condition, int pred_reg_cond, char set_pred, int
pred_reg_set, int marker)
IMUL_S16_S16_regs(int dest_reg, int source_reg_1, char hilo_1, int source_reg_2, char hilo_2, int condition, int pred_reg_cond, char set_pred,
int pred_reg_set, int marker)
IMUL_S16_S16_shmem(int dest_reg, int addr_reg, int offset, int source_reg_2, int hilo_2, int condition, int pred_reg_cond, char set_pred, int
pred_reg_set, int marker)
IMUL_U16_U16_regs(int dest_reg, int source_reg_1, char hilo_1, int source_reg_2, char hilo_2, int condition, int pred_reg_cond, char set_pred,
int pred_reg_set, int marker)
(32 - 33)2 instr_marker 00 = normal reg Access(load or store) (not extra instruction)
01 = normal reg Access(load or store) (with Join) (extra
instruction)
10 = normal reg Access(load or store) (with Exit)
11 = immediate (by default)
(34 - 59) 26 Source 2: High part of the immediate value of 32 bits XXXX XXXX XXXX XXXX XXXX XXXX XXXX XXXX … XX
(60) 1 Not used 0
(61 – 63)3 Sub_op_code 000
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
SHL/SHR Instructions:
Checked YES
These instructions perform the logic shift operations (Left or Right) into operands of 16 or 32 bits size. The sources can be registers or constant
memory locations. The operation of these instructions could be dependable on predicate conditions.
Mnemonics:
(Predicate) LOP. (Logic Operation).(Size) Destiny, Source_1, Source_2
Destiny and source registers are 16 or 32-bit size. The source_1 can be a register or a shared memory location. The source_2 can be a constant
memory location.
(SASS_assembly_lib):
Formats:
Pending
(32 - 33)2 instr_marker 00 = normal reg Access(load or store) (not extra instruction) (by
default)
01 = normal reg Access(load or store) (with Join) (extra
instruction)
10 = normal reg Access(load or store) (with Exit)
11 = immediate
34 Used for…. 0
35 destination type 0 = Register destination
(36 – 38) 3 Nor used 000
(39 – 43) 5 predicate condition to operate the instruction encoding name Description condition formula
0x00 never always false 0
0x01 l less tan (S & ~Z) ^ O
0x02 e Equal Z & ~S
0x03 le less than or equal S ^ (Z | O)
0x04 g greater tan ~Z & ~(S ^ O)
0x05 lg less or greater tan ~Z
0x06 ge greater than or equal ~(S ^ O)
0x07 lge Ordered ~Z | ~S
0x08 u Unordered Z&S
0x09 lu less than or unordered S^O
0x0a eu equal or unordered Z
0x0b leu not greater tan Z | (S ^ O)
0x0c gu greater than or unordered ~S ^ (Z | O)
0x0d lgu not equal to ~Z | S
0x0e geu not less tan (~S | Z) ^ O
0x0f always always true (by default) 1
0x10 o Overflow O
0x11 c carry / unsigned not below C
0x12 a unsigned above ~Z & C
0x13 s sign / negative S
0x1c ns not sign / positive ~S
0x1d na unsigned not above Z | ~C
0x1e nc not carry / unsigned below ~C
0x1f no no overflow ~O
(44 - 45) 2 Input_predicate_register C0= 00 C1= 01
Used as: precondition to operate the instruction C2= 10 C3= 11
(46-51) 6 Not used 00 0000
52 Source 2 selector 1 = Immediate value 0 = Register
(53-57) 5 Not used 0 0000
58 Size of destiny and source 1 = 32 bits 0 = 16 bits
59 Use of the Sign during the shift 1 = Signed 0 = Unsigned
60 Not Used 0
61 Operation code of the shift (Left or Right) 0 = SHL 1 = SHR
(62 – 63) 2 Sub_opcode 11
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
IADD Instruction:
Checked YES
This instruction performs the Integer addition in (32 or 16 bits) of two sources. The sources can be registers or shared memory locations. The
operation of this instruction could be dependable on predicate conditions. Moreover, the operation may modify some of these predicate flags.
Pred: Rx <- Ry + Rz
Mnemonics:
ADD (predicate_condition_out) (Destiny register) (Predicate_condition_in) , (source register 1), (source register 2)
The predicate_condition must be previously set by other instructions to be used as a condition for the addition operation.
Destiny and source registers are 16 or 32-bit size. The source register seems to be selected among (R0 - Rn), where n is the total number of
registers employed by the application.
(SASS_assembly_lib):
Formats:
pending
(32 - 33)2 instr_marker 00 = normal reg Access(load or store) (not extra instruction)
(by default)
01 = normal reg Access(load or store) (with Join) (extra instruction)
10 = normal reg Access(load or store) (with Exit)
11 = immediate
(34)1 Used for…. 0 1
(35)1 destination type 0 = Register destination
(36-37) 2 Predicate register to be set (enabling a new flag, only C0 = 00 (by default)
carry) C1 = 01
C2 = 10
C3 = 11
(38) 1 Set predicate register 1 = Enable predicate register set 0 = Disable predicate register set
(39 – 43)5 predicate_condition encoding name Description condition formula
0x00 never always false (not used) 0
0x01 l less tan (S & ~Z) ^ O
0x02 e Equal Z & ~S
0x03 le less than or equal S ^ (Z | O)
0x04 g greater tan ~Z & ~(S ^ O)
0x05 lg less or greater tan / not equal ~Z
0x06 ge greater than or equal ~(S ^ O)
0x07 lge Ordered ~Z | ~S
0x08 u Unordered Z&S
0x09 lu less than or unordered S^O
0x0a eu equal or unordered Z
0x0b leu not greater than Z | (S ^ O)
0x0c gu greater than or unordered ~S ^ (Z | O)
0x0d lgu not equal to ~Z | S
0x0e geu not less than (~S | Z) ^ O
0x0f always always true (by default) 1
0x10 o Overflow O
0x11 c carry / unsigned not below C
0x12 a unsigned above ~Z & C
0x13 s sign / negative S
0x1c ns not sign / positive ~S
0x1d na unsigned not above Z | ~C
0x1e nc not carry / unsigned below ~C
0x1f no no overflow ~O
(44 - 45) 2 Input_predicate_register C0= 00 C1= 01
Used as: precondition to operate the instruction C2= 10 C3= 11
(46 - 53) 8 Source_register_2: It could be coming from: Register case (46-52): Constant memory: Shared memory:
1) GPRS R0= 00000 … Second part of the (46-52): 000 0000
2) Constant memory R5= 00101 constant memory (i.e.) (53): 1 (use of shared
3) Shared memory R6= 00110 … C[0x2][0x16] memory)
(53) 0 (46-53) = 0001 0110
(54 – 57)4 The first part of the Source_2 when constant memory is The first part of the constant memory (i.e.)
employed C[0x2][0x16]
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
(54-57) = 0010
58 W_32 Operation at 32 or 16 bits 16 bits = 0 32 bits= 1
59 Sign of Source_2 Positive = 0 Negative = 1
60 Not used 0
(61 – 63)3 Sub_op_code 000
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
IADD32 Instruction:
Checked YES
This instruction performs the Integer addition in (32 bits) of two sources of type register or shared memory locations. The operation of this
instruction is not dependable on predicate conditions.
Rz <- Rx + Ry
Mnemonics:
IADD32 (Destiny register), (source register 1), (source register 2)
Destiny and source registers are 32-bit size. The source registers can be selected among (R0 - Rn), where n is the total number of registers
employed by the application.
(SASS_assembly_lib):
Formats:
pending
IADD32I Instruction:
Checked YES
This instruction performs the Immediate Integer addition in (32 bits) of one source of type register or shared memory locations and one immediate
operand.
Rz <- Rx + Imm
Mnemonics:
IADD32 (Destiny register), (source register 1), (Immediate value)
Destiny and source registers are 32-bit size. The source register can be selected among (R0 - Rn), where n is the total number of registers employed
by the application.
(SASS_assembly_lib):
Formats:
Pending
(32 - 33)2 instr_marker 00 = normal reg Access(load or store) (not extra instruction)
01 = normal reg Access(load or store) (with Join) (extra
instruction)
10 = normal reg Access(load or store) (with Exit)
11 = immediate (by default)
(34) 1 Used for…. 0
(35) 1 destination type 0 = Register destination
(36 – 59) The high part of the immediate value (28 – 6) High_Imm XXXX XXXX XXXX XXXX XXXX XX
60 Not Used 0
(61 – 63) Sub_opcode 000
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
IMAD Instruction:
Checked YES
This instruction performs the multiply and addition of three operands of 16 or 32 bits size. The sources can be registers or constant memory
locations. The operation of these instructions could be dependable on predicate conditions.
Mnemonics:
(Predicate) IMAD.(Size) Destiny, Source_1, Source_2, Source_3
(SASS_assembly_lib):
Formats:
Pending
(32 - 33)2 Instr_marker 00 = normal reg Access(load or store) (not extra instruction) (by
default)
01 = normal reg Access(load or store) (with Join) (extra
instruction)
10 = normal reg Access(load or store) (with Exit)
11 = immediate
34 Not used (Address reg [2]) 0
35 destination type 0 = Register 1 = No destination, internal operation only
(36 – 37)2 Register to be set as result of operation if enabled 00 = C0 (by default) 01 = C1
10 = C2 11 = C3
38 Set predicate register as result of operation 1: Enable predicate register 0: Disable predicate register set
set
(39 – 43) 5 Predicate condition to operate the instruction encoding name Description condition formula
0x00 never always false 0
0x01 l less tan (S & ~Z) ^ O
0x02 e Equal Z & ~S
0x03 le less than or equal S ^ (Z | O)
0x04 g greater tan ~Z & ~(S ^ O)
0x05 lg less or greater tan ~Z
0x06 ge greater than or equal ~(S ^ O)
0x07 lge Ordered ~Z | ~S
0x08 u Unordered Z&S
0x09 lu less than or unordered S^O
0x0a eu equal or unordered Z
0x0b leu not greater tan Z | (S ^ O)
0x0c gu greater than or unordered ~S ^ (Z | O)
0x0d lgu not equal to ~Z | S
0x0e geu not less tan (~S | Z) ^ O
0x0f always always true (by default) 1
0x10 o Overflow O
0x11 c carry / unsigned not below C
0x12 a unsigned above ~Z & C
0x13 s sign / negative S
0x1c ns not sign / positive ~S
0x1d na unsigned not above Z | ~C
0x1e nc not carry / unsigned below ~C
0x1f no no overflow ~O
(44 - 45) 2 Input_predicate_register C0= 00 (by default) C1= 01
Used as: precondition to operate the instruction C2= 10 C3= 11
(46 -52) 7 3_ Register_Operand (32 bits) R0= 000000, R5=000101, R6=000110…
53 Source 1 selector 1 = Shared memory 0 = Register
(54 –55) 2 Not used 00
56 Source 2 Immediate value? 1= Immediate value 0 = Register
57 Not used
58 Sign of the Source 1 1 = Negative 0 =Positive
59 Sign of the Source 3 1 = Negative 0 =Positive
60 Not Used 0
(61 – 63) 3 Sub_opcode 000
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
IMAD32 Instruction:
Checked No
This instruction performs the integer multiply and addition of three operands of 16 or 32-bits size. The sources must be registers.
PrE: Rz <- ( Ry * Rx ) + Rz
The destiny register should be one of the source operands in the MAD operation. (Source 3 or Rz)
Mnemonics:
(SASS_assembly_lib):
Formats:
Pending…
IMAD32I Instruction:
Checked No
This instruction performs the integer multiply and addition of three operands of 16 or 32-bits size when one of the sources is an immediate value.
The sources must be registers.
The destiny register should be one of the source operands in the MAD operation. (Source 3 or Rz)
Mnemonics:
(SASS_assembly_lib):
Formats:
Pending…
(32 - 33)2 instr_marker 00 = normal reg Access(load or store) (not extra instruction)
01 = normal reg Access(load or store) (with Join) (extra
instruction)
10 = normal reg Access(load or store) (with Exit)
11 = immediate (by default)
(34 – 59) The high part of the immediate value High_Imm XXXX XXXX XXXX XXXX XXXX XX
26
60 Not Used 0
(61 – 63) Sub_opcode 000
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
LOP Instruction:
Checked YES
This instruction performs the logic operations (AND, OR, XOR, PASS, and NOT) into operands of 16 or 32 bits size. The sources can be registers or
shared memory locations. The operation of this instruction could be dependable on predicate conditions. Moreover, the process may modify some
of these predicate flags.
Mnemonics:
(Predicate) LOP. (Logic Operation).(Size) Destiny, Source_1, Source_2
Destiny and source registers are 16 or 32-bit size. The source_1 can be a register or a shared memory location. The source_2 can be a constant
memory location.
(SASS_assembly_lib):
Formats:
Pending
(32 – 33)2 instr_marker 00 = normal reg Access(load or store) (not extra instruction)
01 = normal reg Access(load or store) (with Join) (extra
instruction)
10 = normal reg Access(load or store) (with Exit)
11 = immediate (by default)
34 High_Address of 2_operand 0
35 destination type 0 = Register destination 1= Memory destination
(36 – 37) 2 Predicate register set (enabling a new flag) or Not used 00 = C0 (by default) 01 = C1
10 = C2 11 = C3
(38) 1 Set predicate register 1: Enable predicate register 0: Disable predicate register
set set
(39 – 43)5 predicate_condition to be considered to execute the instruction if condition
encoding name Description
the input predicate comparison is active formula
0x00 never always false (not used) 0
0x01 l less tan (S & ~Z) ^ O
0x02 e Equal Z & ~S
0x03 le less than or equal S ^ (Z | O)
0x04 g greater tan ~Z & ~(S ^ O)
less or greater tan / not
0x05 lg ~Z
equal
0x06 ge greater than or equal ~(S ^ O)
0x07 lge Ordered ~Z | ~S
0x08 u Unordered Z&S
0x09 lu less than or unordered S^O
0x0a eu equal or unordered Z
0x0b leu not greater than Z | (S ^ O)
0x0c gu greater than or unordered ~S ^ (Z | O)
0x0d lgu not equal to ~Z | S
0x0e geu not less tan (~S | Z) ^ O
0x0f always always true (by default) 1
0x10 o Overflow O
0x11 c carry / unsigned not below C
0x12 a unsigned above ~Z & C
0x13 s sign / negative S
0x1c ns not sign / positive ~S
0x1d na unsigned not above Z | ~C
0x1e nc not carry / unsigned below ~C
0x1f no no overflow ~O
(44 – 45) 2 Input predicate register to compare before to operate C0 = 00 C1 = 01
C2 = 10 C3 = 11
(46 – 47) 2 Logic_operation_selector AND = 00 OR = 01
XOR = 10 NOT = 11
(48 – 49) 2 Not used 00
50 Source 1 inverted 1= inverted 0= not inverted not working
51 Source 2 inverted 1= inverted 0= not inverted not working
52 Not used
53 Shared memory use for Source_2? Yes = 1 No = 0 register use
54 Use of constant memory as Source_2? Yes = 1 No = 0 register use
(55 – 56) 2 Index of the Constant memory space c[xx][xx] 00 (not supported in FLEXGRIPPLUS)
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
57 Not used 0
58 Size selector, Modifier 1 0: b16 1: b32
59 Size selector, Modifier 2 0: u16/u32 1: s16/s32
(60) 1 Not used 0
(61 – 63)3 Sub_op_code 000
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
ISET Instruction:
Checked YES
This instruction performs the integer comparison of two integer sources. A destiny register can be affected if selected. This instruction affects one
flag of a predicate flag as a consequence of the comparison. This instruction can also require an input predicate condition as a precondition for its
execution.
Pre: Rx vs Ry
Mnemonics:
ISET Destiny. (Predicate condition) , Source_1, Source_2
Source_1, Source_2, and Destiny are general purpose registers or constant memory parameters.
(SASS_assembly_lib):
Formats:
ISET_regs(char sigd, int dest_reg, int source_reg_1, int source_reg_2, int comparison, int condition, int pred_reg_cond, char set_pred, int
pred_reg_set, char output_reg, int marker)
(53) 1 Selection of the shared memory as one of the 1= Shared memory used 0 = Shared memory not used
sources for comparison.
(54) 1 Selection of constant memory as one of the 1= Constant memory used 0 = Constant memory not used
sources for comparison.
(55 – 57) 3 The high part of the second memory operand [XXXX][]
(58) 1 Size of operands 0: b16 1: b32
(59) 1 Signed or unsigned selection for destiny 0: u16/u32 (Unsigned) 1: s16/s32 (Signed)
(60) 1 Not used 0
(63 – 61) 3 Secondary Operation Code 011
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
MVC
GLD
GST
MOV
MOV32
MVI
R2G
R2A
A2R
ADA
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
MVC Instruction:
Checked YES
This instruction performs the movement of an immediate operand in the opcode of the instruction. The immediate value can be combined with
one address register.
Mnemonics:
(SASS_assembly_lib):
Formats:
Pending…
(32 - 33) 2 instr_marker 00 = normal reg Access(load or store) (not extra instruction)
(by default)
01 = normal reg Access(load or store) (with Join) (extra instruction)
10 = normal reg Access(load or store) (with Exit) (another option)
11 = immediate
(34 – 35) 2 Not used 00
(36-37) 2 Predicate register set (enabling a new flag) or Not used C0 = 00 (by default) C2 = 10
C1 = 01 C3 = 11
38 Set predicate register as result of operation 1: Enable predicate register set 0: Disable predicate register set
(39 – 43) 5 Predicate condition to operate the instruction Encoding name Description condition formula
0x00 never always false 0
0x01 l less tan (S & ~Z) ^ O
0x02 e Equal Z & ~S
0x03 le less than or equal S ^ (Z | O)
0x04 g greater tan ~Z & ~(S ^ O)
0x05 lg less or greater tan ~Z
0x06 ge greater than or equal ~(S ^ O)
0x07 lge Ordered ~Z | ~S
0x08 u Unordered Z&S
0x09 lu less than or unordered S^O
0x0a eu equal or unordered Z
0x0b leu not greater tan Z | (S ^ O)
0x0c gu greater than or unordered ~S ^ (Z | O)
0x0d lgu not equal to ~Z | S
0x0e geu not less tan (~S | Z) ^ O
0x0f always always true (by default) 1
0x10 o Overflow O
0x11 c carry / unsigned not below C
0x12 a unsigned above ~Z & C
0x13 s sign / negative S
0x1c ns not sign / positive ~S
0x1d na unsigned not above Z | ~C
0x1e nc not carry / unsigned below ~C
0x1f no no overflow ~O
(44 - 45) 2 Input_predicate_register C0= 00 (by default) C1= 01
Used as: precondition to operate the instruction C2= 10 C3= 11
(46-47) 2 Size of movement (source size) 11= 32 bits 01= 16 bits
00= 8 bits
(48-53) 6 Not used 00 0000
54 Address Register or Imm address 1 = Immediate address 0 = Address register
(55-57) 3 Not used 000
58 Size of the destiny 1= 32 bits 0= 16 bits
59 Signed or unsigned sources 1=S16/S32 0= U16/U32 (Unsigned)
60 Not used 0
(61 – 63) 3 Sub_opcode 001
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
GLD Instruction:
Checked YES
This instruction performs the load of an operand of 8, 16 or 32-bits size from the main memory (global) in the GPGPU.
Mnemonics:
(SASS_assembly_lib):
Formats:
Pending…
(32 - 33) 2 instr_marker 00 = normal reg Access(load or store) (not extra instruction) (by default)
01 = normal reg Access(load or store) (with Join) (extra instruction)
10 = normal reg Access(load or store) (with Exit)
11 = immediate
(34 - 38) 5 Not used 0 0000
(39 – 43) 5 Predicate condition to operate the instruction Encoding name Description condition formula
0x00 never always false 0
0x01 l less tan (S & ~Z) ^ O
0x02 e Equal Z & ~S
0x03 le less than or equal S ^ (Z | O)
0x04 g greater tan ~Z & ~(S ^ O)
0x05 lg less or greater tan ~Z
0x06 ge greater than or equal ~(S ^ O)
0x07 lge Ordered ~Z | ~S
0x08 u Unordered Z&S
0x09 lu less than or unordered S^O
0x0a eu equal or unordered Z
0x0b leu not greater tan Z | (S ^ O)
0x0c gu greater than or unordered ~S ^ (Z | O)
0x0d lgu not equal to ~Z | S
0x0e geu not less tan (~S | Z) ^ O
0x0f always always true (by default) 1
0x10 o Overflow O
0x11 c carry / unsigned not below C
0x12 a unsigned above ~Z & C
0x13 s sign / negative S
0x1c ns not sign / positive ~S
0x1d na unsigned not above Z | ~C
0x1e nc not carry / unsigned below ~C
0x1f no no overflow ~O
(44 - 45) 2 Input_predicate_register C0= 00 (by default) C1= 01
Used as: precondition to operate the instruction C2= 10 C3= 11
(46 - 52) 7 Not used 000 0000
(53 – 55) 3 Destiny_move_size 000=DT_U8 (U8) 001=DT_S8 (S8)
010=DT_U16 (U16) 011=DT_S16 (U16)
100=DT_U64 (U64) (NOT supported) 101=DT_U128 (U128) (NOT sup.)
110=DT_U32 (U32) 111=DT_S32 (S32)
(56-60) 5 Not used 0 0000
(61 – 63) 3 Sub_opcode 100 Load
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
GST Instruction:
This instruction performs the storage into the global memory of one operand coming from a general purpose register.
Mnemonics:
(SASS_assembly_lib):
Formats:
Pending…
(32 - 33) 2 instr_marker 00 normal reg Access(load or store) (not extra instruction) (By
default)
01 normal reg Access(load or store) (with Exit)
10 normal reg Access(load or store) (with Join)
(extra instruction) (33=1, 32=0)
11 immediate
(34-38) 5 Not used 0 0000
(39 – 43) 5 predicate_condition encoding name Description condition formula
0x00 never always false 0
0x01 l less tan (S & ~Z) ^ O
0x02 e Equal Z & ~S
0x03 le less than or equal S ^ (Z | O)
0x04 g greater tan ~Z & ~(S ^ O)
0x05 lg less or greater tan ~Z
0x06 ge greater than or equal ~(S ^ O)
0x07 lge Ordered ~Z | ~S
0x08 u Unordered Z&S
0x09 lu less than or unordered S^O
0x0a eu equal or unordered Z
0x0b leu not greater tan Z | (S ^ O)
0x0c gu greater than or unordered ~S ^ (Z | O)
0x0d lgu not equal to ~Z | S
0x0e geu not less tan (~S | Z) ^ O
0x0f always always true (by default) 1
0x10 o Overflow O
0x11 c carry / unsigned not below C
0x12 a unsigned above ~Z & C
0x13 s sign / negative S
0x1c ns not sign / positive ~S
0x1d na unsigned not above Z | ~C
0x1e nc not carry / unsigned below ~C
0x1f no no overflow ~O
(44 – 45) 2 Input_predicate_register C0= 00 (by default) C1= 01
C2= 10 C3= 11
(46 – 52) 7 Not used 000 0000
(53-55) 3 Move_operand_size U8 = 000 U64 = 100
S8 = 001 U128 = 101
U16 = 010 U32 = 110 (by default)
S16 = 011 S32 = 111
(56 - 60) 5 Not used 0 0000
(61 - 63) 3 Sub_Opcode 000=DT_U16 100=DT_S32
001=DT_S16 101=DT_S32 (by default)
010=DT_S16 110=DT_U32
011=DT_U32 111=DT_S32
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
This instruction performs the movement of an operand from a general purpose register into another.
Mnemonics:
(SASS_assembly_lib):
Formats:
Pending…
(32 - 33) 2 instr_marker 00 normal reg Access(load or store) (not extra instruction) (By
default)
01 normal reg Access(load or store) (with Exit)
10 normal reg Access(load or store) (with Join)
(extra instruction) (33=1, 32=0)
11 immediate
34 Address register high part A4 = 1 Ax = 0
(35 - 38) 4 Not used 0000
(39 – 43) 5 predicate_condition encoding name Description condition formula
0x00 never always false 0
0x01 l less tan (S & ~Z) ^ O
0x02 e Equal Z & ~S
0x03 le less than or equal S ^ (Z | O)
0x04 g greater tan ~Z & ~(S ^ O)
0x05 lg less or greater tan ~Z
0x06 ge greater than or equal ~(S ^ O)
0x07 lge Ordered ~Z | ~S
0x08 u Unordered Z&S
0x09 lu less than or unordered S^O
0x0a eu equal or unordered Z
0x0b leu not greater tan Z | (S ^ O)
0x0c gu greater than or unordered ~S ^ (Z | O)
0x0d lgu not equal to ~Z | S
0x0e geu not less tan (~S | Z) ^ O
0x0f always always true (by default) 1
0x10 o Overflow O
0x11 c carry / unsigned not below C
0x12 a unsigned above ~Z & C
0x13 s sign / negative S
0x1c ns not sign / positive ~S
0x1d na unsigned not above Z | ~C
0x1e nc not carry / unsigned below ~C
0x1f no no overflow ~O
(44 – 45) 2 Input_predicate_register C0= 00 (by default) C1= 01
C2= 10 C3= 11
(46-48) 3 Size of movement (source size) 000 DT_U8 100 DT_U32
001 DT_U16 101 DT_U32
010 DT_S16 110 DT_U32
011 DT_U32 111 DT_U32
49 ?? 1
(50 – 52) 3 Not used 000
53 Source 1 selector 1 = Shared memory 0 = Register
(54 – 57) 4 Not used 0000
58 Size of the destiny 1= 32 bits 0= 16 bits
59 Signed or unsigned sources 1=S16/S32 0= U16/U32 (Unsigned)
60 Not used 0
(61 - 63) 3 Sub_Opcode 000=DT_U16 100=DT_S32
001=DT_S16 101=DT_S32 (by default)
010=DT_S16 110=DT_U32
011=DT_U32 111=DT_S32
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
MOV32 Instruction:
Checked YES
This instruction performs the movement of an operand from a general purpose register into another
PrE: Rx <- Ry
Mnemonics:
(SASS_assembly_lib):
Formats:
Pending…
MVI Instruction:
Checked YES
This instruction performs the movement of an immediate operand in the opcode of the instruction.
Mnemonics:
(SASS_assembly_lib):
Formats:
Pending…
(32 - 33) 2 instr_marker 00 = normal reg Access(load or store) (not extra instruction)
01 = normal reg Access(load or store) (with Join) (extra instruction)
10 = normal reg Access(load or store) (with Exit)
11 = immediate (by default)
(34 - 59)26 Immediate high part 26 bits XX XXXX XXXX XXXX XXXX XXXX XXXX
60 Not used
(61 – 63) 3 Sub_opcode 000
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
R2G Instruction:
Checked YES
This instruction performs the movement of an operand from a general purpose register to one shared memory location. The location in the share
memory can be combined with an address register and one immediate (or address offset) value.
Mnemonics:
(SASS_assembly_lib):
Formats:
Pending…
(32 - 33) 2 instr_marker 00 = normal reg Access(load or store) (not extra instruction)
(by default)
01 = normal reg Access(load or store) (with Join) (extra instruction)
10 = normal reg Access(load or store) (with Exit) (another option)
11 = immediate
(34 – 35) 2 Not used 00
(36-37) 2 Predicate register set (enabling a new flag) or Not used C0 = 00 (by default) C2 = 10
C1 = 01 C3 = 11
38 Set predicate register as result of operation 1: Enable predicate register set 0: Disable predicate register set
(39 – 43) 5 Predicate condition to operate the instruction Encoding name Description condition formula
0x00 never always false 0
0x01 l less tan (S & ~Z) ^ O
0x02 e Equal Z & ~S
0x03 le less than or equal S ^ (Z | O)
0x04 g greater tan ~Z & ~(S ^ O)
0x05 lg less or greater tan ~Z
0x06 ge greater than or equal ~(S ^ O)
0x07 lge Ordered ~Z | ~S
0x08 u Unordered Z&S
0x09 lu less than or unordered S^O
0x0a eu equal or unordered Z
0x0b leu not greater tan Z | (S ^ O)
0x0c gu greater than or unordered ~S ^ (Z | O)
0x0d lgu not equal to ~Z | S
0x0e geu not less tan (~S | Z) ^ O
0x0f always always true (by default) 1
0x10 o Overflow O
0x11 c carry / unsigned not below C
0x12 a unsigned above ~Z & C
0x13 s sign / negative S
0x1c ns not sign / positive ~S
0x1d na unsigned not above Z | ~C
0x1e nc not carry / unsigned below ~C
0x1f no no overflow ~O
(44 – 45) 2 Input_predicate_register C0= 00 (by default) C1= 01
Used as: precondition to operate the instruction C2= 10 C3= 11
(46–52) 7 Source register R0= 0000000, R5=0000101, R6=0000110…
(53–54) 2 Size of movement (source size) 01= 32 bits
00= 16 bits
10= 8 bits
(55-57) 3 Not used 000
58 Size of the destiny 1= 32 bits 0= 16 bits
59 Signed or unsigned sources 1=S16/S32 0= U16/U32 (Unsigned)
60 Not used 0
(61 – 63) 3 Sub_opcode 111
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
R2A Instruction:
Checked YES
This instruction performs the movement of an operand from a general purpose register to one address register that is used to address the shared
or constant memories in the GPGPU.
Mnemonics:
(SASS_assembly_lib):
Formats:
Pending…
(32 - 33) instr_marker 00 normal reg Access(load or store) (not extra instruction) (by default)
01 normal reg Access(load or store) (with Join) (extra instruction)
10 normal reg Access(load or store) (with Exit)
11 immediate
34 Not used 0
(35 – 38) 4 Not used 0000
(39 – 43) 5 Predicate condition to operate the instruction Encoding name Description condition formula
0x00 never always false 0
0x01 l less tan (S & ~Z) ^ O
0x02 e Equal Z & ~S
0x03 le less than or equal S ^ (Z | O)
0x04 g greater tan ~Z & ~(S ^ O)
0x05 lg less or greater tan ~Z
0x06 ge greater than or equal ~(S ^ O)
0x07 lge Ordered ~Z | ~S
0x08 u Unordered Z&S
0x09 lu less than or unordered S^O
0x0a eu equal or unordered Z
0x0b leu not greater tan Z | (S ^ O)
0x0c gu greater than or unordered ~S ^ (Z | O)
0x0d lgu not equal to ~Z | S
0x0e geu not less tan (~S | Z) ^ O
0x0f always always true (by default) 1
0x10 o Overflow O
0x11 c carry / unsigned not below C
0x12 a unsigned above ~Z & C
0x13 s sign / negative S
0x1c ns not sign / positive ~S
0x1d na unsigned not above Z | ~C
0x1e nc not carry / unsigned below ~C
0x1f no no overflow ~O
(44 – 45) 2 Input_predicate_register C0= 00 (by default) C1= 01
C2= 10 C3= 11
(46 – 60) 14 Not used 00 0000 0000 0000
(61-63) Sub_Opcode 000 DT_U16
001 DT_S16
010 DT_S16
011 DT_U32
100 DT_S32
101 DT_S32
110 DT_U32 (by default)
111 DT_S32
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
A2R Instruction:
Checked YES
This instruction performs the movement of an operand from an address register to one general purpose register.
Mnemonics:
(SASS_assembly_lib):
Formats:
Pending…
(32 - 33) instr_marker 00 normal reg Access(load or store) (not extra instruction) (by default)
01 normal reg Access(load or store) (with Join) (extra instruction)
10 normal reg Access(load or store) (with Exit)
11 immediate
34 Not used 0
(35 – 38) 4 Not used 0000
(39 – 43) 5 Predicate condition to operate the instruction Encoding name Description condition formula
0x00 never always false 0
0x01 l less tan (S & ~Z) ^ O
0x02 e Equal Z & ~S
0x03 le less than or equal S ^ (Z | O)
0x04 g greater tan ~Z & ~(S ^ O)
0x05 lg less or greater tan ~Z
0x06 ge greater than or equal ~(S ^ O)
0x07 lge Ordered ~Z | ~S
0x08 u Unordered Z&S
0x09 lu less than or unordered S^O
0x0a eu equal or unordered Z
0x0b leu not greater tan Z | (S ^ O)
0x0c gu greater than or unordered ~S ^ (Z | O)
0x0d lgu not equal to ~Z | S
0x0e geu not less tan (~S | Z) ^ O
0x0f always always true (by default) 1
0x10 o Overflow O
0x11 c carry / unsigned not below C
0x12 a unsigned above ~Z & C
0x13 s sign / negative S
0x1c ns not sign / positive ~S
0x1d na unsigned not above Z | ~C
0x1e nc not carry / unsigned below ~C
0x1f no no overflow ~O
(44 – 45) 2 Input_predicate_register C0= 00 (by default) C1= 01
C2= 10 C3= 11
(46 – 60) 14 Not used 00 0000 0000 0000
(61-63) Sub_Opcode 010
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
ADA Instruction:
Checked yes
This instruction performs the addition of immediate value in the address registers (These registers are employed to address the shared memory in
the GPGPU)
Ax <- Ay + Imm
Mnemonics:
ADA (Destiny register), (source register), Imm
Destiny and source registers are 32-bit size. The source register seems to be selected among (A0 - A3) instead the destiny may be (A0 – A15)
(SASS_assembly_lib):
Formats:
Not implemented yet… pending to describe
(32 - 33)2 instr_marker 00 = normal reg Access(load or store) (not extra instruction)
(by default)
01 = normal reg Access(load or store) (with Join) (extra instruction)
10 = normal reg Access(load or store) (with Exit)
11 = immediate
(34)1 Not used 0
(35)1 destination type 0 = Register destination 1= Memory destination
(36-37)2 Predicate register set (enabling a new flag) or Not used C0 = 00 (by default)
C1 = 01
C2 = 10
C3 = 11
(38)1 Write enable / set predicate register 1 = write enabled (just for memory destination)
1 = enable predicate register set, 0 = disable predicate register set
Not used (0)
(39 – 43) predicate_condition encoding name Description condition formula
5 0x00 never always false (not used) 0
0x01 l less tan (S & ~Z) ^ O
0x02 e Equal Z & ~S
0x03 le less than or equal S ^ (Z | O)
0x04 g greater tan ~Z & ~(S ^ O)
0x05 lg less or greater tan / not equal ~Z
0x06 ge greater than or equal ~(S ^ O)
0x07 lge Ordered ~Z | ~S
0x08 u Unordered Z&S
0x09 lu less than or unordered S^O
0x0a eu equal or unordered Z
0x0b leu not greater than Z | (S ^ O)
0x0c gu greater than or unordered ~S ^ (Z | O)
0x0d lgu not equal to ~Z | S
0x0e geu not less tan (~S | Z) ^ O
0x0f always always true (by default) 1
0x10 o Overflow O
0x11 c carry / unsigned not below C
0x12 a unsigned above ~Z & C
0x13 s sign / negative S
0x1c ns not sign / positive ~S
0x1d na unsigned not above Z | ~C
0x1e nc not carry / unsigned below ~C
0x1f no no overflow ~O
(45 - 59) Not used? High part of the Imm, value, or from the 0000 0000 0000 0000
source of destiny register?
(61 – 63)3 Sub_op_code 001
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
FADD32
FADD
FADD32I
FMUL
FMUL32
FMUL32I
FMAD
FMAD32
FMAD32I
F2F
F2I
I2F
FSET
RCP
RCP32
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
FADD32 Instruction:
Checked No, partially implemented, and checking in progress.
This instruction performs the floating-point addition in single-precision (32 bits) of two sources. The sources can be registers or shared memory
locations.
Mnemonics:
FADD32 (Destiny register), (source register), (source register)
Destiny and source registers are 32-bit size. The source register seems to be selected among (R0 - Rn), where n is the total number of registers
employed by the application.
(SASS_assembly_lib):
Formats:
Not implemented yet…
FADD Instruction:
Checked No, partially implemented, and checking in progress.
This instruction performs the floating-point addition in single-precision (32 bits) of two sources. The sources can be registers or shared memory
locations. The operation of this instruction could be dependable on predicate conditions. Moreover, the operation may modify some of these
predicate flags.
Mnemonics:
FADD (predicate_condition) (Destiny register), (source register), (source register)
The predicate_condition must be previously set by other instructions to be used as a condition for the addition operation.
Destiny and source registers are 32-bit size. The source register seems to be selected among (R0 - Rn), where n is the total number of registers
employed by the application.
(SASS_assembly_lib):
Formats:
Not implemented yet…
(32 - 33)2 instr_marker 00 = normal reg Access(load or store) (not extra instruction)
(by default)
01 = normal reg Access(load or store) (with Join) (extra instruction)
10 = normal reg Access(load or store) (with Exit)
11 = immediate
(34)1 Used for…. 0 1
(35)1 destination type 0 = Register destination 1= Memory destination
(36-37) 2 Predicate register set (enabling a new flag) or Not used C0 = 00 (by default)
C1 = 01
C2 = 10
C3 = 11
(38) 1 Set predicate register 1 = Enable predicate register set 0 = Disable predicate register set
(39 – 43)5 predicate_condition encoding name Description condition formula
0x00 never always false (not used) 0
0x01 l less tan (S & ~Z) ^ O
0x02 e Equal Z & ~S
0x03 le less than or equal S ^ (Z | O)
0x04 g greater tan ~Z & ~(S ^ O)
0x05 lg less or greater tan / not equal ~Z
0x06 ge greater than or equal ~(S ^ O)
0x07 lge Ordered ~Z | ~S
0x08 u Unordered Z&S
0x09 lu less than or unordered S^O
0x0a eu equal or unordered Z
0x0b leu not greater tan Z | (S ^ O)
0x0c gu greater than or unordered ~S ^ (Z | O)
0x0d lgu not equal to ~Z | S
0x0e geu not less tan (~S | Z) ^ O
0x0f always always true (by default) 1
0x10 o Overflow O
0x11 c carry / unsigned not below C
0x12 a unsigned above ~Z & C
0x13 s sign / negative S
0x1c ns not sign / positive ~S
0x1d na unsigned not above Z | ~C
0x1e nc not carry / unsigned below ~C
0x1f no no overflow ~O
(44 - 45) 2 Input predicate register to compare before to operate C0= 00 C1= 01
C2= 10 C3= 11
(46 - 53) 8 Source_register_2: It could be coming from: Register case (46-52): Constant memory: Shared memory:
4) GPRS R0= 00000 … Second part of the (46-52): 000 0000
5) Constant memory R5= 00101 constant memory (i.e.) (53): 1 (use of shared
6) Shared memory R6= 00110 … C[0x2][0x16] memory)
(53) 0 (46-53) = 0001 0110
(54 – 57)4 First part of the Source_2 when constant memory is First part of the constant memory (i.e.)
employed C[0x2][0x16]
(54-57) = 0010
58 Sign of Source_1 Positive = 0 Negative = 1
59 Sign of Source_2 Positive = 0 Negative = 1
60 Not used 0
(61 – 63)3 Sub_op_code 000
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
FADD32I Instruction:
Checked No, partially implemented, and checking in progress.
This instruction performs the floating point addition in single precision (32 bits) between one source and one immediate value. The source and
destiny can be registers.
Mnemonics:
FADD32I (Destiny register), (source register), (Immediate value)
Destiny and source registers are 32-bit size. The source register seems to be selected among (R0 - Rn), where n is the total number of registers
employed by the application.
(SASS_assembly_lib):
Formats:
Not implemented yet…
(32 - 33)2 instr_marker 00 = normal reg Access(load or store) (not extra instruction)
01 = normal reg Access(load or store) (with Join) (extra instruction)
10 = normal reg Access(load or store) (with Exit)
11 = immediate (by default)
(34 - 59) 26 The high part of the immediate value of 32 bits
(60) 1 Not used 0
(61 – 63)3 Sub_op_code 000
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
FMUL Instruction:
Checked No, partially implemented, and checking in progress.
This instruction performs the floating-point multiplication in single-precision (32 bits) between two sources. The sources and destiny can be
registers, shared memory locations, constant memory locations, or immediate values. A predicate condition can be present as a precondition for
executing the operation.
Mnemonics:
FMUL. (Predicate condition) (Destiny), (Source_1), (Source_2)
Source_1 and Source_2 can be the immediate value, shared memory location, or constant memory element. In most cases (Source_1 can be
shared memory location. Similarly, Source_2 can be the constant memory location)
(SASS_assembly_lib):
Formats:
Not implemented yet…
(32 - 33)2 instr_marker 00 = normal reg Access(load or store) (not extra instruction) (by default)
01 = normal reg Access(load or store) (with Join) (extra instruction)
10 = normal reg Access(load or store) (with Exit)
11 = immediate
(34)1 Used for…. 0 1
(35)1 destination type 0 = Register destination 1= Memory destination
(36-37) 2 Predicate register set (enabling a new flag) or Not used C0 = 00 (by default) C1 = 01
C2 = 10 C3 = 11
(38) 1 Set predicate register 1 = Enable predicate register set 0 = Disable predicate register set
(39 – 43)5 predicate_condition encoding name Description condition formula
0x00 never always false (not used) 0
0x01 l less tan (S & ~Z) ^ O
0x02 e Equal Z & ~S
0x03 le less than or equal S ^ (Z | O)
0x04 g greater tan ~Z & ~(S ^ O)
0x05 lg less or greater tan / not equal ~Z
0x06 ge greater than or equal ~(S ^ O)
0x07 lge Ordered ~Z | ~S
0x08 u Unordered Z&S
0x09 lu less than or unordered S^O
0x0a eu equal or unordered Z
0x0b leu not greater than Z | (S ^ O)
0x0c gu greater than or unordered ~S ^ (Z | O)
0x0d lgu not equal to ~Z | S
0x0e geu not less tan (~S | Z) ^ O
0x0f always always true (by default) 1
0x10 o Overflow O
0x11 c carry / unsigned not below C
0x12 a unsigned above ~Z & C
0x13 s sign / negative S
0x1c ns not sign / positive ~S
0x1d na unsigned not above Z | ~C
0x1e nc not carry / unsigned below ~C
0x1f no no overflow ~O
(44 - 45) 2 Input predicate register to compare before to operate C0= 00 C1= 01
C2= 10 C3= 11
(46 - 47) 2 Result round method Not rounding = 00 Rounded to zero = 11
(53) 1 Shared memory use for Source_2? Yes = 1 No = 0
(54) 1 Use of constant memory as Source_2? Yes = 1 No = 0
(58) 1 Sign of Source_1 Positive = 0, Negative = 1
(59) 1 Sign of Source_2 Positive = 0, Negative = 1
(60) 1 Not used 0
(61 – 63)3 Sub_op_code 000
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
FMUL32 Instruction:
Checked No, partially implemented, and checking in progress.
This instruction performs the floating-point multiplication in single-precision (32 bits) between two sources. The sources and destiny can be
registers, shared memory locations, constant memory locations, or immediate values. Predicate conditions are not included as a precondition to
operate this instruction.
Mnemonics:
FMUL32 (Destiny), (Source_1), (Source_2)
Source_1 and Source_2 can be the immediate value, shared memory location, or constant memory element. In most cases (Source_1 can be
shared memory location. Similarly, Source_2 can be the constant memory location)
(SASS_assembly_lib):
Formats:
Not implemented yet…
FMUL32I Instruction:
Checked No, partially implemented, and checking in progress.
This instruction performs the floating-point multiplication in single-precision (32 bits) between two sources. The sources and destiny can be
registers, shared memory locations, constant memory locations, or immediate values. Predicate conditions are not included as a precondition to
operate this instruction.
Mnemonics:
FMUL32 (Destiny), (Source_1), Imm
(SASS_assembly_lib):
Formats:
Not implemented yet…
(32 - 33)2 instr_marker 00 = normal reg Access(load or store) (not extra instruction)
01 = normal reg Access(load or store) (with Join) (extra
instruction)
10 = normal reg Access(load or store) (with Exit)
11 = immediate (by default)
(34 - 59) 26 The high part of the immediate value of 32 bits
(60) 1 Not used 0
(61 – 63)3 Sub_op_code 000
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
FMAD32 Instruction:
Checked No, partially implemented, and checking in progress.
This instruction performs the floating-point multiplication and addition in single-precision (32 bits) between three sources. The sources and destiny
can be registers, shared memory locations, constant memory locations, or immediate values. Predicate conditions are not included as a
precondition to operate this instruction.
The destiny register should be one of the source operands in the MAD operation.
Mnemonics:
(SASS_assembly_lib):
Not implemented yet…
(32 - 33)2 instr_marker 00 = normal reg Access(load or store) (not extra instruction) (by default)
01 = normal reg Access(load or store) (with Join) (extra instruction)
10 = normal reg Access(load or store) (with Exit)
11 = immediate
(34)1 Used for…. 0 1
(35)1 destination type 0 = Register destination 1= Memory destination
(36-37) 2 Predicate register set (enabling a new flag) or Not used C0 = 00 (by default) C1 = 01
C2 = 10 C3 = 11
(38) 1 Set predicate register 1 = Enable predicate register set 0 = Disable predicate register set
(39 – 43)5 predicate_condition encoding name Description condition formula
0x00 never always false (not used) 0
0x01 l less tan (S & ~Z) ^ O
0x02 e Equal Z & ~S
0x03 le less than or equal S ^ (Z | O)
0x04 g greater tan ~Z & ~(S ^ O)
0x05 lg less or greater tan / not equal ~Z
0x06 ge greater than or equal ~(S ^ O)
0x07 lge Ordered ~Z | ~S
0x08 u Unordered Z&S
0x09 lu less than or unordered S^O
0x0a eu equal or unordered Z
0x0b leu not greater than Z | (S ^ O)
0x0c gu greater than or unordered ~S ^ (Z | O)
0x0d lgu not equal to ~Z | S
0x0e geu not less tan (~S | Z) ^ O
0x0f always always true (by default) 1
0x10 o Overflow O
0x11 c carry / unsigned not below C
0x12 a unsigned above ~Z & C
0x13 s sign / negative S
0x1c ns not sign / positive ~S
0x1d na unsigned not above Z | ~C
0x1e nc not carry / unsigned below ~C
0x1f no no overflow ~O
(44 - 45) 2 Input predicate register to compare before to operate C0= 00 C1= 01
C2= 10 C3= 11
(46-52) 6 Source 3: It could be a register or a constant memory Register case: Constant memory:
location R0= 00000 …, High part of the constant memory
R5= 00101, (i.e.) C[0x2][0x16]
R6= 00110 … (46-52) = 00 0010
(53) 1 Shared memory use for Source_2? Yes = 1 No = 0
A bit indicates if the shared memory is employed
(54) 1 Use of constant memory for Source_2 or Source_3? Yes = 1 No = 0
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
FMAD32I Instruction:
Checked No, partially implemented, and checking in progress.
This instruction performs the floating-point multiplication and addition in single-precision (32 bits) among two sources and one immediate value.
The sources and the destiny most of the time are general-purpose registers. Predicate values are not included as preconditions to execute the
instruction.
The destiny register should be one of the source operands in the MAD operation.
Mnemonics:
(SASS_assembly_lib):
Formats:
Not implemented yet…
(32 - 33)2 instr_marker 00 = normal reg Access(load or store) (not extra instruction)
01 = normal reg Access(load or store) (with Join) (extra instruction)
10 = normal reg Access(load or store) (with Exit)
11 = immediate (by default)
(34 - 59) 26 The high part of the immediate value of 32 bits
(60) 1 Not used 0
(61 – 63)3 Sub_op_code 000
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
F2F Instruction:
Checked No, partially implemented, and checking in progress.
This instruction performs the floating conversion between two floating-point elements. This instruction is used to change the format or to move
among floating-point sources. A predicate condition can be employed as part of preconditions.
Mnemonics:
(SASS_assembly_lib):
Formats:
Not implemented yet…
(32 - 33)2 instr_marker 00 = normal reg Access(load or store) (not extra instruction) (by default)
01 = normal reg Access(load or store) (with Join) (extra instruction)
10 = normal reg Access(load or store) (with Exit)
11 = immediate
(34)1 Used for…. 0 1
(35)1 destination type 0 = Register destination 1= Memory destination
(36-37) 2 Predicate register set (enabling a new flag) or Not used C0 = 00 (by default) C1 = 01
C2 = 10 C3 = 11
(38) 1 Set predicate register 1 = Enable predicate register set 0 = Disable predicate register set
(39 – 43)5 predicate_condition encoding name Description condition formula
0x00 never always false (not used) 0
0x01 l less tan (S & ~Z) ^ O
0x02 e Equal Z & ~S
0x03 le less than or equal S ^ (Z | O)
0x04 g greater tan ~Z & ~(S ^ O)
0x05 lg less or greater tan / not equal ~Z
0x06 ge greater than or equal ~(S ^ O)
0x07 lge Ordered ~Z | ~S
0x08 u Unordered Z&S
0x09 lu less than or unordered S^O
0x0a eu equal or unordered Z
0x0b leu not greater than Z | (S ^ O)
0x0c gu greater than or unordered ~S ^ (Z | O)
0x0d lgu not equal to ~Z | S
0x0e geu not less tan (~S | Z) ^ O
0x0f always always true (by default) 1
0x10 o Overflow O
0x11 c carry / unsigned not below C
0x12 a unsigned above ~Z & C
0x13 s sign / negative S
0x1c ns not sign / positive ~S
0x1d na unsigned not above Z | ~C
0x1e nc not carry / unsigned below ~C
0x1f no no overflow ~O
(44 - 45) 2 Input predicate register to compare before to operate C0= 00 C1= 01
C2= 10 C3= 11
(46) 1 Fixed value, purpose? 1
(52) 1 Absolute value in source_1 Yes = 1 No = 0
(54) 1 Use of constant memory for Source_2 or Source_3? Yes = 1 No = 0
F2I Instruction:
Checked No, partially implemented, and checking in progress.
This instruction performs the floating conversion into an integer (from float to integer). Predicate conditions can be employed as part of the
preconditions to execute the instruction.
Mnemonics:
(SASS_assembly_lib):
Formats:
Not implemented yet…
(32 - 33)2 instr_marker 00 = normal reg Access(load or store) (not extra instruction) (by default)
01 = normal reg Access(load or store) (with Join) (extra instruction)
10 = normal reg Access(load or store) (with Exit)
11 = immediate
(34)1 Used for…. 0 1
(35)1 destination type 0 = Register destination 1= Memory destination
(36-37) 2 Predicate register set (enabling a new flag) or Not used C0 = 00 (by default) C1 = 01
C2 = 10 C3 = 11
(38) 1 Set predicate register 1 = Enable predicate register set 0 = Disable predicate register set
(39 – 43)5 predicate_condition encoding name Description condition formula
0x00 never always false (not used) 0
0x01 l less tan (S & ~Z) ^ O
0x02 e Equal Z & ~S
0x03 le less than or equal S ^ (Z | O)
0x04 g greater tan ~Z & ~(S ^ O)
0x05 lg less or greater tan / not equal ~Z
0x06 ge greater than or equal ~(S ^ O)
0x07 lge Ordered ~Z | ~S
0x08 u Unordered Z&S
0x09 lu less than or unordered S^O
0x0a eu equal or unordered Z
0x0b leu not greater than Z | (S ^ O)
0x0c gu greater than or unordered ~S ^ (Z | O)
0x0d lgu not equal to ~Z | S
0x0e geu not less tan (~S | Z) ^ O
0x0f always always true (by default) 1
0x10 o Overflow O
0x11 c carry / unsigned not below C
0x12 a unsigned above ~Z & C
0x13 s sign / negative S
0x1c ns not sign / positive ~S
0x1d na unsigned not above Z | ~C
0x1e nc not carry / unsigned below ~C
0x1f no no overflow ~O
(44 - 45) 2 Input predicate register to compare before to operate C0= 00 C1= 01
C2= 10 C3= 11
(46) 1 Fixed value, purpose? 1
(49-50) 2 Rounding mechanism 00 = not rounding 11 = to zero
(54) 1 Use of constant memory for Source_2 or Source_3? Yes = 1 No = 0
I2F Instruction:
Checked No, partially implemented, and checking in progress.
This instruction performs the integer conversion into a floating-point value in single-precision (32 bits). Predicate conditions can be employed as
part of the preconditions to execute the instruction.
Mnemonics:
(SASS_assembly_lib):
Formats:
Not implemented yet…
(32 - 33)2 instr_marker 00 = normal reg Access(load or store) (not extra instruction) (by default)
01 = normal reg Access(load or store) (with Join) (extra instruction)
10 = normal reg Access(load or store) (with Exit)
11 = immediate
(34)1 Used for…. 0 1
(35)1 destination type 0 = Register destination 1= Memory destination
(36-37) 2 Predicate register set (enabling a new flag) or Not used C0 = 00 (by default) C1 = 01
C2 = 10 C3 = 11
(38) 1 Set predicate register 1 = Enable predicate register set 0 = Disable predicate register set
(39 – 43)5 predicate_condition encoding name Description condition formula
0x00 never always false (not used) 0
0x01 l less tan (S & ~Z) ^ O
0x02 e Equal Z & ~S
0x03 le less than or equal S ^ (Z | O)
0x04 g greater tan ~Z & ~(S ^ O)
0x05 lg less or greater tan / not equal ~Z
0x06 ge greater than or equal ~(S ^ O)
0x07 lge Ordered ~Z | ~S
0x08 u Unordered Z&S
0x09 lu less than or unordered S^O
0x0a eu equal or unordered Z
0x0b leu not greater than Z | (S ^ O)
0x0c gu greater than or unordered ~S ^ (Z | O)
0x0d lgu not equal to ~Z | S
0x0e geu not less tan (~S | Z) ^ O
0x0f always always true (by default) 1
0x10 o Overflow O
0x11 c carry / unsigned not below C
0x12 a unsigned above ~Z & C
0x13 s sign / negative S
0x1c ns not sign / positive ~S
0x1d na unsigned not above Z | ~C
0x1e nc not carry / unsigned below ~C
0x1f no no overflow ~O
(44 - 45) 2 Input predicate register to compare before to operate C0= 00 C1= 01
C2= 10 C3= 11
(46) 1 Fixed value, purpose? 1
(48) 1 From signed value Source_1 1 = Yes 0 = No
(49-50) 2 Rounding mechanism 00 = not rounding 11 = to zero
(54) 1 Use of constant memory for Source_2 or Source_3? Yes = 1 No = 0
FSET Instruction:
Checked No, partially implemented, and checking in progress.
This instruction performs a comparison between two floating-point values and modifies one of the predicate flags on one predicate registers as the
effect of the comparison. A predicate condition could be part of the preconditions to execute the instruction. This instruction does not generate
changes in the comparable values, but may change a destiny register is select as logical output.
Mnemonics:
FSET (Affected predicate register and condition) (Source_1), (Source_2), ((Input predicate condition)
Source_1, Source_2, and Destiny are general purpose registers or constant memory parameters.
(SASS_assembly_lib):
Formats:
Not implemented yet…
(32 - 33)2 instr_marker 00 = normal reg Access(load or store) (not extra instruction) (by default)
01 = normal reg Access(load or store) (with Join) (extra instruction)
10 = normal reg Access(load or store) (with Exit)
11 = immediate
(34)1 Used for…. 0 1
(35)1 destination type 0 = Register destination 1= Memory destination
(36-37) 2 Predicate register set (enabling a new flag) or Not used C0 = 00 (by default) C1 = 01
C2 = 10 C3 = 11
(38) 1 Set predicate register 1 = Enable predicate register set 0 = Disable predicate register set
(39 – 43)5 predicate_condition encoding name Description condition formula
0x00 never always false (not used) 0
0x01 l less tan (S & ~Z) ^ O
0x02 e Equal Z & ~S
0x03 le less than or equal S ^ (Z | O)
0x04 g greater tan ~Z & ~(S ^ O)
0x05 lg less or greater tan / not equal ~Z
0x06 ge greater than or equal ~(S ^ O)
0x07 lge Ordered ~Z | ~S
0x08 u Unordered Z&S
0x09 lu less than or unordered S^O
0x0a eu equal or unordered Z
0x0b leu not greater tan Z | (S ^ O)
0x0c gu greater than or unordered ~S ^ (Z | O)
0x0d lgu not equal to ~Z | S
0x0e geu not less tan (~S | Z) ^ O
0x0f always always true (by default) 1
0x10 o Overflow O
0x11 c carry / unsigned not below C
0x12 a unsigned above ~Z & C
0x13 s sign / negative S
0x1c ns not sign / positive ~S
0x1d na unsigned not above Z | ~C
0x1e nc not carry / unsigned below ~C
0x1f no no overflow ~O
(44 - 45) 2 Input predicate register to compare before to operate C0= 00 C1= 01
C2= 10 C3= 11
(46-50) 5 Predicate condition to perform between the two main encoding name Description condition formula
Sources. 0x00 never always false (not used) 0
0x01 L (LT) less tan (S & ~Z) ^ O
0x02 E (EQ) Equal Z & ~S
0x03 Le less than or equal S ^ (Z | O)
0x04 G (GT) greater tan ~Z & ~(S ^ O)
0x05 Lg less or greater tan / not equal ~Z
0x06 ge greater than or equal ~(S ^ O)
0x07 lge Ordered ~Z | ~S
0x08 u Unordered Z&S
0x09 lu less than or unordered S^O
0x0a eu equal or unordered Z
0x0b leu not greater than Z | (S ^ O)
0x0c gu greater than or unordered ~S ^ (Z | O)
0x0d lgu not equal to ~Z | S
0x0e geu not less tan (~S | Z) ^ O
0x0f always always true (by default) 1
0x10 o Overflow O
0x11 c carry / unsigned not below C
0x12 a unsigned above ~Z & C
0x13 s sign / negative S
0x1c ns not sign / positive ~S
0x1d na unsigned not above Z | ~C
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
RCP Instruction:
Checked No, partially implemented, and checking in progress.
This instruction performs the reciprocal operation of a floating-point value in single-precision (32 bits). A predicate condition could be part of the
preconditions to execute the instruction.
Mnemonics:
Source_1, Source_2, and Destiny are general purpose registers or constant memory parameters.
(SASS_assembly_lib):
Formats:
Not implemented yet…
(32 - 33)2 instr_marker 00 = normal reg Access(load or store) (not extra instruction) (by default)
01 = normal reg Access(load or store) (with Join) (extra instruction)
10 = normal reg Access(load or store) (with Exit)
11 = immediate
(34)1 Used for…. 0 1
(35)1 destination type 0 = Register destination 1= Memory destination
(36-37) 2 Predicate register set (enabling a new flag) or Not used C0 = 00 (by default) C1 = 01
C2 = 10 C3 = 11
(38) 1 Set predicate register 1 = Enable predicate register set 0 = Disable predicate register set
(39 – 43)5 predicate_condition encoding name Description condition formula
0x00 never always false (not used) 0
0x01 l less tan (S & ~Z) ^ O
0x02 e Equal Z & ~S
0x03 le less than or equal S ^ (Z | O)
0x04 g greater tan ~Z & ~(S ^ O)
0x05 lg less or greater tan / not equal ~Z
0x06 ge greater than or equal ~(S ^ O)
0x07 lge Ordered ~Z | ~S
0x08 u Unordered Z&S
0x09 lu less than or unordered S^O
0x0a eu equal or unordered Z
0x0b leu not greater tan Z | (S ^ O)
0x0c gu greater than or unordered ~S ^ (Z | O)
0x0d lgu not equal to ~Z | S
0x0e geu not less tan (~S | Z) ^ O
0x0f always always true (by default) 1
0x10 o Overflow O
0x11 c carry / unsigned not below C
0x12 a unsigned above ~Z & C
0x13 s sign / negative S
0x1c ns not sign / positive ~S
0x1d na unsigned not above Z | ~C
0x1e nc not carry / unsigned below ~C
0x1f no no overflow ~O
(44 - 45) 2 Input predicate register to compare before to operate C0= 00 C1= 01
C2= 10 C3= 11
(60) 1 Not used 0
(61 – 63)3 Sub_op_code 011
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
RCP32 Instruction:
Checked No, partially implemented, and checking in progress.
This instruction performs the reciprocal operation of a floating-point value in single-precision (32 bits). This instruction does not require a predicate
condition to start the execution.
Mnemonics:
Source_1, Source_2, and Destiny are general purpose registers or constant memory parameters.
(SASS_assembly_lib):
Formats:
Not implemented yet…
SIN
COS
RRO
EX2
RSQ
LG2
Supported instructions FlexGripPlus (SASS Opcode SM_1.0)
SIN instruction:
Checked Not implemented
This instruction generates the approximate SIN operation of an input operand in the format of 32 bits floating-point.
Destiny_f ← SIN (Source_f)
Mnemonics:
Direct SIN: SIN Rx, Rx
(SASS_assembly_lib):
Not_available
Note:
No comments.
COS instruction:
Checked Not implemented
This instruction generates the approximate COS operation of an input operand in the format of 32 bits floating-point.
Destiny_f ← COS (Source_f)
Mnemonics:
Direct COS: COS Rx, Rx
(SASS_assembly_lib):
Not_available
Note:
No comments.
This instruction reduces the range and adjusts the phase to operate a transcendent operation in the SFU. The operands are in 32 bits floating-point
single precision.
Destiny_f ← RRO (Source_f, method_of_reduction)
Mnemonics:
Direct RRO: RRO Rx, Rx, method (SIN, Exp)
(SASS_assembly_lib):
Not_available
Note:
No comments.
LG2 instruction:
Checked Not implemented
Mnemonics:
Direct LG2: LG2 Ry, Rx
(SASS_assembly_lib):
Not_available
Note:
No comments.
EX2 instruction:
Checked Not implemented
Mnemonics:
Direct EX2 EX2 Ry, Rx
(SASS_assembly_lib):
Not_available
Note:
No comments.
RSQ instruction:
Checked Not implemented
This instruction calculates the reciprocal of the square root of an input operand on 32 bits single-precision floating-point.
Destiny_f ← SRQ (Source_f)
Mnemonics:
Direct RSQ Ry, Rx
(SASS_assembly_lib):
Not_available
Note:
No comments.