# ARM Instruction Encoding The following are the encoding of the ARM instruction set as bit patterns. ## Condition field All instructions contain a 4-bit *condition field* in bits 31 to 28: ![[ARMConditionField.svg]] Condition codes also have corresponding [[Arithmetic Logic Unit#Status flags|status flag states]]. >[!Condition codes]+ | Opcode [31:28] | Symbol | Meaning | Flag state | | -------------- | ------ | --------------------------------- | ------------------------- | | `0000` | EQ | Equal | Z set | | `0001` | NE | Not equal | Z clear | | `0010` | CS/HS | Carry set/unsigned higher or same | C set | | `0011` | CC/LO | Carry clear/unsigned lower | C clear | | `0100` | MI | Minus/negative | N set | | `0101` | PL | Plus/positive or zero | N clear | | `0110` | VS | Overflow | V set | | `0111` | VC | No overflow | V clear | | `1000` | HI | Unsigned higher | C set and Z clear | | `1001` | LS | Unsigned lower or same | C clear or Z set | | `1010` | GE | Signed greater than or equal | N equal to V | | `1011` | LT | Signed less than | N not equal to V | | `1100` | GT | Signed greater than | Z clear and N equal to V | | `1101` | LE | Signed less than or equal | Z set or N not equal to V | | `1110` | AL | Always (unconditional) | - | ^e9df96 ^9a9d76 *** ## Branch and branch and link ![[ARMBranchBranchLinkEncoding.svg]] - `cond` - the [[ARM Instruction Encoding#Condition field|condition field]]. - `L` - sets the branch to also link. - `1` means the branch links. - `signed_immed_24` - used to calculate the *target address*. >[!Calculating the target address]+ > 1. [[Signed Binary Number#Sign extension|Sign extending]] the 24-bit [[Signed Binary Number#Signed 2's complement|signed 2's complement]] immediate to 30 bits. > 2. Shifting the result to the left by two bits, equivalent to multiplication by four, to form a 32-bit value.[^1] > 3. Adding the 32-bit value to the contents of the program counter. ^6b6334 *** ## Data processing instructions ![[ARMDataProcessingEncoding.svg]] - `cond` - the [[ARM Instruction Encoding#Condition field|condition field]]. - `I` - distinguishes between *immediate* and *register* forms of `operand 2`. - `1` means operand 2 is an immediate. - `opcode` - the operation code of the instruction. - `S` - sets the instruction to update condition codes. - `1` means condition codes are updated. - `Rn` - the first *source* operand register. - `Rd` - the *destination* register. - `operand 2` - either an immediate, register, shifted register, or rotated register. Refer to [[ARM Instruction Encoding#Data processing operands|data processing operands]]. >[!Operation codes]+ | Opcode [24:21] | Symbol | Operation | Action | | -------------- | ------ | -------------------------------------------------------- | ------------------------------------------------------------------------------- | | `0000` | AND | [[Conjunction\|Logical AND]] | Rd $:=$ Rn AND operand 2 | | `0001` | EOR | [[Exclusive Disjunction\|Logical XOR]] | Rd $:=$ Rn EOR operand 2 | | `0010` | SUB | Subtract | Rd $:=$ Rn $-$ operand 2 | | `0011` | RSB | Reverse subtract | Rd $:=$ operand 2 $-$ Rn | | `0100` | ADD | Add | Rd $:=$ Rn $+$ operand 2 | | `0101` | ADC | Add with carry | Rd $:=$ Rn $+$ operand 2 $+$ [[Arithmetic Logic Unit#Status flags\|carry flag]] | | `0110` | SBC | Subtract with carry | Rd $:=$ Rn $-$ operand 2 $-lt;br>$\neg$(carry flag) | | `0111` | RSC | Reverse subtract with carry | Rd $:=$ operand 2 $-$ Rn $-lt;br>$\neg$(carry flag) | | `1000` | TST | Test | Update flags after Rn AND operand 2 | | `1001` | TEQ | Test equivalence | Update flags after Rn EOR operand 2 | | `1010` | CMP | Compare | Update flags after Rn $-lt;br>operand 2 | | `1011` | CMN | Compare negated | Update flags after Rn $+lt;br>operand 2 | | `1100` | ORR | [[Disjunction\|Logical inclusive OR]] | Rd $:=$ Rn OR operand 2 | | `1101` | MOV | Move | Rd $:=$ operand 2 | | `1110` | BIC | Bit clear | Rd $:=$ Rn AND $\neg$(operand 2) | | `1111` | MVN | Move not | Rd $:=$ $\neg$(operand 2) | ^97daa4 ### Data processing operands There are several possible formats used to form `operand 2`, based on the operand type and the type of shift used. #### Immediate ![[ARMDataProcessingImmediateEncoding.svg|500]] The value of `operand 2` is encoded as an 8-bit number `immed_8` right rotated by `2 x rotate_imm`. Thus, the immediate inputted to an instruction must be representable by an *8-bit number* and an *even number up to 30* of right rotations. >[!Permitted immediate operands example]+ >$\text{FE}0_{16}=111111100000_{2}=11111110\;0000_{2}$ is a permitted operand since it can be represented by the first 8 bits then 4 bits of left rotation or $(32 - 4) = 28$ bits of right rotation. > >Thus, $\text{FE}0_{16}$ can be encoded as `immed_8`$\;=11111110_{2}$ and `rotate_imm`$\;=14_{10}$ > >$\text{FE}2_{16}=111111100010_{2}=11111110\;0010_{2}$ is not a permitted operand as more than 8 bits are required to represent it along with a left rotation. #### Shifted register by an immediate ![[ARMDataProcessingImmediateShiftEncoding.svg|500]] - `shift_imm` - the number of bits to shift by. The range depends on the shift type. - `LSL` - a value between *0 and 31*. - `LSR` - a value between *1 and 32*, 32 is encoded as 0. - `ASR` - a value between *1 and 32*, 32 is encoded as 0. - `ROR` - a value between *1 and 31*; if `shift_imm = 0`, then a right rotation with extend `RXX` is performed.[^2] - `Rm` - the register that is to be shifted. - `shift` - the shift type, encoded as a 2-bit number. Refer to the below table: | Shift[6:5] | Symbol | Shift type | | ---------- | ------ | ------------------------------------------------ | | `00` | LSL | [[Bit Shift#Logical shift\|Logical]] left | | `01` | LSR | Logical right | | `10` | ASR | [[Bit Shift#Arithmetic shift\|Arithmetic]] right | | `11` | ROR | [[Bit Shift#Rotate\|Rotate]] right | ^4a860f #### Shifted register by a register ![[ARMDataProcessingRegisterShiftEncoding.svg|500]] - `Rs` - the register whose value in the *least significant byte* is the number of bits to shift by. - `Rm` - the register that is to be shifted. - `shift` - the shift type, encoded as a 2-bit number. Refer to the below table: | Shift[6:5] | Symbol | Shift type | | ---------- | ------ | ---------------- | | `00` | LSL | Logical left | | `01` | LSR | Logical right | | `10` | ASR | Arithmetic right | | `11` | ROR | Rotate right | *** ## Load and store word or unsigned byte instructions The load and store instructions use an addressing mode formed from two parts, the *base register* and the *offset*. ![[ARMLoadStoreEncoding.svg]] - `cond` - the [[ARM Instruction Encoding#Condition field|condition field]]. - `I` - distinguishes between *immediate* and *register* forms of the offset. - `1` means the offset is a register. - `P` - sets *pre-* or *post-indexing*. - `1` means either *pre-indexed* or *offset* addressing, depending on `W`. - `U` - indicates whether the offset is *added* to the base or *subtracted* - `1` means the offset it added to the base. - `B` - distinguished between *word* or *unsigned byte* access. - `1` means an unsigned byte is transferred. - `W` - sets if the base register is updated. - `1` means the calculated memory address is written back to the base register - *pre-indexed addressing*. - `L` - distinguishes between *load* and *store*. - `1` means load. - `Rn` - the *base* register. - `Rd` - the *source* or *destination* register. - `offset` - either an immediate, register, shifted register, or rotated register. ### Offset and indexed addressing #### Offset addressing In offset addressing, the *target address* is formed by *adding* or *subtracting* the offset from the base register. The load/store operation then takes place at the address pointed to by the target address. The base register is *not updated*. #### Indexed addressing ##### Pre-indexed addressing Pre-indexed addressing is the same as offset addressing, except that the target address is *written back* to the base register after the load/store operation. ##### Post-indexed addressing In post-indexed addressing, the load/store operation takes place at the address pointed to by the *base register*. Only *after* the operation is done is the target address calculated using the offset. The base register is *always updated* with the target address. ### Offsets #### Immediate offset ![[ARMLoadStoreImmediateEncoding.svg|500]] - `offset_12` - an unsigned 12-bit number. #### Register, shifted register, and rotated register offset ![[ARMLoadStoreRegisterEncoding.svg|500]] - `shift_imm` - the number of bits to shift by. The range depends on the shift type. - `LSL` - a value between *0 and 31*. - `LSR` - a value between *1 and 32*, 32 is encoded as 0. - `ASR` - a value between *1 and 32*, 32 is encoded as 0. - `ROR` - a value between *1 and 31*; if `shift_imm = 0`, then a right rotation with extend `RXX` is performed. [^2] - `shift` - the shift type, encoded as a 2-bit number. Refer to the below table. - `Rm` - the register to be shifted. For an unaltered register as the offset, bits [11:4] are all left as 0s; equivalent to encoding `LSL #0`. | Shift[6:5] | Symbol | Shift type | | ---------- | ------ | ------------------------------------------------ | | `00` | LSL | [[Bit Shift#Logical shift\|Logical]] left | | `01` | LSR | Logical right | | `10` | ASR | [[Bit Shift#Arithmetic shift\|Arithmetic]] right | | `11` | ROR | [[Bit Shift#Rotate\|Rotate]] right | [^1]: ARM instructions are *32-bits or 4 bytes* wide. [^2]: A *right* [[Bit Shift#Rotate through carry|rotation with extend]] (RRX) is encoded by the equivalent to a right rotation by zero. This performs a 33-bit right rotation using the *carry flag* as the 33rd bit.