Module 1B - ARM Cortex M0+ Core Architecture
Module 1B - ARM Cortex M0+ Core Architecture
1
Overview
References
DDI0419C Architecture ARMv6-M Reference Manual
2
Microcontroller vs. Microprocessor
Both have a CPU core to
execute instructions
Microcontroller has peripherals
for embedded interfacing and
control
Analog
Non-logic level
signals
Timing
Clock generators
Communications
◦ point to point
◦ network
Reliability
and safety
3
Architectures and Memory Speed
Load/Store Architecture
Developed to simplify CPU design and improve performance
◦ Memory wall: CPUs keep getting faster than memory
◦ Memory accesses slow down CPU, limit compiler optimizations
◦ Change instruction set to make most instructions independent of memory
Data processing instructions can access registers only
1. Load data into the registers
2. Process the data
3. Store results back into memory
More effective when more registers are available
Register/Memory Architecture
Data processing instructions can access memory or registers
Memory wall is not very high at lower CPU speeds (e.g. under 50 MHz)
4
ARM Processor Core Registers
5
ARM Processor Core Registers (32 bits each)
R0-R12 - General purpose registers for data processing
7
Memory Maps For Cortex M0+ and MCU
KL25Z128VLK4
0x2000_2FFF
SRAM_U (3/4)
16 KB SRAM
0x2000_0000
SRAM_L (1/4)
0x1FFF_F000
0x0001_FFFF
128KB Flash
0x0000_0000
8
Endianness
For a multi-byte value, in what
order are the bytes stored?
9
ARM, Thumb and Thumb-2 Instructions
ARM instructions optimized for resource-rich high-performance computing systems
Deeply pipelined processor, high clock rate, wide (e.g. 32-bit) memory bus
Low-end embedded computing systems are different
Slower clock rates, shallow pipelines
Different cost factors – e.g. code size matters much more, bit and byte operations critical
Modifications to ARM ISA to fit low-end embedded computing
1995: Thumb instruction set
◦ 16-bit instructions
◦ Reduces memory requirements (and performance slightly)
2003: Thumb-2 instruction set
◦ Adds some 32 bit instructions
◦ Improves speed with little memory overhead
CPU decodes instructions based on whether in Thumb state or ARM state - controlled by T bit
10
Instruction Set
Cortex-M0+ core implements ARMv6-M Thumb instructions
Only uses Thumb instructions, always in Thumb state
Most instructions are 16 bits long, some are 32 bits
Most 16-bit instructions can only access low registers (R0-R7), but some can access high
registers (R8-R15)
Conditional execution only supported for 16-bit branch
32 bit address space
Half-word aligned instructions
See ARMv6-M Architecture Reference Manual for specifics per instruction
(Section A.6.7)
11
Assembly Instructions
12
Instruction Format: Labels
13
Instruction Format: Mnemonic
label mnemonic operand1, operand2, operand3 ; comments
14
Instruction Format: Operands
label mnemonic operand1, operand2, operand3 ; comments
Operands
Registers
Constants (called immediate values)
Number of operands varies
No operands: DSB
One operand: BX LR
Two operands: CMP R1, R2
Three operands: ADD R1, R2, R3
Four operands: MLA R1, R2, R3, R4
Normally
operand1 is the destination register, and operand2 and operand3 are source operands.
operand2 is usually a register, and the first source operand
operand3 may be a register, an immediate number, a register shifted to a constant number of
bits, or a register plus an offset (used for memory access).
15
Instruction Format: Comments
16
ARM Instruction Format
label mnemonic operand1, operand2, operand3 ; comments
17
Update Condition Codes in APSR?
https://developer.arm.com/documentation/ddi0595/2021-12/AArch64-Registers/NZCV--Condition-
Flags
18
Instruction Set Summary
Instruction Type Instructions
Move MOV
Load/Store LDR, LDRB, LDRH, LDRSH, LDRSB, LDM, STR, STRB, STRH, STM
Add, Subtract, Multiply ADD, ADDS, ADCS, ADR, SUB, SUBS, SBCS, RSBS, MULS
Compare CMP, CMN
Logical ANDS, EORS, ORRS, BICS, MVNS, TST
Shift and Rotate LSLS, LSRS, ASRS, RORS
Stack PUSH, POP
Conditional branch IT, B, BL, B{cond}, BX, BLX
Extend SXTH, SXTB, UXTH, UXTB
Reverse REV, REV16, REVSH
Processor State SVC, CPSID, CPSIE, SETEND, BKPT
No Operation NOP
Hint SEV, WFE, WFI, YIELD
19
Load/Store Register
ARM is a load/store architecture, so must process data in registers (not memory)
20
Load-Modify-Store
C statement
x = x + 1;
Assume variable X resides in memory
and is a 32-bit integer
21
3 Steps: Load, Modify, Store
Registers
2
ALU
Modify
x = x + 1; 1 Load 3 Store
22
Example 1: Adding Two Integers
int x = 1;
int y = 2;
int z;
23
Adding Two Integers
int x = 1;
int y = 2;
int z;
int x = 1;
int y = 2;
int z;
25
Example 2: Set a Bit in C
a |= (1 << k)
or
a = a | (1 << k)
Example: k = 5
a a7 a6 a5 a4 a3 a2 a1 a0
1 << k 0 0 1 0 0 0 0 0
a | (1 << k) a7 a6 1 a4 a3 a2 a1 a0
The other bits should not be affected.
26
Set a Bit in Assembly
a |= (1 << 5)
Solution:
MOVS r4, #1 ; r4 = 1
LSLS r4, r4, #5 ; r4 = 1<<5
ORRS r0, r0, r4 ; r0 = r0 | 1<<5
27
Example 3: 64 Bit Addition
start
; C = A + B
; Two 64-bit integers A (r1,r0) and B (r3, r2).
; Result C (r5, r4)
; A = 00000002FFFFFFFF
; B = 0000000400000001
LDR r0, =0xFFFFFFFF ; A’s lower 32 bits
LDR r1, =0x00000002 ; A’s upper 32 bits
LDR r2, =0x00000001 ; B’s lower 32 bits
LDR r3, =0x00000004 ; B’s upper 32 bits
; Add A and B
ADDS r4, r2, r0 ; C[31..0] = A[31..0] + B[31..0], update Carry
ADC r5, r3, r1 ; C[64..32] = A[64..32] + B[64..32] + Carry
stop B stop
28