Section 4. Architecture
HIGHLIGHTS
This section of the manual contains the following major topics:
4.1
4.2
4.3
4.4
4.5
4.6
4.7
Introduction .................................................................................................................... 4-2
Clocking Scheme/Instruction Cycle ............................................................................... 4-5
Instruction Flow/Pipelining ............................................................................................. 4-6
I/O Descriptions ............................................................................................................. 4-7
Design Tips .................................................................................................................. 4-14
Related Application Notes............................................................................................ 4-15
Revision History ........................................................................................................... 4-16
4
Architecture
2000 Microchip Technology Inc.
DS39504A-page 4-1
PIC18C Reference Manual
4.1
Introduction
The high performance of the PIC18CXXX devices can be attributed to a number of architectural
features commonly found in RISC microprocessors. These include:
•
•
•
•
•
•
•
•
Harvard architecture
Long Word Instructions
Single Word Instructions
Single Cycle Instructions
Instruction Pipelining
Reduced Instruction Set
Register File Architecture
Orthogonal (Symmetric) Instructions
Figure 4-2
shows a general block diagram for PIC18CXXX devices.
Harvard Architecture:
Harvard architecture has the program memory and data memory as separate memories which
are accessed from separate buses. This improves bandwidth over traditional von Neumann
architecture in which program and data are fetched from the same memory using the same bus.
To execute an instruction, a von Neumann machine must make one or more (generally more)
accesses across the 8-bit bus to fetch the instruction. Then data may need to be fetched, oper-
ated on and possibly written. As can be seen from this description, the bus can become
extremely congested. With a Harvard architecture, the instruction is fetched in a single instruction
cycle (all 16 bits). While the program memory is being accessed, the data memory is on an inde-
pendent bus and can be read and written. These separated busses allow one instruction to exe-
cute, while the next instruction is fetched. A comparison of Harvard and von Neumann
architectures is shown in
Figure 4-1.
Figure 4-1: Harvard vs. von Neumann Block Architectures
Harvard
von Neumann
Data
Memory
CPU
8
16
Program
Memory
CPU
8
Program
and
Data
Memory
Long Word Instructions:
Long word instructions have a wider (more bits) instruction bus than the 8-bit data memory bus.
This is possible because the two buses are separate. This allows instructions to be sized differ-
ently than the 8-bit wide data word and allows a more efficient use of the program memory, since
the program memory width is optimized to the architectural requirements.
Single Word Instructions:
Single word instruction opcodes are 16-bits wide making it possible to have all but a few instruc-
tions be single word instructions. A 16-bit wide program memory access bus fetches a 16-bit
instruction in a single cycle. With single word instructions, the number of words of program mem-
ory locations equals the number of instructions for the device. This means that all locations are
valid instructions.
Typically in the von Neumann architecture, most instructions are multi-byte. In general, a device
with 4 Kbytes of program memory would allow approximately 2K of instructions. This 2:1 ratio is
generalized and dependent on the application code. Since each instruction may take multiple
bytes, there is no assurance that each location is a valid instruction.
DS39504A-page 4-2
2000 Microchip Technology Inc.
Section 4 Architecture
Double Word Instructions:
Some operations require more information then can be stored in the 16 bits of a program memory
location. These operations require a double word instruction, and are therefore 32-bits wide.
Instructions that require this second instruction word are:
• Memory to memory move instruction (12 bits for each RAM address)
-
MOVFF
SourceReg, DestReg
• Literal value to FSR move instruction (12 bits for data and 2 bits for FSR to load)
-
LFSR
FSR#, Address
• Call and goto operations (20 bits for address)
-
CALL
Address
-
GOTO
Address
The first word indicates to the CPU that the next program memory location is the additional infor-
mation for this instruction and not an instruction. If the CPU tries to execute the second word of
an instruction (due to a software modified PC pointing to that location as an instruction), the
fetched data is executed as a
NOP.
Double word instruction execution is not split between the two T
CY
cycles by an interrupt request.
That is, when an interrupt request occurs during the execution of a double word instruction, the
execution of the instruction is completed before the processor vectors to the interrupt address.
The interrupt latency is preserved.
Instruction Pipeline:
The instruction pipeline is a two-stage pipeline that overlaps the fetch and execution of instruc-
tions. The fetch of the instruction takes one T
CY
, while the execution takes another T
CY
. However,
due to the overlap of the fetch of current instruction and execution of previous instruction, an
instruction is fetched and another instruction is executed every T
CY
.
Single Cycle Instructions:
With the program memory bus being 16-bits wide, the entire instruction is fetched in a single
machine cycle (T
CY
), except for double word instructions. The instruction contains all the infor-
mation required and is executed in a single cycle. There may be a one cycle delay in execution
if the result of the instruction modified the contents of the program counter. This requires the pipe-
line to be flushed and a new instruction to be fetched.
Two Cycle Instructions:
Double word instructions require two cycles to execute, since all the required information is in the
32 bits.
Reduced Instruction Set:
When an instruction set is well designed and highly orthogonal (symmetric), fewer instructions
are required to perform all needed tasks. With fewer instructions, the whole set can be more rap-
idly learned.
4
Architecture
Register File Architecture:
The register files/data memory can be directly or indirectly addressed. All special function regis-
ters, including the program counter, are mapped in the data memory.
Orthogonal (Symmetric) Instructions:
Orthogonal instructions make it possible to carry out any operation on any register using any
addressing mode. This symmetrical nature and lack of “special instructions” make programming
simple yet efficient. In addition, the learning curve is reduced significantly. The Enhanced MCU
instruction set uses only three non-register oriented instructions, which are used for two of the
cores features. One is the
SLEEP
instruction, which places the device into the lowest power use
mode. The second is the
CLRWDT
instruction, which verifies the chip is operating properly by pre-
venting the on-chip Watchdog Timer (WDT) from overflowing and resetting the device. The third
is the
RESET
instruction, which resets the device.
2000 Microchip Technology Inc.
DS39504A-page 4-3
PIC18C Reference Manual
Figure 4-2:
General Enhanced MCU Block Diagram
Data Bus<8>
PORTA
Table Pointer<21>
8
21
21
Address Latch
Program Memory
(up to 2M Bytes)
Data Latch
31 Level Stack
20
PCLATU PCLATH
21
Data Latch
8
8
Data RAM
(up to 4K
address reach)
Address Latch
PORTB
12
Address<12>
4
BSR
inc/dec logic
RA0
RA1
RA2
RA3
RA4
RA5
RA6
PCU PCH PCL
Program Counter
12
FSR0
FSR1
FSR2
inc / dec
logic
4
Bank0, F
RB0/INT0
RB1/INT1
RB2/INT2
RB3
RB<7:4>
PORTC
RC0
RC1
RC2
RC3
RC4
RC5
RC6
RC7
PORTD
RD0
RD1
RD2
RD3
RD4
RD5
RD6
RD7
8
PORTE
RE0
RE1
RE2
RE3
RE4
RE5
RE6
RE7
12
16
TABLELATCH
Decode
8
ROMLATCH
Instruction
Register
Instruction
Decode &
Control
OSC2/CLKOUT
OSC1/CLKIN
Timing
Generation
Power-up
Timer
Oscillator
Start-up Timer
Power-on
Reset
4X PLL
Watchdog
Timer
Brown-out
Reset
3
8
PRODH PRODL
8 x 8 Multiply
W
8
8
ALU<8>
8
T1OSI
T1OSO
BITOP
8
8
Precision
Bandgap
Reference
MCLR
V
DD
, V
SS
PORTx
Rx0
Rx1
Rx2
Rx3
Rx4
Rx5
Rx6
Rx7
Timer0
Timer1
Timer2
Timer3
A/D Converter
Other
Peripherals
CCP’s
Enhanced
CCP’s
Master
Synchronous
Serial Port
Addressable
USART
CAN
USB
Peripheral Modules (Note 1)
Note 1:
Many of the general purpose I/O pins are multiplexed with one or more peripheral module functions. The multiplexing combinations are
device dependent.
DS39504A-page 4-4
2000 Microchip Technology Inc.
Section 4 Architecture
4.2
Clocking Scheme/Instruction Cycle
The clock input is internally divided by four to generate four non-overlapping quadrature clocks,
namely Q1, Q2, Q3 and Q4. Internally, the program counter is incremented every Q1, and the
instruction is fetched from the program memory and latched into the instruction register in Q4.
The instruction is decoded and executed during the following Q1 through Q4. The clocks and
instruction execution flow are illustrated in
Figure 4-3
and
Example 4-1.
Figure 4-3: Clock/Instruction Cycle
T
CY
1
Q1
Device Clock
(OSC1 or T1OSCI)
Q1
Q2
Q3
Q4
PC
CLKOUT
(RC mode)
Fetch INST (PC)
Execute INST (PC-2)
Fetch INST (PC+2)
Execute INST (PC)
Fetch INST (PC+4)
Execute INST (PC+2)
T
CY
2
Q3
Q4
Q1
Q2
Q3
Q4
Q1
T
CY
3
Q2
Q3
Q4
Q2
Internal
phase
clock
PC
PC+2
PC+4
4.2.1
Phase Lock Loop (PLL)
The clock input is multiplied by four by the PLL. Therefore, when it is internally divided by four, it
provides an instruction cycle that is the same frequency as the external clock frequency. Four
non-overlapping quadrature clocks, namely Q1, Q2, Q3 and Q4 are still generated internally.
Internally, the program counter (PC) is incremented every Q1, and the instruction is fetched from
the program memory and latched into the instruction register in Q4. The instruction is decoded
and executed during the following Q1 through Q4. The clocks and instruction execution flow are
illustrated in
Figure 4-4
and
Example 4-1.
Figure 4-4: Clock/Instruction Cycle with PLL
T
CY
1
Q1
PLL Output
Q1
Q2
Q3
Q4
PC
OSC2/CLKOUT
(RC mode)
Fetch INST (PC)
Execute INST (PC-2)
Fetch INST (PC+2)
Execute INST (PC)
Fetch INST (PC+4)
Execute INST (PC+2)
PC
PC+2
PC+4
4
T
CY
2
T
CY
3
Architecture
Q2
Q3
Q4
Q1
Q2
Q3
Q4
Q1
Q2
Q3
Q4
Internal
phase
clock
2000 Microchip Technology Inc.
DS39504A-page 4-5