A NEW GENERATION OF
TAG SRAMS—THE IDT71215 AND
IDT71216
Integrated Device Technology, Inc.
APPLICATION
NOTE
AN-136
By Kelly Maas
INTRODUCTION
The 71215 and 71216 represent a new generation of
integrated Tag SRAMs. Just as earlier Tag SRAMs such as
the 71B74 were better suited for tag applications than conven-
tional SRAMs, the 71215/16 go a step further by integrating
new features to significantly ease the design of high perfor-
mance cache subsystems for today’s high speed processors.
These Tag RAMs are designed for easy interfacing to Intel and
PowerPC processors, but are very flexible and can easily be
used in other applications as well.
This application note first provides some background infor-
mation on caches, then describes in detail the architecture
and operation of the 71215 and 71216. This is followed by
three application examples, then a brief discussion of cache
coherency protocol implementation using these Tag RAMs.
Since the 71215 and 71216 are very similar, the descriptions
and explanations in this application note apply to both unless
otherwise noted.
CACHE AND TAG BASICS
For those new to caches, a brief review of cache basics may
be worthwhile. A cache is a memory that provides a CPU with
high speed access to a subset of the data from main memory.
Our discussions are focused on the secondary cache, which
is also known as the L2 cache, but it is not much different from
the faster primary (L1) cache residing inside most CPUs.
The cache consists of a controller, a data memory and a tag
memory. The purpose of the data memory is to store the
active data from main memory, and is composed of either
synchronous burst or asynchronous SRAMs. The tag memory
stores indexes (part of the CPU address field) that indicate
which data is stored in the cache. Additionally, most caches
also require at least one bit of memory for each cache entry,
to indicate the valid or dirty status of that entry. Figure 1 shows
how the CPU address field relates to the cache and the tag
memory. This example includes valid and dirty status bits, and
represents a 512KB cache, 2GB cacheable address space,
32-byte line size, and 8-byte word size.
DATA SRAM ADDRESS
A31
MSB
A30
A19
A18
A5
A4
A3
LSB
TAG MEMORY
12
1
1
TAG
ADDRESS
TAG
LINE
VALID
LINE
DIRTY
COMPARATOR
MATCH
to CACHE CONTROLLER
3176 drw 01
Figure 1. CPU Address Field and the L2 Cache (Showing 512 KB cache size and 2 GB cacheable main memory)
The IDT logo is a registered trademark of Integrated Device Technology, Inc.
PowerPC is a trademark of International Business Machines Corporation
Pentium is a trademark of Intel Corporation
©1995
Integrated Device Technology, Inc.
1/95
A NEW GENERATION OF TAG SRAMS—THE IDT71215 AND IDT71216
APPLICATION NOTE AN-136
Integrated Tag RAMs operate as ordinary SRAMs, but
have an additional access mode in which a word of data (an
index) is internally read (but not driven off-chip) and compared
with the CPU address driven onto the Tag RAM’s data bus.
Figure 2 shows the basic architecture of an integrated Tag
SRAM. The comparator indicates whether the cache holds
the data for the address supplied by the CPU or other bus
master. This is a critical timing path since this tag “hit” or “miss”
must be determined before the cache memory access can be
completed (or even started, in many cases). Note that tag
memories connect only to the CPU address bus and never to
the CPU data bus.
BASIC TAG RAM ARCHITECTURE
bit status memory on chip.
THE 71215 AND 71216
As shown in Figure 3, these 16K x 15 RAMs are configured
internally as two memories: 16K x 12 for tag and 16K x 3 for
status. These two memories share the address bus but are
controlled independently. An important new feature is extra
pins and logic for generating
BRDY
(Intel’s Burst Ready) and
TA
(PowerPC’s Transfer Acknowledge). These are CPU input
signals which are time critical in zero wait state secondary
caches. I/O’s are 3.3V compatible and there is a low power
standby mode. All writes are synchronous as with burst data
SRAMs, while all reads and compares are asynchronous for
minimum delay. Two opposite polarity chip select pins are
provided for easy depth expansion.
WRITE
DATA
IN
MEMORY
DATA
OUT
ADDRESS
DATA
(TAG)
READ
COMPARE
MATCH
3176 drw 02
Figure 2. Basic Integrated Tag SRAM Architecture
An additional feature of the Tag SRAM is that a portion of
the memory is resettable. This permits use of one bit of the
data field as a “valid” status bit. Upon system initialization,
when the cache contains random data, a quick reset will clear
the valid bit for every cache line so that all initial cache
accesses will result in a miss. A miss then causes the address
to be loaded into the Tag RAM, data from main memory to be
loaded into the data RAMs, and the valid bit to be set true. If
not included in the Tag RAM, this function requires an addi-
tional 1-bit wide SRAM.
The reset feature of earlier Tag RAMs was sufficient for
implementation of a valid bit, but nothing more. Today’s
secondary caches frequently implement four-state write-back
protocols such as MESI, with multiprocessor applications
requiring five states (e.g. MOESI) or more. Hence, most
caches need a two- or three-bit status memory that is ac-
cessed separately from the tag memory. It is used in conjunc-
tion with the match output to determine the response to a CPU
memory access or a snoop. (A snoop is an operation initiated
by the system in order to maintain coherency between the
cache(s) and main memory.) This has typically been handled
with yet another RAM - a conventional separate I/O SRAM
organized as either x1 or x4. The 71215/16 includes a three-
2
A NEW GENERATION OF TAG SRAMS—THE IDT71215 AND IDT71216
APPLICATION NOTE AN-136
ADDR(0:13)
16K x 12
MEMORY
16K x 3
MEMORY
VLD
OUT
DLY
OUT
WT
OUT
OET
TAG (0:11)
VLDin / S1
IN
DLYin / S2
IN
WTin / S3
IN
BRDYIN
(TAIN)
RESET
CLK
SFUNC
BRDYH (TAH)
W/
R
(TT1)
MATCH AND
BRDY LOGIC
MATCH
REGISTER
WET
WES
OES
BRDY
(
TA
)
BRDYOE (TAOE)
CS1
CS2
CONTROL
LOGIC
Chip enabling
Reseting the 16K x 3 memory
Powering down
Disabling outputs
PWRDN
3176 drw 03
Figure 3. Simplified 71215 / 71216 Block Diagram (71216 signal names are in parenthesis)
With a 16K x 12 tag memory, the 71215 and 71216 are
wider and deeper than most Tag RAMs. For a typical 64-bit
CPU with a 32-byte line size, the 16K depth supports a 512KB
cache while the 12-bit tag field supports 2GB of cacheable
main memory. Thus, only a single component is required for
most applications. Table 1 shows the relationships between
Tag RAM size, cache size, and cacheable main memory size.
The Tag depth is equal to the cache size divided by the line
size. The Tag width is equal to the base-2 log of the ratio of
main memory size to cache size.
TABLE 1: REQUIRED TAG RAM SIZE AS A
FUNCTION OF CACHE SIZE AND MAIN
MEMORY SIZE
(For 32-byte line size and direct
mapped cache architecture.)
Cache Size
64MB
128KB
256KB
512KB
1MB
4K x 9
8K x 8
16K x 7
32K x 6
Cacheable Main Memory Size
256MB
4K x 11
8K x 10
16K x 9
32K x 8
1GB
4K x 13
8K x 12
2GB
4K x 14
8K x 13
4GB
4K x 15
8K x 14
For a 1MB cache and 4GB of cacheable main memory, two
of the devices may be cascaded in depth without any timing
penalty apart from increased capacitive loading. This is
accomplished with the two Chip Select pins. A low order
address signal may be connected to
CS1
on one chip and to
CS2 on the other so that at any given time, one is selected and
the other is deselected. The deselected chip ignores all
control inputs (except
RESET
and
PWRDN
) and tri-states its
outputs so that the two chips can be conveniently bussed
together. As expected, worst case timing delays from the Chip
Select inputs are the same as for the Address inputs. When
only a single 71215 or 71216 is used in an application,
CS1
is
tied to V
SS
and CS2 is tied to V
CC
.
16K x 11 16K x 12 16K x 13
32K x 10 32K x 11 32K x 12
3176 tbl 01
3
A NEW GENERATION OF TAG SRAMS—THE IDT71215 AND IDT71216
APPLICATION NOTE AN-136
ADDR (13:0)
Reg
0
1
16K x 12
MEMORY
TAG
16K x 3
MEMORY
STATUS
CS1
CS2
Register
Data
IN
Register
SA
SA
Data
IN
Register
VLD/S1
IN
DLY/S2
IN
WT/S3
IN
TAG (11:0)
OET
WRITE
(pos) PULSE
GENERATOR
VLD/S1
OUT
DLY/S2
OUT
WT/S3
OUT
WET
WES
CLK
REGISTER
RESET
(neg) PULSE
GENERATOR
COMPARE
OES
RESET
PWRDN
SFUNC
W/
R
(TT1)
BRDYH (TAH)
71216 only
MATCH
BRDYIN
(
TAIN
)
BRDYOE
(
TAOE
)
REG-
ISTER
BRDY
(
TA
)
3176 drw 04
Figure 4. Detailed 71215 / 71216 Block Diagram (71216 pin names are in parenthesis)
4
A NEW GENERATION OF TAG SRAMS—THE IDT71215 AND IDT71216
APPLICATION NOTE AN-136
The 71215/16 is shown in more detail in Figure 4. The tag
memory is controlled by the Write Enable Tag (
WET
) and
Output Enable Tag (
OET
) pins. During writes,
WET
is synchro-
nous to CLK, as are the input data (TAG0 - TAG11) and
address (A0 - A13). Note that
WET
has no effect on the TAG
output buffers, so
OET
must be high to disable the outputs
during writes. Reads are performed by deasserting
WET
and
asynchronously asserting
OET
. For cache architectures in
which the tag is never read (e.g. write-through caches),
OET
may be tied to V
CC
. When both
WET
and
OET
are high, the
71215/16 is in the match mode, where the TAG0 - TAG11
inputs are compared with the stored data and are used to
generate the MATCH and
BRDY
/
TA
outputs. In both read and
WT
IN
/ S3
IN
DTY
IN
/ S2
IN
VLD
IN
/ S1
IN
I/O
match modes, the address path is flow-through for the fastest
possible response to a new address.
The three status bits of the 71215/16 are labeled VLD/S1,
DTY/S2, and WT/S3. The reason for the dual names is that
their functions vary, dependent on the state of the static Status
Function (SFUNC) input signal. When SFUNC is low, the
status bits are said to be in a “dedicated” mode and are
referred to as Valid, Dirty and Write-Through. See Figure 5.
When SFUNC is high, the status bits play no special role within
the 71215/16 and are simply referred to as Status 1, Status 2
and Status 3. See Figure 6. The functionality of VLD and WT
in the dedicated mode is described later. DTY/S2 does not
have any special functionality within the 71215/16.
Address
MEMORY
V
D WP
WT
OUT
/ S3
OUT
DTY
OUT
/ S2
OUT
COMPARE
VLD
OUT
/ S1
OUT
WET
WES
internal RESET
OE
CLK
71216 only
MATCH
W/
R
(TT1)
BRDYH (TAH)
BRDY
IN
(
TAIN
)
BRDYOE
(
TAOE
)
Figure 5. Dedicated Mode Logic (71216 pin names are in parenthesis)
5
BRDY
(
TA
)
3176 drw 05