# IDT79R3721 DRAM Controller







INTEGRATED DEVICE TECHNOLOGY, INC.

## The IDT R3721 DRAM Controller Hardware User's Manual

**Preliminary Information** 

**Revision 2.0** 

August 12, 1992

©1992 Integrated Device Technology, Inc.

## ABOUT THIS MANUAL

This manual has been constructed as a detailed applications guide on the use of the IDT R3721 and IDT73720 to construct appropriate DRAM subsystems for an R3051 family CPU. The manual has been written to describe a wide variety of memory subsystems. The manual has been written assuming that the system designer will primarily study the types of subsystems appropriate to the application at hand; it is not assumed that each system designer will read the manual in its entirety.

In addition to the design information, the manual contains overview chapters on the DRAM controller (IDT R3721), Bus Exchanger (IDT73720) and R3051 family bus interface. Also included is a brief review of DRAM fundamentals.

A quantitative description of the R3721 electrical interface is provided in the data sheet for this product. Also included in the data sheets are the mechanical descriptions of the part, including packaging and pin-out.

Additional information on development tools, additional support chips, the R3051 family, and the use of these products in various applications, are provided in separate data sheets and applications notes.

Any of this information is readily available from your local IDT sales representative.

## **CONTENTS OVERVIEW**

**Chapter 1** contains a brief overview of the capabilities of the R3721 DRAM controller.

Chapter 2 contains a description of the R3051 family bus interface.

**Chapter 3** contains a brief overview of the fundamentals of DRAM operation.

Chapter 4 describes how the R3721 DRAM controller operates.

**Chapter 5** describes how to program the R3721 to enable the various features and timing models it supports.

**Chapter 6** describes the various interfaces of the R3721, and describes how to connect it to the CPU, the DRAMs, the data path, and also how to use it with other memory subsystems.

**Chapter 7** describes the considerations involved in the construction of a noninterleaved DRAM subsystem. Various DRAM configurations are described.

**Chapter 8** provides a detailed analysis of a particular non-interleaved memory configuration. This chapter contains information on how to perform the timing analysis required to properly program the R3721 in such a system.

**Chapter 9** describes the considerations involved in the construction of an interleaved DRAM subsystem.

**Chapter 10** contains a detailed description of a particular interleaved memory configuration. This chapter also contains a detailed analysis of how to properly program the R3721 for an interleaved memory system.

**Chapter 11** describes the reset sequence, the refresh timing, and the clocking of the R3721.

Appendix A describes the IDT73720 Bus Exchanger.

Integrated Device Technology, Inc. reserves the right to make changes to its products or specifications at any time, without notice, in order to improve design or performance and to supply the best possible product. IDT does not assume any responsibility for use of any circuitry described other than the circuitry embodied in an IDT product. The Company makes no representations that circuitry described herein is free from patent infringement or other rights of third parties which may result from its use. No license is granted by implication or otherwise under any patent, patent rights or other rights, of Integrated Device Technology, Inc.

#### LIFE SUPPORT POLICY

Integrated Device Technology's products are not authorized for use as critical components in life support devices or systems unless a specific written agreement pertaining to such intended use is executed between the manufacturer and an officer of IDT.

- 1. Life support devices or systems are devices or systems which (a) are intended for surgical implant into the body or (b) support or sustain life and whose failure to perform, when properly used in accordance with instructions for use provided in the labeling, can be reasonably expected to result in a significant injury to the user.
- 2. A critical component is any components of a life support device or system whose failure to perform can be reasonably expected to cause the failure of the life support device or system, or to affect its safety or effectiveness.

The IDT logo is a registered trademark and RISController, R3051, and RISChipset are trademarks of Integrated Device Technology, Inc. MIPS is a registered trademarks of MIPS Computer Systems, Inc. UNIX is a registered trademark of AT&T.

All others are trademarks of their respective companies.



## TABLE OF CONTENTS

## Chapter 1: R3721 Overview

| Introduction                               | 1-1          |
|--------------------------------------------|--------------|
| Description                                | 1-2          |
| Configurability                            | 1-4          |
| Performance Considerations                 | 1-5          |
| Applications                               | 1-5          |
|                                            | 10           |
| Chapter 2: R3051 Family Interface Overview |              |
|                                            | 2-2          |
| R3051 Bus Interface Pin Description        |              |
| Read Transactions                          | 2-5          |
| Read Interface Timing Overview             | 2-5          |
| Memory Addressing                          | 2-5          |
| Bus Turn Around                            | 2-6          |
| Bringing Data into the Processor           | 2-7          |
| Terminating the Read                       | 2-8          |
| Read Timing Diagrams                       | 2-8          |
| Single Word Reads                          | 2-8          |
| Quad Word Reads                            | 2-8          |
| Write Interface                            | 2-15         |
| Types of Write Transactions                | 2-15         |
| Write Interface Timing Overview            | 2-16         |
|                                            | 2-10<br>2-16 |
| Memory Addressing                          | -            |
| Data Phase                                 | 2-16         |
| Terminating the Write                      | 2-17         |
| Write Timing Diagrams                      | 2-18         |
| Basic Write                                | 2-18         |
| DMA Arbiter Interface                      | 2-19         |
| Interface Overview                         | 2-19         |
| DMA Arbiter Timing Diagrams                | 2-19         |
| Initiation of DMA Mastership               | 2-19         |
| Relinquishing Mastership Back to the CPU   | 2-20         |
|                                            |              |
| Chapter 3: Fundamentals of DRAM Operation  |              |
| Introduction                               | 3-1          |
| DRAM Architecture                          | 3-1          |
| Normal Access                              | 3-3          |
|                                            | 3-3          |
| Page Mode and Static Column Accesses       |              |
| DRAM Refresh and Pre-charge                | 3-3          |
| Memory System Configurations               | 3-6          |
| Summary                                    | 3-7          |
|                                            |              |
| Chapter 4: R3721 Operation Overview        |              |
| Introduction                               | 4-1          |
| R3051 Bus Interface                        | 4-1          |
| R3721 DRAM Interface                       | 4-2          |
| Pin Description                            | 4-5          |
| •                                          | -            |
| Chapter 5: Programming the R3721           |              |
| Introduction                               | 5-1          |
| The Mode Register                          | 5-1          |
| Programming the Mode Register              | 5-1<br>5-1   |
|                                            | 5-1          |
| DRAM Size Field                            |              |
| External Memory Configuration              | 5-2          |

| Write Near                                                                    | 5-2                          |
|-------------------------------------------------------------------------------|------------------------------|
| $\overline{RAS}$ to $\overline{CAS}$ Delay                                    | 5-2                          |
| RAS Timing                                                                    | 5-3                          |
| CAS Pulse Width                                                               | 5-4                          |
| CAS Pre-charge Time                                                           | 5-5                          |
| Refresh Period                                                                | 5-5                          |
| Delayed Chip-select                                                           | 5-6                          |
| Default Settings                                                              | 5-7                          |
| Writing to the Mode Register                                                  | 5-7                          |
| Auto Configuration Detection and Initialization                               | 5-8                          |
|                                                                               |                              |
| Chapter 6: R3721 Interfacing                                                  |                              |
| Introduction                                                                  | 6-1                          |
| R3051 Bus Interface                                                           | 6-1                          |
| R3721 DRAM Interface                                                          | 6-4                          |
| Data Path Control Interface                                                   | 6-6                          |
| Summary                                                                       | 6-8                          |
|                                                                               |                              |
| Chapter 7: The Use of the R3721 in a Non-interleaved Memory System            |                              |
| Introduction                                                                  | 7-1                          |
| Non-Interleaved System Design                                                 | 7-1                          |
| Single Read Transaction Timings                                               | 7-2                          |
| Start of Single Read Access                                                   | 7-2                          |
| Memory Control Signals for Single Read Accesses                               | 7-4                          |
| End of a Single Read Access                                                   | 7-5                          |
| Page Read Accesses                                                            | 7-3<br>7-8                   |
| Single Read Access Outside of Page                                            | <b>7</b> -10                 |
| Single Write Transaction Timings                                              | <b>7-</b> 10<br><b>7-</b> 11 |
| Start of Write Access                                                         | 7-11<br>7-11                 |
| Memory Control Signals for Single Write Accesses                              | 7-11<br>7-11                 |
| End of a Single Write Access                                                  | 7-11                         |
| Page Write Accesses                                                           | 7-13                         |
| Single Write Access Outside of Page                                           | 7-17<br>7-24                 |
|                                                                               | 7-24<br>7-25                 |
| Partial Word Write Operation                                                  |                              |
| Quad Word Read Transaction Timings                                            | 7-26                         |
| Start of Quad Word Read Access                                                | 7-26                         |
| Memory Control Signals During Quad Word Read Accesses                         | 7-26                         |
| End of a Quad Word Read Access                                                | 7-26                         |
| Page Quad Word Read Accesses                                                  | 7-30                         |
| Observer Q. Application Promoto: A New interlaged (The Deal Manage Containing |                              |
| Chapter 8: Application Example: A Non-interleaved Two Bank Memory System      |                              |
| Introduction                                                                  | 8-1                          |
| General System Description                                                    | 8-1                          |
| Detailed Description of the R3721 Connections                                 | 8-3                          |
| Multiple Banks of "x1" DRAMs                                                  | 8-4                          |
| Setting the Mode Register                                                     | 8-6                          |
| Derating Effect Due to Capacitive Loading                                     | 8-6                          |
| System Timing Diagrams                                                        | 8-15                         |
| Charter O. The line of the DOZO1 in an Interior 1 Marcon Content              |                              |
| Chapter 9: The Use of the R3721 in an Interleaved Memory System               | 0.1                          |
| Introduction                                                                  | 9-1                          |
| Interleaved System Design                                                     | 9-1                          |
| Single Read Transaction Timings                                               | 9-2                          |
| Start of Single Read Access                                                   | 9-2                          |
| Memory Control Signals for Single Read Accesses                               | 9-3                          |
| End of a Single Read Access                                                   | 9-3                          |
| Page Read Accesses                                                            | 9-5                          |
| Single Read Access Outside of Page                                            | 9-6                          |

| Single Write Transaction Timings                                           | 9-7   |
|----------------------------------------------------------------------------|-------|
| Start of Write Access                                                      | 9-7   |
| Memory Control Signals for Single Write Accesses                           | 9-7   |
| End of a Single Write Access                                               | 9-8   |
| Page Write Accesses                                                        | 9-8   |
| Single Write Access Outside of Page                                        | 9-11  |
| Partial Word Write Operation                                               | 9-11  |
| Quad Word Read Transaction Timings                                         | 9-12  |
| Start of Quad Word Read Access                                             | 9-12  |
| Memory Control Signals During Quad Word Read Accesses                      | 9-13  |
| End of a Quad Word Read Access                                             | 9-14  |
| Page Quad Word Read Accesses                                               | 9-17  |
|                                                                            |       |
| Chapter 10: Application Example: An Interleaved Two Bank-Pair Memory Syste | em    |
| Introduction                                                               | 10-1  |
| General System Description                                                 | 10-1  |
| Detailed Description of the R3721 Connections                              | 10-2  |
| Setting the Mode Register                                                  | 10-4  |
| Derating Effect Due to Capacitive Loading                                  | 10-4  |
| System Timing Diagrams                                                     | 10-12 |
|                                                                            |       |
| Chapter 11: Reset Initialization, Refresh and Input Clocking               |       |
| Introduction                                                               | 11-1  |
| Power-up and Reset                                                         | 11-1  |
| DRAM Initialization                                                        | 11-1  |
| CAS Before RAS Refresh Timings                                             | 11-2  |
| Input Clock Requirements                                                   | 11-4  |
|                                                                            |       |
| Appendix A: IDT73720 Bus Exchanger Overview                                |       |
| Introduction                                                               | A-1   |
| Major Features                                                             | A-1   |
| Description                                                                | A-2   |
| Architecture Overview                                                      | A-2   |
| Data Flow Control Signals                                                  | A-2   |
| Memory Read Operations                                                     | A-3   |
| Memory Write Operations                                                    | A-3   |
| Pin Description                                                            | A-4   |
|                                                                            |       |

#### LIST OF FIGURES

| Figure 1.1      | R3721 Dynamic Memory Controller                                           | 1-2        |
|-----------------|---------------------------------------------------------------------------|------------|
| Figure 1.2      | R3051-based System Using R3721 DRAM Controller                            | 1-3        |
| Figure 2.1      | Start of Processor Read                                                   | 2-6        |
| Figure 2.2      | Data Sampling by R3051                                                    | 2-7        |
| Figure 2.3      | End of Read                                                               | 2-9        |
| Figure 2.4      | Single Word Read Cycle                                                    | 2-10       |
| Figure 2.5 (a)  | Start of Burst Quad Word Read                                             | 2-11       |
| Figure 2.5 (b)  | End of Burst Quad Word Read                                               | 2-12       |
| Figure 2.6 (a)  | Start of Throttled Quad Word Read                                         | 2-13       |
| Figure 2.6 (b)  | End of Throttled Quad Word Read                                           | 2-14       |
| Figure 2.7      | Start of Write                                                            | 2-17       |
| Figure 2.8      | End of Write                                                              | 2-18       |
| Figure 2.9      | Basic Write                                                               | 2-18       |
| Figure 2.10     | DMA Mastership Request                                                    | 2-20       |
| Figure 2.11     | Relinquishing DMA Mastership                                              | 2-21       |
| Figure 3.1      | Separate I/O "x1" DRAM                                                    | 3-2        |
|                 |                                                                           | 3-2        |
| Figure 3.2      | Common I/O "x4" DRAM                                                      |            |
| Figure 3.3      | Normal DRAM Access                                                        | 3-4        |
| Figure 3.4      | Page Mode DRAM Access                                                     | 3-5        |
| Figure 3.5      | Interleaved Memory System                                                 | 3-6        |
| Figure 4.1      | R3721 State Machine                                                       | 4-3        |
| Figure 4.2      | Mode Register of DRAM Controller                                          | 4-4        |
| D               |                                                                           | - 1        |
| Figure 5.1      | The Mode Register                                                         | 5-1        |
| Figure 5.2      | RAS to CAS Delay                                                          | 5-3        |
| Figure 5.3      | RAS to Signals Timing                                                     | 5-4        |
| Figure 5.4      | CAS Pulse Width Timing                                                    | 5-4        |
| Figure 5.5      | CAS Pre-Charge Timing                                                     | 5-5        |
| Figure 5.6      | Chip-select Timing                                                        | 5-6        |
| Figure 5.7      | Settings of Mode Register at Power Up                                     | 5-7        |
| Figure 5.8      | Writing to the Mode Register                                              | 5-8        |
| Figure 6.1      | R3721 CPU Interface Connections                                           | 6-2        |
| Figure 6.2      | R3721 DRAM Control Interface                                              | 6-4        |
| Figure 6.3      | R3721 Data Path Interface to 74FCT245s                                    | 6-6        |
| Figure 6.4      | R3721 Data Path Interface to IDT73720 Bus Exchangers                      | 6-7        |
|                 | Outline of the Mode Destates Handles on Description to the Observe        | <b>7</b> 1 |
| Figure 7.1      | Settings of the Mode Register Used as an Example in this Chapter          | 7-1        |
| Figure 7.2 (a)  | Start of Single Read Access for Fast Chip-select                          | 7-3        |
| Figure 7.2 (b)  | Start of Single Read Access for Slow Chip-select                          | 7-4        |
| Figure 7.3 (a)  | DRAM Control for RCD=0 (Single Read)                                      | 7-5        |
| Figure 7.3 (b)  | DRAM Control for RCD=1 (Single Read)                                      | 7-5        |
| Figure 7.4 (a)  | End of Single Read Access, $\overline{CAS}$ Pulse = 1.5 Clock Cycle       | 7-6        |
| Figure 7.4 (b)  | End of Single Read Access, CAS Pulse = 2.5 Clock Cycle                    | 7-7        |
| Figure 7.5      | Example of a Single Read Access                                           | 7-8        |
| Figure 7.6      | Page Read Access Timing Diagram                                           | 7-9        |
| Figure 7.7      | Single Read Access Outside of Page                                        | 7-10       |
| Figure 7.8      | Start of a Single Write Access for Fast Chip-select                       | 7-12       |
| Figure 7.9 (a)  | End of Single Write Access, CAS pulse = 1.5 Clock Cycle                   | 7-14       |
| Figure 7.9 (b)  | End of Single Write Access, CAS pulse = 2.5 Clock Cycle                   | 7-15       |
| Figure 7.10     | Timing Diagrams for a Single Write Access                                 | 7-16       |
| Figure 7.11 (a) | Optimum Page Write in Two Clock Cycles. $\overline{CS}$ and Internal Page |            |
|                 | Comparator Bypassed.                                                      | 7-18       |

| Figure 7.11 (b)                                                                                                                                                                 | 3 Clock Cycles Page Write with Slow Chip-select, $\overline{CAS}$ Pulse<br>Width = 1.5 Clock Cycles, $\overline{CAS}$ Pre-charge = 0.5 Clock Cycles                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | 7-19                                                                                     |
|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------|
| Figure 7.11 (c)                                                                                                                                                                 | 3 Clock Cycles Page Write with Slow Chip-select, CAS Pulse<br>Width = 2.5 Clock Cycles, CAS Pre-charge = 0.5 Clock Cycles                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | 7-20                                                                                     |
| Figure 7.11 (d)                                                                                                                                                                 | 3 Clock Cycles Page Write with $\overline{CAS}$ Pulse Width = 1.5 Clock                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | 7 20                                                                                     |
| 0                                                                                                                                                                               | Cycles, CAS Pre-charge = 1.5 Clock Cycles.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 7-21                                                                                     |
| Figure 7.11 (e)                                                                                                                                                                 | Page Write Using Internal Comparator, WrNear Not Asserted                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | 7-22                                                                                     |
| Figure 7.12                                                                                                                                                                     | Page Write Access Timing                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | 7-23                                                                                     |
| Figure 7.13                                                                                                                                                                     | Single Read Followed by a Single Write Followed by a Single Read Access                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | 7-24                                                                                     |
| Figure 7.14                                                                                                                                                                     | Single Write Access Outside of Page                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | 7-25                                                                                     |
| Figure 7.15 (a)                                                                                                                                                                 | Quad Word Read Transaction Timing, $\overline{CAS}$ Pulse Width = 1.5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |                                                                                          |
| - garo o (a)                                                                                                                                                                    | Clock Cycles, CAS Pre-charge = 0.5 Clock Cycle                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | 7-27                                                                                     |
| Figure 7.15 (b)                                                                                                                                                                 | Quad Word Read Transaction Timing, $\overline{CAS}$ Pulse Width = 1.5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |                                                                                          |
| Bare                                                                                                                                                                            | Clock Cycles, CAS Pre-charge = 1.5 Clock Cycle                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | 7-28                                                                                     |
| Figure 7.16                                                                                                                                                                     | Quad Word Read Access Timing Diagrams                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | 7-29                                                                                     |
| Figure 7.17                                                                                                                                                                     | Page Quad Word Read Access Timing Diagrams                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 7-30                                                                                     |
| i igai c i i i i                                                                                                                                                                | rado Anna 1101a rona 1100000 111112 5 meruno 111111111111111111111111111111111111                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |                                                                                          |
| Figure 8.1                                                                                                                                                                      | General System Using the R3051 and the R3721                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | 8-2                                                                                      |
| Figure 8.2                                                                                                                                                                      | Address Decoder PAL Equations for DRAM_CS and MSel                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | 8-3                                                                                      |
| Figure 8.3                                                                                                                                                                      | Detailed Connections for the R3721 in a Two Banks                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |                                                                                          |
| 0                                                                                                                                                                               | Non-interleaved Memory System                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | 8-4                                                                                      |
| Figure 8.4                                                                                                                                                                      | Analysis to Set RCD in the Mode Register                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | 8-9                                                                                      |
| Figure 8.5                                                                                                                                                                      | CAS Pulse Width Timing Analysis                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 8-10                                                                                     |
| Figure 8.6 (a)                                                                                                                                                                  | CAS Pre-charge Time Analysis                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | 8-12                                                                                     |
| Figure 8.6 (b)                                                                                                                                                                  | CAS Pre-charge Time Analysis During Writes                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 8-13                                                                                     |
| Figure 8.7                                                                                                                                                                      | Mode Register Settings for a Two Bank Non-interleaved System                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | 8-14                                                                                     |
| Figure 8.8                                                                                                                                                                      | Single Read Access to Bank(1)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | 8-15                                                                                     |
| Figure 8.9                                                                                                                                                                      | Single Write Access to Bank(0)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | 8-16                                                                                     |
|                                                                                                                                                                                 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |                                                                                          |
| Figure 9.1                                                                                                                                                                      | Settings of the Mode Register Used as an Example in this                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                                                                          |
| Ģ                                                                                                                                                                               |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |                                                                                          |
|                                                                                                                                                                                 | Chapter for Interleaved Memory Systems                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | 9-1                                                                                      |
| Figure 9.2                                                                                                                                                                      | Chapter for Interleaved Memory Systems<br>Example of a Single Read Access for Interleaved                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |                                                                                          |
| Figure 9.2                                                                                                                                                                      | Chapter for Interleaved Memory Systems<br>Example of a Single Read Access for Interleaved<br>Memory Systems                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 9-1<br>9-4                                                                               |
|                                                                                                                                                                                 | Chapter for Interleaved Memory Systems<br>Example of a Single Read Access for Interleaved<br>Memory Systems<br>Page Read Access Timing Diagrams in Interleaved                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | 9-4                                                                                      |
| Figure 9.2<br>Figure 9.3                                                                                                                                                        | Chapter for Interleaved Memory Systems<br>Example of a Single Read Access for Interleaved<br>Memory Systems<br>Page Read Access Timing Diagrams in Interleaved<br>Memory Systems                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |                                                                                          |
| Figure 9.2                                                                                                                                                                      | Chapter for Interleaved Memory Systems<br>Example of a Single Read Access for Interleaved<br>Memory Systems<br>Page Read Access Timing Diagrams in Interleaved<br>Memory Systems<br>Single Read Access Outside of Page for the Interleaved                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 9-4<br>9-5                                                                               |
| Figure 9.2<br>Figure 9.3<br>Figure 9.4                                                                                                                                          | Chapter for Interleaved Memory Systems<br>Example of a Single Read Access for Interleaved<br>Memory Systems<br>Page Read Access Timing Diagrams in Interleaved<br>Memory Systems<br>Single Read Access Outside of Page for the Interleaved<br>Memory System                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 9-4<br>9-5<br>9-6                                                                        |
| Figure 9.2<br>Figure 9.3<br>Figure 9.4<br>Figure 9.5                                                                                                                            | Chapter for Interleaved Memory Systems<br>Example of a Single Read Access for Interleaved<br>Memory Systems<br>Page Read Access Timing Diagrams in Interleaved<br>Memory Systems<br>Single Read Access Outside of Page for the Interleaved<br>Memory System<br>Page Write Access Timing Diagrams in Interleaved Systems                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | 9-4<br>9-5                                                                               |
| Figure 9.2<br>Figure 9.3<br>Figure 9.4                                                                                                                                          | Chapter for Interleaved Memory Systems<br>Example of a Single Read Access for Interleaved<br>Memory Systems<br>Page Read Access Timing Diagrams in Interleaved<br>Memory Systems<br>Single Read Access Outside of Page for the Interleaved<br>Memory System<br>Page Write Access Timing Diagrams in Interleaved Systems<br>Single Read Followed by a Single Write Followed by a Single                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | 9-4<br>9-5<br>9-6<br>9-9                                                                 |
| Figure 9.2<br>Figure 9.3<br>Figure 9.4<br>Figure 9.5<br>Figure 9.6                                                                                                              | Chapter for Interleaved Memory Systems<br>Example of a Single Read Access for Interleaved<br>Memory Systems<br>Page Read Access Timing Diagrams in Interleaved<br>Memory Systems<br>Single Read Access Outside of Page for the Interleaved<br>Memory System<br>Page Write Access Timing Diagrams in Interleaved Systems<br>Single Read Followed by a Single Write Followed by a Single<br>Read Access in Interleaved Systems                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | 9-4<br>9-5<br>9-6<br>9-9<br>9-10                                                         |
| Figure 9.2<br>Figure 9.3<br>Figure 9.4<br>Figure 9.5<br>Figure 9.6<br>Figure 9.7                                                                                                | Chapter for Interleaved Memory Systems<br>Example of a Single Read Access for Interleaved<br>Memory Systems<br>Page Read Access Timing Diagrams in Interleaved<br>Memory Systems<br>Single Read Access Outside of Page for the Interleaved<br>Memory System<br>Page Write Access Timing Diagrams in Interleaved Systems<br>Single Read Followed by a Single Write Followed by a Single<br>Read Access in Interleaved Systems<br>Single Write Access Outside of Page in Interleaved Systems                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 9-4<br>9-5<br>9-6<br>9-9                                                                 |
| Figure 9.2<br>Figure 9.3<br>Figure 9.4<br>Figure 9.5<br>Figure 9.6                                                                                                              | Chapter for Interleaved Memory Systems<br>Example of a Single Read Access for Interleaved<br>Memory Systems<br>Page Read Access Timing Diagrams in Interleaved<br>Memory Systems<br>Single Read Access Outside of Page for the Interleaved<br>Memory System<br>Page Write Access Timing Diagrams in Interleaved Systems<br>Page Write Access Timing Diagrams in Interleaved Systems<br>Single Read Followed by a Single Write Followed by a Single<br>Read Access in Interleaved Systems<br>Single Write Access Outside of Page in Interleaved Systems<br>Quad Word Read Transaction Timing in Interleaved Systems, CAS                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | 9-4<br>9-5<br>9-6<br>9-9<br>9-10<br>9-11                                                 |
| Figure 9.2<br>Figure 9.3<br>Figure 9.4<br>Figure 9.5<br>Figure 9.6<br>Figure 9.7<br>Figure 9.8 (a)                                                                              | Chapter for Interleaved Memory Systems<br>Example of a Single Read Access for Interleaved<br>Memory Systems<br>Page Read Access Timing Diagrams in Interleaved<br>Memory Systems<br>Single Read Access Outside of Page for the Interleaved<br>Memory System<br>Page Write Access Timing Diagrams in Interleaved Systems<br>Single Read Followed by a Single Write Followed by a Single<br>Read Access in Interleaved Systems<br>Single Write Access Outside of Page in Interleaved Systems<br>Single Write Access Outside of Page in Interleaved Systems<br>Quad Word Read Transaction Timing in Interleaved Systems, CAS<br>Pulse Width = 1.5 Clock Cycles, CAS Pre-charge = 0.5 Clock Cycle                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | 9-4<br>9-5<br>9-6<br>9-9<br>9-10                                                         |
| Figure 9.2<br>Figure 9.3<br>Figure 9.4<br>Figure 9.5<br>Figure 9.6<br>Figure 9.7                                                                                                | Chapter for Interleaved Memory Systems<br>Example of a Single Read Access for Interleaved<br>Memory Systems<br>Page Read Access Timing Diagrams in Interleaved<br>Memory Systems<br>Single Read Access Outside of Page for the Interleaved<br>Memory System<br>Page Write Access Timing Diagrams in Interleaved Systems<br>Single Read Followed by a Single Write Followed by a Single<br>Read Access in Interleaved Systems<br>Single Write Access Outside of Page in Interleaved Systems<br>Guad Word Read Transaction Timing in Interleaved Systems, CAS<br>Pulse Width = 1.5 Clock Cycles, CAS Pre-charge = 0.5 Clock Cycle<br>Quad Word Read Transaction Timing in Interleaved Systems, CAS                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 9-4<br>9-5<br>9-6<br>9-9<br>9-10<br>9-11<br>9-14                                         |
| Figure 9.2<br>Figure 9.3<br>Figure 9.4<br>Figure 9.5<br>Figure 9.6<br>Figure 9.7<br>Figure 9.8 (a)<br>Figure 9.8 (b)                                                            | Chapter for Interleaved Memory Systems<br>Example of a Single Read Access for Interleaved<br>Memory Systems<br>Page Read Access Timing Diagrams in Interleaved<br>Memory Systems<br>Single Read Access Outside of Page for the Interleaved<br>Memory System<br>Page Write Access Timing Diagrams in Interleaved Systems<br>Single Read Followed by a Single Write Followed by a Single<br>Read Access in Interleaved Systems<br>Single Write Access Outside of Page in Interleaved Systems<br>Single Write Access Outside of Page in Interleaved Systems<br>Quad Word Read Transaction Timing in Interleaved Systems, CAS<br>Pulse Width = 1.5 Clock Cycles, CAS Pre-charge = 0.5 Clock Cycle<br>Quad Word Read Transaction Timing in Interleaved Systems, CAS<br>Pulse Width = 1.5 Clock Cycles, CAS Pre-charge = 1.5 Clock Cycle                                                                                                                                                                                                                                                                                                                                      | 9-4<br>9-5<br>9-6<br>9-9<br>9-10<br>9-11                                                 |
| Figure 9.2<br>Figure 9.3<br>Figure 9.4<br>Figure 9.5<br>Figure 9.6<br>Figure 9.7<br>Figure 9.8 (a)                                                                              | Chapter for Interleaved Memory Systems<br>Example of a Single Read Access for Interleaved<br>Memory Systems<br>Page Read Access Timing Diagrams in Interleaved<br>Memory Systems<br>Single Read Access Outside of Page for the Interleaved<br>Memory System<br>Page Write Access Timing Diagrams in Interleaved Systems<br>Single Read Followed by a Single Write Followed by a Single<br>Read Access in Interleaved Systems<br>Single Write Access Outside of Page in Interleaved Systems<br>Quad Word Read Transaction Timing in Interleaved Systems, CAS<br>Pulse Width = 1.5 Clock Cycles, CAS Pre-charge = 0.5 Clock Cycle<br>Quad Word Read Transaction Timing in Interleaved Systems, CAS<br>Pulse Width = 1.5 Clock Cycles, CAS Pre-charge = 1.5 Clock Cycle<br>Quad Word Read Access Timing diagrams for Interleaved                                                                                                                                                                                                                                                                                                                                           | 9-4<br>9-5<br>9-6<br>9-9<br>9-10<br>9-11<br>9-14<br>9-15                                 |
| Figure 9.2<br>Figure 9.3<br>Figure 9.4<br>Figure 9.5<br>Figure 9.6<br>Figure 9.7<br>Figure 9.8 (a)<br>Figure 9.8 (b)<br>Figure 9.9                                              | Chapter for Interleaved Memory Systems<br>Example of a Single Read Access for Interleaved<br>Memory Systems<br>Page Read Access Timing Diagrams in Interleaved<br>Memory Systems<br>Single Read Access Outside of Page for the Interleaved<br>Memory System<br>Page Write Access Timing Diagrams in Interleaved Systems<br>Single Read Followed by a Single Write Followed by a Single<br>Read Access in Interleaved Systems<br>Single Write Access Outside of Page in Interleaved Systems<br>Quad Word Read Transaction Timing in Interleaved Systems, CAS<br>Pulse Width = 1.5 Clock Cycles, CAS Pre-charge = 0.5 Clock Cycle<br>Quad Word Read Transaction Timing in Interleaved Systems, CAS<br>Pulse Width = 1.5 Clock Cycles, CAS Pre-charge = 1.5 Clock Cycle<br>Quad Word Read Access Timing diagrams for Interleaved<br>Memory System                                                                                                                                                                                                                                                                                                                          | 9-4<br>9-5<br>9-6<br>9-9<br>9-10<br>9-11<br>9-14                                         |
| Figure 9.2<br>Figure 9.3<br>Figure 9.4<br>Figure 9.5<br>Figure 9.6<br>Figure 9.7<br>Figure 9.8 (a)<br>Figure 9.8 (b)                                                            | Chapter for Interleaved Memory Systems<br>Example of a Single Read Access for Interleaved<br>Memory Systems<br>Page Read Access Timing Diagrams in Interleaved<br>Memory Systems<br>Single Read Access Outside of Page for the Interleaved<br>Memory System<br>Page Write Access Timing Diagrams in Interleaved Systems<br>Page Write Access Timing Diagrams in Interleaved Systems<br>Single Read Followed by a Single Write Followed by a Single<br>Read Access in Interleaved Systems<br>Single Write Access Outside of Page in Interleaved Systems<br>Quad Word Read Transaction Timing in Interleaved Systems, CAS<br>Pulse Width = 1.5 Clock Cycles, CAS Pre-charge = 0.5 Clock Cycle<br>Quad Word Read Transaction Timing in Interleaved Systems, CAS<br>Pulse Width = 1.5 Clock Cycles, CAS Pre-charge = 1.5 Clock Cycle<br>Quad Word Read Access Timing diagrams for Interleaved<br>Memory System<br>Page Quad Word Read Access Timing Diagrams for Interleaved                                                                                                                                                                                                | 9-4<br>9-5<br>9-6<br>9-9<br>9-10<br>9-11<br>9-14<br>9-15<br>9-16                         |
| Figure 9.2<br>Figure 9.3<br>Figure 9.4<br>Figure 9.5<br>Figure 9.6<br>Figure 9.7<br>Figure 9.8 (a)<br>Figure 9.8 (b)<br>Figure 9.9                                              | Chapter for Interleaved Memory Systems<br>Example of a Single Read Access for Interleaved<br>Memory Systems<br>Page Read Access Timing Diagrams in Interleaved<br>Memory Systems<br>Single Read Access Outside of Page for the Interleaved<br>Memory System<br>Page Write Access Timing Diagrams in Interleaved Systems<br>Single Read Followed by a Single Write Followed by a Single<br>Read Access in Interleaved Systems<br>Single Write Access Outside of Page in Interleaved Systems<br>Quad Word Read Transaction Timing in Interleaved Systems, CAS<br>Pulse Width = 1.5 Clock Cycles, CAS Pre-charge = 0.5 Clock Cycle<br>Quad Word Read Transaction Timing in Interleaved Systems, CAS<br>Pulse Width = 1.5 Clock Cycles, CAS Pre-charge = 1.5 Clock Cycle<br>Quad Word Read Access Timing diagrams for Interleaved<br>Memory System                                                                                                                                                                                                                                                                                                                          | 9-4<br>9-5<br>9-6<br>9-9<br>9-10<br>9-11<br>9-14<br>9-15                                 |
| Figure 9.2<br>Figure 9.3<br>Figure 9.4<br>Figure 9.5<br>Figure 9.6<br>Figure 9.7<br>Figure 9.8 (a)<br>Figure 9.8 (b)<br>Figure 9.9<br>Figure 9.10                               | Chapter for Interleaved Memory Systems<br>Example of a Single Read Access for Interleaved<br>Memory Systems                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 9-4<br>9-5<br>9-6<br>9-9<br>9-10<br>9-11<br>9-14<br>9-15<br>9-16                         |
| Figure 9.2<br>Figure 9.3<br>Figure 9.4<br>Figure 9.5<br>Figure 9.6<br>Figure 9.7<br>Figure 9.8 (a)<br>Figure 9.8 (b)<br>Figure 9.9                                              | Chapter for Interleaved Memory Systems<br>Example of a Single Read Access for Interleaved<br>Memory Systems<br>Page Read Access Timing Diagrams in Interleaved<br>Memory Systems<br>Single Read Access Outside of Page for the Interleaved<br>Memory System<br>Page Write Access Timing Diagrams in Interleaved Systems<br>Single Read Followed by a Single Write Followed by a Single<br>Read Access in Interleaved Systems<br>Single Write Access Outside of Page in Interleaved Systems<br>Quad Word Read Transaction Timing in Interleaved Systems, CAS<br>Pulse Width = 1.5 Clock Cycles, CAS Pre-charge = 0.5 Clock Cycle<br>Quad Word Read Transaction Timing in Interleaved Systems, CAS<br>Pulse Width = 1.5 Clock Cycles, CAS Pre-charge = 1.5 Clock Cycle<br>Quad Word Read Access Timing diagrams for Interleaved<br>Memory System<br>Page Quad Word Read Access Timing Diagrams for Interleaved<br>Memory System<br>Page Quad Word Read Access Timing Diagrams for Interleaved<br>Memory System<br>Page Quad Word Read Access Timing Diagrams for Interleaved<br>Memory Systems                                                                            | 9-4<br>9-5<br>9-6<br>9-9<br>9-10<br>9-11<br>9-14<br>9-15<br>9-16<br>9-18                 |
| Figure 9.2<br>Figure 9.3<br>Figure 9.4<br>Figure 9.5<br>Figure 9.6<br>Figure 9.7<br>Figure 9.8 (a)<br>Figure 9.8 (b)<br>Figure 9.9<br>Figure 9.10<br>Figure 10.1                | Chapter for Interleaved Memory Systems<br>Example of a Single Read Access for Interleaved<br>Memory Systems<br>Page Read Access Timing Diagrams in Interleaved<br>Memory System<br>Single Read Access Outside of Page for the Interleaved<br>Memory System<br>Page Write Access Timing Diagrams in Interleaved Systems<br>Page Write Access Timing Diagrams in Interleaved Systems<br>Single Read Followed by a Single Write Followed by a Single<br>Read Access in Interleaved Systems<br>Guad Word Read Transaction Timing in Interleaved Systems, CAS<br>Pulse Width = 1.5 Clock Cycles, CAS Pre-charge = 0.5 Clock Cycle<br>Quad Word Read Transaction Timing in Interleaved Systems, CAS<br>Pulse Width = 1.5 Clock Cycles, CAS Pre-charge = 1.5 Clock Cycle<br>Quad Word Read Access Timing diagrams for Interleaved<br>Memory System<br>Page Quad Word Read Access Timing Diagrams for Interleaved<br>Memory System<br>Page Quad Word Read Access Timing Diagrams for Interleaved<br>Memory System<br>Page Quad Word Read Access Timing Diagrams for Interleaved<br>Memory System<br>Page Quad Word Read Access Timing Diagrams for Interleaved<br>Memory System | 9-4<br>9-5<br>9-6<br>9-9<br>9-10<br>9-11<br>9-14<br>9-15<br>9-16                         |
| Figure 9.2<br>Figure 9.3<br>Figure 9.4<br>Figure 9.5<br>Figure 9.6<br>Figure 9.7<br>Figure 9.8 (a)<br>Figure 9.8 (b)<br>Figure 9.9<br>Figure 9.10                               | Chapter for Interleaved Memory Systems<br>Example of a Single Read Access for Interleaved<br>Memory Systems                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 9-4<br>9-5<br>9-6<br>9-9<br>9-10<br>9-11<br>9-14<br>9-15<br>9-16<br>9-18<br>10-2         |
| Figure 9.2<br>Figure 9.3<br>Figure 9.4<br>Figure 9.5<br>Figure 9.6<br>Figure 9.7<br>Figure 9.8 (a)<br>Figure 9.8 (b)<br>Figure 9.9<br>Figure 9.10<br>Figure 10.1<br>Figure 10.2 | Chapter for Interleaved Memory Systems<br>Example of a Single Read Access for Interleaved<br>Memory Systems                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 9-4<br>9-5<br>9-6<br>9-9<br>9-10<br>9-11<br>9-14<br>9-15<br>9-16<br>9-18<br>10-2<br>10-3 |
| Figure 9.2<br>Figure 9.3<br>Figure 9.4<br>Figure 9.5<br>Figure 9.6<br>Figure 9.7<br>Figure 9.8 (a)<br>Figure 9.8 (b)<br>Figure 9.9<br>Figure 9.10<br>Figure 10.1                | Chapter for Interleaved Memory Systems<br>Example of a Single Read Access for Interleaved<br>Memory Systems                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 9-4<br>9-5<br>9-6<br>9-9<br>9-10<br>9-11<br>9-14<br>9-15<br>9-16<br>9-18<br>10-2         |

| Figure 10.5 (a)<br>Figure 10.5 (b)<br>Figure 10.6<br>Figure 10.7<br>Figure 10.8<br>Figure 10.9<br>Figure 10.10 | CASPre-charge Time AnalysisCASPre-charge Time Analysis During WritesMode Register Settings for a Two Bank Non-interleaved SystemSingle Read Access to the Even Half-bank-pair 0Single Read Access to the Odd Half-bank-pair 1Single Write Access to the Odd Half-bank-pair 0Single Write Access to the Even Half-bank-pair 1Single Write Access to the Even Half-bank-pair 1 | 10-8<br>10-10<br>10-11<br>10-12<br>10-13<br>10-14<br>10-15 |
|----------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------|
| Figure 11.1<br>Figure 11.2<br>Figure 11.3<br>Figure 11.4<br>Figure 11.5                                        | Cold Start<br>Warm Start<br>Reset Initialization Sequence<br>CAS-before-RAS Refresh Sequence<br>R3721 Input Clock Requirements                                                                                                                                                                                                                                               | 11-1<br>11-1<br>11-2<br>11-3<br>11-4                       |
| Figure A. 1                                                                                                    | Block Diagram of IDT73720 Bus Exchanger                                                                                                                                                                                                                                                                                                                                      | A-1                                                        |

#### LIST OF TABLES

| Table 4.1 | Processor to DRAM Address Multiplexing | 4-1 |
|-----------|----------------------------------------|-----|
| Table 4.2 | Bank Selection in Multi-Bank System    | 4-2 |





INTRODUCTION

The R3721 is a member of the R3051<sup>™</sup> system support RISChipset<sup>™</sup>. The R3721 is a Dynamic RAM Memory Controller, designed to offer the same levels of system flexibility as the R3051 family.

The R3721 is responsible for translating between the R3051 family bus interface and the special control requirements of various DRAM based subsystems. The R3721 performs all necessary handshaking and timing control. All that is required to implement a DRAM sub-system for the R3051 family is the R3721, DRAMs, an address decoder, and some transceivers for the data path.

The R3721 has been designed to enable systems to be implemented with field upgrade capabilities of their memory system. In order to upgrade to larger memory devices, or to increase the amount of memory, software merely needs to re-program the R3721 mode register at boot time. No complicated re-routing of address lines, nor modifications of the data path need to occur. Thus, as with the R3051 family, a single footprint and base design can offer a wide variety of end products, depending on the frequency of devices selected, the amount of memory installed, and the specific R3051 family CPU selected.

The R3721 is packaged using a low-cost 84-pin PLCC package, and supports a wide variety of DRAM-based sub-systems:

- 256k x 1 through 4Mb x 4 DRAM devices
- 1 to 4 banks of DRAM
- non-interleaved or two-way interleaved
- Direct control of DRAM data path transceivers
- Direct handshake with R3051
- Supports all bus transfers of the R3051 family
- DRAM access times of 100 ns or faster
- Supports page mode operation of DRAMs (either read or write) using onchip page detector
- CAS-before-RAS refresh
- Capability to drive up to 36 DRAMs directly
- Directly controls x8, x9, x32, and x36 DRAM Memory Modules
- Highly programmable DRAM timing control for optimum performance
- · Supports various memory address decoding schemes

Figure 1.1 illustrates a block diagram of the R3721 DRAM controller.



Figure 1.1 R3721 Dynamic Memory Controller

#### DESCRIPTION

The R3721 DRAM controller contains all of the functional elements necessary to support the bus transaction requirements of the R3051 family.

The R3721 connects directly to the R3051 bus, and captures address and control information from the bus as the R3051, or a DMA controller, drives it.

The R3721 begins its transaction once its memory space is selected by an external address decoder.

The R3721 will generate all of the DRAM control signal sequencing required:

- Row address set-up to RAS asserted
- Row address hold from RAS asserted
- Column address set-up to  $\overline{CAS}$  asserted
- $\overline{RAS}$  to  $\overline{CAS}$  delay
- CAS to data valid (read)
- WE to CAS set-up (write)

In addition, the R3721 will manage the transceiver-based data path interface, to properly control the flow of data between the CPU bus and the DRAM devices. The R3721 can either control standard FCT245 type transceivers (non-interleaved memory systems), or use the high-performance 73720 Bus Exchanger (for interleaved or banked memory systems). Figure 1.2 illustrates a typical system composed of the R3051, R3721 DRAM controller, and 73720 bus exchanger.

Finally, the R3721 will provide the proper acknowledgement back to the R3051, at the optimum time. That is, the R3721 will generate  $\overline{ACK}$  and/or  $\overline{RdCEn}$ , according to the timing model for the DRAMs and the type of transfer requested.



Figure 1.2 R3051-based System Using R3721 DRAM Controller

The types of transactions performed include:

- **Single Datum Reads.** The R3051 can request a single datum read (one to four bytes). The DRAM controller is capable of processing that read as a standard access or as a page mode access, depending on its locality to the preceding access.
- **Single Datum Writes.** The R3051 can perform single datum writes (one to four bytes). The DRAM controller will process that write as either a standard write access or a page mode write access, depending on its locality to the preceding access. The R3721 can use the WrNear output from the R3051 to provide a quick address decode, and can retire near writes in two cycles in almost any speed system.
- **Quad Word Reads.** The R3051 requests quad word reads in response to cache misses. The DRAM controller uses page mode in the DRAM to provide the data. If the memory sub-system is fast enough, the data will be returned in a true "burst" response (one word per clock cycle after initial latency). Otherwise, the R3721 will control RdCEn to "throttle" the read response.

• CAS-before-RAS Refresh. The R3721 contains all logic to automatically perform DRAM refresh. The R3721 uses simple  $\overline{CAS}$ -before- $\overline{RAS}$  refresh signalling to perform DRAM refresh.

#### CONFIGURABILITY

The R3721 supports various memory configurations and speeds, as programmed into its on-chip, write-only mode register (Figure 1.3). The mode register allows the system designer to program the following system characteristics:

- · Speed of address decoding.
- Refresh rate
- CAS pre-charge time in page mode accesses
- $\overline{CAS}$  low time, which indicates data access time from assertion of  $\overline{CAS}$
- RAS wave form, including low and high times
- RAS to CAS delay required
- Memory configuration (interleaved, etc.)
- DRAM size, from 256k x 1 through 4Mb x 4.

At boot time, the processor programs the DRAM controller according to the type of memory system connected. Note that the DRAM controller may be reprogrammed; thus, it is possible to program in a "maximum case" value, perform memory diagnostics to determine the exact configuration, and then reprogram the device according to the actual system configuration. This capability is included to allow the system to re-configure itself at boot time, to further support various field and manufacturing options for the given system.

The DRAM controller performs all internal address shifting necessary to accommodate the various depths of DRAM. That is, all R3721's are connected to the address bus in the same fashion, regardless of the DRAM organization; logic internal to the R3721 multiplexes R3051 address lines to the appropriate DRAM row or column address lines, according to the type of device under the R3721's control. Again, this capability has been provided to allow field upgrades to higher density DRAM devices.

#### PERFORMANCE CONSIDERATIONS

The R3721 has been optimized to obtain maximum performance from lowcost, commodity DRAMs such as 1Mb 70-80ns 256k x 4 devices. Whereas discrete or PAL based DRAM control systems are typically limited to use only one edge of the processor SysClk output, the R3721 uses both clock edges of SysClk to provide higher granularity of timing in the DRAM sub-system and to achieve levels of performance that would be difficult and expensive to achieve in a discrete implementation.

| 15   | 14  | 13            | 12     | 11                   | 10    | 9    | 8  | 7                | 6  | 5   | 4               | 3              | 2     | 1   | 0   |
|------|-----|---------------|--------|----------------------|-------|------|----|------------------|----|-----|-----------------|----------------|-------|-----|-----|
| rsvd | DCS | RF2           | RF1    | RF0                  | СР    | rsvd | CO | R2               | R1 | R0  | RCD             | WrNr           | Inivd | DZ1 | DZ0 |
| L    | I   |               |        |                      |       |      |    | L                |    |     |                 |                |       |     |     |
|      |     |               |        |                      |       |      |    |                  |    |     |                 |                |       |     |     |
|      |     | rsvd:<br>DCS: |        | ust be v<br>elayed ( |       |      |    | R(2:0):<br>RCD:  |    |     | Wavef           | orm<br>S delay |       |     |     |
|      |     | RF(2:0        | D): Re | efrésh P             | eriod |      |    | WrNr:            |    | Use | WrNea           | r on wr        | ites  |     |     |
|      |     | CP:<br>C0:    |        | \S Pre-<br>\S Low    |       | Time |    | Inlvd:<br>DZ(1:0 | ): |     | eaved<br>M Pace | DRAM<br>Size   |       |     |     |

| F(2:0):<br>P: | Delayed Chip Select<br>Refresh Period<br>CAS Pre-charge Time<br>CAS Low time | RCD:<br>WrNr:<br>Inivd:<br>DZ(1:0): | RAS to CAS delay<br>Use WrNear on writes<br>Interleaved DRAM<br>DRAM Page Size |
|---------------|------------------------------------------------------------------------------|-------------------------------------|--------------------------------------------------------------------------------|
| •.            |                                                                              | <b>DL</b> (1.0).                    | Divini i ago Oizo                                                              |

Figure 1.3 R3721 Mode Register

For example, at 25MHz in a non-interleaved memory configuration, the R3721 can perform a standard read access in 5 cycles. Similarly, page mode writes are retired at the maximum processor rate of one write every two cycles.

Quad word reads obtain data at the rate of one-word every two clock cycles. In a higher-performance memory configuration, interleaved memory can be used to increase the block refill rate to one word every clock cycle. This performance, coupled with the high cache hit rates inherent in the R3051 family, allows system designers to build high-performance, low cost systems with a minimum of parts and design complexity.

Thus, the combination of the R3051 family CPU and R3721 DRAM controller offers the system designer the ability to perform cost performance tradeoffs without absolutely crippling the performance of the end product.

#### APPLICATIONS

The R3721 is a basic building block, which fits a broad range of applications, including graphics systems, laser printers, data communications, and other applications requiring a high-performance processor. The R3721 is designed to eliminate all glue logic and PALs from the DRAM control sub-system, easing design, reducing time-to-market, increasing performance, and lowering system cost.



## R3051 FAMILY INTERFACE OVERVIEW

**CHAPTER 2** 

The IDT R3051 family utilizes a simple, flexible bus interface to its external memory and I/O resources. The interface uses a single, multiplexed 32-bit address and data bus and a simple set of control signals to manage read and write operations. Complementing the basic read and write interface is a DMA Arbiter interface which allows an external agent to gain control of the memory interface to transfer data. This chapter provides an overview of the R3051 memory interface; additional detail is found in the "R3051 Hardware User's Guide".

The R3051 family supports the following types of operations on its interface:

**Write Operations:** The R3051 family utilizes an on-chip write buffer to isolate the execution core from the speed of external memory during write operations. The write interface of the R3051 family is thus designed to allow a variety of write strategies, from fast 2-cycle write operations through multiple wait-state writes.

The R3051 family supports the use of fast page mode writes by providing an output indicator,  $\overline{WrNear}$ , to indicate that the current write may be retired using a page mode access. This facilitates the rapid "flushing" of the on-chip write buffer, since the majority of processor writes will occur within a localized area of memory.

**Read Operations:** The processor executes read operations as the result of either a cache miss or an uncacheable reference. As with the write interface, the read interface has been designed to accommodate a wide variety of memory system strategies. There are two types of reads performed by the processor:

Quad word reads occur when the processor requests a contiguous block of four words from memory. Bursts occur in response to instruction cache misses, and may occur in response to a data cache miss. The processor incorporates an on-chip 4-deep read buffer which may be used to "queue up" the read response before passing it through to the highbandwidth cache and execution core. Read buffering is appropriate in systems which require wait states between adjacent words of a block read. On the other hand, systems which use high-bandwidth memory techniques (such as memory interleaving) can effectively bypass the read buffer by providing words of the block at the processor clock rate. Note that the choice of burst vs. read buffering is independent of the initial latency of the memory; that is, burst mode can be used even if multiple wait states are required to access the first word of the block.

Single word reads are used for uncacheable references (such as I/O or boot code) and may be used in response to a data cache miss. The processor is capable of retiring a single word read in as few as two clock cycles.

• **DMA Operations:** The R3051 family includes a DMA arbiter which allows an external agent to gain full control of the processor read and write interface. DMA is useful in systems which need to move significant amounts of data within memory (e.g. BitBlT operations) or move data between memory and I/O channels.

#### **R3051 BUS INTERFACE PIN DESCRIPTION**

This section describes the signals used in the above interfaces. Note that many of the signals have multiple definitions which are de-multiplexed either by the ALE signal or the  $\overline{Rd}$  and  $\overline{Wr}$  control signals. Note that signals indicated with an overbar are active low.

#### Address and Data Path

#### A/D(31:0) **I/O**

Address/Data: A 32-bit, time multiplexed bus which indicates the desired address for a bus transaction in one cycle, and which is used to transmit data between this device and external memory resources on other cycles.

Bus transactions on this bus are logically separated into two phases: during the first phase, information about the transfer is presented to the memory system to be captured using the ALE output. This information consists of:

Address(31:4): The high-order address for the transfer is presented. BE(3:0): These strobes indicating which bytes of the 32-bit bus will be involved in the transfer.  $\overline{BE(3)}$  indicates that AD(31:24) is used;  $\overline{BE(2)}$  indicates that AD(23:16) is used:  $\overline{BE(1)}$  indicates that AD(15:8) is used: and  $\overline{BE(0)}$ indicates that AD(7:0) is used.

During write cycles, the bus contains the data to be stored and is driven from the internal write buffer. On read cycles, the bus receives the data from the external resource, in either a single word transaction or in a burst of four words, and places it into the on-chip read buffer.

#### Addr(3:2)0

0

Low Address (3:2) A 2-bit bus which indicates which word is currently expected by the processor. Specifically, this two bit bus presents either the address bits for the single word to be transferred (writes or single word reads) or functions as a two bit counter starting at '00' for burst read operations.

#### **Read and Write Control Signals**

#### ALE

Address Latch Enable: Used to indicate that the A/D bus contains valid address information for the bus transaction. This signal is used by external logic (transparent latches) to capture the address for the transfer.

#### DataEn

0 Data Input Enable: This signal indicates that the AD bus is no longer being driven by the processor during read cycles, and thus the external memory system may enable the drivers of the memory system onto this bus without having a bus conflict occur. During write cycles, or when no bus transaction is occurring, then this signal is negated.

#### Burst/ WrNear

0

C

Ι

I

Ι

**Burst Transfer:** On read transactions, this signal indicates that the current bus read is requesting a block of four contiguous words from memory (a burst read). This signal is asserted only in read cycles due to cache misses; it is asserted for all I-Cache miss read cycles, and for D-Cache miss read cycles if selected at device reset time.

**Write Near:** On write transactions, this output tells the external memory system that the bus interface unit is performing back-to-back write transactions to an address within the same 256 entry memory "page" as the prior write transaction. This signal is useful in memory systems which employ page mode or static column DRAMs.

#### Rd

**Read:** An output which indicates that the current bus transaction is a read. **Wr O** 

Write: An output which indicates that the current bus transaction is a write.

#### ACK

**Acknowledge:** An input which indicates to the device that the memory system has sufficiently processed the bus transaction, and that the processor may either advance to the next write buffer entry (writes) or release the execution core to process the read data (reads).

#### RdCEn

**Read Buffer Clock Enable:** An input which indicates to the device that the memory system has placed valid data on the AD bus, and that the processor may move the data into the on-chip Read Buffer.

#### BusError

**Bus Error:** Input to the bus interface unit to terminate a bus transaction due to an external bus error. This signal is only sampled during read and write operations. If the bus transaction is a read operation, then the CPU will also take a bus error exception.

#### **Status Information**

### Diag(1) O

**Diagnostic Pin 1.** This output indicates whether the current bus read transaction is due to an on-chip cache miss, and also presents part of the miss address. The value output on this pin is time multiplexed:

- **Cached:** During the phase in which the A/D bus presents address information, this pin is an active high output which indicates whether the current read is a result of a cache miss. The value of this pin at this time in other than read cycles is undefined.
- **Miss Address (3):** During the remainder of the read operation, this output presents address bit (3) of the address the processor was attempting to reference when the cache miss occurred. Regardless of whether a cache miss is being processed, this pin reports the transfer address during this time.

#### Diag(0) O

**Diagnostic Pin 0.** This output distinguishes cache misses due to instruction references from those due to data references, and presents the remaining bit of the miss address. The value output on this pin is also time multiplexed:

| or the mass address. | me value output on and phi is also ame manuplexed.           |
|----------------------|--------------------------------------------------------------|
| I/D:                 | If the "Cached" Pin indicates a cache miss, then a high      |
|                      | on this pin at this time indicates an instruction reference, |
|                      | and a low indicates a data reference. If the read is not     |
|                      | due to a cache miss but rather an uncached reference         |
|                      | ("Cached" is negated), then this pin is undefined during     |
|                      | this phase.                                                  |
| Miss Address (2):    | During the remainder of the read operation, this output      |

**Miss Address (2):** During the remainder of the read operation, this output presents address bit (2) of the address the processor was attempting to reference when the cache miss occurred. Regardless of whether a cache miss is being processed, this pin reports the transfer address during this time.

#### **DMA** Arbiter Interface

These signals are involved when the processor exchanges bus mastership with an external agent.

#### BusReq

Ι

**DMA Arbiter Bus Request:** An input to the device which requests that the processor tri-state its bus interface signals so that they may be driven by an external master. The negation of this input releases the bus back to the R3051/52.

#### BusGnt O

**DMA Arbiter Bus Grant.** An output from the R3051/52 used to acknowledge that a  $\overline{\text{BusReq}}$  has been detected, and that the bus is relinquished to the external master.

#### **READ TRANSACTIONS**

The majority of the execution engine read requests are never seen at the memory interface, but rather are satisfied by the internal cache resources of the processor. Only in the cases of uncacheable references or cache misses do read transactions occur on the bus.

In general, there are only two types of read transactions: quad word reads and single word reads. Note that partial word reads of less than 32-bits can be thought of as a simple subset of the single word read, with only some of the byte enable strobes asserted.

Quad word reads occur only in response to cache misses. All instruction cache misses are processed as quad word reads; data cache misses may be processed as quad word reads or single word reads, depending on the mode selection made during reset initialization of the processor.

In processing reads, there are two parameters of interest. The first parameter is the initial latency to the first word of the read. This latency is influenced by the overall system architecture as well as the type of memory system being addressed: time required to perform address decoding, and perform bus arbitration, memory pre-charge requirements, and memory control requirements, as well as memory access time. The initial latency is the only parameter of interest in single word reads.

The second parameter of interest (only in quad word refills) is the repeat rate of data; that is, time required for subsequent words to be processed back to the processor. Factors which influence the repeat rate include the memory system architecture, the types and speeds of devices used, and the sophistication of the memory controller: memory interleaving, the use of faster devices serves to increase the repeat rate (minimize the amount of time between adjacent words).

The R3051 family has been designed to accommodate a wide variety of memory system designs, including no wait state operations (first word available in two cycles) and true burst operation (adjacent words every clock cycle), through simpler, slower systems incorporating many bus wait states to the first word and multiple clock cycles between adjacent words (this is accomplished by use of the on-chip read buffer). The R3721 DRAM controller supports these various schemes, according to the memory configuration under its control.

#### **READ INTERFACE TIMING OVERVIEW**

The read interface is designed to allow a variety of memory strategies. An overview of how data is transmitted from memory and I/O devices to the processor is discussed below. Note that multiplexing the address and data bus does not slow down read transactions: the address is on the A/D bus for only one-half clock cycle, so the data drivers can be enabled quickly; memory and I/O devices initiate their transfers based on addressing and chip enable, not on the availability of the bus. Thus, memory does not need to "wait" for the bus, and no performance penalty occurs.

#### **Memory Addressing**

A read transaction begins when the processor asserts its  $\overline{Rd}$  control output, and also drives the address and other control information onto the A/D and memory interface bus. Figure 2.1 illustrates the start of a processor read transaction, including the addressing of memory and the bus turn around.

The addressing occurs in a half-cycle of the  $\overline{SysClk}$  output. At the rising edge of  $\overline{SysClk}$ , the processor will drive the read target address onto the A/D bus. At this time, ALE will also be asserted, to allow an external transparent latch to capture the address. Depending on the system design, address decoding could occur in parallel with address de-multiplexing (that is, the decoder could start on the assertion of ALE, and the output of the decoder captured by ALE),



Figure 2.1 Start of Processor Read

or could occur on the output side of the transparent latches. During this phase,  $\overline{DataEn}$  will be held high indicating that memory drivers should not be enabled onto the A/D bus.

Concurrent with driving addresses on the A/D bus, the processor will indicate whether the read transaction is a quad word read or single word read, by driving  $\overline{\text{Burst}}$  to the appropriate polarity (low for a quad word read). If a quad word read is indicated, the Addr(3:2) lines will drive '00' (the start of the block); if a single word (or subword) is indicated, the Addr(3:2) lines will indicate the word address for the transfer. The functioning of the counter during quad words is described later.

#### **Bus Turn Around**

Once the A/D bus has presented the address for the transfer, it is "turned around" by the processor to accept the incoming data. This occurs in the second phase of the first clock cycle of the read transaction, as illustrated in Figure 2.1.

The processor turns the bus around by carefully performing the following sequence of events:

- It negates ALE, causing the transparent address latches to capture the contents of the A/D bus.
- It disables its output drivers on the A/D bus, allowing it to be driven by an external agent. The processor design guarantees that the ALE is negated prior to tri-stating the A/D bus.
- The processor then asserts DataEn, to indicate that the bus may be now driven by the external memory resource. The processor design insures that the A/D bus is released prior to DataEn being asserted. DataEn may be directly connected to the output enable of external memory, and no bus conflicts will occur.

Thus, the processor A/D bus is ready to be driven by the end of the second phase of the read transaction. At this time, it begins to look for the end of the read cycle.

#### Bringing Data into the Processor

Regardless of whether the transfer is a quad word read or a single word transfer, the basic mechanism for transferring data presented on the A/D bus into the processor is the same.

Although there are two control signals involved in terminating read operations, only the RdCEn signal is used to cause data to be captured from the bus.

The memory system asserts  $\overline{RdCEn}$  to indicate to the processor that it has (or will have) data on the A/D bus to be sampled. The earliest that  $\overline{RdCEn}$  can be detected by the processor is the rising edge of  $\overline{SysClk}$  after it has turned the bus around (start of phase 1 of the second clock cycle of the read).

If  $\overline{\text{RdCEn}}$  is detected as asserted (with adequate setup and hold time to the rising edge of  $\overline{\text{SysClk}}$ ), the processor will capture (with proper setup and hold time) the contents of the A/D bus on the immediately subsequent falling edge of  $\overline{\text{SysClk}}$ . This captures the data in the internal read buffer for later processing by the execution core/cache subsystem.

Figure 2.2 illustrates the sampling of data by an R3051/52.



Figure 2.2 Data Sampling by R3051

#### Terminating the Read

There are actually three methods for the external memory system to terminate an ongoing read operation:

- It can supply an  $\overline{ACK}$  (acknowledge) to the processor, to indicate that it has sufficiently processed the read request and has or will supply the requested data in a timely fashion. Note that  $\overline{ACK}$  may be signalled to the processor "early", to enable it to begin processing the read data even while additional data is brought from the A/D bus. This is applicable only in burst read operations.
- It can supply a BusError to the processor, to indicate that the requested data transfer has "failed" on the bus, and force the processor to take a bus error exception. Although the system interface behavior of the processor when BusError is presented is identical to the behavior when ACK is presented, no data will actually be written into the on-chip cache. Rather, the cache line will either remain unchanged, or will be invalidated by the processor, depending on how much of the read has already been processed.
- The external memory system can supply the requested data, using  $\overline{RdCEn}$  to enable the processor to capture data from the bus. The processor will "count" the number of times  $\overline{RdCEn}$  is sampled as asserted; once the processor counts that the memory system has returned the desired amount of data (one word or four words), it will implicitly "acknowledge" the read at the same time that it samples the last required  $\overline{RdCEn}$ . This approach leads to a simpler memory design at the cost of lower performance.

The R3721 always uses a properly timed ACK to terminate quad word reads. There are actually two phases of terminating the read: there is the phase where the memory system indicates to the processor that it has sufficiently processed the read request, and the internal read buffer can be released to begin refilling the internal caches; and there is the phase in which the read control signals are negated by the processor bus interface unit. The difference between these phases is due to block refill: the R3721 "releases" the execution core even though additional words of the block are still required; in that case, the processor will continue to assert the external read control signals until all four words are brought into the read buffer, while simultaneously refilling/ executing based on the data already brought on board.

Figure 2.3 shows the timing of the control signals when the read cycle is being terminated.

#### **READ TIMING DIAGRAMS**

This section illustrates a number of timing diagrams applicable to R3051 family read transactions. These diagrams reference AC parameters whose values are contained in the R3051/52 data sheet.

#### Single Word Reads

Figure 2.4 illustrates the case of a single word read. In this figure, two bus wait cycles were required before the data was returned. Thus, two rising edges of  $\overline{SysClk}$  occurred where neither  $\overline{RdCEn}$  or  $\overline{ACK}$  were asserted. On the third rising edge of  $\overline{SysClk}$ ,  $\overline{RdCEn}$  was asserted. Optionally,  $\overline{ACK}$  could also be asserted at this time, although it is not strictly necessary.

#### **Quad Word Reads**

Figure 2.5 (a, b) illustrates a block read in which bus wait cycles are required before the first word is brought to the processor, but in which additional words can be brought in at the processor clock rate. Thus, as with the no wait cycle



Figure 2.3 End of Read

operation,  $\overline{ACK}$  is returned simultaneously with the first  $\overline{RdCEn}$ . Figure 2.5 (a) illustrates the start of the block read, including initial wait cycles to the first word; Figure 2.5 (b) illustrates the activity which occurs as data is brought onto the chip and the read is terminated. The use of memory interleaving in the DRAM subsystem allows true burst operation.

Figure 2.6 (a, b) illustrates a block read in which bus wait cycles are required before the first word is returned, and in which wait cycles are required between subsequent words: Figure 2.6 (a) illustrates the first two words of the block being brought on chip; Figure 2.6 (b) illustrates the last two words of the read, including the optimum timing of  $\overline{\text{ACK}}$ , and the negation of the read control signals.

In this diagram, the R3721 returns  $\overline{ACK}$  according to when the processor will empty the read buffer. The R3721 determines the optimal cycle to assert  $\overline{ACK}$ , allowing the CPU to restart even while data is read from the DRAMs. The timing of  $\overline{ACK}$  insures that the last data word is returned to the processor before it is emptied from the read buffer.



Figure 2.4 Single Word Read Cycle



Figure 2.5 (a) Start of Burst Guad Word Read



Figure 2.5 (b) End of Burst Quad Word Read



Figure 2.6 (a) Start of Throttled Guad Word Read





#### WRITE INTERFACE

The design goal of the write interface was to achieve two things:

Insure that a relatively slow write cycle does not unduly degrade the performance of the processor. To this end, a four deep write buffer has been incorporated on chip. The role of the write buffer is to decouple the speed of the memory interface from the speed of the execution engine. The write buffer captures store information (data, address, and transaction size) from the processor at its clock rate, and later presents it to the memory interface at the rate it can perform the writes. Four such buffer entries are incorporated, thus allowing the processor to continue execution even when performing a quick succession of writes. Only when the write buffer is filled must the processor stall; simulations have shown that significantly less than 1% of processor clock cycles are lost to write buffer full stalls.

Allow the memory system to optimize for fast writes. To this end, a number of design decisions were made: the WrNear signal is provided to allow page mode writes to be used in even simple memory systems; the A/D bus presents the data to be written in the second phase of the first clock cycle of a write transaction; and writes can be performed in as few as two clock cycles.

All though it may be counter-intuitive, a significant percentage of the bus traffic will in fact be processor writes to memory. This can be demonstrated if one assumes the following:

#### Instruction Mix:

| ALU Operations           | 55% |
|--------------------------|-----|
| Branch Operations        | 15% |
| Load Operations          | 20% |
| Store Operations         | 10% |
| <b>Cache Performance</b> |     |
| Instruction Hit Rate     | 98% |
| Data Hit Rate            | 96% |

Under these assumptions, in 100 instructions, the processor would perform:

2 Reads to process instruction cache misses on instruction fetches

 $4\% \ge 20 = 0.8$  reads to process data cache misses on loads

10 store operations to the write through cache

Total: 2.8 reads and 10 writes

Thus, in this example, over 75% of the bus transactions are write operations, even though only 10 instructions were store operations, vs. 100 instruction fetches and 20 data fetches. Thus, it is appropriate to optimize the DRAM subsystem for page mode write operations.

#### **TYPES OF WRITE TRANSACTIONS**

Unlike instruction fetches and data loads, which are usually satisfied by the on-chip caches and thus are not seen at the bus interface, all write activity is seen at the bus interface as single write transactions. There is no such thing as a "burst write"; the processor performs a word or subword write as a single autonomous bus transaction; however, the WrNear output does allow successive write transactions to be processed using page mode writes. This is particularly important when "flushing" the write buffer before performing a data read.

In processing writes, there is only one parameter of interest: the latency of the write. This latency is influenced by the overall system architecture as well as the type of memory system being addressed: time required to perform address decoding and bus arbitration, memory pre-charge requirements, and memory control requirements, as well as memory access time. WrNear may be used to reduce the latency of successive write operations. In addition, WrNear may be used to bypass the address decoder; if the memory controller retired the last transaction, and WrNear is asserted for this write, then obviously that memory controller will also be responsible for this transaction, and the system does not need to wait for the output of the address decoder.

The R3051 family has been designed to accommodate a wide variety of memory system designs, including no wait cycle operations (write completed in two cycles) through simpler, slower systems incorporating many bus wait cycles.

#### WRITE INTERFACE TIMING OVERVIEW

The protocol for transmitting data from the processor to memory and I/O devices is discussed below.

#### Memory Addressing

A write transaction begins when the processor asserts its  $\overline{Wr}$  control output, and also drives the address and other control information onto the A/D and memory interface bus. Figure 2.7 illustrates the start of a processor write transaction, including the addressing of memory and presenting the store data on the A/D bus.

The addressing occurs in a half-cycle of the SysClk output. At the rising edge of SysClk, the processor will drive the write target address onto the A/D bus. At this time, ALE will also be asserted, to allow an external transparent latch to capture the address. Depending on the system design, address decoding could occur in parallel with address de-multiplexing (that is, the decoder could start on the assertion of ALE, and the output of the decoder captured by ALE), or could occur on the output side of the transparent latches. During this phase, WrNear will also be determined and driven out by the processor.

#### **Data Phase**

Once the A/D bus has presented the address for the transfer, the address is replaced on the A/D bus by the store data. This occurs in the second phase of the first clock cycle of the write transaction, as illustrated in Figure 2.7.

The processor enters the data phase by performing the following sequence of events:

- It negates ALE, causing the transparent address latches to capture the contents of the A/D bus.
- It internally captures the data in a register in the bus interface unit, and enables this register onto its output drivers on the A/D bus. The processor design guarantees that the ALE is negated prior to the address being removed from the A/D bus.

Thus, the processor A/D bus is driving the store data by the end of the second phase of the write transaction. At this time, it begins to look for the end of the write cycle.



Figure 2.7 Start of Write

#### Terminating the Write

There are only two methods for the external memory system to terminate a write operation:

- It can supply an ACK (acknowledge) to the processor, to indicate that it has sufficiently processed the write request, and the processor may terminate the write.
- It can supply a BusError to the processor, to indicate that the requested data transfer has "failed" on the bus. The system interface behavior of the processor when BusError is presented is identical to the behavior when ACK is asserted. In the case of writes terminated by BusError, no exception is taken, and the data transfer cannot be retried.

Figure 2.8 shows the timing of the control signals when the write cycle is being terminated.

#### WRITE TIMING DIAGRAMS

This section illustrates a basic write from a R3051 family CPU. The values for the AC parameters referenced are contained in the R3051 family data sheet.

#### **Basic Write**

Figure 2.9 illustrates the case of a basic write. In this figure, two bus wait cycles were required before the data was retired. Thus, two rising edges of  $\overline{SysClk}$  occurred where  $\overline{ACK}$  was not asserted. On the third rising edge of  $\overline{SysClk}$ ,  $\overline{ACK}$  was asserted, and the write operation was terminated.







Figure 2.9 Basic Write

#### DMA ARBITER INTERFACE

The R3051 family contains provisions to allow an external agent to remove the processor from its memory bus, and thus perform transfers (DMA). These provisions use the DMA arbiter to coordinate the external request for mastership with the CPU read and write interface.

The DMA arbiter interface uses a simple two signal protocol to allow an external agent to obtain mastership of the external system bus. Logic internal to the CPU synchronizes the external interface to the internal arbiter unit to insure that no conflicts between the internal synchronous requesters (read and write engines) and external asynchronous (DMA) requester occurs.

#### **INTERFACE OVERVIEW**

An external agent indicates the desire to perform DMA requests by asserting the  $\overline{\text{BusReq}}$  input to the processor. DMA requests have the highest priority, and thus, once the request is detected, is guaranteed to gain mastership at the next arbitration.

The CPU indicates that the external DMA cycle may begin by asserting its  $\overline{BusGnt}$  output on the rising edge of  $\overline{SysClk}$  after  $\overline{BusReq}$  is detected with appropriate set-up time to the external rising edge of  $\overline{SysClk}$ . During DMA cycles, the processor holds the following memory interface signals in tri-state:

- A/D Bus
- Addr(3:2)
- Interface control signals: Rd, Wr, DataEn, Burst/WrNear, and ALE
- Diag(1:0)

In addition to tri-stating these signals, the CPU will ignore transitions on  $\overline{RdCEn}$ ,  $\overline{ACK}$ , and  $\overline{BusError}$  during DMA cycles.

Thus, the DMA master can use the same memory control logic as that used by the CPU; it may use  $\overline{\text{Burst}}$ , for example, to obtain a burst of data from the memory; it may use  $\overline{\text{RdCEn}}$  to detect whether the memory has satisfied its request, etc. Thus, DMA can occur at the same speed at which the R3051 family allows data transfers on its bus (a peak of one word per clock cycle). During DMA cycles, the processor will continue to operate out of cache until it requires the bus.

The external agent indicates that the DMA transfer has terminated by negating the BusReq input to the processor, which is sampled on the rising edge of SysClk. BusGnt is negated on a falling edge of SysClk, so that it will be negated before the assertion of Rd or Wr for a subsequent transfer. On the next rising edge of SysClk, the processor will resume driving tri-stated signals.

Note that there is no hardware coherency mechanism defined for DMA transfers relative to either the internal caches or the write buffer. Software must explicitly manage DMA transfers to insure that data conflicts are avoided. This is an appropriate trade-off for the vast majority of embedded applications.

### DMA ARBITER TIMING DIAGRAMS

These figures reference AC timing parameters whose values are contained in the R3051 family data sheet.

#### **Initiation of DMA Mastership**

Figure 2.10 shows the beginning of a DMA cycle. Note that if BusReq were asserted while the processor was performing a read or write operation, BusGnt would be delayed until the next bus slot after the read or write operation is completed.

To initiate DMA, the processor must detect the assertion of BusReq with proper set-up time to SysClk. Once BusReq is detected, and the bus is free, the processor will grant control to the requesting agent by asserting its BusGnt output, and tri-stating its output drivers, from a rising edge of SysClk. The bus will remain in the control of the external master until it negates BusReq, indicating that the processor is once again the bus master.

#### **Relinquishing Mastership Back to the CPU**

Figure 2.11 shows the end of a DMA cycle. The next rising edge of  $\overline{SysClk}$  after the negation of  $\overline{BusReq}$  is sampled may actually be the beginning of a processor read or write operation.

To terminate DMA, the external master must negate the processor BusReq input. Once this is detected (with proper setup and hold time), the processor will negate its BusGnt output on the next falling edge of SysClk. It will also reenable its output drivers. Thus, the external agent must disable its output drivers by this clock edge, to avoid bus conflicts.



Figure 2.10 DMA Mastership Request



Figure 2.11 Relinquishing DMA Mastership



# FUNDAMENTALS OF DRAM OPERATION

#### INTRODUCTION

The R3051 typically executes out of cache. However, if the data or instruction it is attempting to fetch is not available from the on-chip caches, it must be fetched from main memory, as shown in Figure 3.1. Also, since the R3051 cache uses a write-through policy, all writes appear on the memory bus.

The effect of main memory on CPU performance depends on both the cache size, and the program running on the processor. The locality of memory references and the number of writes will vary with each program. The effect of main memory on CPU performance will be more pronounced for programs with either low locality, a large proportion of writes, or a system with a small cache.

Typical cost-sensitive embedded applications utilize DRAMs for the processor main memory, based on the density/cost/power of these devices. However, DRAMs require special control in order to access their contents. These control actions are performed by the R3721. In order to better understand the R3721, a basic understanding of the DRAM control requirements is required. Those familiar with the basics of DRAM may skip this chapter.

### **DRAM ARCHITECTURE**

There are two common configuration of DRAMs: "Separate I/O", and "Common I/O". Separate I/O devices typically are "x1" DRAMs (a single data bit of I/O). These store 1 bit of data in N different locations and provide two pins, D and Q, for input and output data. Other DRAM configurations are "x4" and "x8", and use a common I/O structure. To conserve package size these use bi-directional data pins for both input and output data. Figure 3.1 shows the internal organization of a "x1" DRAM with separate I/O; Figure 3.2 shows the internal organization of a "x4" DRAM with common I/O.

In order to achieve high-density DRAMs are implemented using a capacitive memory cell. While this memory cell can be implemented as a much smaller cell than a typical SRAM cell, these memory cells do have a capacitive time constant whereby they discharge their value over a relatively small time. Thus, DRAMs must be refreshed periodically, as described below.

Further, DRAMs exhibit a "pre-charge" requirement for both  $\overline{RAS}$  and  $\overline{CAS}$ . During this time, internal bit lines are precharged back to a high level prior to sampling or writing a particular bit cell. The requirements of the multiplexed address interface, pre-charge and refresh requirements, and the timing associated with general DRAM control make the design of a DRAM subsystem relatively more complex than a standard SRAM or EPROM subsystem. The R3721 manages all of these requirements in a R3051 based system.

DRAMs are available today in various densities, speeds, and modes. Densities vary from 256 Kbits to 4 Mbits, with 1 Mbit devices being common due to price and density. The selection of density and depth depends upon the size of main memory and the number of banks supported. Since the R3051 is a 32-bit-wide CPU, each DRAM bank must be 32 bits wide. For example, when using 4M-by-1 devices, up to 16 Mbytes of main memory can be supported per bank. Conversely, when using 256K-by-4 only 1 Mbyte of memory can be supported per bank.

٨









DRAM speeds are typically measured as address-to-data access times, which can vary from 150ns to 60ns and faster. Besides offering various access times, DRAMs are also available in various data access modes, such as Nibble mode, Page Mode, and Static Column mode.

#### **Normal Access**

Accessing data in DRAM differs from accessing data in an SRAM. DRAM uses a multiplexed addressing arrangement, which allows for a smaller package size but increases the complexity of the interface. The system must provide a row address followed by a column address. The row address is latched by the Row Address Strobe ( $\overline{RAS}$ ), while the column address is latched by the Column Address Strobe ( $\overline{CAS}$ ). To meet pre-charge requirements, both  $\overline{RAS}$  and  $\overline{CAS}$  must be kept inactive for a minimum pre-charge delay after a read cycle. Figure 3.3 shows DRAM access timing.

Note that this multiplexed address also corresponds well to the internal organization of the DRAM.

#### Page Mode and Static Column Accesses

Both Static Column and Page Mode DRAMs support a normal (slow) first access in which  $\overline{RAS}$  and  $\overline{CAS}$  are asserted as in a normal DRAM. At the end of the cycle,  $\overline{RAS}$  and/or  $\overline{CAS}$  can stay low, thus eliminating both the precharge requirement and the address multiplexing delay for subsequent accesses. In page-mode devices row and column addresses are strobed into the DRAM on the first access, and only  $\overline{CAS}$  needs to be recycled for subsequent accesses which share the same Row address as the first access. However the setup, hold, transition, and pre-charge times for  $\overline{CAS}$  must still be met. In Static Column mode the setup, hold, transition and pre-charge delays associated with  $\overline{CAS}$  are eliminated by keeping  $\overline{CAS}$  as well as  $\overline{RAS}$  low after the first access. However, in Static Column mode the output buffers of the DRAM are enabled, thus consuming more power than Page Mode devices.

The use of page mode or static column mode allows high-bandwidth from relatively slow devices. In order to reduce power consumption of the system while maintaining high performance, the R3721 uses page mode accesses rather than static column. Using page mode allows fast writes, as well as high-performance single and quad word reads.

Figure 3.4 shows a series of DRAM page mode accesses.

#### **DRAM Refresh and Pre-charge**

Since DRAMs use a capacitive device to hold a memory bit, each DRAM cell must be periodically refreshed to ensure the cell does not lose its charge. Most DRAMs require a complete refresh every 4 to 8 ms. Most DRAMs reserve the most significant bit as a select bit for an internal multiplexer that selects data in two banks of arrays; in these DRAMs, the lower address bits access memory cells in both banks simultaneously. This halves the number of refresh cycles needed. For example, a DRAM with 1024 rows now only needs 512 refreshes within the refresh period.







Figure 3.4 Page Mode DRAM Access

A variety of methods are used for refreshing DRAMs; the two most commonly used are  $\overline{CAS}$ -Before- $\overline{RAS}$  and  $\overline{RAS}$ -only refresh. The  $\overline{CAS}$ -before- $\overline{RAS}$  method does not require an externally-generated Row address; instead, the DRAM uses its own address counters. With  $\overline{RAS}$ -only refresh the refresh address must also be generated externally and thus requires additional components for the refresh address counter. Since virtually all modern DRAMs support  $\overline{CAS}$ -before- $\overline{RAS}$  refresh, this is the technique used by the R3721.

Another requirement of DRAM is pre-charge after every read operation. Performing a memory-read operation from a DRAM cell causes its capacitor to discharge slightly. To remedy this, data must be written back into the cell after each read operation. This write-back operation is called pre-charge and is automatically handled by the DRAM. However, the DRAM controller must insure that the DRAM  $\overline{RAS}$  and  $\overline{CAS}$  pre-charge timing requirements are met for proper operation.

### **MEMORY SYSTEM CONFIGURATIONS**

A DRAM memory system can be designed in a number of different ways. Two of the major ways are: interleaved and non-interleaved.

An interleaved memory system works by dividing the memory system into two or more arrays. Consecutive addresses are distributed amongst the arrays: for example, in a two-way interleaved memory system, even word addresses reside within one bank, while odd word addresses reside in another bank.

Interleaved memory systems offer the advantage of higher bandwidth during multi-word transactions. For example, if, by using page mode, a particular system can read a new value from each DRAM once every 80ns (including access time, prop delay, set-up time, and  $\overline{CAS}$  precharge), then using two way interleaving allows two new data values to be read each 80ns, or a new value every 40ns. For a 25MHz CPU, this is the maximum data rate of the CPU. Thus, two way interleaving is used to double the bandwidth from the memory to the CPU. Figure 3.5 illustrates a two-way interleaved memory system. The concept of interleaving can be extended to 4, 8, or n-way interleaved memory systems.

The disadvantage of interleaved memory is that it requires a multiple 32-bit DRAM arrays, each with independent 32-bit data busses that are transceivered/ multiplexed onto a single CPU data bus. Thus, the minimum system configuration includes more memory (and thus cost) than the minimum configuration of a non-interleaved memory system. In addition, a memory data path for each memory array must be provided.



Figure 3.5 Interleaved Memory System

Independent of memory interleaving, multiple banks of memory can be under the control of a single DRAM controller and can use a single memory to CPU data path. In an interleaved system, each "bank" actually contains multiple 32-bit arrays of memory (for example, in a two-way interleaved system with two banks, there are two pairs of even and odd memory arrays; both even arrays use a single data path). While distinguishing between even and odd arrays in an interleaved memory system uses low order processor address bits, distinguishing between multiple banks of memory uses high-order bits. For the R3051 family, a quad word read will never cause a "bank crossing", and thus the relatively slow output enable/disable time of DRAMs is not a problem.

One might wonder why a system which includes two banks of DRAMs does not use interleaving to attain the benefit of higher bandwidth. There are a number of reasons such a decision might be made: some technical, and some business.

An interleaved memory system requires each 32-bit memory array to have its own dedicated data path to the DRAMs; this is because the tri-state enable/ disable times of DRAMs are too slow to attain the desired bandwidth back to the CPU. In a banked system, DRAM output enable timing is not a critical parameter, and thus multiple banks can share the same data path with no performance penalty.

The other difficulty with interleaved systems comes from a business model. Interleaved systems require that the "base configuration" offer a minimum of two banks of memory, and that memory upgrades occur in pairs. A noninterleaved memory configuration can start with half as much memory in the base model, and a memory upgrade of only a single array can be offered. Thus, a less expensive base model can be offered, and less expensive upgrades can be offered. This may fit a particular marketing requirement of the machine.

#### SUMMARY

DRAMs offer advantages in terms of cost and density of memory. However, they also introduce complexity in their control and system interface. The R3721 automatically handles this interface between the R3051 and the DRAM sub-system, allowing the benefits of DRAMs to be attained with minimal cost and complexity to the system designer.



# **R3721 OPERATION OVERVIEW**

#### INTRODUCTION

The IDT79R3721 DRAM controller is a single chip DRAM controller for systems based on an IDT79R3051 family CPU. It provides all of the control timing to interface from the CPU address/data bus and control bus through to the DRAM address and control interface, and also provides control of the data path between the DRAMs and the CPU.

The R3721 has been designed to support a wide variety of DRAM subsystems across a wide frequency range of R3051 CPUs. This chapter is intended to provide an overview of these capabilities; subsequent chapters provide more in-depth details on how these features work, and the specific timing associated with various memory configurations.

### **R3051 BUS INTERFACE**

The R3721 is designed to reside directly on the R3051 family A/D and control busses. To complete the system design, an external address decoder is required, and external data path chips such as the IDT73720 Bus Exchangers, or IDT74FCT245 bi-directional transceivers.

Regardless of size or organization of DRAM, the R3721 is always connected to particular bits of the R3051 A/D bus. The R3721 uses programmed values for the DRAM size and configuration to internally multiplex R3051 address lines into the appropriate row and column addresses for the DRAM. Table 4.1 shows the internal multiplexing of addresses performed by the R3721. Table 4.2 shows the DRAM bank selection, and which RAS/CAS control signals are output.

| DRAM Address  | Interleaved | Non-Interleaved |
|---------------|-------------|-----------------|
| Column(8:0)   | A(11:3)     | A(10:2)         |
| Row(8:0)      | A(20:12)    | A(19:11)        |
| Bank Sel(1:0) | A21         | A(21:20)        |

Address assignment for 256k x1 and 256kx4 DRAMs

| DRAM Address  | Interleaved | Non-Interleaved |
|---------------|-------------|-----------------|
| Column(9:0)   | A(12:3)     | A(11:2)         |
| Row(9:0)      | A(22:13)    | A(21:12)        |
| Bank Sel(1:0) | A23         | A(23:22)        |

Address assignment for 1Mx1 and 1Mx4 DRAMs

| DRAM Address  | Interleaved | Non-Interleaved |
|---------------|-------------|-----------------|
| Column(10:0)  | A(13:3)     | A(12:2)         |
| Row(10:0)     | A(24:14)    | A(23:13)        |
| Bank Sel(1:0) | A25         | A(25:24)        |

Address assignment for 4Mx1 and 4M x 4 DRAMs

Table 4.1 Processor to DRAM Address Multiplexing

| Bank Sel(1:0) | Non-Interleaved | Interleaved       |
|---------------|-----------------|-------------------|
| 00            | RAS(0)/CAS(3:0) | RAS(1:0)/CAS(3:0) |
| 01            | RAS(1)/CAS(3:0) | RAS(1:0)/CAS(3:0) |
| 10            | RAS(2)/CAS(3:0) | RAS(3:2)/CAS(3:0) |
| 11            | RAS(3)/CAS(3:0) | RAS(3:2)/CAS(3:0) |

Table 4.2. Bank Selection in Multi-Bank System

The R3721 monitors the processor ALE,  $\overline{Rd}$ ,  $\overline{Wr}$ , and  $\overline{Burst}/\overline{WrNear}$  control signals to determine the type of cycle in progress. The R3721 contains its own address latches, and aligns processor address outputs with DRAM Row and Column addresses.

If the external address decoder indicates that this transfer is intended for the DRAM sub-system, the R3721 performs the DRAM control interface (using timing programmed into the device during system boot). At the appropriate time, the DRAM controller will return the  $\overline{RdCEn}/\overline{ACK}$  handshake back to the processor to indicate that the transaction is sufficiently completed.

The interface to  $\overline{ACK}$  and  $\overline{RdCEn}$  is performed using a tri-stateable output driver with an internal pull-up. This allows other tri-stateable sources to directly drive  $\overline{ACK}$  and  $\overline{RdCEn}$  without introducing combinatorial logic delays inherent in combining the acknowledgment of multiple memory subsystems.

#### **R3721 DRAM INTERFACE**

The R3721 has been designed to interface to a wide variety of DRAM subsystems. Various options include:

Interleaved vs. Non-Interleaved

Interleaved memory subsystems offer higher system performance by providing higher bandwidth to the processor during quad word refills. However, an interleaved memory system requires a larger "base" amount of memory (two 32-bit arrays minimum) and a wider data path (one for each array, time multiplexed onto a single CPU bus).

The R3721 offers the system designer the flexibility to design either type of memory system. In fact, with proper planning, the system designer can offer a base model that does not perform memory interleaving, but allow field upgrades to perform interleaving (thus increasing both the memory and raw performance of the system).

Various densities of DRAM

The R3721 allows the system designer to use DRAM densities from  $256K \times 1$  through  $4M \times 4$ . Thus, depending on the memory requirements of the application, the system designer can decide the appropriate memory subsystem for the application. In addition, the DRAM controller internally aligns the CPU address bus with the DRAM address lines; this allows a later field upgrade to increase the density of memory devices used without requiring jumpering of address lines. Table 4.1 shows the internal multiplexing of address lines which allows the R3721 to support varying densities of DRAMs, without changing its interface to the processor bus.

Single bank or multiple banks of memory

The R3721 allows systems to be constructed with one to four banks (32bit wide memory arrays) of memory (either interleaved or not). Obviously, it has been designed to allow various strategies of "field upgrades" in the DRAM memory sub-system.

The R3721 utilizes high-performance output drivers, and four sets of the RAS and CAS DRAM controls, to directly drive up to 36 DRAM devices without external drivers. The R3721 uses a high-power output driver with built-in series resistance to avoid the noise problems typically associated with driving large capacitive loads.

In addition to the capability to directly drive these large loads, the R3721 also allows the system designer to incorporate additional, external memory drivers if needed. The various timing options can be selected to accommodate the additional delay of buffer drivers in the DRAM subsystem. The R3721 takes care of the particular case of partial writes.  $\overline{CAS}(3:0)$  are used to provide selective enabling of those DRAMs being written; that is, only those byte lanes involved in the write will have their corresponding  $\overline{CAS}$  signals asserted.

• Intelligent Control interface to take advantage of Page Mode DRAMs

The R3721 state machine was designed after extensive simulation of R3051 program behavior. Optimizations around typical locality of reference are included in the state machine for the R3051.

Figure 4.1 shows the basic state machine for the R3721. Note that it is optimized for series of page mode DRAM accesses.

Specifically, page mode is used for:

— Burst Refill

Page mode is used to obtain words within a quad word read. However, simulation has shown that the most likely next transfer is a single word write; thus,  $\overline{RAS}$  and  $\overline{CAS}$  are negated at the end of the burst refill to minimize the latency of subsequent operations due to  $\overline{RAS}$  precharge.

-Single Reads

After a single read, the DRAM controller will leave the DRAMs expecting a subsequent page mode access to the same page (either another read, a write, or a burst refill). The R3721 includes an on-chip page comparator which uses the DRAM density programmed into the device to determine whether or not a given access can take advantage of page mode.



Figure 4.1 R3721 State Machine

-Single Writes

After a single write, the DRAM controller will leave the DRAMs expecting a subsequent page mode access to the same page (either another write, a read, or a burst refill). The DRAM controller can use either WrNear, or its internal page comparator, to detect opportunity for page mode accesses.

Thus, the R3721 has truly been optimized to the operating environment of the R3051 based systems.

• Various speeds of DRAMs and Processors

The R3721 has been designed to support a wide range of processor frequencies, across a wide range of DRAM speeds. The system designer can configure varying times for the DRAM control signals. Programmable DRAM control parameters include:

 $-\overline{RAS}$  to  $\overline{CAS}$  Delay

This allows the system designer to control a number of critical timings, including row address hold time from  $\overline{RAS}$  and the  $\overline{RAS}$  to  $\overline{CAS}$  delay requirements of the system.

 $-\overline{RAS}$  and  $\overline{CAS}$  pulse widths

These parameters directly control the access time of the DRAM, and the resulting system performance.

 $-\overline{RAS}$  and  $\overline{CAS}$  pre-charge times

These parameters allow the system designer to minimize the performance penalty of DRAM pre-charge, yet still insure proper system operation.

-Refresh period

Depending on the system speed, the DRAM controller will be programmed for the appropriate counter value to insure both proper refresh operation, and to insure that the maximum  $\overline{RAS}$  low time of the DRAM is not violated. The R3721 uses a  $\overline{CAS}$ -before- $\overline{RAS}$  refresh protocol to perform DRAM refresh.

— Address decode time

The DRAM controller can work in systems which can properly decode addresses within the first cycle of a transfer, for optimal performance. Alternately, the DRAM controller can work with slower systems, requiring an extra half-cycle to perform proper address decoding.

• Various data path options

The R3721 directly controls the data path between the CPU and the DRAM sub-system. The R3721 can control either a set of IDT74FCT245s (for non-interleaved memory systems) or IDT73720s (for either multiple banked or interleaved memory configurations).

The R3721 allows this variety of options through the use of the on-chip mode register pictured in Figure 4.2. Subsequent chapters will discuss the fields of the Mode register, and the impact of the various options on system design.

|                |        |    |      |    |    | -  | -  | •   |      | -     | 1   | ~   |
|----------------|--------|----|------|----|----|----|----|-----|------|-------|-----|-----|
| Rsvd DCS RF2 R | -1 RF0 | СР | Rsvd | C0 | R2 | R1 | R0 | RCD | WrNr | Inivd | DZ1 | DZ0 |

| Figure 4.2 | Mode | Register | of DRAM | Controller |
|------------|------|----------|---------|------------|
|------------|------|----------|---------|------------|

#### PIN DESCRIPTION

This section describes the signals used in the above interfaces. More detail on the actual use of these pins is found in other chapters. Note that signals indicated with an overbar are active low.

#### **R3051** Interface

### Reset

**Reset**: An active low input used to reset the DRAM controller state machines. Reset causes the R3721 to load the mode register with default values, and performs 16 CAS-before-RAS refresh cycles to the DRAMs to initialize them.

### A/D(25:0)

Address/Data(25:0): These signals are connected directly with A/D(25:0) of the R3051 family CPU. The DRAM controller uses these inputs to obtain:

| BE(3:0):       | Individual data byte enables used in write operations. |
|----------------|--------------------------------------------------------|
| Address(25:4): | Address bits used to select amongst banks of DRAMs,    |
|                | and Row and Column addresses, according to Tables      |
|                | 4.1 and 4.2.                                           |
| Data(15:0):    | During Mode register write operations, during the      |
|                | data phase the A/D bus carries the values to be        |
|                | written into the mode register.                        |

### Addr(3:2)

Low Order Address(3:2): These signals carry the word within quad word address currently expected by the processor. During single reads, or writes. these inputs carry the specified address. During quad word reads, the DRAM controller uses an internal counter to manage word within quad word addressing, and thus ignores these inputs.

### ALE

Address Latch Enable: This signal is used to de-multiplex the A/D bus from address to data phase. The R3721 uses this signal to capture the current value of A/D(25:0) and Addr(3:2) during the address phase. The R3721 also uses this signal as the indication of the beginning of a memory transfer, and awaits the  $\overline{\text{CS}}$  input, according to the timing specified in the mode register.

### Rd

**Read**: Indicates that the current transfer is a read (single or burst).

### Wr

Write: Indicates that the current transfer is a write (near or not).

#### **Burst/WrNear**

Burst: During reads, this signal functions as the "Burst" indicator. If burst is asserted during a read, the R3721 knows that a guad word read sequence is expected.

WrNear: During writes, this signal functions as the "Write Near" indicator. If the DRAM controller state machine is in the "IDLE,  $\overline{RAS}$  asserted" state, it may use this signal to process the write in two cycles.

Ι

# 4 - 5

# Т

# I

T

I

Т

# SysClk

**System Clock**: This is the master timing reference, and is a direct connection from the  $\overline{SysClk}$  output of the R3051 family processor. All timing events are referenced to the  $\overline{SysClk}$  input.

### CS

**DRAM Chip Select**: This input is provided by the external address decoder, and is used to indicate that this R3721 controls the DRAM responsible for processing this transfer. The R3721 uses the programmed value in the Mode Register to determine when to sample this input.

### **MSel**

**Mode Register Select**: This input is provided by the external address decoder, and is used to indicate that this transfer targets the internal mode register of the R3721. To write to the mode register, both  $\overline{CS}$  and  $\overline{MSel}$  must be asserted by the external address decoder.

### RdCEn

### 0

I

T

Ι

**Read Buffer Clock Enable**: This output to the R3051 processor indicates that the currently requested word will be available on its A/D bus at the next sampling clock edge (falling edge of SysClk).

This output is a tri-stateable output; it is only driven by the R3721 in transfers in which its  $\overline{CS}$  input is asserted at the proper time. It is internally pulled up, so that no external pull-up resistor is required.

## ACK

### 0

**Acknowledge**: This output to the R3051 family processor indicates that the R3721 has sufficiently processed the current transfer.

On read operations, the processor uses this information to determine when to begin emptying the read buffer into the on-chip cache. The timing of this output during quad word reads is determined by the R3721 for optimal performance. The R3721 will release the processor to begin execution as early as possible in the transfer, but will insure that the fourth word of the quad read is available before the processor obtains it from the read buffer. Thus, the processor can simultaneously execute the incoming instruction stream even while the R3721 obtains the remaining words of the transfer.

On write operations, the processor uses this to terminate the write operation.

This output is a tri-stateable output; it is only driven by the R3721 in transfers in which its  $\overline{CS}$  input is asserted at the proper time. It is internally pulled up, so that no external pull-up resistor is required.

### **DRAM Interface**

### DAddr(10:0)

**DRAM Address**: These outputs are typically connected directly to the DRAM multiplexed row/column address inputs. Depending on the memory system organization and the organization of the DRAMs used, the R3721 will align the processor addresses with the DRAM addresses according to Table 4.1.

These outputs incorporate series resistors to eliminate overshoot and undershoot problems associated with large capacitive loads. In addition, highdrive capability has been incorporated in these outputs. Thus, the R3721 can directly drive large numbers of DRAMs or multiple SIMM modules.

#### **RAS(3:0)**

**Row Address Strobe**: These outputs are directly connected with the RAS inputs of the DRAMs on a bank basis, according to Table 4.2. The falling edge of this signal is used by the DRAM to capture the row address presented on DAddr(10:0).

In order to directly drive multiple DRAM devices, these signals provide high drive, and incorporate series resistors. Each  $\overline{RAS}$  signal may drive multiple loads with no system performance degradation.

### CAS(3:0)

**Column Address Strobe**: These outputs are directly connected with the  $\overline{CAS}$  inputs of the DRAMs on a byte basis. The R3051 processor may write partial word quantities, in which case the R3721 only enables those DRAMs in the byte lane being updated.  $\overline{CAS}(3)$  corresponds to  $\overline{BE}(3)$ ;  $\overline{CAS}(2)$  corresponds to  $\overline{BE}(2)$ ; etc. The falling edge of this signal is used by the DRAM to capture the column address presented on DAddr(10:0).

In order to directly drive multiple DRAM devices, these signals provide high drive, and incorporate series resistors. However, the propagation delay of  $\overline{CAS}$  is a system critical parameter; thus, no  $\overline{CAS}$  signal should drive more than 8 loads.

### **WBank(3:0)**

**Bank Write Enable**: These outputs are used to individually control the write enables of various memory banks. In non-interleaved systems, all four outputs are asserted;  $\overline{RAS}$  selects the specific bank to be written. In interleaved systems, they are enabled in pairs; that is, writes to an even bank cause  $\overline{WBank}(2)$  and  $\overline{WBank}(0)$  to be asserted, while writes to an odd bank cause  $\overline{WBank}(3)$  and  $\overline{WBank}(1)$  to be asserted. Again, only the specific bank being written will have its  $\overline{RAS}$  asserted, and thus only that bank will be updated during the write.

During refresh cycles, these outputs are negated. This avoids accessing the "test mode" built into modern 4Mb DRAMs.

In order to directly drive multiple DRAM devices, these signals provide high drive, and incorporate series resistors.

#### OE

#### 0

**DRAM Output Enable**: This output is directly connected to the output enable of common I/O DRAMs. It is connected to all DRAMs under the control of the R3721.

#### 0

#### 0

0

#### 0

#### Data Path Control Interface

#### DByteEn(3:0)

0

**Data Path Byte Enable**: These outputs are four identical output enables for the transceivers in the DRAM data path. Even in the case of partial writes, all four enables will be asserted; CAS(3:0) will control which devices actually get updated.

In typical systems,  $\overline{DByteEn}$  is connected on a byte lane basis to evenly distribute the load. For example, if the data path interfaces uses 74FCT245s, then the  $\overline{DByteEn}$  is directly connected to the " $\overline{OE}$ " input of the transceiver on that byte lane. If the data path uses IDT73720 Bus Exchangers,  $\overline{DByteEn}(1:0)$  are connected to the Bus Exchanger on the lower half of the data bus (Data(15:0)), and  $\overline{DByteEn}(2:0)$  are connected to the Bus Exchanger on the upper half of the data bus (Data(31:16)).

### $T/\overline{R}$

0

**Transmit/Receive**: This signal indicates the direction of the data path, and is connected directly to the  $T/\overline{R}$  input of the 74FCT245 or IDT73720. A high output indicates that data is being transmitted from the CPU to the memory (write); a low output indicates a memory read.

### Path

### Ο

**Path**: This signal is directly connected to the Path input of the IDT73720. It is used to specify the even or odd memory bank participating in the current transfer. This output is high if an even bank is the target of the transfer; it is low for an odd bank.

#### YZLEn

### 0

**Data Path Latch Enable**: This signal is connected to the YLEn and the ZLEn inputs of the IDT73720 Bus Exchanger. It is used to capture the data provided by both banks of memory of an interleaved system, for later sequencing onto the processor A/D bus. The latches are transparent when this output is high, and closed when it is low.



# **PROGRAMMING THE R3721**

**CHAPTER 5** 

### INTRODUCTION

This chapter describes the organization of the mode register and gives a detailed illustration of the timings involved with each mode. Topics include:

- A general overview of the various fields of the mode register and its operation.
- A detailed description of the timing diagrams related with each field of the mode register.
- The default settings of the mode register.
- A detailed explanation on how to access the mode register.

## THE MODE REGISTER

The mode register is a 16-bit write-only register used to configure the R3721 to adapt it to a variety of different applications. Figure 5.1 illustrates the mode register. The settings of the mode register influence the signals used to control the external DRAM banks as well as the signals involved in controlling the data path.

|      | 14  |     |     |     |    |      |    |    |    |    |     |      |       |     |     |
|------|-----|-----|-----|-----|----|------|----|----|----|----|-----|------|-------|-----|-----|
| Rsvd | DCS | RF2 | RF1 | RF0 | СР | Rsvd | CO | R2 | R1 | R0 | RCD | WrNr | Inivd | DZ1 | DZ0 |

Figure 5.1 The Mode Register

### **PROGRAMMING THE MODE REGISTER**

The mode register contains different fields that provide the R3721 with great flexibility in interfacing with a wide range of applications. Each field is used to control one aspect of the behavior of the R3721. All the fields get updated when writing to the mode register.

### **DRAM Size Field**

Bits 0 and 1 of the mode register are used to inform the R3721 of the organization of the DRAMs used in the system as follows:

| Bit 1<br>DZ1 | Bit 0<br>DZ0 | DRAM Page Size |
|--------------|--------------|----------------|
| 0            | 0            | 512 entries    |
| 0            | 1            | 1K entries     |
| 1            | 0            | 1K entries     |
| 1            | 1            | 2K entries     |
|              |              | 2886 tbl 01    |

This allows the R3721 to control up to a maximum of 64 MBytes of memory.

#### **External Memory Configuration**

Bit 2 of the mode register are used to program the physical configuration of the external memory and the data path.

| Bit 2<br>Inlvd | Memory Configuration                                                    |
|----------------|-------------------------------------------------------------------------|
| 0              | Non-Interleaved memory system                                           |
| 1              | Interleaved memory system and Bus Exchangers are used in the data path. |

2886 tbl 02

The R3721 always assumes that Bus Exchangers are used in the data path for the interleaved configuration. In the Non-Interleaved configurations, it is possible to connect either standard transceivers or Bus Exchangers.

#### Write Near

The R3721 has the ability to use the R3051  $\overline{\text{WrNear}}$  output to provide fast page mode writes. The extra delay may be appropriate in certain memory configurations, as discussed in later chapters.

| Bit 3<br>WrNr | Use of WrNear             |
|---------------|---------------------------|
| 0             | Use of WrNear is enabled  |
| 1             | Use of WrNear is disabled |

### $\overline{\text{RAS}}$ to $\overline{\text{CAS}}$ Delay

Bit 4 of the mode register specifies the delay between the assertion of the appropriate  $\overline{RAS}$  signal to the assertion of the related  $\overline{CAS}$  signal. This delay can be programmed to be either one clock cycle or two clock cycles. Figure 5.2 illustrates the effect of the RCD bit .

The DRAM controller always transitions the DAddr bus from Row Address to Column Address one-half clock cycle before the assertion of  $\overline{CAS}$ .

| Bit 4<br>RCD | RAS to CAS delay                                                 |
|--------------|------------------------------------------------------------------|
| 0            | One clock cycle delay from $\overline{RAS}$ to $\overline{CAS}$  |
| 1            | Two clock cycles delay from $\overline{RAS}$ to $\overline{CAS}$ |

2886 tbl 03



Figure 5.2 RAS to CAS Delay

### **RAS** Timing

Bits 5, 6 and 7 of the mode register specify the width of the  $\overline{RAS}$  pulse in clock cycles as well as the  $\overline{RAS}$  pre-charge time. This field gives the system designer the freedom to choose from a wide range of DRAM speeds based on a performance/cost criteria. Figure 5.3 illustrates the timings of the  $\overline{RAS}$  signals.

Logically speaking,  $\overline{RAS}$  precharge occurs in two parts. A portion of the precharge occurs at the start of the transfer, and varies in duration from one to three clock cycles (depending on the programmed value). The second portion occurs at the end of the transfer, and is always one clock cycle long. This distinction allows the DRAM controller to avoid additional  $\overline{RAS}$  pre-charge if the DRAM controller state machine was already in the "IDLE,  $\overline{RAS}$  negated" state.

| Bit 7<br>R2 | Bit 6<br>R1 | Bit 5<br>R0 | RAS Pulse Width | RAS Pre-charge |
|-------------|-------------|-------------|-----------------|----------------|
| 0           | 0           | 0           | 2 clock cycles  | 2 clock cycles |
| 0           | 0           | 1           | 3 clock cycles  | 2 clock cycles |
| 0           | 1           | 0           | 3 clock cycles  | 3 clock cycles |
| 0           | 1           | 1           | 4 clock cycles  | 2 clock cycles |
| 1           | 0           | 0           | 4 clock cycles  | 3 clock cycles |
| 1           | 0           | 1           | 4 clock cycles  | 4 clock cycles |
| 1           | 1           | 0           | Reserved        |                |
| 1           | 1           | 1           | Reserved        |                |

2886 tbl 04



Figure 5.3 RAS Signals Timing

#### CAS Pulse Width

Bit 8 of the mode register specifies the  $\overline{CAS}$  pulse width in clock cycles. The  $\overline{CAS}$  pulse width can be programmed to be 1.5 or 2.5 clock cycles. Figure 5.4 illustrates the timings of the  $\overline{CAS}$  pulse width.

The  $\overline{\text{CAS}}$  pulse width, along with the  $\overline{\text{CAS}}$  precharge time, has the most dramatic impact on system performance. These parameters affect the performance of the various page mode accesses performed by the DRAM controller, and thus directly affect the timing of the  $\overline{\text{RdCEn}}$  and  $\overline{\text{ACK}}$  acknowledgment signals back to the processor.

| C0 | CAS Pulse Width  |             |
|----|------------------|-------------|
| 0  | 2.5 clock cycles |             |
| 1  | 1.5 clock cycles |             |
|    |                  | 2886 tbl 05 |



Figure 5.4 CAS Pulse Width Timing

#### **CAS** Pre-charge Time

Bit 10 of the mode register specifies the  $\overline{CAS}$  pre-charge time which could be programmed to be either 0.5 clock cycle or 1.5 clock cycles. Any combination between the  $\overline{CAS}$  pulse width and the  $\overline{CAS}$  pre-charge time is possible. Figure 5.5 illustrates the  $\overline{CAS}$  pre-charge timing.



Figure 5.5 CAS Pre-Charge Timing

### **Refresh Period**

Bits 11, 12 and 13 of the mode register specify the frequency of the input clock to the R3721. The R3721 loads an internal refresh timer with the appropriate value to refresh the DRAMs according to the table below.

The value is appropriate to avoid violating the  $10\mu S$  maximum  $\overline{RAS}$  low time specification for DRAMs.

| Bit 13<br>RF2 | Bit 12<br>RF1 | Bit 11<br>RF0 | SysClk<br>Freq. |  |  |  |
|---------------|---------------|---------------|-----------------|--|--|--|
| 0             | 0             | 0             | 4 MHz           |  |  |  |
| 0             | 0             | 1             | 8 MHz           |  |  |  |
| 0             | 1             | 0             | 12 MHz          |  |  |  |
| 0             | 1             | 1             | 16 MHz          |  |  |  |
| 1             | 0             | 0             | 20 MHz          |  |  |  |
| 1             | 0             | 1             | 25 MHz          |  |  |  |
| 1             | 1             | 0             | 33 MHz          |  |  |  |
| 1             | 1             | 1             | 40 MHz          |  |  |  |

2886 tbl 07

### **Delayed Chip-select**

Bit 14 of the mode register specifies when the R3721 will sample the Chip\_Select input at the beginning of any access. The R3721 can be programmed to sample the Chip-Select on the first positive edge of the clock following the negation of ALE or on the first negative edge of the clock following the negation of ALE.

This bit allows the R3721 to perform optimally in either a high-performance (or low frequency) system capable of rapidly decoding addresses, or in systems using a slower, or synchronous approach to address decoding. The R3721 needs to also be explicitly aware of transfers which do not use its memory devices; for example, it can use these cycles to perform a DRAM refresh without performance loss in the system.

The DCS bit also affects the operation of the R3721 for page writes. If the DCS is cleared, the R3721 can perform page writes in a minimum of two clock cycles. If the DCS bit is set, the R3721 can perform page writes in a minimum of 3 clock cycles. Figure 5.6 illustrates the timings of the Chip\_Select or the Mode\_Select input pins.

| Bit 14<br>DCS |                                                                                                                                   |
|---------------|-----------------------------------------------------------------------------------------------------------------------------------|
| 0             | $\overline{\text{CS}}$ sampled on the positive edge of the clock 2 clock cycle page writes may be possible                        |
| 1             | $\overline{\text{CS}}$ sampled on the negative edge of the clock after the negation of ALE 2 clock cycle page writes not possible |

2886 tbl 08



Figure 5.6 Chip-select Timing

#### **DEFAULT SETTINGS**

At power up, the mode register is loaded with default values which correspond to the following system:

- DRAM page size: 512 entries
- System configuration: Non-interleaved
- WrNear for fast writes enabled.
- 2 clock cycles delay from  $\overline{RAS}$  assertion to  $\overline{CAS}$  assertion
- 4 clock cycles for the  $\overline{RAS}$  pulse width and the  $\overline{RAS}$  pre-charge time
- 2.5 clock cycles for the  $\overline{CAS}$  pulse width
- 1.5 clock cycle for the  $\overline{CAS}$  pre-charge time
- 25 MHz frequency of operation
- Delayed Chip\_Select.

Figure 5.7 illustrates the settings of the mode register at power up.

| 15   | 14  | 13  | 12  | 11  | 10  | 9    | 8  | 7  | 6  | 5  | 4   | 3    | 2     | 1   | 0   |
|------|-----|-----|-----|-----|-----|------|----|----|----|----|-----|------|-------|-----|-----|
| Rsvd | DCS | RF2 | RF1 | RF0 | СР  | Rsvd | CO | R2 | R1 | R0 | RCD | WrNr | Inlvd | DZ1 | DZ0 |
| x    | 1   | 1   | 0   | 1   | 1   | x    | 0  | 1  | 0  | 1  | 1   | 0    | 0     | 0   | 0   |
| D15  | D14 | D13 | D12 | D11 | D10 | D9   | D8 | D7 | D6 | D5 | Π4  | D3   | D2    | D1  | DO  |

Figure 5.7 Settings of Mode Register at Power Up

#### WRITING TO THE MODE REGISTER

The mode register is a 16-bit write only register that controls the internal operation of the R3721 DRAM controller. The different fields of the mode register control the behavior of various output control signals such as the  $\overline{\text{RAS}}$  and the  $\overline{\text{CAS}}$  signals. At power up, the mode register is initialized with the default settings illustrated in Figure 5.7. To obtain maximum performance out of the R3721 DRAM Controller, the mode register needs to be programmed to fit the application at hand.

To access the internal mode register of the R3721, the external address decoder must assert both the  $\overline{\text{CS}}$  line and the  $\overline{\text{MSel}}$  lines. The assertion of the  $\overline{\text{CS}}$  line is important to distinguish among multiple R3721's in a single system. The Internal mode register of the R3721 should be mapped in the uncacheable I/O space of the R3051.

The R3051 can access the mode register by proceeding with a standard write operation to the I/O location occupied by the mode register. The R3721 detects the assertion of both the  $\overline{CS}$  and the  $\overline{MSel}$  lines and determines that the access is for the internal mode register. The data present on the R3051 data bus A/D(15:0) is written into the mode register, regardless of system byte ordering. The R3721 returns the  $\overline{ACK}$  signal to the R3051 to terminate the write access to the mode register in 3 clock cycles. Thus, the write access to the mode register is always 3 clock cycles regardless of the configuration of the external memory system. The external state machine controlling the rest of the system should not assert the  $\overline{ACK}$  for writes to the mode register, since the R3721 DRAM Controller asserts  $\overline{ACK}$  with proper timing to terminate the write. Figure 5.8 illustrates the timing diagrams in writing to the mode register.

Note that it is recommended that writes to the mode register use '0' in the upper A/D bits (A/D(25:16)). This insures compatibility with possible future versions of the DRAM controller. When writing to the mode register, the two reserved bits (bit 9 and bit 15) must be written as "0".





### AUTO CONFIGURATION DETECTION AND INITIALIZATION

Many of today's systems are designed to allow for future fields upgrades of the base memory system to more memory banks and/or deeper DRAM devices. Although these upgrade strategies typically do not support moving from non-interleaved to interleaved systems, or from "x4" to "x1" devices, the ability to offer a base configuration (at a lower selling price) with capability upgrades is often a selling feature of the end product.

To use the R3721 in such systems, the software at boot-up should configure the mode register of the R3721 with the maximum memory size it can support according to the basic system design.

The software should then run diagnostics to determine whether or not the DRAM size used corresponds to the programmed size. The diagnostic software should also determine the presence of multiple banks. Typical strategies for such diagnostics include writing distinct values into a given location within each bank, and then reading the data back to see if any of the writes did not occur properly, or altered data previously written.

Once the configuration of the system is determined, the software should reprogram the mode register with the exact system configuration to obtain the maximum performance out of the R3721 DRAM Controller.



# **R3721 INTERFACING**

# INTRODUCTION

This chapter describes the various hardware interfaces of the R3721. Included are discussions on:

- The R3051 processor interface, including the interface to the address decoder and the interface to other memory controllers.
- The interface to DRAM devices.
- The interface to the DRAM data path transceiver elements.

## **R3051 BUS INTERFACE**

The R3721 is designed to reside directly on the R3051 family A/D and control busses. To complete the system design, an external address decoder is required, and external data path chips such as the IDT73720 Bus Exchangers, or IDT74FCT245 bi-directional transceivers should be provided.

Regardless of size or organization of DRAM, the R3721 is always connected to specific bits of the R3051 A/D bus. The R3721 uses programmed values for the DRAM size and organization to internally multiplex R3051 address lines into the appropriate row and column addresses for the DRAM. Chapter 4 shows the internal address multiplexing of the R3721.

The R3721 monitors the processor ALE,  $\overline{Rd}$ ,  $\overline{Wr}$  and  $\overline{Burst}/\overline{WrNear}$  control signals to determine the type of cycle in progress. The R3721 contains its own address latches, and aligns processor address outputs with DRAM Row and Column addresses.

Figure 6.1 shows the processor interface with the R3721, including the interface to the address decoder.



Memory Controllers

#### Figure 6.1 R3721 CPU Interface Connections

#### **Processor Interface Signals**

The interface to the processor from the R3721 in general requires no external interface logic. This section describes how the R3721 signals are derived from the processor interface.

#### Reset

I

T

I

In most systems, this signal is directly connected with the same logic used to drive the R3051 processor Reset signal.

#### A/D(25:0)

The R3721 A/D(25:0) bus is directly connected to the R3051 family A/D bus. Regardless of the actual DRAM configuration, the R3721 is always connected the same way to the R3051 bus. Although not all systems require the high-order address lines, it is good practice to connect all of A/D(25:0) with the R3051. This allows greater flexibility in later upgrading to higher density DRAMs, or populating with additional DRAM banks.

#### Addr(3:2)

As with the A/D(25:0), these inputs are directly connected to the R3051 Addr(3:2) outputs.

#### ALE

This input is directly connected with the R3051 ALE output.

 $\overline{Rd}$  I This input is directly connected with the R3051  $\overline{Rd}$  output. Wr

This input is directly connected with the R3051  $\overline{Wr}$  output.

T

I

I

#### Burst/WrNear

This input is directly connected with the R3051 Burst/WrNear output.

#### SysClk

This input is directly connected with the R3051 SysClk output. It is not connected through a clock buffer, but rather directly connected with the CPU output.

#### $\overline{\mathbf{CS}}$

The  $\overline{\text{CS}}$  input is provided by the system address decoder to select the DRAM address space. In general, the address decoder looks at the R3051 output address (it may look at the address as captured by external transparent latches) to determine which memory resource is currently being accessed.

#### MSel

The  $\overline{\text{MSel}}$  input is provided by the system address decoder to select the R3721 mode register. In general, the address decoder looks at the R3051 output address (it may look at the address as captured by external transparent latches) to determine which memory resource is currently being accessed.

#### RdCEn

The R3721  $\overline{RdCEn}$  is a tri-stateable output. It is only driven during accesses in which the R3721  $\overline{CS}$  input is asserted. The connection between this output and the R3051  $\overline{RdCEn}$  input depends on the rest of the system. If the rest of the system is designed to provide a tri-stateable  $\overline{RdCEn}$ , then this output can be wire "OR"ed with the  $\overline{RdCEn}$  outputs of other memory subsystems, and tied directly to the  $\overline{RdCEn}$  input of the processor. Otherwise, a logic device must perform the logical negative true "OR" function. An internal pull-up is provided.

#### ACK

0

The R3721  $\overrightarrow{ACK}$  is a tri-stateable output.. It is only driven during accesses in which the R3721  $\overrightarrow{CS}$  input is asserted. The connection between this output and the R3051  $\overrightarrow{ACK}$  input depends on the rest of the system. If the rest of the system is designed to provide a tri-stateable  $\overrightarrow{ACK}$ , then this output can be wire "OR"ed with the  $\overrightarrow{ACK}$  outputs of other memory subsystems, and tied directly to the  $\overrightarrow{ACK}$  input of the processor. Otherwise, a logic device must perform the "OR" function. An internal pull-up is provided.

#### The Address Map

In typical MIPS-based systems, such as those using the R3051, RAM is located in memory starting at physical address "0". Note that various aspects of the kernel implicitly assume that RAM will be available at this location; for example, the current exception handler is invoked at a very low physical address. Thus, typical systems will decode the DRAM accesses in a region beginning at physical address "0", and provide a  $\overline{CS}$  to the R3721.

The Mode Register of the DRAM controller is typically mapped as an I/O peripheral. In MIPS systems (and thus for the R3051), these are referenced through kseg1, which is unmapped and uncached. References to kseg1 are always translated to the lowest 512MB of the physical address space. The system architect typically will use an address well above the maximum amount of DRAM expected for the system, but avoid the address space reserved for the system EPROM (in typical systems, the EPROM is located in the address region accessed by the processor Reset exception vector, which is physical address 1FC0\_0000).

In systems which use multiple R3721 DRAM Controllers to manage separated DRAM subsystems, the system designer will typically arrange the address map so that all of the various DRAM subsystems combine to present a single contiguous RAM array as seen by software. That is, one DRAM controller may be selected to respond to references in the address range 0 to 16MB, and the next to respond to the address range 16MB to 32MB. Note that there is no particular reason for the various DRAM Controllers' mode registers to appear contiguous, and thus these are typically scattered throughout the I/O space to simplify address decoding.

### **R3721 DRAM INTERFACE**

The R3721 has been designed to interface to a wide variety of DRAM subsystems. Various options include:

- Interleaved vs. Non-Interleaved
- Various densities of DRAM
- Single bank or multiple banks of memory
- Intelligent Control interface to take advantage of Page Mode DRAMs
- Various speeds of DRAMs and Processors
- Various data path options.

Later chapters describe specific memory configurations used with the R3051. Note that the R3721 provides enough output signals to interface with up to 4 memory banks, and to interface with devices as large as  $4M \times 4$ . Many systems will use less than the maximum amount of memory supported. It is typically good practice to route unused address and control lines to the memory array, to allow future or field upgrades to higher density devices or additional memory banks.

Figure 6.2 shows the R3721 DRAM Control interface. In general, the following strategies for interconnection apply:



Figure 6.2 R3721 DRAM Control Interface

#### DAddr(10:0)

0

These outputs are typically connected directly to the DRAM multiplexed row/column address inputs. According to the memory system organization and the organization of the DRAMs used, the R3721 will align the processor addresses with the DRAM addresses as described in Chapter 4.

These outputs incorporate series resistors to eliminate overshoot and undershoot problems associated with large capacitive loads. In addition, highdrive capability has been incorporated in these outputs. Thus, the R3721 can directly drive large numbers of DRAMs or multiple SIMM modules.

Certain system configurations, however, require too many DRAMs to drive directly from the R3721. Such systems can use external buffer/drivers, and select an appropriate system timing model.

#### **RAS**(3:0)

These outputs are typically directly connected with the  $\overline{RAS}$  inputs of the DRAMs on a bank basis, as described in Chapter 4. The falling edge of this signal is used by the DRAMs to capture the row address presented on DAddr(10:0).

In order to directly drive multiple DRAM devices, these signals provide high drive, and incorporate series resistors. Each  $\overline{\text{RAS}}$  signal may drive multiple loads with no system performance degradation. Certain system configurations, however, require too many DRAMs to drive directly from the R3721. Such systems can use external buffer/drivers, and select an appropriate system timing model.

#### **CAS**(3:0)

These outputs are directly connected with the  $\overline{CAS}$  inputs of the DRAMs on a byte basis, according to chapter 4 ( $\overline{CAS}(0)$  corresponds to  $\overline{BE}(0)$ , etc.). The falling edge of this signal is used by the DRAM to capture the column address presented on DAddr(10:0).

In order to directly drive multiple DRAM devices, these signals provide high drive, and incorporate series resistors. However, the propagation delay of  $\overline{CAS}$  is a system critical parameter; thus, no  $\overline{CAS}$  signal should drive more than 8 loads. Certain system configurations require too many DRAMs to drive directly from the R3721. Such systems can use external buffer/drivers, and select a system timing model appropriately.

#### WBank(3:0)

0

These outputs are used to individually control the write enables of various memory banks. In non-interleaved systems, all four outputs are asserted, and  $\overline{RAS}$  is used to control which bank actually is written. In interleaved systems, they are enabled in pairs (writes to an even bank cause  $\overline{WBank}(2)$  and  $\overline{WBank}(0)$  to be asserted, etc.). Again, only the particular array being written will have its  $\overline{RAS}$  asserted. Thus, these outputs are connected directly to the  $\overline{WE}$  inputs of the DRAMs in a given bank (that is,  $\overline{WBank}(0)$  is connected to all of the  $\overline{WE}$  inputs of memory array 0, etc.).

During refresh cycles, these outputs are negated. This avoids accessing the "test mode" built into modern 4Mb DRAMs.

In order to directly drive multiple DRAM devices, these signals provide high drive, and incorporate series resistors. Certain system configurations, however, require too many DRAMs to drive directly from the R3721. Such systems can use external buffer/drivers, and select a system timing model appropriately.

#### OE

0

This output is directly connected to the output enable of common I/O DRAMs. It is connected to all DRAMs under the control of the R3721. This output also incorporates series resistors for driving large loads.

### DATA PATH CONTROL INTERFACE

In addition to directly interfacing to the DRAM devices, the R3721 directly controls the data path transceivers between the CPU and the DRAMs. The R3721 is designed to support the use of 74FCT245 type transceivers in non-interleaved configurations, or IDT73720 Bus Exchangers in banked or interleaved memory configurations. Figure 6.3 shows the interface between the R3721 and 74FCT245 transceivers; Figure 6.4 shows the interface to the IDT73720 Bus Exchanger.

The R3721 directly provides the control signals for the data path, eliminating logic (and timing delays) in this path. Typical systems are connected as described below:

#### DByteEn(3:0)

**0** 

These outputs provide four identical copies of transceiver output enables. Note that  $\overline{CAS}$ , which is asserted on a byte basis, controls which DRAMs actually participate in the transfer. To equally balance the loads, these outputs are typically connected on a byte basis.

If the data path interface uses 74FCT245's, then the  $\overline{DByteEn}$  is directly connected to the " $\overline{OE}$ " input of the transceiver on that byte lane. If the data path uses IDT73720 Bus Exchangers,  $\overline{DByteEn(1:0)}$  are connected to the Bus Exchanger on the lower half of the data bus (Data(15:0)), and  $\overline{DByteEn(2:0)}$  are connected to the Bus Exchanger on the upper half of the data bus (Data(31:16)).

#### T/R

#### 0

This signal indicates the direction of the data path, and is connected directly to the  $T/\overline{R}$  input of the 74FCT245 or IDT73720. This output is high during writes, and low during reads.



Figure 6.3 R3721 Data Path Interface to 74FCT245s

#### Path

ο

0

This signal is directly connected to the Path input of the IDT73720. It is used to specify which memory array is participating in the current transfer. The R3721 outputs a high to enable an even bank, and a low for an odd bank.

#### YZLEn

This signal is connected to the YLEn and the ZLEn inputs of the IDT73720 Bus Exchanger. It is used to capture the data provided by both banks of memory of an interleaved system, for later sequencing onto the processor A/ D bus. The latch is open when this output is high.



Figure 6.4 R3721 Data Path Interface to IDT73720 Bus Exchangers

### SUMMARY

The R3721 has been designed to eliminate virtually all glue logic when interfacing an R3051 CPU with DRAM memory devices. However, the R3721 allows the system designer maximum flexibility, by supporting a wide variety of memory systems, and by allowing the system architect to construct the address map appropriate to the target application.

Integrated Device Technology, Inc.

# THE USE OF THE R3721 **IN A NON-INTERLEAVED MEMORY SYSTEM**

**CHAPTER 7** 

## INTRODUCTION

This chapter describes how to use the R3721 DRAM controller in a noninterleaved memory system. This chapter explains in detail the effect of various configurations of the mode register on the timings of the output signals. The design considerations discussed include:

- A detailed description of the design of a non-interleaved DRAM system.
- A detailed description of the timings for single read transactions.
- A detailed description of the timings for write transactions.
- A detailed description of the timings for quad reads transactions.

## NON-INTERLEAVED SYSTEM DESIGN

A non-interleaved memory system consists of 1 to 4 banks of "x1" or "x4" DRAMs interfaced to the CPU by the DRAM controller. The R3721 uses the DRAM size information encoded in the mode register and selects the appropriate bank by decoding high-order address bits from the CPU. In the non-interleaved configuration, each RAS controls one bank while all the CAS signals are shared among the four banks (i.e. RASO and CAS(3:0) control bank 0 and RAS3 and  $\overline{CAS(3:0)}$  control bank 3). The  $\overline{CAS}$  signals should be distributed on a byte-lane basis; that is, all DRAMs on the byte lane corresponding to  $\overline{BE}(0)$  should use  $\overline{CAS}(0)$ , etc. In the non-interleaved configuration, the R3721 supports either standard 74245 data transceivers or IDT73720 Bus Exchangers in the data path. In non-interleaved systems, the output data path control signals from the R3721 work identically for either type of data path device.

For the ease of discussion, all the timing diagrams illustrated in this chapter assume the settings of the mode register as shown in Figure 7.1. These settings correspond to a 25MHz non-interleaved system with the following:

- 256Kx4 DRAMs.
- WrNear enabled
- RAS pulse width is 3 clock cycles, RAS pre-charge time is 2 clock cycles,
- CAS pulse width is 1.5 clock cycle,
- $\overline{CAS}$  pre-charge time is 0.5 clock cycle,
- 2 clock cycles delay from RAS to CAS.
- Fast chip-select mode.

| 15   | 14  | 13  | 12    | 11  | 10  | 9    | 8  | /  | ю  | 5  | 4   | 3    | 2     | 1   | U     |
|------|-----|-----|-------|-----|-----|------|----|----|----|----|-----|------|-------|-----|-------|
| Rsvd | DCS | RF2 | 2 RF1 | RF0 | СР  | Rsvo | C0 | R2 | R1 | R0 | RCD | WrNr | inivd | DZ1 | I DZ0 |
| 0    | 0   | 1   | 0     | 1   | 0   | 0    | 1  | 0  | 0  | 1  | 1   | 0    | 0     | 0   | 0     |
| D15  | D14 | D13 | D12   | D11 | D10 | ) D9 | D8 | D7 | D6 | D5 | D4  | D3   | D2    | D1  | D0    |

| 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |  |
|----|----|----|----|----|----|---|---|---|---|---|---|---|---|---|---|--|
|----|----|----|----|----|----|---|---|---|---|---|---|---|---|---|---|--|

Figure 7.1 Settings of the Mode Register Used as an Example in this Chapter

The timing diagrams illustrated in this chapter apply for single bank as well as for multiple bank non-interleaved systems. The Path signal is illustrated with two values, and the clock edge at which its value should change is indicated to accommodate multiple banks in systems using the Bus Exchanger in the data path. For even banks, the Path signal is always 1, for odd banks, the Path signal is 0. RASn is any one of the four RAS signals.

### SINGLE READ TRANSACTION TIMINGS

In general, there are only two types of read transactions from the R3051: quad word reads and single word reads. Quad word reads occur only in response to cache misses. All instruction cache misses are processed as quad word reads; data cache misses may be processed as quad word reads or single word reads, depending on the initialization of the processor. Uncached references are always processed as single datum reads. This section describes the timing diagrams involved in single datum reads; a later section describes quad word read operations. The R3721 only asserts the CAS signals corresponding to the byte lanes requested for the particular transfer.

### Start of Single Read Access

The R3721 determines the beginning of a single read access by monitoring the assertion of the ALE and the  $\overline{Rd}$  signals from the R3051. The R3721 latches and multiplexes the input address from the R3051 and outputs the row address on the DAddr bus. If the fast chip-select mode is selected (DCS bit in mode register = 0), the  $\overline{CS}$  input must be valid before the following rising edge of  $\overline{SysClk}$  for the R3721 to respond to the access; otherwise the R3721 assumes the access to be outside of the memory space it controls and does not assert any control signals. For a slow chip select mode, the  $\overline{CS}$  bit must be valid by the following falling edge of  $\overline{SysClk}$ . Figure 7.2 illustrates the beginning of a single read access for both cases.



Figure 7.2 (a) Start of Single Read Access for Fast Chip-select



Figure 7.2 (b) Start of Single Read Access for Slow Chip-select

## Memory Control Signals for Single Read Accesses

After the detection of the  $\overline{CS}$  signal, the R3721 starts to issue the various control signals to the DRAMs in the following way:

- On the rising edge of  $\overline{\text{SysClk}}$  following  $\overline{\text{CS}}$ , the appropriate  $\overline{\text{RASn}}$  signal is issued ( $\overline{\text{RAS0}}$  for access to bank 0,  $\overline{\text{RAS1}}$  for access to bank 1 ....). The  $\overline{\text{ACK}}$  and  $\overline{\text{RdCEn}}$  signals are enabled and driven to a level "high".
- Depending on the value of the RCD bit in the mode register, the R3721 can proceed in two different ways:

If RCD=0, the column address is presented on the DAddr bus on the falling edge of  $\overline{SysClk}$  following the assertion of the  $\overline{RAS}$  signal. The appropriate  $\overline{CAS}$  signals are asserted on the next rising edge of  $\overline{SysClk}$  ( $\overline{CASO}$  for  $\overline{BE0}$ ,  $\overline{CAS1}$  for  $\overline{BE1}$ ...).

If RCD=1, the column address is presented on the DAddr bus on the falling edge of  $\overline{SysClk}$ , 1.5 clock cycles following the assertion of the  $\overline{RAS}$  signal. The  $\overline{CAS}$  signals are asserted on the following rising edge of  $\overline{SysClk}$ .

Figures 7.3 (a, b) illustrates the timing diagrams in issuing the control signals to the DRAMs for both values of the RCD bit.



Figure 7.3 (a) DRAM Control for RCD=0 (Single Read)



Figure 7.3 (b) DRAM Control for RCD=1 (Single Read)

#### End of a Single Read Access

Depending on the setting of the  $\overline{CAS}$  pulse width in the mode register, the  $\overline{CAS}$  signals are kept asserted for either 1.5 or 2.5 clock cycles, and negated by the falling edge of  $\overline{SysClk}$ . The R3721 is designed in such a way that the R3051 samples the data on the same falling edge used to negate the  $\overline{CAS}$  signals. Thus, the R3721 asserts  $\overline{ACK}$  and the  $\overline{RdCEn}$  one clock cycle before negating the  $\overline{CAS}$  signals, so that they are sampled by the processor one-half cycle before the data sample point. The  $\overline{ACK}$  and  $\overline{RdCEn}$  signals are asserted on the falling edge of  $\overline{SysClk}$  and kept asserted for one clock cycle.

To take advantage of the page mode capabilities of the DRAMs, the R3721 always assumes that any single read access will be followed by another access (read or write or quad word read) within the same DRAM page. Based on this assumption, the RAS signal remains asserted at the end of the single word read access to continue in the page mode of the DRAMs. Figure 7.4(a, b) illustrate the timing diagrams in ending a single read access for both values of the  $\overline{CAS}$  pulse width.





Figure 7.4 (b) End of Single Read Access,  $\overline{CAS}$  Pulse = 2.5 Clock Cycle



Figure 7.5 illustrates the complete control timings involved in a single read access for the settings of the mode register illustrated in Figure 7.1.

Figure 7.5 Example of a Single Read Access

#### **Page Read Accesses**

In order to reduce latency of memory operations, the R3721 attempts to use page mode transfers wherever possible. To support this operation, the R3721 incorporates an internal address comparator which compares high-order bits from the current transfer with high-order bits from the previous transfer (the previous transfer high-order bits are the current row address of the DRAMs). The R3721 determines the maximum page size of the memory system based on the DRAM size information encoded in the mode register.

Page read accesses take advantage of the previous transfer in that the  $\overline{RAS}$  signal is already asserted and the DRAM already has accessed the appropriate

row address. Page read accesses have timings similar to single read accesses, with the exception that no time is lost in re-asserting the  $\overline{RAS}$  signal and remultiplexing the row and column addresses.

Once the R3721 detects the start of a single read access from the R3051 and determines that it is within the current page, it outputs the column address to the DRAMs. On the following rising edge of SysClk, the CAS signals are asserted in the fast chip-select mode. In the slow chip-select mode, the CAS signals are asserted asserted on the second rising edge of SysClk. The page read access is then terminated as for a single read access. Figure 7.6 illustrates the timing for a page read access for the settings of the mode register illustrated in Figure 7.1.



Figure 7.6 Page Read Access Timing Diagram

### Single Read Access Outside of Page

It may occur that the R3721 has left the DRAMs in page mode, but the subsequent access it outside of the current DRAM page. In this case, the transfer must be completed as a single read transaction. However,  $\overline{RAS}$  must be pre-charged before the transfer begins.

Once the R3721 detects the start of a single read access from the R3051 and determines that it is not within the same page, it outputs the row address to the DRAMs. The RAS signal is negated on the second rising edge of SysClk. The RAS signal is kept high for the time specified in the mode register (a minimum of 2 clock cycles).

The access continues then as for a single read access:  $\overline{RAS}$  is asserted, the column address is presented, the  $\overline{CAS}$  signal asserted, and the response generated to the processor. The read access outside of page is then terminated as for a single read access. Figure 7.7 illustrates the timing diagrams for a read access outside of page for the settings of the mode register as illustrated in Figure 7.1.



Figure 7.7 Single Read Access Outside of Page

### SINGLE WRITE TRANSACTION TIMINGS

In the R3051 family, a significant percentage of the bus traffic is due to processor writes to memory. Unlike processor load instructions and instruction fetches, which are usually satisfied by the on-chip processor caches and thus not seen on the bus, all processor store instructions are seen at the bus interface as single write transactions. Note that there is no such thing as a "quad word" write; the R3051 performs a word or a subword write as a single autonomous bus transaction. However, the R3051 provides a WrNear signal to indicate that the present write has the same upper 22 address bits as the preceding write, and is used to optimally retire strings of write operations on the bus interface.

#### **Start of Write Access**

The R3721 determines the beginning of a single write access by monitoring the assertion of the ALE and the  $\overline{Wr}$  signals from the R3051. The starting sequence for a single write access is very similar to the starting sequence of a single read access, which is illustrated in Figures 7.2 (a, b). The  $\overline{WBank(3:0)}$ signals are asserted on the falling edge of  $\overline{SysClk}$  after the detection of the  $\overline{Wr}$ signal. In non-interleaved systems, all four  $\overline{WBank}$  signals are identical copies; they are typically distributed one per bank to evenly reduce loading.

Figure 7.8 illustrates the starting sequence for a write access for the fast chip-select case.

#### Memory Control Signals for Single Write Accesses

The memory control signals sequence for single write accesses is very similar to the single read access sequence described earlier. Specifically, all of the discussion with respect to the timing of the  $\overline{RAS}$  and  $\overline{CAS}$  control signals and the row and column addresses are identical between reads and writes.

One difference between reads and writes arises in the case of partial word access. Partial writes must be handled specifically so that unaffected bytes within the word are not inadvertently written. The R3721 uses the  $\overline{CAS(3:0)}$  bus to provide individual byte enables to the DRAMs. These signals are derived from the  $\overline{BE(3:0)}$  outputs from the processor. During partial writes, only those bytes enabled by the processor have their corresponding DRAM enabled for writes, since only those DRAMs see  $\overline{CAS}$  asserted.

A final consideration in write activity is the availability of the data to the DRAMs. To eliminate the penalty typically associated with multiplexed busses, the R3051 drives valid data out one half cycle into the transfer. Thus, the write data is available early in the transfer, and the R3721 does not need to wait for the processor data.



Figure 7.8 Start of a Single Write Access for Fast Chip-select

#### End of a Single Write Access

Terminating a write access is different from terminating a read access, based on the bus interface of the CPU. On reads, the R3051 samples data one-half clock after it samples the control input asserted; on writes, the R3051 holds the write data one full cycle after it samples ACK. Thus, the DRAM controller can assert ACK early in the transfer, and be assured that it has a full cycle of valid data remaining.

In a write access, the R3721 asserts the  $\overline{ACK}$  signal half a clock cycle before the assertion of the  $\overline{CAS}$  signals. This means that the R3051 terminates the write access during the cycle in which the  $\overline{CAS}$  signals are asserted. This shortens the initial write latency by one clock cycle compared to the initial read latency. In this scheme, the data from the R3051 is guaranteed to be valid for one clock cycle after the assertion of  $\overline{CAS}$  since the R3051 doesn't negate the bus until one clock cycle after the detection of the  $\overline{ACK}$  signal. This clock cycle is longer than the required data hold time for most DRAMs. The R3721 also guarantees that the  $\overline{WBank}(3:0)$  and the  $\overline{DByteEn}(3:0)$  signals are valid and do not change for one clock cycle after the assertion of  $\overline{CAS}$ .

Depending on the encoding of the  $\overline{CAS}$  pulse width in the mode register, the  $\overline{CAS}$  signals are kept asserted for 1.5 or 2.5 clock cycles and negated on the falling edge of  $\overline{SysClk}$ . This means that  $\overline{CAS}$  will be kept asserted for 0.5 to 1.5 clock cycle into the next access. For systems which choose a  $\overline{CAS}$  low time of 2.5 cycles, there will be no penalty to a subsequent page read or write access, even though the  $\overline{CAS}$  signals are kept asserted into the next access.

Similar to a single read access, the R3721 assumes that any write access will be followed by another access (read, write or quad word read) within the same DRAM page. Based on this assumption, the  $\overline{RAS}$  signal is kept asserted at the end of the single word write access to enable the page mode of the DRAMs. Figures 7.9 (a, b) illustrates the timing in ending a single write access for both values of the  $\overline{CAS}$  pulse width.



Figure 7.9 (a) End of Single Write Access, CAS Pulse = 1.5 Clock Cycle



Figure 7.9 (b) End of Single Write Access, CAS Pulse = 2.5 Clock Cycle

Figure 7.10 illustrates the complete control timings involved in a single write access for the settings of the mode register as illustrated in Figure 7.1. By comparing Figure 7.10 with Figure 7.5, it can be noted that for the same system, the initial write latency is one clock cycle shorter than the initial read latency.







#### Page Write Accesses

The R3051 provides a WrNear signal to indicate that the present write has the same upper 22 address bits as the preceding previous write, compatible with virtually any DRAM. The R3721 has an internal page comparator that determines the true page size based on the DRAMs size encoded in the mode register, thus optimizing for the particular memory system. Based on the internal page comparator, the R3721 can retire a page mode write in a minimum of 3 clock cycles, the same as for a page read access. However, the R3721 also uses the WrNear signal from the R3051; when WrNear is asserted, the R3721 can bypass its internal comparator and also the  $\overline{CS}$  input detection, and retire the write access in the optimal time of 2 clock cycles.

The page write access takes advantage of the previous cycle in that the  $\overline{RAS}$  signal is already asserted. Page write accesses have similar timing to single write accesses, with the exception that no time is lost in re-asserting the  $\overline{RAS}$  signal and re-multiplexing the row and column addresses.

Once the R3721 detects the start of a single write access from the R3051 and determines that it is within the current DRAM page, it outputs the column address to the DRAMs. On the following rising edge of  $\overline{SysClk}$ , the  $\overline{CAS}$  signals are asserted in the fast chip-select mode. In the slow chip-select mode, the  $\overline{CAS}$  signals are asserted on the second rising edge of  $\overline{SysClk}$ . The page write access is then terminated as for a single write access.

The R3721 uses very specific rules to determine whether or not to bypass its internal page comparator by using the  $\overline{WrNear}$  signal from the R3051. All of the following conditions must be satisfied to achieve two cycle writes:

- settings in the mode register must be as follows:
  - fast chip select is enabled (DCS='0'),
  - $\overline{CAS}$  pre-charge = 0.5 clock cycle (CP='0'),
  - $\overline{CAS}$  pulse width = 1.5 clock cycle (C(1:0)='01')
  - WrNear is enabled (WrNr = '0')
- the previous access was a write access to the memory space controlled by the R3721 (CS was asserted during last transfer).

If both conditions are satisfied, the R3721 ignores the  $\overline{\text{CS}}$  input line and relies instead on the WrNear signal. If, at the detection of the write access, the WrNear signal is not asserted, the R3721 defaults to its standard mode of operation and retires a write in a minimum of 3 clock cycles (page mode) or longer. If the WrNear signal is asserted along with the Wr signal, the R3721 asserts the  $\overline{\text{ACK}}$  signal on the falling edge of  $\overline{\text{SysClk}}$  and retires the write in two clock cycles. This timing is illustrated in Figure 7.11 (a).





Figure 7.11 (b) illustrates the timing diagrams for a page write where the  $\overline{CAS}$  pulse width is set for 1.5 clock cycles with slow chip-select and  $\overline{CAS}$  precharge = 0.5 clock cycles. In this case again, the internal page comparator is not bypassed and the write access is retired in 3 clock cycles.



Figure 7.11 (b) 3 Clock Cycles Page Write with Slow Chip-Select,  $\overline{CAS}$  Pulse Width = 1.5 Clock Cycles,  $\overline{CAS}$  Pre-charge = 0.5 Clock Cycles.

Figure 7.11(c) illustrates the timing diagrams for a page write where the  $\overline{CAS}$  pulse width is set for 2.5 clock cycles with slow chip-select and  $\overline{CAS}$  precharge = 0.5 clock cycles. In this case again, the internal page comparator is not bypassed since the system is not set-up for fast chip select, and the write access is retired in 3 clock cycles.

Note that when the  $\overline{CAS}$  pulse width is programmed as 2.5 cycles, slow  $\overline{CS}$  must be used. Otherwise, the DRAM write enable will be asserted too soon for the DRAMs, resulting in a spurious write cycle. This rule applies regardless of the  $\overline{CAS}$  pre-charge width selected.





Figure 7.11 (d) illustrates the timing diagrams for a page write where the  $\overline{CAS}$  pulse width is set for 1.5 clock cycles and  $\overline{CAS}$  pre-charge = 1.5 clock cycles. When  $\overline{CAS}$  pre-charge time is 1.5 clock cycles, back-to-back two cycle near writes are not possible, regardless of the setting of the DCS bit. In this case again, the internal page comparator is not bypassed and the write access is retired in 3 clock cycles.



Figure 7.11 (d) 3 Clock Cycles Page Write with CAS Pulse Width = 1.5 Clock Cycles, CAS Pre-charge = 1.5 Clock Cycles.

Figure 7.11 (e) illustrates the timing diagrams for a page write where the  $\overline{\text{WrNear}}$  signal is not issued and the R3721 relies on its internal page comparator to determine the page size. In this case again, the write access is retired in a minimum of 3 clock cycles. This timing will also be exhibited in systems which disable the  $\overline{\text{WrNear}}$  feature of the processor, via the mode register.







Figure 7.12 illustrates the timing diagrams for a page write access for the settings of the mode register as illustrated in Figure 7.1.

Figure 7.12 Page Write Access Timing

The concept of page write and page read applies for all cases: single reads followed by single writes or vice versa. Figure 7.13 illustrates the timing for a single read access followed by a single write access followed by a single read access, all within the same page and based on the settings of the mode register illustrated in Figure 7.1.



Figure 7.13 Single Read Followed by a Single Write Followed by a Single Read Access

### Single Write Access Outside of Page

Single write accesses outside of page are single write accesses from the R3051 but happen to be outside the DRAM page accessed by the previous single read or single write access. The write access outside of page can't take advantage of the previous cycle, and thus is processed as a standard write. However, RAS must be pre-charged prior to the write being processed. The single write access outside of page has a very similar timing to the single write access with the exception that RAS is pre-charged before re-multiplexing the row and column addresses.

Once the R3721 detects the start of a single write access from the R3051 and determines that it is not within the same page, it begins pre-charging  $\overline{RAS}$ . The  $\overline{RAS}$  signal is negated on the second rising edge of  $\overline{SysClk}$ . The  $\overline{RAS}$  signal is kept high for the time specified in the mode register which is a minimum of 2 clock cycles. The access continues then as for a single write access by the assertion of the  $\overline{RAS}$  signal, presenting the column address, asserting the  $\overline{CAS}$  signals and then terminating the access. Figure 7.14 illustrates the timing diagrams for a read access outside of page for the settings of the mode register illustrated in Figure 7.1.



Figure 7.14 Single Write Access Outside of Page

#### **Partial Word Write Operation**

Partial word write accesses are standard write accesses from the R3051 with the exception that only selected bytes within a word are enabled. This information is provided by the  $\overline{BE(3:0)}$  signals from the R3051. The R3721 maps the  $\overline{BE3:0}$  from the R3051 directly into the  $\overline{CAS(3:0)}$  signals. For partial word accesses then, only the  $\overline{CAS}$  signals of the selected bytes will be asserted.

## **QUAD WORD READ TRANSACTION TIMINGS**

Quad word read operations are reads to the memory system in which the R3051 reads 4 contiguous words from memory, always starting on an even word boundary, and never crossing a DRAM page boundary. Quad word reads occur only in response to cache misses. All instruction cache misses are processed as quad word reads; data cache misses may be processed as quad word reads.

### Start of Quad Word Read Access

The start of a quad word read access is very similar to the start of a single read access and is described earlier. The only exception is that the  $\overline{\text{Burst}}$  signal from the R3051 is asserted at the same time as the  $\overline{\text{Rd}}$  signal.

### Memory Control Signals During Quad Word Read Accesses

The memory control signal sequence for a quad word read access is very similar to a single read access in the following way:

- The first word read from memory is treated exactly as a single read access and the timing has been specified earlier.
- To read the remaining 3 words from memory, the  $\overline{RAS}$  signal is kept asserted while the  $\overline{CAS}$  signals are toggled three extra times. After the first word is read, the  $\overline{CAS}$  signals are de-asserted on the falling edge of  $\overline{SysClk}$  (the same edge at which the R3051 samples the first word). The  $\overline{CAS}$  signals are then asserted on the rising edge of  $\overline{SysClk}$  after satisfying the  $\overline{CAS}$  precharge requirements encoded in the mode register. The  $\overline{CAS}$  signals are kept asserted for the time specified by the  $\overline{CAS}$  pulse width in the mode register. They are then de-asserted by the falling edge of  $\overline{SysClk}$ . This edge again corresponds to the edge where the R3051 samples the next wordin. This process is repeated for the remaining three words.
- To enable the read buffer of the R3051, for every word available from the memory system the RdCEn is asserted for one clock cycle.

Figure 7.16 illustrates the control signals involved in the quad word read transactions.

### End of a **Quad Word Read Access**

To terminate a quad word read access, the memory system must return the  $\overline{ACK}$  signal back to the R3051. To take advantage of the R3051 instruction streaming and to ensure optimal performance, the  $\overline{ACK}$  signal must be asserted four clock cycles before the fourth word is sampled by the R3051. The R3721 makes internal calculations based on the settings of the mode register and always asserts the  $\overline{ACK}$  signal four clock cycles before the fourth word is ready.



### Figure 7.15 (a) Guad Word Read Transaction Timing, CAS Pulse Width = 1.5 Clock Cycles, CAS Pre-charge = 0.5 Clock Cycle

At the end of quad word read, the R3721 always negates the RAS signal and exits the page mode of the DRAM. This feature has been incorporated because simulations have shown that a write outside of a page is the most likely transfer after a quad word read access. By doing this, the time lost to pre-charge the RAS signal is minimized for the next transaction. The RAS signal is always negated half a clock cycle after the negation of the CAS signals in any mode or configuration. Figure 7.15 (a) illustrates the timing of the control signals involved in a quad word read transaction for a CAS pulse width of 1.5 clock cycle and CAS pre-charge time of 0.5 clock cycles.

Figure 7.15 (b) illustrates the timing of the control signals involved in a quad word read transaction for a  $\overline{CAS}$  pulse width of 1.5 clock cycle and  $\overline{CAS}$  precharge time of 1.5 clock cycle.



Figure 7.15 (b) Quad Word Read Transaction Timing, CAS Pulse Width = 1.5 Clock Cycles, CAS Pre-charge = 1.5 Clock Cycle

In quad word read transactions, the rate at which the  $\overline{CAS}$  signals are toggled determine the speed at which the memory system will return the remaining words to the R3051. Figure 7.16 illustrates the complete control timings involved in a quad word read access for the settings of the mode register illustrated in Figure 7.1.



Figure 7.16 Guad Word Read Access Timing Diagrams

### Page Quad Word Read Accesses

Page quad word read accesses are quad word read accesses from the R3051 but happen to be within the same DRAM page as the previous single read or single write accesses.

The page quad word read access takes advantage of the previous cycle in that the  $\overline{RAS}$  signal is already asserted for the target row address. The page quad word read access has similar timing to the quad word read access, with the exception that no time is lost in re-asserting the  $\overline{RAS}$  signal and re-multiplexing the row and column addresses.

Once the R3721 detects the start of a quad word read access from the R3051 and determines that it is within the same page, it outputs the column address to the DRAMs. On the following rising edge of  $\overline{SysClk}$ , the  $\overline{CAS}$  signals are asserted in the fast chip-select mode. In the slow chip-select mode, the  $\overline{CAS}$  signals are asserted on the second rising edge of  $\overline{SysClk}$ . The access proceeds then as for a standard quad word read access. Figure 7.17 illustrates the timing diagrams for a page quad word read access for the settings of the mode register illustrated in Figure 7.1.



Figure 7.17 Page Quad Word Read Access Timing Diagrams



## APPLICATION EXAMPLE: A NON-INTERLEAVED TWO BANK MEMORY SYSTEM

**CHAPTER 8** 

## INTRODUCTION

This chapter is a detailed example on how to use the R3721 to interface a two bank, non-interleaved DRAM memory system to the R3051 RISController Family. It will describe the general system implementation and the connections between the R3721 and the rest of the system. It will also give a detailed explanation of how to set the mode register to adapt the R3721 to the application target. Finally, this chapter will summarize some of the timing diagrams involved in different types of accesses for the system presented in this chapter.

## **GENERAL SYSTEM DESCRIPTION**

In a typical system, the R3051 uses a 2x input clock for its internal operation and produces a 1x output clock  $\overline{SysClk}$  for use by the external system. Figure 8.1 illustrates a general purpose system based on the R3051. The system shown is a synchronous one where the memory controllers use the  $\overline{SysClk}$  to synchronize their operation to the R3051. The R3721 DRAM controller controls two 32-bit banks of non-interleaved DRAMs along with the data buffers (74FCT245 or the Bus Exchanger) that go with them. The rest of the system (EPROMs and I/O) are controlled by a separate external state machine implemented in a couple of programmable logic devices and is beyond the scope of this manual.



Figure 8.1 General System Using the R3051 and the R3721.

The R3721 DRAM Controller uses SysClk directly from the R3051, while the other memory subsystems may use a buffered version of SysClk to reduce the loading effect on the clock line. The R3721 connects directly to the multiplexed address/data bus of the R3051. The R3721 uses the ALE signal to latch the address of the current access. During writes to the internal mode register of the R3721, data is presented on the lower two bytes (A/D(15:0), regardless of endianness) of the multiplexed address/data bus and latched by the R3721 into its internal mode register. For the rest of the external system, standard latches such as IDT 74FCT373's demultiplex the R3051 address and data busses. The R3721 shares the control signals from/to the R3051 with the rest of the external system.

An address decoder PAL connects directly to the outputs of the address latches and provides the system with the required chip-select lines. The address decoder thus provides the R3721 DRAM Controller with the required  $\overline{CS}$  and  $\overline{MSel}$  enable lines. In this example, the R3721 controls two non-interleaved banks of 256K x 4 DRAMs that reside between address 0X0000\_0000 to 0X001F\_FFFF. The internal mode register of the R3721 resides in the I/O space, at address 0X0020\_0000. The address decoder PAL must generate the DRAM\_CS line for any access to the DRAM memory space and must issue both the DRAM\_CS and the  $\overline{MSel}$  lines for a write to the mode register. Figure 8.2

illustrates the address decoder PAL equations to produce the  $\overline{DRAM\_CS}$  and the  $\overline{MSel}$  lines.

| DRAM_CS NOT | =  | !LA31 AND !LA30 AND !LA29 AND !LA28 AND !LA22 AND !RD {issue for read}           |
|-------------|----|----------------------------------------------------------------------------------|
|             | OR | !LA31 AND LA30 AND !LA29 AND !LA28 AND !LA22 AND !WR {issue for writes}          |
|             | OR | !LA31 AND !LA30 AND !LA29 AND !LA28 AND LA22 AND !WR; {issue for access to Msel} |
| MSEL NOT    | =  | !LA31 AND !LA30 AND !LA29 AND !LA28 AND LA22 AND !WR; {issue for access to Msel} |

LAxx is the latched address from the address latches 74FCT373's.

Figure 8.2 Address Decoder PAL Equations for DRAM\_CS and MSel.

#### **DETAILED DESCRIPTION OF THE R3721 CONNECTIONS**

In this example, the R3721 controls two banks of non-interleaved 256K x 4 DRAMs to obtain a maximum DRAM memory space of 2 MBytes. Each memory bank consists of 8 devices to interface to the R3051 32-bit data bus. Four standard data transceivers 74FCT245 in the data path isolate the DRAM banks from the R3051 multiplexed address/data bus. This will reduce the loading effect on the bus and prevent any contentions from occurring. Figure 8.3 illustrates the detailed connections among the various modules.





The connections around the R3721 can be divided in several sections as follows:

- CPU connections:
  - The R3721 connects directly to the SysClk from the R3051 and synchronizes its internal operation to both edges of the clock.
  - The R3721 controls 2 MBytes of DRAMs and requires 21 address lines. In this case, the R3721 only needs to connect to A/D(20:0) from the R3051, and can have the other unused input lines tied to ground. However, it is good practice to connect all the A/D pins of the R3721 to the A/D pins of the R3051(A/D(25:0)). This allows the system to be field upgradable to larger densities of DRAMs and/or more banks without modifications to the PCB board.
  - Addr(3:2) from the R3051 are connected directly to Addr(3:2) on the R3721.
  - The ALE,  $\overline{\text{Rd}}$ ,  $\overline{\text{Wr}}$ , and  $\overline{\text{Burst/WrNear}}$  pins on the R3721 connect directly to the corresponding pins on the R3051.
  - ACK and RdCEn are pulled high internally and combined with similar signals from the rest of the system to form one set that is routed to the R3051.
  - The  $\overline{CS}$  and the  $\overline{MSel}$  are connected to the  $\overline{DRAM}_{\overline{CS}}$  and  $\overline{MSel}$  pins from the address decoder PAL.

This set-up is appropriate for multiple banks of "x4" DRAMs, and for a single bank of "x1" DRAMs. Multiple banks of "x1" DRAMs require external buffers, as described later, but are directly analogous to this system.

- The connections for two banks of "x4" DRAMs should then be as follows:
  - $\overline{RASO}$  is connected to all the  $\overline{RAS}$  input pins in bank 0 (8 devices).
  - RAS1 is connected to all the RAS input pins in bank 1 (8 devices).
  - $\overline{CAS(3:0)}$  will be directly mapped from the  $\overline{BE(3:0)}$  outputs from the R3051 and are connected to all the  $\overline{CAS}$  input pins in the corresponding byte lanes of both banks (4 devices per  $\overline{CAS}$ ).
  - RAS2 and RAS3 are not used.
  - $\overline{OE}$  is connected to all the  $\overline{OE}$  signals of all the DRAMs (16 devices)
  - WBank0 is connected to the WE input pins of all DRAMs in bank 0 (8 devices).
  - WBank1 is connected to the WE input pins of all DRAMs in bank 1 (8 devices).
  - WBank2 and WBank3 are not used
  - DAddr(8:0) will be connected to the 9 input address pins (A8:0) of the DRAMs in both banks (16 devices).

### Multiple Banks of "x1" DRAMs

For multiple banks of "x1" DRAMs, each bank consists of 32 devices for a 32bit data bus. In such topologies, the number of DRAM devices could be as many as 128 devices, which is much greater than the drive capacity of the output buffers of the R3721. The R3721 is designed to drive a maximum of 36 DRAM devices. In the case of very large loads, the use of external buffers or drivers is highly recommended. Note that the timing of  $\overline{CAS}$  is a particularly critical system parameter. The R3721 is defined for optimal performance when  $\overline{CAS}$ drives 8 devices or less; thus, when using multiple banks of "x1" devices,  $\overline{CAS}$ should be buffered to reduce loading. The drive capability of the RAS signals (RASO to RAS3) is specified for up to 36 devices, but since all the other output control signals will be buffered, the  $\overline{RAS}$  signals should also be buffered (to minimize timing skew).

For systems where only one or two banks of memory are used, the system designer should opt for the solution to route the extra unused control outputs to unpopulated slots to allow for future field upgrades to denser DRAMs and/ or extra memory banks. These signals include the unused RAS and WBank signals, and high-order DAddr lines.

In Chapter 7, it was shown that care must be taken to insure that no spurious writes occur during page mode writes. Specifically, note that the  $\overline{\text{WBank}}$  signal could be asserted on the same clock edge used to negate the  $\overline{\text{CAS}}$  signals. Most DRAMs require a Trch of Ons. In most systems, this is usually guaranteed, since the  $\overline{\text{WBank}}$  line drives a larger load than  $\overline{\text{CAS}}$ . However, the system designer could choose more design margin by buffering  $\overline{\text{WBank}}$ .

It is also possible to use multiple R3721 DRAM controllers in systems with very deep memory requirements. In these systems, each R3721 can control upto 64 MBytes. The selection amongst the various R3721 sub-systems is performed using the  $\overline{CS}$  inputs from the system address decoder. The use of multiple R3721's can also serve to reduce the loading effect on the output control signals, and thus reduce memory latency. In a system with multiple R3721 subsystems, the address decoder would typically arrange the DRAM to appear contiguous in memory (starting at physical address "0"), while the mode registers may be scattered throughout the I/O space.

#### • Data Path connections:

In this example of non-interleaved configuration, 74FCT245 data transceivers are used as data buffers and the connections are the followings:

- $T/\overline{R}$  is connected to the  $T/\overline{R}$  pins of all 4 transceivers (4 devices).
- DByteEn(3:0) are connected to the output enable (OE) input pins of the transceiver of the corresponding byte lane (1 device per DByteEn).
- Path and YZLEn are not used.

If the IDT73720 Bus Exchangers were used in the data path, the connections should be as follows:

- $T/\overline{R}$  is connected to the  $T/\overline{R}$  pins of both IDT73720 Bus Exchangers (2 devices).
- DByteEn(3:0) are connected to the output enable (OE) pin of the half of the Bus Exchanger of the corresponding byte lane (1 load each).
- Path will connect to the Path input pin of both Bus Exchangers (2 devices).
- YZLEn will connect to both the LEYX and LEZX pins from each Bus Exchangers (total of 4 loads).

Finally, note that it is recommended that pull-up or pull-down resistors be used on the data lines. This will reduce power consumption during partial word accesses and idle cycles, when the bus is not being actively driven.

# SETTING THE MODE REGISTER

In order to obtain the best performance of the R3721 DRAM Controller, the internal mode register must be programmed with the appropriate values tailored to the application at hand. In the example used in this chapter, the system is assumed to be a non-interleaved memory system running at 20 MHz using  $256K \times 4$  DRAMs with 100 ns of access time ("trac" = 100 ns). In order to determine proper values for the mode register, the system designer must consider the AC characteristics of the R3051, the R3721 and the data buffers (IDT73720's or IDT74FCT245's). In addition, the system designer must calculate the derating effect due to capacitive loading on the signal traces.

### Derating Effect Due to Capacitive Loading

The effect of capacitive loading due to the capacitance of the devices, the length of the traces on the PC boards and the propagation delay of the signal in travelling through the board add additional delays to the signals. These factors collectively are known as derating factors. Derating factors are arrived at by making approximate calculations of the capacitance. The capacitance obtained is compared with the rated drive capability of the IC component. The effect of additional capacitance on the timing is computed based on data sheet deratings:

- 1. The typical derating factor of the output driver for standard logic devices is 1ns/50pF.
- 2. The derating factor of the output driver for the CPU's is 1ns/25pF.
- 3. The traces typically have a capacitance of 2pF/inch.
- 4. The signal travels at the speed of 0.2ns/inch on a FR4 substrate.

The system designer should consider the derating effects described above and should use these or other values appropriate to the specific design in question in order to calculate the worst case interface timing.

The derating delay due to capacitive loading  $t\ensuremath{\mathsf{dr}}$  should be computed as follows:

| tdr | = | trace length in inches * 0.2ns/inch +           |   |
|-----|---|-------------------------------------------------|---|
|     |   | [(number of loads * input capacitance per load) | - |
|     |   | (rated capacitive load of the output driver)] * |   |
|     |   | the derating factor of the output driver        |   |

tdr = derating delay = \_\_ ns

In addition, the system designer must consider the variations in time between the R3051 output clock high time and output clock low time. This variations in the clock  $t_{VT}$  are expressed in the R3051 data sheet by the t32 and the t33 parameters and are equal to:

tvr =  $\pm 2$  ns at 25 MHz and less

tvr =  $\pm 1$  ns at more than 25 MHz.

Obviously, this effect only needs to be considered for events which occur at half-clock cycle intervals; the R3051 guarantees that the period of  $\overline{SysClk}$  will be regular rising edge to rising edge or falling edge to falling edge.

The analysis to set the mode register should then be as follows:

### • DRAM Page Size field (DZ1:0):

The system designer should set this field depending on the size of the DRAMs used in the external system (from  $256K \times 1 \text{ to } 4M \times 4$ ). In the case of this example, the DRAM size used is  $256K \times 4$  and the DZ1:0 bits are set to "0 0".

### • External memory configuration (Inlvd):

The system designer has the choice between interleaved and noninterleaved configurations and the types of data buffers used for the noninterleaved configurations. For this example, 74FCT245 transceivers are used in a non-interleaved system and the Inlvd bit is set to "0".

### • WrNear :

The WrNr bit in the mode register can be used to force the DRAM controller to ignore the processor  $\overline{WrNear}$  output signal during write accesses. This feature is important for interleaved systems using DRAM SIMM modules, where the  $\overline{OE}$  of the DRAM banks are grounded. In these systems, a write to one array will cause the other array to be read; to avoid bus contention in consecutive writes, the WrNr bit forces near writes to be retired in three cycles (rather than two), thus allowing time for bus contention to be avoided. In this system, this is not a problem; the WrNr field is set to '0' to allow fast writes.

### • $\overline{RAS}$ to $\overline{CAS}$ delay (RCD):

The RAS to CAS delay is the delay in clock cycles from the assertion of a RAS signal to the assertion of the corresponding CAS signal(s). It is expressed in clock cycles. This parameter is defined from the "trcd" parameter found in the DRAM data sheets. As stated in DRAM data sheets, "trcd" is important during read accesses. If the actual RAS to CAS delay is less than the max "trcd" specified, then the access is controlled by the RAS strobe. On the other hand if the actual RAS to CAS delay is greater than the max "trcd" specified, then the access is controlled by the CAS strobe. Similarly, there are two criteria to consider in deciding on the settings of this bit.

- There is the "Row address hold time "trah" specified in the DRAM data sheet which determines how long the row address must be held constant after the assertion of the RAS signal. This parameter is usually around 10 to 15ns. If the RCD bit is set to "0", DAddr will switch from the row address to the column address half a cycle after the assertion of the RAS signal. At 20MHz, this is equivalent to 25ns. If the RAS signal is heavily loaded, violation of this parameter could occur. In that case, setting RCD to "1" would be a more prudent choice.

- During single read accesses, or for the initial latency of quad word read accesses, if the actual  $\overline{RAS}$  to  $\overline{CAS}$  delay is less than the max "tred" specified, then the first word access is controlled by the  $\overline{RAS}$  strobe. The system designer must make sure, in that case, that the data will be valid when the R3051 samples it. During read accesses, the R3051 samples the data in at the same edge the  $\overline{CAS}$  signals are negated. The system designer should proceed with the following analysis for RCD set to "0" as shown in Figure 8.4:

tx1  $\overline{RAS}$  to  $\overline{CAS}$  delay = 1 clock cycle minimum +

tx2  $\overline{CAS}$  pulse width = 1.5 clock cycles minimum

| tx3 total available = $2.5$ clock cycles. | tx3 | total | available = | 2.5 | clock | cycl | es. |  |
|-------------------------------------------|-----|-------|-------------|-----|-------|------|-----|--|
|-------------------------------------------|-----|-------|-------------|-----|-------|------|-----|--|

| $tx5$ access time from $\overline{RAS}$ ("trac" max, DRAM d/s)=ns $tx6$ delay through data buffers (max, '245 d/s)=ns $tx7$ data setup time for R3051 (t2 max, R3051)=ns $tx8$ max capacitive derating effect (tdr max)=ns | +<br>+<br>+ |
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------|
| tx9 maximum time to obtain data = ns                                                                                                                                                                                       |             |

for a valid system, tx9 should be less than tx4.

In this example, RCD has been set to "0". This corresponds to one clock cycle from  $\overline{\text{RAS}}$  to  $\overline{\text{CAS.}}$ 



Figure 8.4 Analysis to Set RCD in the Mode Register

### • RAS Timing (R2:0)

The  $\overline{\text{RAS}}$  timing field encodes the  $\overline{\text{RAS}}$  pulse width as well as the  $\overline{\text{RAS}}$  precharge time. The system designer must set these three bits such that the specified  $\overline{\text{RAS}}$  pulse width "tras" in the DRAM data sheets and the specified  $\overline{\text{RAS}}$ pre-charge time "trp" are not violated. In this example,  $\overline{\text{RAS}}$  pulse width is set to 3 clock cycles which is 150 ns, and is longer than the required 100 ns. The  $\overline{\text{RAS}}$  pre-charge time is set to 2 clock cycles which is 100 ns and longer than the required 70 or 80 ns. R2:0 are then set to "0 0 1".

# • CAS pulse width (CO)

The R3721 is designed in a such a way that during read accesses, the  $\overline{CAS}$  signals are negated by the same edge at which the R3051 samples data. Further, during most read accesses (single read or quad word reads) the data path is assumed to be set to pass the desired data to the CPU (outputs of data buffers enabled, data buffers in the receive mode, and the latches are transparent). This means that from the  $\overline{CAS}$  strobe (or the  $\overline{RAS}$  strobe) the data coming out of the DRAMs passes through the data buffers directly to the R3051. Except for interleaved quad word read accesses, no latching of the data takes place. The system designer must ensure that the  $\overline{CAS}$  pulse width is long enough for the data to come out of the DRAMs, through the data buffers and meet the data setup time of the R3051.

The system designer should proceed with the following analysis illustrated in figure 8.5:

- $ty1 = \overline{CAS}$  pulse width = 1.5 or 2.5 clock cycles
- $ty1' = [\overline{CAS} \text{ pulse width } * (1/\text{frequency of operation})] tvr$

the time needed for the data to be present at the input of the R3051. =

|     | SysClk to CAS low (t1 max, R3721)                        | = | ns | + |
|-----|----------------------------------------------------------|---|----|---|
| ty3 | access time from $\overline{CAS}$ ("tcac" max, DRAM d/s) | = | ns | + |
| ty4 | delay through the data buffer (max, 245 d/s)             | = | ns | + |
| ty5 | R3051 data input setup time (t1a max, R3051)             | = | ns | + |
| ty6 | max capacitive derating effect (tdr max)                 | = | ns |   |
| tv7 | may time for data to be ready                            |   | ne |   |

ty7 max time for data to be ready

ns \_

for proper operation, ty7 must be less than ty1'.

For the example in this chapter, the  $\overline{CAS}$  pulse width is set to 1.5 clock cycles, then:

ty1 = 1.5 clock cycles ty1' = [1.5 \* 50 ns] - 2 ns = 73 nstv2 = 8 ns ty3 = 25 nsty4 = 7 nsty5 = 6 nsty6 = 5 ns (estimate) tv7 = 51 ns

ty7 is less than ty1' and the system should run properly.





#### • CAS pre-charge time (CP):

Most DRAMs require a  $\overline{CAS}$  pre-charge time of about 10 ns, which is equivalent to half a clock cycle. This set up is appropriate for most medium speed applications. However, the  $\overline{CAS}$  precharge time is important during the page mode operation of the DRAMs. There are two criteria to consider in setting this bit:

- During page read operations where the CAS is pre-charged and then re-asserted to enable the next word from the DRAMs as is illustrated in figure 8.6 (a) In such situations, the next word to be read from the DRAMs will be available after a delay corresponding to :
  - access time from address "taa" or
  - access time from  $\overline{CAS}$  "tcac" or
  - access time from CAS pre-charge "tacp"

whichever is longer (as per DRAM data sheet). The system designer must then take into consideration the access from the  $\overline{CAS}$  pre-charge time. The analysis for the access from the assertion of the  $\overline{CAS}$  is the same as for the  $\overline{CAS}$  pulse width analysis in figure 8.5. The analysis for the  $\overline{CAS}$  pre-charge time is as follows:

tz1= time from  $\overline{CAS}$  pre-charge to when the next data word must be available. This time equals the sum of the  $\overline{CAS}$  pulse width and the  $\overline{CAS}$  precharge times with a minimum of 2 clock cycles and a maximum of 4 clock cycles.

tz1 of 3 or 4 clock cycles is irrelevant since the access will then completely be determined by the  $\overline{CAS}$  pulse width. This analysis will concentrate on the 2 clock cycle tz1 where the  $\overline{CAS}$  pre-charge time is 0.5 clock cycles and the  $\overline{CAS}$  pulse width is 1.5 clock cycles.

#### tz1' = time for next data element

= 2 clock cycles \* (1/frequency of operation)

|     |   | SysClk to CAS pre-charge (t1a max, R3721) access time from CAS pre-charge | = | ns | + |
|-----|---|---------------------------------------------------------------------------|---|----|---|
|     |   | ("tacp" max, DRAM d/s)                                                    | = | ns | + |
| tz4 | = | delay through the data buffer (max, '245 d/s)                             | = | ns | + |
| tz5 | = | R3051 data input setup time (t1a max, R3051)                              | = | ns | + |
| tz6 | = | max capacitive derating effect (tdr max)                                  | = | ns |   |
| tz7 | = | max time for data to be ready                                             | = | ns |   |

For proper operation tz7 must be less than tz1'

For the example of this chapter, the  $\overline{CAS}$  pulse width is set to 1.5 clock cycles and the  $\overline{CAS}$  pre-charge time is set to 0.5 clock cycles, then

tz1 = 2.0 clock cyclestz1' = 2.0 \* 50 ns = 100 nstz2 = 8 nstz3 = 55 nstz4 = 7 nstz5 = 6 nsty6 = 5 ns (estimate)ty7 = 81 ns

ty7 is less than ty1' and the system should run properly.



Figure 8.6 (a) CAS Pre-charge Time Analysis

- During page write accesses where the  $\overline{CAS}$  pulse width is 1.5 clock cycles and the  $\overline{CAS}$  pre-charge time is set to 0.5 clock cycles, the system designer must ensure that the data is available at the DRAM inputs before asserting the  $\overline{CAS}$  strobes. That is the data from the R3051 through the data buffers in addition to the DRAM data setup time must be less than one half clock cycle which is tw6. This timing analysis is illustrated in figure 8.6 (b) and is as follows: tw6 = one half clock cycle - tvr = ns

| tw5 = | max time for data to be ready                  | = | ns |   |
|-------|------------------------------------------------|---|----|---|
| tw4 = | max capacitive derating effect (tdr max)       | = | ns |   |
| tw3 = | DRAM data setup time ("tds" min, DRAM d/s)     | = | ns | + |
| tw2 = | delay through the data buffer (max, '245 d/s)  | = | ns | + |
| tw1 = | SysClk to data from the R3051 (t19 max, R3051) | = | ns | + |
|       |                                                |   |    |   |

this time (tw5) must be less than tw6 for proper operation. For the example of this chapter, the  $\overline{CAS}$  pulse width is set to 1.5 clock cycles and the  $\overline{CAS}$  pre-charge time is set to 0.5 clock cycles, then

tw1 = 10 ns tw2 = 7 ns tw3 = 0 ns tw4 = 5 ns (estimate)tw5 = 22 ns

tw5 is less than tw6 which is (25 - 2 = 23 ns) 23 ns and the system should run properly. For well layed-out systems the derating factor could be reduced to 3 or 4 ns and thus provide more margin for the DRAM data setup time.



Figure 8.6 (b) CAS Pre-charge Time Analysis During Writes

#### • Refresh Period (RF2:0):

The refresh period must be set according to the frequency of operation. In this example, the RF2:0 bits are set for 20 MHz operation at "1 0 0".

#### • Delayed Chip Select (DCS):

The delay chip select must be set if the external address decoder is not fast enough to meet the fast chip select requirements. That is, if the external decoder can not provide chip select within the first clock cycle of the access.

The delay chip select feature can also be set to slow down the page write accesses. The main reason for this is demonstrated in figure 8.6 (b) Slowing down the page writes is appropriate when the delay through the data buffer is such that the data is not available to the DRAMs within half a clock cycle. In this case setting the DCS bit will slow the page write operation as demonstrated in figure 7.12 (c) The slow  $\overline{CS}$  mode must be enabled in systems using a 2.5 clock cycle  $\overline{CAS}$  pulse width, to insure proper write operation. It will also have the effect of adding an extra clock cycle for every access.

In this example, the data can be available to the DRAMs within the half clock cycle (25 ns), and thus the DCS bit is cleared.

Figure 8.7 illustrates the settings of the mode register used for this system.

| 15   | 14  | 13  | 12  | 11  | 10  | 9    | 8  | 7  | 6  | 5  | 4   | 3    | 2     | 1   | 0   |
|------|-----|-----|-----|-----|-----|------|----|----|----|----|-----|------|-------|-----|-----|
| Rsvd | DCS | RF2 | RF1 | RF0 | СР  | Rsvd | င၀ | R2 | R1 | R0 | RCD | WrNr | inivd | DZ1 | DZO |
| 0    | 0   | 1   | 0   | 0   | 0   | 0    | 1  | 0  | 0  | 1  | 0   | 0    | 0     | 0   | 0   |
| D15  | D14 | D13 | D12 | D11 | D10 | D D9 | D8 | D7 | D6 | D5 | D4  | D3   | D2    | D1  | D0  |

Figure 8.7 Mode Register Settings for a Two Bank Non-interleaved System

### SYSTEM TIMING DIAGRAMS

In general, Chapter 7 illustrated all of the relevant protocol for noninterleaved systems. Figures 8.8 and 8.9 are provided to show the exact signalling of RAS, which is used to select a particular bank during a particular access.

Figure 8.8 illustrates the timing diagrams involved in a single read access to bank 1 starting from an idle,  $\overline{RAS}$  asserted state. Bank 1 is selected by only asserting the  $\overline{RAS}(1)$  signal. Note that in this drawing, a  $\overline{RAS}$  precharge is required prior to the  $\overline{RAS}(1)$  pulse.



Figure 8.8 Single Read Access to Bank(1)

Figure 8.9 illustrates the timing diagrams involved in a single write access to bank 0 starting from an idle,  $\overline{RAS}$  asserted state . Again, note that the current  $\overline{RAS}$  is for bank 1; thus, a  $\overline{RAS}$  precharge cycle is required.



Figure 8.9 Single Write Access to Bank(0)

**CHAPTER 9** 



# THE USE OF THE R3721 **IN AN INTERLEAVED MEMORY SYSTEM.**

# INTRODUCTION

This chapter describes the use of the R3721 DRAM controller in an interleaved memory system. Included in this chapter is a discussion of:

- The effect of various configurations of the mode register on the timings of the output signals.
- A description of an interleaved DRAM memory system connected to the R3051.
- A detailed illustration of the timing diagrams involved in the various processor memory transactions.

### **INTERLEAVED SYSTEM DESIGN**

An interleaved memory system consists of 1 or 2 bank-pairs of "x1" or "x4" DRAMs connected with the R3051 by the R3721 DRAM controller. Each memory bank-pair consists of an even half (32-bit bank) and an odd half (32bit bank). Even halves and odd halves are determined by the low order address bits (Addr2 bit), while the bank-pairs are selected by a high order address bit. Each half bank-pair in the interleaved configuration is analogous to a bank in the non-interleaved configurations. The R3721 uses the DRAM size information encoded in the mode register and selects the appropriate memory devices individually by decoding the address bits from the R3051 address/data bus. In the interleaved configuration, each RAS controls a single half-bank (i.e.  $\overline{RASO}$  controls the even half of bank 0,  $\overline{RASI}$  controls the odd half of bank 0,  $\overline{RAS2}$  controls the even half of bank 1, and  $\overline{RAS3}$  controls the odd half of bank 1). In the interleaved configuration, the R3721 directly controls the IDT 73720 Bus Exchangers in the data path.

The primary benefit of an interleaved memory configuration occurs in quad word reads. Interleaved memory does not reduce the latency of DRAM access. and thus does not benefit single transactions (single reads, single writes, page reads and page writes). In multiple word accesses, however, interleaved memory obtains higher bandwidth from the DRAM devices, and thus dramatically improves the performance of these accesses.

For ease of discussion, all timing diagrams illustrated in this chapter assume the settings of the mode register as shown in Figure 9.1. These settings correspond to an interleaved system with the followings:

- 1M x 4 DRAMs.
- RAS pulse width is 3 clock cycles, RAS pre-charge time is 2 clock cycles,
- CAS pulse width is 1.5 clock cycles.
- CAS pre-charge time is 0.5 clock cycles.
- 2 clock cycles delay from  $\overline{RAS}$  to  $\overline{CAS}$ .

12 11 10 9 8 7

- Fast chip-select mode,
- WrNear ignored. 14 13

15

|      |     |     |     |     |    |      |    |    |    |    |     |      |       |     | _   |
|------|-----|-----|-----|-----|----|------|----|----|----|----|-----|------|-------|-----|-----|
| Rsvd | DCS | RF2 | RF1 | RF0 | СР | Rsvd | C0 | R2 | R1 | R0 | RCD | WrNr | Inivd | DZ1 | DZ0 |
| 0    | 0   | 1   | 0   | 0   | 0  | 0    | 1  | 0  | 0  | 1  | 1   | 1    | 1     | 1   | 0   |
|      |     |     |     |     |    |      |    |    |    |    |     |      |       |     |     |

6 5

3

4

2

1

0

D15 D14 D13 D12 D11 D10 D9 D8 D7 D6 D5 D4 D3 D2 D1 D0

> Figure 9.1 Settings of the Mode Register Used as an Example in this Chapter for Interleaved Memory Systems

The timing diagrams illustrated in this chapter apply for single bank-pair or two bank-pair interleaved systems. For accesses in the even array of a bankpair, the Path signal is always high; for accesses in the odd array of a bank pair, the Path signal is low.  $\overline{RAS(even)}$  is either  $\overline{RAS0}$  or  $\overline{RAS2}$  while  $\overline{RAS(odd)}$  is either  $\overline{RAS1}$  or  $\overline{RAS3}$ . All four  $\overline{CAS}$  signals are connected to every array in the system, on a byte-lane basis.

### SINGLE READ TRANSACTION TIMINGS

In general, there are only two types of read transactions from the R3051: quad word reads and single datum reads. Quad word reads occur only in response to cache misses. All instruction cache misses are processed as quad word reads; data cache misses may be processed as quad word reads or single word reads. Uncached references are always processed as single word reads. This section describes the timing diagrams involved in single word reads; a later section in this chapter describes the quad word read operations.

#### Start of Single Read Access

The R3721 determines the beginning of a single read access by monitoring the assertion of the ALE and the  $\overline{Rd}$  signals from the R3051. The R3721 multiplexes the input address from the R3051 according to the DRAM configuration and outputs the row address on the DAddr bus. If the fast chipselect mode is selected (DCS bit in mode register = 0), the  $\overline{CS}$  input must be valid before the following rising edge of SysClk for the R3721 to respond to the access; otherwise the R3721 assumes the access to be outside of the memory space it controls and does not assert any DRAM control signals. For a slow chip select mode, the  $\overline{CS}$  bit must be valid by the following falling edge of  $\overline{SysClk}$ . Chapter 7 illustrates the start of a single read. For an interleaved system, the R3721 will assert both the even and the odd RAS signals for any single read access. The level of the Path signal will direct the proper word from the indicated memory array to the R3051. This is important, since after the single read access, the page mode of the DRAMs is enabled and the following access could be in the even half-bank-pair or the odd half-bank-pair. By asserting both RAS control signals, the page mode of both halves of the DRAM bank-pairs are enabled for subsequent page mode accesses.

### Memory Control Signals for Single Read Accesses

After the detection of the  $\overline{CS}$  signal, the R3721 starts to issue the various control signals to the DRAMs in the following way:

- On the rising edge of SysClk following CS, the appropriate RAS(even) and RAS(odd) signals are issued (RAS0 and RAS1 for access to bank-pair 0, RAS2 and RAS3 for access to bank-pair 1). The ACK and RdCEn outputs are enabled and driven to a level "high".
- Depending on the value of the RCD bit in the mode register, the R3721 can proceed in two different ways:

If RCD=0, the column address is presented on the DAddr bus on the falling edge of  $\overline{SysClk}$  following the assertion of the  $\overline{RAS}$  signals. The appropriate  $\overline{CAS(3:0)}$  signals are asserted on the next rising edge of  $\overline{SysClk}$  ( $\overline{CAS0}$  for access to D(7:0), etc). The path signal is set to 1 for accesses to the even half and set to 0 for accesses to the odd half. In the interleaved configurations, accesses to the even or odd halves are determined by the Addr2 bit from the R3051.

If RCD=1, the column address is presented on the DAddr bus on the falling edge of  $\overline{SysClk}$ , 1.5 clock cycles following the assertion of the  $\overline{RAS}$  signals. The  $\overline{CAS}$  signals are asserted on the following rising edge of  $\overline{SysClk}$ . The Path signal is set to 1 for accesses to the even half and set to 0 for accesses to the odd half.

#### End of a Single Read Access

Depending on the settings of the  $\overline{CAS}$  pulse width in the mode register, the  $\overline{CAS}$  signals are kept asserted for 1.5 or 2.5 clock cycles and negated on the falling edge of  $\overline{SysClk}$ . The R3721 is designed in such a way that the R3051 samples the data on the same falling edge used to negate the  $\overline{CAS}$  signals. The R3721 asserts the  $\overline{ACK}$  and  $\overline{RdCEn}$  signals one clock cycle before negating the  $\overline{CAS}$  signals. The  $\overline{ACK}$  and  $\overline{RdCEn}$  signals are asserted on the falling edge of  $\overline{SysClk}$  and  $\overline{RdCEn}$  signals are asserted on the falling edge of  $\overline{SysClk}$  and  $\overline{RdCEn}$  signals are asserted on the falling edge of  $\overline{SysClk}$  and kept asserted for one clock cycle.

To take advantage of the page mode capabilities of the DRAMs, the R3721 always assumes that any single read access will be followed by another access (read or write or quad word reads) within the same DRAM page. Based on this assumption, both the RAS signals (RAS(even) and RAS(odd)) are kept asserted at the end of the single word read access to enable the page mode of the DRAMs. Chapter 7 illustrates the timing diagrams in ending a single read access for both values of the  $\overline{CAS}$  pulse width.

Figure 9.2 illustrates the complete control timings involved in a single read access for the settings of the mode register illustrated in Figure 9.1. This figure represents a generic timing diagram in which the access could be for the even or the odd half-bank-pair. This is why the Path signal is shown with both of its two possible values.





#### **Page Read Accesses**

Page read accesses are single read accesses from the R3051 but happen to be within the same DRAM page as the previous single read or single write accesses. The R3721 determines the maximum page size of the memory system based on the DRAM size information encoded in the mode register.

The page read access in interleaved memory systems takes advantage of the previous cycle in that the  $\overline{RAS}$  signals are already asserted. The page read access has very similar timing to the single read access with the exception that no time is lost in re-asserting the  $\overline{RAS}$  signals and re-multiplexing the row and column addresses.

Once the R3721 detects the start of a single read access from the R3051 and determines that it is within the same page, it outputs the column address to the DRAMs. The page read access is then terminated as for a single read access. Figure 9.3 illustrates the timing diagrams for a page read access for the settings of the mode register illustrated in Figure 9.1.



#### Single Read Access Outside of Page

Single read accesses outside of the current page are single read accesses from the R3051 but happen to be outside the DRAM page accessed by the previous single read or single write access. The read access outside of the current page doesn't take advantage of the previous cycle, and thus  $\overline{RAS}$  must be pre-charged before the read access is begun. Once  $\overline{RAS}$  is pre-charged, the access continues as for a single word read.

Once the R3721 detects the start of a single read access from the R3051 and determines that it is not within the same page, it outputs the row address to the DRAMs. On the second rising edge of SysClk, both RAS signals are negated to begin the RAS pre-charge. The RAS signals are kept high for the time specified in the mode register (a minimum of 2 clock cycles). The access continues then as for a single read access; the RAS signals are asserted, the column address is asserted, the CAS signals are asserted, and then the access is terminated. The read access outside of page is then terminated as for a single read access. Figure 9.4 illustrates the timing diagrams for a read access outside of page for the settings of the mode register as illustrated in Figure 9.1.



### SINGLE WRITE TRANSACTION TIMINGS

In the R3051 family, a significant percentage of the bus traffic will be processor writes to memory. This is due to the write-through nature of the processor data cache: all processor writes are propagated to the bus; however, the majority of reads are satisfied by the on-chip caches, and are not seen on the bus.

Note that for the R3051 there is no such thing as a "quad word" write; the R3051 performs a word or a subword write as a single autonomous bus transaction. However, the R3051 provides a WrNear signal to indicate that the present write has the same upper 22 address bits as the preceding write. This is equivalent to a DRAM memory page of 256 words, and will work for any of the DRAM sizes supported by the R3721.

#### **Start of Write Access**

The R3721 determines the beginning of a single write access by monitoring the assertion of the ALE and the  $\overline{Wr}$  signals from the R3051. The starting sequence for a single write access is very similar to the starting sequence of a single read access described earlier. The appropriate  $\overline{WBank(3:0)}$  signals ( $\overline{WBank(2)}$  and  $\overline{WBank(0)}$  for writes to the even array:  $\overline{WBank(3)}$  and  $\overline{WBank(1)}$ for writes to the odd array. RAS then controls which even or odd array is actually written) are asserted on the falling edge of  $\overline{SysClk}$  after the detection of the  $\overline{Wr}$  signal. Again like for a single read access, both the RAS(even) and the RAS(odd) signal are asserted for a single write access to enable the page mode of the DRAMs in both halves of the interleaved bank-pair. Chapter 7 illustrates the starting sequence for a write access for the fast chip-select case.

#### Memory Control Signals for Single Write Accesses

After the detection of the  $\overline{CS}$  signal, the R3721 starts to issue the various control signals to the DRAMs in the following way:

- On the rising edge of SysClk following CS, the appropriate RAS(even) and RAS(odd) signals are issued. The ACK and RdCEn outputs are enabled and driven to a level "high".
- Depending on the value of the RCD bit in the mode register, the R3721 can proceed in two different ways:

If RCD=0, the column address is presented on the DAddr bus on the falling edge of  $\overline{SysClk}$  following the assertion of the  $\overline{RAS}$  signals. The appropriate  $\overline{CAS}$  signals are asserted on the next rising edge of  $\overline{SysClk}$ . The path signal is set to 1 for accesses to the even array, and set to 0 for accesses to the odd array. In the interleaved configurations, Addr2 bit from the R3051 determines the access to the even or odd array; to avoid writing to the wrong array, only the even or odd  $\overline{WBank}$  signals are asserted.

If RCD=1, the column address is presented on the DAddr bus on the falling edge of  $\overline{SysClk}$ , 1.5 clock cycles following the assertion of the  $\overline{RAS}$  signals. The  $\overline{CAS}$  signals are asserted on the following rising edge of  $\overline{SysClk}$ .

In interleaved configurations, single write accesses differ from single read accesses in that only the even or odd pair of WBank signals are asserted. This, coupled with the assertion of RAS for the appropriate bank pair, insures that only the selected 32-bit array is written into. Finally, only those byte lanes being written have their CAS signals asserted, to handle the case of partial word writes. This ensures that the data is written in the right memory locations and that no wrong data is written in the other half of the memory bank-pair. The level of the Path signal will direct the R3051 data through the Bus Exchangers to the proper half of the memory bank-pair.

#### End of a Single Write Access

A single write access in an interleaved system is ended exactly as for a noninterleaved system. This operation is described in Chapter 7.

### **Page Write Accesses**

The R3051 provides a WrNear signal to indicate that the present write has the same upper 22 address bits as the preceding previous write. The R3721 has an internal page comparator that determines the actual DRAM page size according to the information encoded in the mode register. Based on the internal page comparator, the R3721 can retire a write in a minimum of 3 clock cycles, the same as for a page read access. However, the R3721 also uses the WrNear signal from the R3051 to bypass its internal comparator and  $\overline{CS}$ detection and to retire a write access in the optimal time of 2 clock cycles.

The page write access takes advantage of the previous cycle in that the  $\overline{RAS}$  signals are already asserted. The page write access has a very similar timing to the single write access with the exception that no time is lost in re-asserting the  $\overline{RAS}$  signals and re-multiplexing the row and column addresses.

Once the R3721 detects the start of a single write access from the R3051 and determines that it is within the same page, it outputs the column address to the DRAMs. On the following rising edge of SysClk, the appropriate CAS signals are asserted in the fast chip-select mode. In the slow chip-select mode, the appropriate CAS signals are asserted on the second rising edge of SysClk. The page write access is then terminated as for a single write access.

The R3721 uses very specific rules to determine whether or not to bypass its internal page comparator and to use the WrNear signal from the R3051. All of the following conditions must be satisfied:

- settings in the mode register are as follows:
  - fast chip\_select is enabled (DCS = 0),
  - $\overline{CAS}$  pre-charge = 0.5 clock cycle (CP = 0),
  - $\overline{CAS}$  pulse width = 1.5 clock cycle (C1:0 = 01)
  - $\overline{\text{WrNear}}$  must be enabled ( $\overline{\text{WrNr}} = 0$ )
- the previous access was a write access to the memory space controlled by the R3721 (CS has been asserted).

If both conditions are satisfied, the R3721 ignores the  $\overline{CS}$  input line and relies totally on the WrNear signal. If at the detection of the write access the WrNear signal is not asserted the R3721 defaults to its standard mode of operation and retires a write in a minimum of 3 clock cycles. If the WrNear signal is asserted along with the Wr signal, the R3721 asserts the ACK signal on the falling edge of SysClk and retires the write in two clock cycles.

With the  $\overline{CAS}$  pulse width selected as 2.5 cycles, the slow  $\overline{CS}$  mode must be selected, regardless of the  $\overline{CAS}$  pre-charge selected. This is required to assure proper timing during write operations, and to avoid spurious writes.

In this system,  $\overline{WrNear}$  cannot be used to shorten the access, since the R3721 has to satisfy the  $\overline{CAS}$  pulse width of 2.5 clock cycles. In this case the  $\overline{CAS}$  is still asserted for 1.5 clock cycles into the next access and the  $\overline{WrNear}$  can not be used to do a 2 clock cycles write. The internal page comparator is not bypassed and the minimum write access requires 3 clock cycles.

When  $\overline{CAS}$  pre-charge time is 1.5 clock cycles, the  $\overline{CAS}$  signal will not be asserted until the third clock cycle and  $\overline{CS}$  has to be valid by then; thus, the delayed chip select has no impact on this access. In this case again, the internal page comparator is not bypassed and the write access is retired in 3 clock cycles.



Figure 9.5 illustrates the timing diagrams for a page write access to the even half-bank-pair for the settings of the mode register as illustrated in Figure 9.1.



The concept of page write and page read applies for all cases: single reads followed by single writes or vice versa. Figure 9.6 illustrates the timing diagrams for a single read access followed by a single write access followed by a single read access, all within the same page and based on the settings of the mode register illustrated in Figure 9.1.



Figure 9.6 Single Read Followed by a Single Write Followed by a Single Read Access in Interleaved Systems

### Single Write Access Outside of Page

Single write accesses outside of page are single write accesses from the R3051 but happen to be outside the DRAM page accessed by the previous single read or single write access. The write access outside of page can't take advantage of the previous cycle, since  $\overline{RAS}$  must be pre-charged. The single write access outside of page has very similar timing to the single write access with the exception that extra time is lost in pre-charging the  $\overline{RAS}$  signals before re-multiplexing the row and column addresses.

Figure 9.7 illustrates the timing diagrams for a read access outside of page for the settings of the mode register illustrated in Figure 9.1.

#### **Partial Word Write Operation**

Partial word write accesses are standard write accesses from the R3051 with the exception that only selected bytes within a word are enabled. This information is provided by the  $\overline{BE(3:0)}$  signals from the R3051. The R3721 maps the  $\overline{BE(3:0)}$  from the R3051 directly into the  $\overline{CAS(3:0)}$  signals. For partial word write accesses then, only the  $\overline{CAS}$  signals of the selected bytes will be asserted.





#### **QUAD WORD READ TRANSACTION TIMINGS**

Quad word read operations are reads to the memory system in which the R3051 reads 4 contiguous words from memory always starting on an even word boundary. Quad word reads occur only in response to cache misses. All instruction cache misses are processed as quad word reads; data cache misses may be processed as quad word reads or single word reads.

The advantage of interleaved memory system lies in the multiple word transactions, in this case quad word reads. In quad word read accesses, the interleaved memory system can produce the remaining three words (after the initial latency) at double the rate of the non interleaved systems. This is achieved in the interleaved memory system by reading two words from the memory at the same time (a word from both arrays of the bank-pair). The memory will pass the first word to the CPU while the DRAM controller latches the second word into the Bus Exchangers. On the following clock edge it will release the latched word to the CPU. Simultaneously, the interleaved memory system will pre-charge the  $\overline{CAS}$  signals and produce the remaining two words in a similar fashion. This has the effect of doubling the band-width of the memory system.

#### Start of Quad Word Read Access

The start of a quad word read access is very similar to the start of a single read access and is described earlier. The only exception is that the Burst signal from the R3051 is asserted at the same time as the  $\overline{Rd}$  signal. Chapter 7 illustrates the start of a quad word read access during the fast chip-select mode.

#### Memory Control Signals During Quad Word Read Accesses

After the detection of the  $\overline{CS}$  signal, the R3721 starts to issue the various control signals to the DRAMs in the following way:

- On the rising edge of SysClk following CS, the appropriate RAS(even) and RAS(odd) signals are issued. The ACK and RdCEn outputs are enabled and driven to a level "high".
- Depending on the value of the RCD bit in the mode register, the R3721 can proceed in two different ways:

If RCD=0, the column address is presented on the DAddr bus on the falling edge of  $\overline{SysClk}$  following the assertion of the  $\overline{RAS}$  signals. The  $\overline{CAS}$  signals are asserted on the next rising edge of  $\overline{SysClk}$ . The assertion of  $\overline{CAS}$  produces two 32-bit words Data0 (even) and Data1 (odd), one from each array of the bankpair. The path signal is set to 1 to access the even half and the YZLEn signal is set to 1 to enable the latches of the Bus Exchangers.

If RCD=1, the column address is presented on the DAddr bus on the falling edge of  $\overline{SysClk}$ , 1.5 clock cycles following the assertion of the RAS signals. The CAS signals are asserted on the following rising edge of  $\overline{SysClk}$ . The Path signal is set to 1 to access the even half-bank-pair and the YZLEn signal is set to 1 to enable the latches of the Bus Exchangers.

- After satisfying the  $\overline{CAS}$  pulse width requirement programmed in the mode register, the  $\overline{CAS}$  signals are negated on a falling edge of  $\overline{SysClk}$ . The  $\overline{CAS}$  signals are negated by the same clock edge the R3051 samples the first data element (Data0).
- At the same clock edge, the CAS signals are negated and the YZLEn signal is also negated. This closes the latches of the Bus Exchangers, which now store Data0 and Data1.
- Also at this same clock edge, the Path signal is set to 0. This enables the data element from the odd half-bank-pair which is stored in the Bus Exchangers to be routed to the R3051.
- One clock cycle later, on the falling edge of the clock, the R3051 will sample in the second data element stored in the Bus exchangers (Data 1). At this same falling clock edge, Path and YZLEn are reset to level 1. This re-enables the even half of the memory bank-pair and makes the latches transparent.
- After satisfying the CAS pre-charge time encoded in the mode register, the CAS signals are asserted on the rising edge of SysClk. This will again produce two data elements (Data2(even) and Data3(odd)), one from each half-bank-pair, and the same procedure as described above is repeated for a second time.
- To enable the read buffer of the R3051, for each word available from the memory system RdCEn is asserted for one clock cycle. In interleaved configurations, two data elements are produced at the same time and presented to the R3051 on two subsequent clock cycles; thus, the RdCEn will usually be kept asserted for two clock cycles. If the system configuration is such that the 4 data elements can be presented to the R3051 on every subsequent falling edge of SysClk, the RdCEn signal will be kept asserted for 4 clock cycles before negated.

#### End of a Quad Word Read Access

To terminate a quad word read access, the memory system must return the  $\overline{ACK}$  signal back to the R3051. To take advantage of R3051 instruction streaming and to ensure optimal performance, the  $\overline{ACK}$  signal must be asserted four clock cycles before the fourth word is sampled by the R3051. The R3721 makes internal calculations based on the settings of the mode register and always asserts the  $\overline{ACK}$  signal four clock cycles before the fourth word is ready.

At the end of quad word read, the R3721 always negates both  $\overline{RAS}$  signals and exits the page mode of the DRAM. Simulations have shown that the most probable transfer after a quad word read is a non-page mode write; thus, the R3721 exits page mode to pre-charge  $\overline{RAS}$ , minimizing the access time of the subsequent cycle.





The  $\overline{\text{RAS}}$  signals are always negated half a clock cycle after the negation of the  $\overline{\text{CAS}}$  signals in any mode or configuration. Figure 9.8 (a) illustrates the timing of the control signal involved in a quad word read transaction for a  $\overline{\text{CAS}}$  pulse width of 1.5 clock cycle and  $\overline{\text{CAS}}$  pre-charge time of 0.5 clock cycle.

Figure 9.8 (b) illustrates the timing of the control signals involved in a quad word read transaction for a  $\overline{CAS}$  pulse width of 1.5 clock cycle and  $\overline{CAS}$  precharge time of 1.5 clock cycle.





In quad word read transactions, the rate at which the  $\overline{\text{CAS}}$  signals are toggled determine the speed at which the memory system will return the remaining 3 words to the R3051. Figure 9.9 illustrates the complete control timings involved in a quad word read access for the settings of the mode register illustrated in Figure 9.1.





### Page Quad Word Read Accesses

Page quad word read accesses are quad word read accesses from the R3051 but happen to be within the same DRAM page as the previous single read or single write accesses.

The page quad word read access takes advantage of the previous cycle in that the  $\overline{RAS}$  signals are already asserted. The page quad word read access has a very similar timing to the quad word read access with the exception that no time is lost in re-asserting the  $\overline{RAS}$  signals and re-multiplexing the row and column addresses.

Once the R3721 detects the start of a quad word read access from the R3051 and determines that it is within the same page, it outputs the column address to the DRAMs. On the following rising edge of SysClk, the CAS signals are asserted in the fast chip-select mode. In the slow chip-select mode, the CAS signals are asserted on the second rising edge of SysClk. The access proceeds then as for a standard quad word read access. Figure 9.10 illustrates the timing diagrams for a page quad word read access for the settings of the mode register illustrated in Figure 9.1.



Figure 9.10 Page Quad Word Read Access Timing Diagrams for Interleaved Memory Systems



# APPLICATION EXAMPLE: AN INTERLEAVED TWO BANK-PAIR MEMORY SYSTEM

**CHAPTER 10** 

### INTRODUCTION

This chapter describes some of the system considerations appropriate in an interleaved system. In general, an interleaved system and a non-interleaved system are very similar; thus, the reader is referred to earlier chapters.

This chapter contains:

- The general system implementation and the connections between the R3721 and the rest of the system.
- A detailed explanation on how to set the mode register to adapt the R3721 to the application at hand.
- A summary of some of the timing diagrams involved for the different types of CPU accesses.

### **GENERAL SYSTEM DESCRIPTION**

In a typical system, the R3051 uses a 2x input clock for its internal operation and produces a 1x output clock SysClk for use by the external system. Figure 10.1 illustrates a general purpose system based on the R3051. The system shown is a synchronous one, where the external state machine uses the SysClk to synchronize its operation to the R3051. The R3721 DRAM controller controls two bank-pairs of interleaved DRAMs along with two IDT73720 Bus Exchangers for the data path. The rest of the system (EPROMs and I/O) are controlled by a separate, external state machine implemented in a couple of programmable logic devices and is beyond the scope of this manual.

An address decoder PAL connects directly to the outputs of the address latches and provides the system with the required chip-select lines. The address decoder also provides the R3721 DRAM Controller with the required  $\overline{CS}$  and  $\overline{MSel}$  enable lines. The R3721 controls two interleaved bank-pairs of 1M x 4 DRAMs that reside between address 0X0000\_0000h and 0X00FF\_FFFC. The internal mode register of the R3721 resides in the uncached I/O space, at physical address 0X0100\_0000. The address decoder PAL must generate the DRAM\_CS line for any access to the DRAM memory space and must issue both the DRAM\_CS and the  $\overline{MSel}$  lines for a write access to the mode register.



Figure 10.1 General Interleaved Memory System Using the R3051, the R3721 and the IDT73720

### **DETAILED DESCRIPTION OF THE R3721 CONNECTIONS**

The R3721 controls two bank-pairs of interleaved 1M x 4 DRAMs to obtain a maximum DRAM memory space of 16 MBytes. Each memory bank-pair consists of two 32-bit wide arrays: an even array and an odd array. In the interleaved memory configuration, the IDT73720 Bus Exchangers are used in the data path to obtain the maximum performance out of the interleaved DRAM memory system. Two IDT73720 Bus Exchangers in the data path isolate the DRAM banks from the R3051 multiplexed address/data bus. This will reduce the loading effect on the bus and prevent contention from occurring. Figure 10.2 illustrates the detailed connections among the various modules.

The connections around the R3721 can be divided in several sections as described in Chapter 8. In this system,  $\overline{RAS(0)}$  is connected to the even array of the first bank-pair; RAS(1) to the odd array; etc.





Figure 10.2 Detailed Connections for the R3721 in a Two Bank-pairs Interleaved DRAM Memory System

# SETTING THE MODE REGISTER

In order to obtain the best performance from the R3721 DRAM Controller, the internal mode register must be programmed with the appropriate values tailored to the application at hand. In the example used in this chapter, the system is assumed to be a two memory bank-pairs, interleaved memory system running at 25MHz using 1M x 4 DRAMs with 80 ns access time ("trac" = 80 ns). The analysis used in this chapter to set the mode register is the same one used in Chapter 8. In order to determine the proper values for the mode register, the system designer must consider the AC characteristics of the R3051, the R3721 and the IDT73720. In addition, the system designer must calculate the derating effect due to capacitive loading on the signal traces.

### Derating Effect Due to Capacitive Loading

The effect of capacitive loading due to the capacitance of the devices, the length of the traces on the PC boards and the propagation delay of signals travelling through the board add additional delays to the signals. These factors collectively are known as derating factors. Derating factors are arrived at by making approximate calculations of the capacitance. The capacitance obtained is compared with the rated drive capability of the IC component. The effect of additional capacitance on the timing is computed based on "rules of thumb":

- 1. The derating factor of the output driver for standard logic devices is 1ns/ 50pF.
- 2. The derating factor of the output driver for the CPu's is 1ns/25pF.
- 3. The traces have a capacitance of 2pF/inch.
- 4. The signal travels at the speed of 0.2ns/inch on a FR4 substrate.

The system designer should consider the derating effects described above and should use these or other values appropriate to the specific design in question in order to calculate the worst case interface timing.

The derating delay due to capacitive loading tdr should be computed as follows:

```
tdr = trace length in inches * 0.2ns/inch +
[(number of loads * input capacitance per load) -
( rated capacitive load of the output driver)] *
the derating factor of the output driver
```

```
tdr = derating delay = __ ns
```

In addition, the system designer must consider the variations in time between the R3051 output clock high time and output clock low time. These variations in the clock  $t_{vr}$  are expressed in the R3051 data sheet by the t32 and the t33 parameters and are equal to:

- $t_{vr} = \pm 2ns$  at 25MHz and less.
- $t_{vr} = \pm lns$  at more than 25MHz.

Obviously, this effect only needs to be considered for events which occur at half-clock cycle intervals; the R3051 guarantees that the period of  $\overline{SysClk}$  will be regular rising edge to rising edge or falling edge to falling edge.

The analysis to set the mode register should then be as follows:

#### • DRAM Page Size field (DZ1:0):

The system designer should set this field depending on the page size of the DRAMs used in the external system (from  $256K \ge 1$  to  $4M \ge 4$ ). In the case of this example, the DRAM size used is  $1M \ge 4$  and the DZ1:0 bits are set to "1 0".

#### • External memory configuration (Inlvd):

The system designer has the choice between interleaved and noninterleaved configurations and the types of data buffers used for the noninterleaved configurations. For this example, interleaved memory system is used and two IDT73720 Bus Exchangers must be used in the data path. The configuration bit Inlvd is set to "1".

### • $\overline{RAS}$ to $\overline{CAS}$ delay (RCD):

The  $\overline{RAS}$  to  $\overline{CAS}$  delay is the delay in clock cycles from the assertion of a  $\overline{RAS}$  signal to the assertion of the corresponding  $\overline{CAS}$  signal(s). This parameter is derived from the "tred" parameter found in the DRAM data sheets. As stated in the DRAM data sheets, "tred" is important during read accesses. If the actual  $\overline{RAS}$  to  $\overline{CAS}$  delay is less than the max "tred" specified, than the access is controlled by the  $\overline{RAS}$  strobe. On the other hand if the actual  $\overline{RAS}$  to  $\overline{CAS}$  delay is greater than the max "tred" specified, then the access is controlled by the  $\overline{CAS}$  strobe. Similarly, there are two criteria to consider in deciding on the settings of this bit.

- There is the "Row address hold time "trah" specified in the DRAM data sheet which determines how long the row address must be held constant after the assertion of the RAS signal. This parameter is usually 10 to 15 ns. If the RCD bit is set to "0", DAddr will switch from the row address to the column address half a cycle after the assertion of the RAS signal. At 25 MHz, this is equivalent to 20 ns. If the RAS signal is heavily loaded, violation of this parameter could occur. In that case setting RCD to "1" would be a more prudent choice.
- During single read accesses or for the initial latency of quad word read accesses, if the actual  $\overline{RAS}$  to  $\overline{CAS}$  delay is less than the max "tred" specified, then the first word access is controlled by the  $\overline{RAS}$  strobe. The system designer must make sure, in that case, that the data will be valid when the R3051 samples it. During read accesses, the R3051 samples the data in at the same edge the  $\overline{CAS}$  signals are negated. The system designer should proceed with the following analysis for RCD set to "0" as shown in Figure 10.3:

|   | $\overline{RAS}$ to $\overline{CAS}$ delay CAS pulse width | =<br>= | 1 clock cycle minimum +<br>1.5 clock cycles minimum |
|---|------------------------------------------------------------|--------|-----------------------------------------------------|
| _ |                                                            |        |                                                     |

| tx3 total available | = | 2.5 clock cycles. |
|---------------------|---|-------------------|
|---------------------|---|-------------------|

tx4 minimum time available from assertion of RAS

| 174 | minimum une available nom asseruon or RAS               |   |    |   |  |  |  |  |
|-----|---------------------------------------------------------|---|----|---|--|--|--|--|
|     | = [tx3 * (1/frequency of operation)] - t                |   |    |   |  |  |  |  |
|     | = [2.5 clock cycles * (1/frequency of operation)] - tvr |   |    |   |  |  |  |  |
| tx5 | access time from RAS ("trac" max, DRAM d/s)             | = | ns | + |  |  |  |  |
| tx6 | delay through IDT73720 (max, d/s)                       | = | ns | + |  |  |  |  |
| tx7 | data setup time for R3051 (t2 max, R3051)               | = | ns | + |  |  |  |  |
| tx8 | max capacitive derating effect (tdr max)                | = | ns |   |  |  |  |  |
| tx9 | maximum time to obtain data                             | = | ns |   |  |  |  |  |

for a valid system, tx9 should be less than tx4.

In this example, RCD has been set to "1". This corresponds to two clock cycles from  $\overline{RAS}$  to  $\overline{CAS}$ . In this case, the data in read accesses is controlled by the  $\overline{CAS}$  strobes.



Figure 10.3 Analysis to Set the RCD Bit in the Mode Register

# • RAS Timing (R2:0)

The  $\overline{\text{RAS}}$  timing field encodes the  $\overline{\text{RAS}}$  pulse width as well as the  $\overline{\text{RAS}}$  pre-charge time. The system designer must set these three bits such that the specified  $\overline{\text{RAS}}$  pulse width "tras" in the DRAM data sheets and the specified  $\overline{\text{RAS}}$  pre-charge time "trp" are not violated. In this example,  $\overline{\text{RAS}}$  pulse width is set to 3 clock cycles, which is 120 ns and is longer than the required 80 ns. The  $\overline{\text{RAS}}$  pre-charge time is set to 2 clock cycles which is 80 ns and longer than the required 60 to 70 ns. R2:0 are then set to "0 0 1".

# • CAS pulse width (CO)

The R3721 is designed in a such a way that during read accesses, the  $\overline{CAS}$  signals are negated at the same edge at which the R3051 samples the data. For timing analysis, during read accesses (single read or quad word reads) the data path is assumed to be set to the right settings (outputs of data buffers enabled, data buffers in the receive mode, and the latches are transparent). This means that from the  $\overline{CAS}$  strobe (or the  $\overline{RAS}$  strobe) the data coming out of the DRAMs passes through the data buffers directly to the R3051. Except for interleaved quad word read accesses, no latching of the data takes place. Under these circumstances, the system designer must ensure that the  $\overline{CAS}$  pulse width is long enough for the data to come out of the DRAMs, through the data buffers and meet the data setup time of the R3051.

The system designer should proceed with the following analysis illustrated in Figure 10.4:

#### APPLICATION EXAMPLE FOR AN INTERLEAVED TWO BANK-PAIR MEMORY SYSTEM USING THE R3721 DRAM CONTROLLER.

+

+

+

+



For any configuration, a CAS pulse width of 2.5 clock cycles must also use the delayed  $\overline{CS}$ . This is required to avoid spurious writes.

#### • CAS pre-charge time (CP):

Most DRAMs require a  $\overline{CAS}$  pre-charge time of about 10 ns, which is roughly equivalent to half a clock cycle at 25 MHz. This set up is appropriate for most medium speed applications. However, the  $\overline{CAS}$ pre-charge time is important during the page mode operation of the DRAMs. There are two criteria to consider in setting this bit:

- During page read operations (page read accesses following page write accesses or quad word read accesses) where the CAS is pre-charged and then re-asserted to enable the next word from the DRAMs as is illustrated in Figure 10.5 (a). In such situations, the next word to be read from the DRAMs will be available after a delay corresponding to :
  - access time from address "taa" or
  - access time from CAS "tcac" or
  - access time from CAS pre-charge "tacp"

whichever is longer (as per DRAM data sheet). The system designer must then take into consideration the access from the  $\overline{CAS}$  pre-charge time. The analysis for the access from the assertion of the  $\overline{CAS}$  is the same as for the  $\overline{CAS}$  pulse width analysis in Figure 10.4. The analysis for the  $\overline{CAS}$  pre-charge time is as follows:

tz1 = time from  $\overline{CAS}$  negated to when the next data word must be available. This time equals the sum of the  $\overline{CAS}$  pulse width and the  $\overline{CAS}$ precharge times with a minimum of 2 clock cycles and a maximum of 4 clock cycles.

tz1 of 3 or 4 clock cycles is irrelevant since the access will then completely be determined by the  $\overline{CAS}$  pulse width. The analysis will concentrate on the 2 clock cycle tz1 where the  $\overline{CAS}$  pre-charge time is 0.5 clock cycles and the  $\overline{CAS}$  pulse width is 1.5 clock cycles.





+

| tz1' |   | time for next data element<br>2 clock cycles * (1/frequency of operation) |   |    |
|------|---|---------------------------------------------------------------------------|---|----|
| tz2  | = | SysClk to CAS pre-charge (t1a max, R3721)                                 | = | ns |
| tz3  | = | access time from $\overline{CAS}$ pre-charge                              |   |    |
|      |   | ("tacp" max, DRAM d/s)                                                    | = | ns |
| tz4  | = | delay through the data buffer (max, '245 d/s)                             | = | ns |
| tz5  | = | R3051 data input setup time (t1a max, R3051)                              | = | ns |
| tz6  | = | max capacitive derating effect (tdr max)                                  | = | ns |
| tz7  | = | max time for data to be ready                                             | = | ns |

For proper operation tz7 must be less than tz1'

For this example, the  $\overline{CAS}$  pulse width is set to 1.5 clock cycles and the  $\overline{CAS}$  pre-charge time is set to 0.5 clock cycles:

tz1 = 2.0 clock cycles tz1' = 2.0 \* 40 ns = 80 ns tz2 = 7 ns tz3 = 45 ns tz4 = 6.5 ns tz5 = 5 ns ty6 = 5 ns (estimate)ty7 = 68.5 ns

ty7 is still less than ty1' and the system should run properly.

- During page write accesses where the  $\overline{\text{CAS}}$  pulse width is 1.5 clock cycles and the  $\overline{\text{CAS}}$  pre-charge time is 0.5 clock cycles, the system designer must ensure that the data is available at the DRAM inputs before asserting the  $\overline{\text{CAS}}$  strobes. That is, the data from the R3051 through the data buffers in addition to the DRAM data setup time must be less than one half clock cycle which is tw6. This timing analysis is illustrated in Figure 10.5 (b)  $\overline{\text{CAS}}$  pre-charge time analysis during writes is as follows:

tw6 = one half clock cycle - tvr = ns

| tw1 = | SysClk to data from the R3051 (t19 max, R3051) | - | ns | + |
|-------|------------------------------------------------|---|----|---|
| tw2 = | delay through the data buffer (max, '245 d/s)  | = | ns | + |
| tw3 = | DRAM data setup time ("tds" min, DRAM d/s)     | = | ns | + |
| tw4 = | max capacitive derating effect(tdr max)        | = | ns |   |
| tw5 = | max time for data to be ready                  | = | ns |   |

this time (tw5) must be less than tw6 for the proper operation of the system.



Figure 10.5 (b) CAS Pre-charge Time Analysis During Writes

For this example, the  $\overline{CAS}$  pulse width is set to 1.5 clock cycles and the  $\overline{CAS}$  pre-charge time is set to 0.5 clock cycles:

tw1 = 9 ns

tw2 = 6.5 ns

tw3 = 0 ns

tw4 = 5 ns (estimate)

 $tw5 = 20.5 \, ns$ 

tw5 is greater than tw6 which is 18 ns (20–2 = 18 ns) and the  $\overline{CAS}$  pre-charge time should be set to 1.5 clock cycles. However, the  $\overline{CAS}$  pre-charge time will be set to 0.5 clock cycles in order not to slow down the quad word accesses. To ensure proper operation during write accesses, the WrNr bit in the mode register will be set to 1. In this case, page write accesses will take a minimum of 3 clock cycles and no violation of the DRAM specification will occur.

A final consideration for setting the  $\overline{CAS}$  precharge time stems from the particular nature of interleaved systems. Specifically, if the  $\overline{CAS}$  pulse width selected is 2.5 clock cycles, then the system must also use the slow  $\overline{CS}$  mode. This is required to avoid bus contention when switching between reads and writes.

#### • Refresh Period (RF2:0):

The refresh period must be set according to the frequency of operation. In this example, the RF2:0 bits are set to a 25 MHz operation at "1 0 1"

#### • Delayed Chip Select (DCS):

The delay chip select must be set if the external address decoder is not fast enough to meet the fast chip select requirements. This is if the external decoder can not provide chip select within the first clock cycle of the access. In this example, where the system is running at 25 MHz, for fast chip select operations the  $\overline{DRAM\_CS}$  line must be ready within 40 ns. If the DCS bit is set to "1", the  $\overline{DRAM\_CS}$  needs to be valid within 60 ns, which is easily achievable.

The delay chip select can also be set to slow down the page write accesses. Slowing down the page writes accesses is appropriate when the delay through the data buffer is such that the data is not available to the DRAM within half a clock cycle. In this case, setting the DCS bit will slow the page write operation as demonstrated in Figure 9.12 (c). It will also have the effect of adding an extra clock cycle for the initial latency during page read accesses (quad word reads included) while keeping the repetition rate (remaining words in a page) at its peak.

In the example used in this chapter, the DCS bit is set to "0" since there is no need of slowing down the page write accesses. Figure 10.6 illustrates the settings of the mode register used for this system.

• Ignore WrNear:

The WrNr bit in the mode register can be used to force the DRAM controller to ignore the processor  $\overline{WrNear}$  output during write accesses. This feature is important for interleaved systems using DRAM SIMM modules, where the  $\overline{OE}$  of the DRAMs is grounded; if  $\overline{OE}$  is available from the SIMM, higher performance is possible by enabling the  $\overline{WrNear}$  from the CPU. In systems with no  $\overline{OE}$  control, a write to one array will cause a read to the other array in the bank-pair; to avoid bus contention in consecutive writes, the WrNr bit forces near writes to be retired in a minimum of three cycles (rather than two), thus allowing time to avoid bus contention. In this system, WrNr is set to '1' to slow writes and avoid bus contention.

#### SYSTEM TIMING DIAGRAMS

The following section will present different timing diagrams for bus accesses based on the system described earlier in this chapter. These timing diagrams illustrate the various CPU accesses possible, and are provided to illustrate the complete functionality of the R3721 DRAM Controller.

| 15   | 14  | 13  | 12  | 11  | 10  | 9    | 8  | 7  | 6  | 5  | 4   | 3    | 2     | 1   | 0   |
|------|-----|-----|-----|-----|-----|------|----|----|----|----|-----|------|-------|-----|-----|
| Rsvd | DCS | RF2 | RF1 | RF0 | СР  | Rsvd | Co | R2 | R1 | R0 | RCD | WrNr | Inivd | DZ1 | DZ0 |
| 0    | 0   | 1   | 0   | 1   | 0   | 0    | 1  | 0  | 0  | 1  | 1   | 1    | 1     | 1   | 0   |
| D15  | D14 | D13 | D12 | D11 | D10 | D9   | D8 | D7 | D6 | D5 | D4  | D3   | D2    | D1  | D0  |

Figure 10.6 Mode Register Settings for a Two Bank Non-interleaved System

Figure 10.7 illustrates the timing diagrams involved in a single read access to the even memory array of bank-pair 0 starting from an idle state where no  $\overline{RAS}$  signal was asserted.





Figure 10.8 illustrates the timing diagrams involved in a single read access to the odd memory array of bank-pair 1 starting from an idle,  $\overline{RAS}$  asserted state. Pre-charging  $\overline{RAS}$  has to occur because the previous access occurred to the other bank-pair.



Figure 10.8 Single Read Access to the Odd Half-bank-pair 1

Figure 10.9 illustrates the timing diagrams involved in a single write access to the odd memory array of bank-pair 0, starting from an idle state where no  $\overline{RAS}$  signal was asserted.









Figure 10.10 Single Write Access to the Even Half-bank-pair 1



# RESET INITIALIZATION, REFRESH AND INPUT CLOCKING

#### INTRODUCTION

This chapter discusses the system housekeeping for the R3721:

- The reset initialization sequence performed by the R3721 DRAM controller to initialize the DRAM memory banks under its control.
- The  $\overline{CAS}$ -before- $\overline{RAS}$  refresh sequence used by the R3721.
- The input clocking requirements of the R3721.

#### **POWER-UP AND RESET**

The R3721 uses the same Reset pulse as the R3051 in order to synchronize its operation to the R3051. The R3721 has the same requirements as the R3051 in terms of the power on reset pulse width, and the warm reset pulse width. Figure 11.1 illustrates the power on requirements of the R3721 DRAM Controller.



Figure 11.1 Cold Start

Figure 11.2 illustrates the warm reset requirements of the R3721 DRAM Controller.



Figure 11.2 Warm Reset

#### DRAM INITIALIZATION

Reset causes the internal mode register of the R3721 to be loaded with the default values (illustrated in Chapter 4), and all the output control signals are negated. All the internal counters are cleared. The internal refresh timer is loaded with the refresh interval count that corresponds to the default settings of the mode register.

The R3721 DRAM Controller proceeds to initialize the complete memory system by issuing 15 consecutive  $\overline{CAS}$ -before- $\overline{RAS}$  refresh cycles. These 16 refresh cycles reset the internal row counter of the DRAMs. Figure 11.3 illustrates the reset initialization sequence of the R3721.



Figure 11.3 Reset Initialization Sequence

During the reset initialization sequence, the R3721 uses the default settings of the mode register to control the widths of both the RAS and the CAS strobes. The default values of the mode register call for the largest RAS pulse width and pre-charge time. This ensures that during reset initialization, the specified DRAM parameters (RAS pulse width, CAS pulse width, ....) are not violated.

# **CAS BEFORE RAS REFRESH TIMINGS**

The R3721 has a built-in refresh timer that issues a refresh request at a maximum interval time of 9.6  $\mu sec$ . The refresh timer gets loaded with the appropriate number of clock counts that are encoded in the refresh field of the mode register.

The refresh interval of 9.6  $\mu$ sec maximum, ensures that the maximum specified  $\overline{RAS}$  pulse width of 10  $\mu$ sec (as per DRAM data sheets) is never violated. This feature is very important since in page mode the  $\overline{RAS}$  signal can be kept asserted for long periods of time.

Figure 11.4 illustrates the timing for a  $\overline{CAS}$ -before- $\overline{RAS}$  refresh sequence. During the refresh sequence all the DRAM control signals are negated with the exception of the  $\overline{RAS}$  and the  $\overline{CAS}$  signals. This ensures that, for 4 Mbit DRAMs, the  $\overline{WBank(3:0)}$  signals are not asserted during the refresh, and thus the test mode of 4 Mbits DRAMs is not enabled. At the end of a refresh sequence or initialization sequence, all the control signals are negated. During a refresh sequence, the R3721 asserts all the  $\overline{RAS}$  and all the  $\overline{CAS}$  signals as a single set.



Figure 11.4 CAS-before-RAS Refresh Sequence

#### • Priority Scheme

To resolve conflicts between an internal refresh request and external bus accesses requests, the R3721 has built-in the following refresh priority scheme:

- If ALE and  $\overline{CS}$  are detected at the same time as an internal refresh request, the R3721 gives the priority to the refresh request and services it. At the same time, the R3721 registers the fact that a transfer is pending and will service it at the end of the refresh sequence.
- If a refresh request occurs during the time the R3721 is servicing a bus access, the refresh sequence will be delayed until the end of the bus access.
- If a transfer is detected (ALE and  $\overline{CS}$  asserted) during the time the R3721 is servicing a refresh request, the bus access is delayed until the end of the refresh sequence. Additionally, the R3721 will enable its  $\overline{RdCen}$  and  $\overline{ACK}$  output drivers.

# INPUT CLOCK REQUIREMENTS

The R3721 uses the SysClk output directly from the R3051 to synchronize its operation to the R3051 timing requirements. The R3721 uses both edges of the SysClk to control its internal state machine. The requirements for the input clock to the R3721 are illustrated in Figure 11.5.



Figure 11.5 R3721 Input Clock Requirements



Integrated Device Technology, Inc.

....

# IDT73720 BUS EXCHANGER OVERVIEW

APPENDIX A

#### **INTRODUCTION:**

The IDT73720 Bus Exchanger is designed to interface multiple memory busses to a single CPU data bus. It is used in systems which implement multiple banks within a memory subsystem—either interleaved, or banked for deeper memories.

This appendix provides an overview of the IDT73720 16-bit Bus Exchanger, shown in Figure A.1. Detailed information on the pin-out, packaging, and electrical specifications can be found in the data sheet for this device, also available from IDT.

#### **MAJOR FEATURES:**

- High speed 16-bit bus exchange for interbus communication in the following environments:
  - --- Multi-way interleaving memory
  - Multiplexed address and data busses
- Direct Interface to R3051 Family RISChipSet<sup>™</sup>
  R3051<sup>™</sup> Family of Integrated RISController<sup>™</sup> CPUs
  R3721 DRAM Controller
- Supports R3051 family systems from 20 to 40MHz
- Interfaces a single CPU bus to interleaved memory systems
- Data path for read and write operations
- Low noise 12mA TTL level outputs
- Simplifies data path design in high-performance memory systems
- Bidirectional 3-Bus Architecture: X, Y, Z
  - One CPU Bus: X
  - Two (interleaved or banked) memory busses: Y & Z
  - Each bus can be independently latched
- Byte control on all three busses
- Source terminated outputs for low noise and undershoot control
- 68-pin PLCC package
- High-performance CMOS technology



Figure A.1 Block Diagram of IDT73720 Bus Exchanger

#### **DESCRIPTION:**

The IDT73720 Bus Exchanger is a high speed 16-bit bus exchange device intended for inter-bus communication in multi-way interleaving or multi-banked memory systems.

The 73720 Bus Exchanger provides data path support in an R3051 family system utilizing interleaved or banked memory techniques. The Bus Exchanger is responsible for interfacing between the CPU A/D bus (CPU address/data bus) and multiple memory data busses. The R3721 DRAM Controller has been designed to directly control a pair of Bus Exchangers as the data path between the CPU bus and DRAM memory busses.

The 73720 uses a three bus architecture (X, Y, Z), with control signals suitable for simple transfer between the CPU bus (X) and either memory bus (Y or Z). The Bus Exchanger features independent read and write latches for each memory bus, thus supporting a variety of memory strategies. Y and Z ports support individual byte output enables to independently enable upper and lower bytes.

#### **ARCHITECTURE OVERVIEW:**

The Bus Exchanger is used to service both read and write operations between the CPU and the dual memory busses. It includes independent data path elements for reads from and writes to each of the memory banks (Y and Z). Data flow control is managed by a simple set of control signals, analogous to a simple transceiver. In short, the Bus Exchanger allows bidirectional communication between ports X and Y and ports X and Z.

The data path elements for each memory port include:

**Read Latch**: Each of the memory ports Y and Z contains a transparent latch to capture the contents of the memory bus. Each latch features an independent latch enable. During reads, the R3721 will assert the YZLEN output (tied to the LEYX and LEZX inputs of the 73720) to cause the bus exchanger to capture the data from the interleaved memory array.

**Write Latch**: Each memory port Y and Z contains an independent latch to capture data from the CPU bus during writes. Each memory port write latch features an independent latch enable, allowing write data to be directed to a specific memory port without disrupting the other memory port. The R3721 uses the write data path as a simple data transceiver, and thus does not need to latch data into either of the write latches.

#### DATA FLOW CONTROL SIGNALS

 $T/\overline{R}$  (Transmit/Receive). This signal controls the direction of data transfer. A transmit is used for CPU writes, and a receive is used for read operations. The R3721 T/ $\overline{R}$  output has been designed to drive either a pair of Bus Exchangers, or 4 IDT 74FCT245s.

**Path**: The path control signal is used to select between the even memory path Y and the odd memory path Z during read or write operations. Path selects the memory port to be connected to the CPU bus (X-port). The R3721 uses a Path value of "1" for the even memory port, and "0" for the odd port.

In an interleaved memory system, data is captured into the Bus Exchanger using the R3721 YZLEn; the R3721 then uses the Path signal to sequence the even read latch followed by the odd read latch onto the CPU port. Thus, both words are returned to the processor in back to back cycles.

 $\overline{OEH}$ ,  $\overline{OEL}$  are the output enable control signals to select upper or lower bytes of all three ports. These signals, in conjunction with  $T/\overline{R}$  and Path, determine the current output ports of the Bus Exchanger.

#### **MEMORY READ OPERATIONS**

Memory reads can be thought of as occurring in two distinct stages. During the first stage, the data present at the memory port is captured by the read latch for that memory port. During a subsequent phase, data is brought from a selected read latch to the CPU A/D port X by using output enable control.

The read operation is selected by driving  $T/\overline{R}$  low. The read is managed using the Path input to select the memory port (Y or Z); the LEYX/LEZX enable the data capture into the corresponding Read Latch.

The read latches enable the R3721 to perform high-performance bursts in interleaved memory systems. The R3721 reads both banks simultaneously; once the data is available, the R3721 closes the Bus Exchangers' read latches, capturing the data. The DRAM Controller then sequences the data onto the CPU bus, while simultaneously pre-charging  $\overline{CAS}$ , and re-asserting  $\overline{CAS}$  to obtain the second pair of words. In many systems, this strategy allows the DRAM controller to return all four words of a quad word read at the maximum data rate of the CPU.

Note that the Bus Exchanger may be used as a data transceiver by holding the latches open. In this case, the two phases of the read operation are compressed into a single activity. This is how the R3721 uses the Bus Exchanger during single read operations, banked memory configurations, and for the first word of an even-odd pair in an interleaved memory system.

#### **MEMORY WRITE OPERATIONS**

The R3721 always uses the Bus Exchanger as a simple transceiver during write operations. Thus, CPU data is never latched into the Bus Exchanger write latches.

The R3721 uses the  $T/\overline{R}$ , Path, and  $\overline{OEU}/\overline{OEL}$  control signals to properly steer processor data into the DRAM arrays. The write operation is selected by driving  $T/\overline{R}$  high. Writes are thus performed using the Path input to select the memory port (Y or Z). The LEXY/LEXZ are not used, and thus are tied high to enable CPU data to flow through the latches.

# PIN DESCRIPTION

#### This section describes the signals used by the Bus Exchanger. More detail on these pins may be found in the IDT73720 data sheet; more detail on the R3721 interface can be found in other chapters of this manual. Note that signals indicated by an overbar are active low.

# X(0:15)

# I/O

**Bidirectional Data Port X**. In an R3721 system, this is connected to the CPU's A/D (Address/Data) bus.

# Y(0:15)

**Bidirectional Data port Y**. In an R3721 based system, this is connected to the even path or lower bank of memory.

# Z(0:15)

**Bidirectional Data port Z**. In an R3051 based system, this is connected to the odd path or upper bank of memory.

# LEXY

Latch Enable input for Y-Write Latch. In an R3721 system, this is tied high.

# LEXZ

# Latch Enable input for Z-Write Latch. In an R3721 system, this is tied high.

#### LEYX

**Latch Enable input for the Y-Read Latch**. The Y-Read Latch is open when LEYX is high. Data from the even path Y is latched on the high to low transition of LEYX. In an R3721 system, this is tied to the XYLEn output of the R3721.

#### LEZX

**Latch Enable input for the Z-Read Latch**. The Z-Read Latch is open when LEZX is high. Data from the odd path Z is latched on the high to low transition of LEZX. In an R3721 system, this is tied to the XYLEn output of the R3721.

# PATH

# Ι

T

Т

Ι

**Even/Odd Path Selection**. When high, PATH enables data transfer between the X-Port and the Y-port (even path). When low, PATH enables data transfer between the X-Port and the Z-port (odd path).

# T/R

**Transmit/Receive Data**. When  $T/\overline{R}$  is high, Port X is enabled to transfer (write) data into the memory port specified by PATH. When  $T/\overline{R}$  is low, Port X is enabled to receive (Read) data from the memory port specified by PATH.

# OEH

**Output Enable for Upper byte**. When low, the Upper byte of data is transferred to the port specified by PATH in the direction specified by  $T/\overline{R}$ .

# OEL

**Output Enable for Lower byte**. When low, the Lower byte of data is transferred to the port specified by PATH in the direction specified by  $T/\overline{R}$ .

#### BUS EXCHANGER OVERVIEW

# I/0

**I/O** 

I/O

I

I

Ι



Integrated Device Technology, Inc.

2975 Stender Way P.O. Box 58015 Santa Clara, CA 95052-8015 (408) 727-6116 FAX 408-492-8674