OCP IP OpenCoreProtocolSpecification 1.0
OCP IP OpenCoreProtocolSpecification 1.0
Release 1.0
2001 OCP-IP Association, All Rights Reserved. Open Core Protocol Reference Document Revision 002 This document, including all software described in it, is furnished under the terms of the Open Core Protocol Specification License Agreement (the "License") and may only be used or copied in accordance with the terms of the License. The information in this document is a work in progress, jointly developed by the members of OCP-IP Association ("OCP-IP") and is furnished for informational use only. The technology disclosed herein may be protected by one or more patents, copyrights, trademarks and/or trade secrets owned by or licensed to OCP-IP. OCP-IP reserves all rights with respect to such technology and related materials. Any use of the protected technology and related material beyond the terms of the License without the prior written consent of OCP-IP is prohibited. This document contains material that is confidential to OCP-IP and its members and licensors. The user should assume that all materials contained and/or referenced in this document are confidential and proprietary unless otherwise indicated or apparent from the nature of such materials (for example, references to publicly available forms or documents). Disclosure or use of this document or any material contained herein, other than as expressly permitted, is prohibited without the prior written consent of OCP-IP or such other party that may grant permission to use its proprietary material. The trademarks, logos, and service marks displayed in this document are the registered and unregistered trademarks of OCPIP, its members and its licensors. The following trademarks of Sonics, Inc. have been licensed to OCP-IP: FastForward, SonicsIA, CoreCreator, SiliconBackplane, SiliconBackplane Agent, MultiChip Backplane, InitiatorAgent Module, TargetAgent Module, ServiceAgent Module, SOCCreator, and Open Core Protocol. The copyright and trademarks owned by OCP-IP, whether registered or unregistered, may not be used in connection with any product or service that is not owned, approved or distributed by OCP-IP, and may not be used in any manner that is likely to cause customer confusion or that disparages OCP-IP. Nothing contained in this document should be construed as granting by implication, estoppel, or otherwise, any license or right to use any copyright without the express written consent of OCP-IP, its licensors or a third party owner of any such trademark. Printed in the United States of America. Part number: 161-000125-0001
EXCEPT AS OTHERWISE EXPRESSLY PROVIDED, THE OPEN CORE PROTOCOL (OCP) SPECIFICATION IS PROVIDED BY OCP-IP TO MEMBERS "AS IS" WITHOUT WARRANTY OF ANY KIND, EXPRESS, IMPLIED OR STATUTORY, INCLUDING BUT NOT LIMITED TO ANY IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OF THIRD PARTY RIGHTS. OCP-IP SHALL NOT BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL OR CONSEQUENTIAL DAMAGES OF ANY KIND OR NATURE WHATSOEVER (INCLUDING, WITHOUT LIMITATION, ANY DAMAGES ARISING FROM LOSS OF USE OR LOST BUSINESS, REVENUE, PROFITS, DATA OR GOODWILL) ARISING IN CONNECTION WITH ANY INFRINGEMENT CLAIMS BY THIRD PARTIES OR THE SPECIFICATION, WHETHER IN AN ACTION IN CONTRACT, TORT, STRICT LIABILITY, NEGLIGENCE, OR ANY OTHER THEORY, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
Contents
1 Overview 1 OCP Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Compliance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Part I 2 3 Specification Theory of Operation Signals and Encoding Dataflow Signals 5 7 11
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Basic Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Simple Extensions. . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Complex Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 Sideband Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Reset, Interrupt, Error, and Core-specific Flag Signals. . . . . . . . . . 17 Control and Status Signals . . . . . . . . . . . . . . . . . . . . . . . 18 Test Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Scan Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Clock Control Interface . . . . . . . . . . . . . . . . . . . . . . . . . 20 Debug and Test Interface . . . . . . . . . . . . . . . . . . . . . . . . 20 Signal Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Dataflow Signals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Sideband and Test Signals . . . . . . . . . . . . . . . . . . . . . . . 29 Transfer Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Burst Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Packing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Burst Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Threads and Connections . . . . . . . . . . . . . . . . . . . . . . . . . 34 OCP Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
viii
Signal Options. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 Protocol Options. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 Minimum Implementation . . . . . . . . . . . . . . . . . . . . . . . . 36 OCP Interface Compatibility . . . . . . . . . . . . . . . . . . . . . . . 36 5 Timing Diagrams 41
Simple Write and Read Transfer . . . . . . . . . . . . . . . . . . . . . . 42 Request Handshake . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 Request Handshake and Separate Response Pipelined Request and Response . . . . . . . . . . . . . . . 44
. . . . . . . . . . . . . . . . . . . . . 45
Datahandshake Extension
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Minimum Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . 56 Hold-time Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . 56 Physical Design Parameters . . . . . . . . . . . . . . . . . . . . . . . 57 Connecting Two OCP Cores . . . . . . . . . . . . . . . . . . . . . . . 58 7 Core Performance 61
Q-Master . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 Features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 Configuration Settings. . . . . . . . . . . . . . . . . . . . . . . . . . 70 Q-Slave . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Configuration Settings. . . . . . . . . . . . . . . . . . . . . . . . . . 72 OCP Merger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 Configuration Restrictions . . . . . . . . . . . . . . . . . . . . . . . . 80 Functionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Contents
ix
OCP Monitor
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 . . . . . . . . . . . . . . . . . . . . . 103 107
Components
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
Instances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Chip Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 Syntax Specification . . . . . . . . . . . . . . . . . . . . . . . . . . 112 . . . . . . . . . . . . . . . . . . 112 115 119
Sample Chip RTL Configuration File 12 Interface Configuration File 13 Chip Synthesis Configuration File
Configuration File Sections . . . . . . . . . . . . . . . . . . . . . . . 120 Version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 Technology Section . . . . . . . . . . . . . . . . . . . . . . . . . . 120 Chip Section. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 Instance Section. . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
Technology Variable Syntax . . . . . . . . . . . . . . . . . . . . . . 134 Chip Section Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . 136 Instance Section Syntax . . . . . . . . . . . . . . . . . . . . . . . . 137 Sample Chip Synthesis Configuration File . . . . . . . . . . . . . . . . 138 14 Core Synthesis Configuration File 141
Version Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 Clock Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 Area Section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 Port Constraints Section. . . . . . . . . . . . . . . . . . . . . . . . 145 Max Delay Constraints . . . . . . . . . . . . . . . . . . . . . . . . 150 False Path Constraints . . . . . . . . . . . . . . . . . . . . . . . . 151 Sample Core Synthesis Configuration File . . . . . . . . . . . . . . . 151 15 Package File Part III Guidelines 16 Developers Guidelines 153 155 157
Signal Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 Multi-threaded OCP Implementation . . . . . . . . . . . . . . . . . 161 Slave . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 Master. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 State Machine Examples . . . . . . . . . . . . . . . . . . . . . . . . 163
Simple OCP Extensions . . . . . . . . . . . . . . . . . . . . . . . . . 168 Complex OCP Extensions . . . . . . . . . . . . . . . . . . . . . . . . 168 Threads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 Connections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 Sideband Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 Debug and Test Interface . . . . . . . . . . . . . . . . . . . . . . . . 171 Scan Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 Clock Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 17 Timing Guidelines 173
Level0 Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 Level1 Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 Level2 Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 Index 177
Introduction
The Open Core Protocol (OCP) delivers the first openly licensed, corecentric protocol that comprehensively describes the system-level integration requirements of intellectual property (IP) cores. While other bus and component interfaces address only the data flow aspects of core communications, the OCP unifies all inter-core communications, including sideband control and test harness signals. The OCP's synchronous unidirectional signaling produces simplified core implementation, integration, and timing analysis. OCP eliminates the task of repeatedly defining, verifying, documenting and supporting proprietary interface protocols. The OCP readily adapts to support new core capabilities while limiting test suite modifications for core upgrades. Clearly delineated design boundaries enable cores to be designed independently of other system cores yielding definitive, reusable IP cores with reusable verification and test suites. Any on-chip interconnect can be interfaced to the OCP rendering it appropriate for many forms of on-chip communications: Dedicated peer-to-peer communications, as in many pipelined signal processing applications such as MPEG2 decoding. Simple slave-only applications such as slow peripheral interfaces. High-performance, latency-sensitive, multi-threaded applications, such as multi-bank DRAM architectures.
The OCP supports very high performance data transfer models ranging from simple request-grants through pipelined and multi-threaded objects. Higher complexity SOC communication models are supported using thread identifiers to manage out-of-order completion of multiple concurrent transfer sequences.
xii
The OCP-IP CoreCreator tool automates the tasks of building, simulating, verifying and packaging OCP-compatible cores. IP core products can be fully "componentized" by consolidating core models, timing parameters, synthesis scripts, verification suites and test vectors in accordance with the OCP specification.
Library Overview
The Open Core Protocol Specification provides technical details for working with the Open Core Protocol Interface. It is the hardware companion to the CoreCreator Guide. This document also includes information on the Socket Transaction Language (STL), behavioral models, and chip file formats. The CoreCreator Guide is for design engineers preparing cores with CoreCreator. The document is a reference for using the GUI and the command line interface tools. It is the tools companion to the Open Core Protocol Specification. The CoreCreator Tutorials step through the CoreCreator tools flow, from core creation through core packaging.
Overview
The Open Core Protocol (OCP) defines a high-performance, bus-independent interface between IP cores that reduces design time, design risk, and manufacturing costs for SOC designs. An IP core can be a simple peripheral core, a high-performance microprocessor, or an on-chip communication subsystem such as a wrapped on-chip bus. The Open Core Protocol: Achieves the goal of IP design reuse. The OCP transforms IP cores making them independent of the architecture and design of the systems in which they are used Optimizes die area by configuring into the OCP only those features needed by the communicating cores Simplifies system verification and testing by providing a firm boundary around each IP core that can be observed, controlled, and validated
The approach adopted by the Virtual Socket Interface Alliances (VSIA) Design Working Group on On-Chip Buses (DWGOCB) is to specify a bus wrapper to provide a bus-independent Transaction Protocol-level interface to IP cores. The OCP is equivalent to VSIAs proposed Virtual Component Interface (VCI). While the VCI addresses only data flow aspects of core communications, the OCP is a superset of VCI that also supports configurable sideband control signaling and test harness signals. Only the OCP defines protocols to unify all of the inter-core communication.
OCP Characteristics
The OCP defines a point-to-point interface between two communicating entities such as IP cores and bus interface modules (bus wrappers). One entity acts as the master of the OCP instance, and the other as the slave. Only the master can present commands and is the controlling entity. The slave responds to commands presented to it, either by accepting data from the master, or presenting data to the master. For two entities to communicate in a peer-to-peer fashion, there need to be two instances of the OCP connecting them - one where the first entity is a master, and one where the first entity is a slave. Figure 1 shows a simple system containing a wrapped bus and three IP core entities: one that is a system target, one that is a system initiator, and an entity that is both.
Figure 1 System Showing Wrapped Bus and OCP Instances
module
On-Chip Bus
The characteristics of the IP core determine whether the core needs master, slave, or both sides of the OCP; the wrapper interface modules must act as the complementary side of the OCP for each connected entity. A transfer across this system occurs as follows. A system initiator (as the OCP master) presents command, control, and possibly data to its connected slave (a bus wrapper interface module). The interface module plays the request across the on-chip bus system. The OCP does not specify the embedded bus functionality. Instead, the interface designer converts the OCP request into an embedded bus transfer. The receiving bus wrapper interface module (as the OCP master) converts the embedded bus operation into a legal OCP command. The system target (OCP slave) receives the command and takes the requested action. Each instance of the OCP is configured (by choosing signals or bit widths of a particular signal) based on the requirements of the connected entities and is independent of the others. For instance, system initiators may require more
Overview
address bits in their OCP instances than do the system targets; the extra address bits might be used by the embedded bus to select which bus target is addressed by the system initiator. The OCP is flexible. There are several useful models for how existing IP cores communicate with one another. Some employ pipelining to improve bandwidth and latency characteristics. Others use multiple-cycle access models, where signals are held static for several clock cycles to simplify timing analysis and reduce implementation area. Support for this wide range of behavior is possible through the use of synchronous handshaking signals that allow both the master and slave to control when signals are allowed to change.
Compliance
For a core to be considered OCP compliant, it must satisfy the following conditions: 1. Each OCP interface on the core must: Contain at least the basic OCP signals described in Basic Signals on page 12. Comply with the protocol semantics for the OCP defined in Chapter 4 that are required to support the minimum command and response set defined in Minimum Implementation on page 36. Specify the timing of each signal using the terminology defined in Chapter 6 on page 55.
2. The core and its interfaces must be described using the syntax defined in Chapter 10 on page 97. The interfaces must have their timing defined as described in Chapter 6 on page 55. Any non-OCP interfaces must be further described using the interface configuration file described in Chapter 12 on page 115.
Part I Specification
Theory of Operation
The Open Core Protocol interface addresses communications between the functional units (or IP cores) that comprise a system on a chip. The OCP provides independence from bus protocols without having to sacrifice highperformance access to on-chip interconnects. By designing to the interface boundary defined by the OCP, you can develop reusable IP cores without regard for the ultimate target system. Given the wide range of IP core functionality, performance and interface requirements, a fixed definition interface protocol cannot address the full spectrum of requirements. The need to support verification and test requirements increases the complexity of the interface. To address this spectrum of interface definitions, the OCP defines a highly configurable interface. The OCPs structured methodology includes all of the signals required to describe an IP cores communications including data flow, control, and verification and test signals. This chapter provides an overview of the concepts behind the Open Core Protocol, introduces the terminology used to describe the interface and offers a high-level view of the protocol.
Bus Independence
A core utilizing the OCP can be interfaced to any bus. A test of any bus-independent interface is to connect a master to a slave without an intervening onchip bus. This test not only drives the specification towards a fully symmetric interface but helps to clarify other issues. For instance, device selection techniques vary greatly among on-chip buses. Some use address decoders. Others generate independent device select signals (analogous to a board level chip select). This complexity should be hidden from IP cores, especially since in the directly-connected case there is no decode/selection logic. OCP-compliant slaves receive device selection information integrated into the basic command field. Arbitration schemes vary widely. Since there is virtually no arbitration in the directly-connected case, arbitration for any shared resource is the sole responsibility of the logic on the bus side of the OCP. This permits OCPcompliant masters to pass a command field across the OCP that the bus interface logic converts into an arbitration request sequence.
Commands
There are two basic commands, Read and Write and two command extensions. The Broadcast command has the same protocol semantics as Write; the difference is that the master indicates that it is attempting to write to several or all remote target devices that are connected on the other side of the slave. As such, Broadcast is typically useful only for slaves that are in turn a master on another communication medium (such as an attached bus). The second command extension, Read Exclusive (ReadEx), has protocol semantics that are similar to Read, but guarantees sufficient resource locking to support atomic read-modify-write or swap semantics. On receiving a ReadEx command, the slave attempts to acquire exclusive access to the addressed resource. Once the slave returns data from that address, the master can assume that it has obtained exclusive access and issue a Write command. The Write command notifies the slave to update the address (which must match the ReadEx address), and then to release exclusive access to the memory location.
Theory of Operation
Address/Data
Wide widths, characteristic of shared on-chip address and data buses, make tuning the OCP address and data widths essential for area-efficient implementation. Only those address bits that are significant to the IP core should cross the OCP to the slave. The OCP address space is flat and composed of 8bit bytes (octets). To increase transfer efficiencies, many IP cores have data field widths significantly greater than an octet. The OCP directly supports up to 128-bit data fields, allowing it to transfer 16 bytes simultaneously. The OCP refers to the chosen data field width as the word size of the OCP. The term word is used in the traditional computer system context; that is, a word is the natural transfer unit of the block. OCP masters or slaves with natural word sizes, that are not directly supported, are zero-extended to the next power-of-two boundary. For instance, a 12-bit DSP core would typically employ a 16-bit OCP and provide zeros on the upper nibble. Transfers of less than a full word of data are supported by providing byte enable information that specifies which octets are to be transferred. Assembling octets into larger aggregates follows the rule that the aggregate is addressed at the OCP word-aligned address of the lowest octet in the aggregate. The lowest octet in the aggregate is also the least significant octet in the aggregate making the OCP little-endian.
Pipelining
The OCP allows pipelining of transfers. To support this feature, the return of read data and the provision of write data may be delayed after the presentation of the associated request.
Response
The OCP separates requests from responses. A slave can accept a command request from a master on one cycle and respond in a later cycle. The division of request from response permits pipelining. The OCP provides the option of having responses for Write commands, or completing them immediately without a response (posted write model).
Burst
To provide high transfer efficiency, burst support is essential for many IP cores. The extended OCP supports annotation of transfers with burst information but requires that appropriately sequenced addresses accompany each successive command in the burst. This simplifies the requirements for address sequencing/burst count processing in the slave.
10
While the notion of a thread is a local concept between a master and a slave communicating over an OCP, it is possible to globally pass information from initiator to target using connection identifiers. Connection information helps to identify the initiator and determine priorities at the target.
12
Dataflow Signals
The dataflow signals consist of a small set of required signals called the basic OCP and optional signals that can be configured to support additional core communication requirements. The optional dataflow signals are grouped into simple and complex extensions. The naming conventions for dataflow signals use the prefix M for signals driven by the OCP master and S for signals driven by the OCP slave.
Basic Signals
Table 1 lists the basic OCP signals that must be present in any OCP interface.
Table 1 Basic OCP Signals
Name
Clk MAddr MCmd MData SCmdAccept SData SResp
Width
1 132 3 8/16/32/64/128 1 8/16/32/64/128 2
Driver
varies master master master slave slave slave
Function
OCP clock Transfer address Transfer command Write data Slave accepts transfer Read data Transfer response
Clk Clock signal for the OCP. All interface signals are synchronous to the rising edge of Clk. MAddr The Transfer address, MAddr specifies the slave-dependent address of the resource targeted by the current transfer. To configure this field, use the addr_wdth parameter. MAddr is a byte address that must be aligned to the OCP word size (data_wdth). If the OCP word size is larger than a single byte, the aggregate is addressed at the OCP word-aligned address of the lowest byte in the OCP word, and the lowest order address bits are hardwired to 0. MCmd Transfer command. This signal indicates the type of transfer at the OCP. Commands are encoded as follows.
13
Table 2
Command Encoding
MCmd[2:0]
0 0 0 0 1 1 1 1 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1
Transaction Type
Idle Write Read ReadEx Reserved Reserved Reserved Broadcast
Mnemonic
IDLE WR RD RDEX
BCST
The set of allowable commands can be limited using the readex_enable and broadcast_enable parameters as described in Protocol Options on page 35. MData Write data. This field carries data from the master to the slave. The data width is configured using the data_wdth parameter. If the OCP word size is larger than a single byte, the lowest addressed byte is also the least significant in the aggregate. Together with the previously described addressing rule, this makes the OCP little-endian. SCmdAccept Slave accepts transfer. A value of 1 on the SCmdAccept signal indicates that the slave accepts the masters transfer request. SData Read data. This field carries data from the slave to the master. The field width is configured using the data_wdth parameter. If the OCP word size is larger than a single byte, the lowest addressed byte is also the least significant in the aggregate. SResp Response field from the slave to a transfer request from the master. Response encoding is as follows.
Table 3 Response Encoding
SResp[1:0]
0 0 1 1 0 1 0 1
Response
No response Data valid / accept Reserved Response error
Mnemonic
NULL DVA
ERR
14
Simple Extensions
Table 4 lists the simple OCP extensions. The extensions add address spaces, byte enables, burst support, datahandshake, and response flow control.
Table 4 Simple OCP Extensions
Name
MAddrSpace MBurst MByteEn MDataValid MRespAccept SDataAccept
Width
1-8 3 1/2/4/8/16 1 1 1
Driver
master master master master master slave
Function
Address space Burst code Byte enables Write data valid Master accepts response Slave accepts write data
MAddrSpace Address Space. This field is an extension of the MAddr field and is used to indicate the address region of a transfer. Examples of address regions are the register space versus the regular memory space of a target core or the user versus supervisor space for an initiator core. The MAddrSpace field is configured into the OCP using the addrspace parameter. The width of the MAddrSpace field is configured using the addrspace_wdth parameter. While the encoding of the MAddrSpace field is core-specific, it is recommended that target cores use 0 to indicate the internal register space. MBurst Burst type. This signal allows linking related transfers into a burst transaction. It is configured into the OCP using the burst parameter. It encodes both the burst type and the burst code, as shown in Table 5.
Table 5 Burst Encoding
MBurst[2:0]
0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1 0 0 0 0 1 1 1 1
Burst Type
All Incrementing Incrementing Incrementing Custom (packed) Custom (not packed) Streaming Incrementing
Burst Code
LAST TWO FOUR EIGHT DFLT1 DFLT2 STRM CONT
15
MByteEn Byte enables. This field indicates which bytes within the OCP word are part of the current transfer. There is one bit in MByteEn for each byte in the OCP word. Setting MByteEn[n] to 1 indicates that the byte associated with data wires [(8n + 7):8n] should be transferred. The MByteEn field is configured into the OCP using the byteen parameter. The allowable patterns on MByteEn can be limited using the force_aligned parameter as described on page 35. MDataValid Write data valid. When set to 1, this bit indicates that the data on the MData field is valid. Use the datahandshake parameter to configure this field into the OCP. MRespAccept Master response accept. The master indicates that it accepts the current response from the slave with a value of 1 on the MRespAccept signal. Use the respaccept parameter to enable this field into the OCP. SDataAccept Slave accepts write data. The slave indicates that it accepts pipelined write data from the master with a value of 1 on SDataAccept. Use the datahandshake parameter to configure this field into the OCP.
Complex Extensions
Table 6 shows a list of complex OCP extensions. The extensions add support for threads and connections.
Table 6 Complex OCP Signal Extensions
Name
MConnID MDataThreadID MThreadBusy MThreadID SThreadBusy SThreadID
Width
1-8 1-4 1-16 1-4 1-16 1-4
Driver
master master master master slave slave
Function
Connection identifier Write data thread identifier Master thread busy Request thread identifier Slave thread busy Response thread identifier
MConnID Connection identifier. This variable-width field provides the binary encoded connection identifier associated with the current transfer request. To configure this field use the connid parameter. The field width is configured with the connid_wdth parameter.
16
MDataThreadID Write data thread identifier. This variable-width field provides the thread identifier associated with the current write data. The field carries the binary-encoded value of the thread identifier. MDataThreadID is required if threads is greater than 1 and the datahandshake parameter is included. MDataThreadID has the same width as MThreadID and SThreadID. MThreadBusy Master thread busy. The master notifies the slave that it cannot accept any responses associated with certain threads. The MThreadBusy field is a vector (one bit per thread). A value of 1 on any given bit indicates that the thread associated with that bit is busy. Bit 0 corresponds to thread 0, and so on. The width of the field is set using the threads parameter. It is legal to enable a one-bit MThreadBusy interface for a single-threaded OCP. To configure this field, use the mthreadbusy parameter. MThreadID Request thread identifier. This variable-width field provides the thread identifier associated with the current transfer request. If threads is greater than 1, this field is enabled. The field width is the next whole integer of log2(threads). SThreadID Response thread identifier. This variable-width field provides the thread identifier associated with the current transfer response. If threads is greater than 1, this field is enabled. The field width is the next whole integer of log2(threads). SThreadBusy Slave thread busy. The slave notifies the master that it cannot accept any new requests associated with certain threads. The SThreadBusy field is a vector, one bit per thread. A value of 1 on any given bit indicates that the thread associated with that bit is busy. Bit 0 corresponds to thread 0, and so on. The width of the field is set using the threads parameter. It is legal to enable a one-bit SThreadBusy interface for a single-threaded OCP. To configure this field, use the sthreadbusy parameter.
17
Sideband Signals
Sideband signals are OCP signals that are not part of the dataflow phases, and so can change asynchronously with the request/response flow. Sideband signals convey control information such as reset, interrupt, error, and corespecific flags. They also exchange control and status information between a core and an attached system. All sideband signals are optional. Table 7 shows a list of the sideband extensions to the OCP.
Table 7 Sideband OCP Signals
Name
MFlag Reset_n SError SFlag SInterrupt Control ControlBusy ControlWr Status StatusBusy StatusRd
Width
1-8 1 1 1-8 1 116 1 1 116 1 1
Driver
master varies slave slave slave system core system core core system
Function
Master flags Synchronous reset Slave error Slave flags Slave interrupt Core control information Hold control information Control information has been written Core status information Status information is not consistent Status information has been read
Reset_n
0 1
Function
Reset Active Reset Disabled
18
SError Slave error. With a value of 1 on the SError signal the slave indicates an error condition to the master. The SError field is configured with the serror parameter. SFlag Slave flags. This variable-width set of signals allows the slave to communicate out-of-band information to the master. Encoding is completely core-specific. To configure this field into the OCP, use the sflag parameter. To configure the width of this field, use the sflag_wdth parameter. SInterrupt Slave interrupt. The slave may generate an interrupt with a value of 1 on the SInterrupt signal. The SInterrupt field is configured with the interrupt parameter.
19
StatusRd Core status event. This signal is set to 1 by the system to indicate that status information is read by the system. To configure this field into the OCP, use the statusrd parameter.
Test Signals
The test signals add support for scan, clock control, and IEEE 1149.1 (JTAG). All test signals are optional.
Table 9 Test OCP Signals
Name
Scanctrl Scanin Scanout ClkByp TestClk TCK TDI TDO TMS TRST_N
Width
1-256 0-256 0-256 1 1 1 1 1 1 1
Driver
system system core system system system system core system system
Function
Scan control signals Scan data in Scan data out Enable clock bypass mode Test clock Test clock Test data in Test data out Test mode select Test reset
Scan Interface
The Scanctrl, Scanin, and Scanout signals together form a scan interface into a given IP core. Scanctrl Scan mode control signals. Use this variable width field to control the scan mode of the core. A scanport_wdth >0 configures this field into the OCP interface. Use the scancrtl_wdth parameter to configure the width of this field. Scanin Scan data in for a cores scan chains. Use the scanport_wdth parameter, to configure this field into the OCP interface and control its width. Scanout Scan data out from the cores scan chains. Use the scanport_wdth parameter to configure this field into the OCP interface and control its width.
20
21
Signal Configuration
The set of signals making up the OCP interface is configurable to match the characteristics of the IP core. Throughout this chapter, configuration parameters were mentioned that control the existence and width of various OCP fields. Table 10 on page 21 summarizes the configuration parameters, sorted by interface signal group.
Table 10 OCP Signal Configuration Parameters
Group
Basic
Signal
Clk MAddr MCmd MData SCmdAccept SData SResp
Simple
Complex
Sideband
22
Group
Signal
SFlag SInterrupt Status StatusBusy9 StatusRd10
Test
ClkByp Scanctrl Scanin Scanout TCK TDI TDO TestClk TMS TRST_N
1 MByteen has a width of data_wdth/8. 2 MDataValid and SDataAccept are both configured by the same parameter (datahandshake); either both exist or neither. 3 MDataThreadID is included if threads is greater than 1 and the datahandshake parameter is set. 4 MThreadBusy has a width equal to threads. It may be included for single-threaded OCP interfaces. 5 MThreadID and SThreadID are both configured by the same parameter (threads); either both exist or neither. 6 SThreadBusy has a width equal to threads. It may be included for single-threaded OCP interfaces. 7 ControlBusy can only be included if both Control and ControlWr exist. 8 ControlWr can only be included if Control exists. 9 StatusBusy can only be included if Status exists. 10 StatusRd can only be included if Status exists.
23
Signal Directions
Figure 2 on page 24 summarizes all OCP signals. The direction of some signals (for example, MCmd) depends on whether the module acts as a master or slave, while the direction of other signals (for example, Control) depends on whether the module acts as a system or a core. The combination of these two choices gives a possibility of six different module configurations as shown in Table 11.
Table 11 Module Configuration Based on Signal Directions
Acts as System
System Master System Slave
Acts as Core
Core Master Core Slave
For example, if a module acts as OCP master and also as system, it is designated a system master. There are two special signals, Clk and Reset_n, that are typically either supplied by the system module (if one of the two modules acts as a system), or alternatively can be supplied by a third (external) entity that is neither of the two modules connected through the OCP interface.
24
Figure 2
Master
Clk MCmd MAddr MAddrSpace MBurst MByteEn MThreadID MConnID SCmdAccept SResp SData SThreadID MRespAccept MDataValid MData MDataThreadID SDataAccept MThreadBusy SThreadBusy Reset_n SInterrupt SError SFlag MFlag
Slave
Request
Data Flow
Response
Data Handshake
Sideband
System
Control ControlWr ControlBusy Status StatusRd StatusBusy Scanctrl Scanin Scanout ClkByp TestClk TCK TDI TDO TMS TRST_N
Core
Test
Required Optional
Protocol Semantics
This chapter defines the semantics of the OCP protocol by assigning meanings to the signal encodings described in the preceding chapter. Figure 3 provides a graphic view of the hierarchy of elements that compose the OCP.
Figure 3 Hierarchy of Elements
Transaction
Transfer
Transfer
...
Transfer
Phase
Phase
...
Phase
Group
Timing information
Signal
Signal
...
Signal
26
Signal Groups
Some OCP fields are grouped together because they must be active at the same time. The data flow signals are divided into three signal groups: request signals, response signals, and datahandshake signals. A list of the signals that belong to each group is shown in Table 12.
Table 12 OCP Signal Groups
Group
Request
Signal
MAddr MAddrSpace MBurst MByteEn MCmd MConnID MData* MThreadID
Condition
always always always always always always datahandshake = 0 always always always always datahandshake = 1 always always
Response
Datahandshake
*MData belongs to the request group, unless the datahandshake configuration parameter is enabled. In that case it belongs to the datahandshake group.
Combinational Dependencies
It is legal for some signal or signal group outputs to be derived from inputs without an intervening latch point, that is combinationally. To avoid combinational loops, other outputs cannot be derived in this manner. Figure 4 describes a partial order of combinational dependency. For any arrow shown, the signal or signal group that is pointed to can be derived combinationally from the signal at the point of origin of the arrow or another signal earlier in the dependency chain. No other combinational dependencies are allowed.
Protocol Semantics
27
Figure 4
Slave Master
MThreadBusy
MRespAccept
Combinational paths are not allowed within the sideband and test signals, or between those signals and the data flow signals. The only legal combinational dependencies are within the data flow signals. Data flow signals, however, may be combinationally derived from Reset_n. For timing purposes, some of the allowed combinational paths are designated as preferred paths and are described in Allowed Combinational Paths for Level2 Timing on page 175.
Dataflow Signals
Signals in a signal group must all be valid at the same time. The request group is valid whenever a command other than Idle is presented on the MCmd field. The response group is valid whenever a response other than Null is presented on the SResp field. The datahandshake group is valid whenever a 1 is presented on the MDataValid field.
The accept signal associated with a signal group is valid only when that group is valid. The SCmdAccept signal is valid whenever a command other than Idle is presented on the MCmd field. The MRespAccept signal is valid whenever a response other than Null is presented on the SResp field. The SDataAccept signal is valid whenever a 1 is presented on the MDataValid field.
28
The signal groups map on a one-to-one basis to protocol phases. All signals in the group must be held steady from the beginning of a protocol phase until the end of that phase. Outside of a protocol phase, all signals in the corresponding group (except for the signal that defines the beginning of the phase) are dont care. In addition, the MData field is a "don't care" during read-type requests, and the SData field is a "don't care" for responses to write-type requests. A request phase begins whenever the request group becomes active. It ends when the SCmdAccept signal is sampled by Clk as 1 during a request phase. A response phase begins whenever the response group becomes active. It ends when the MRespAccept signal is sampled by Clk as 1 during a response phase. If MRespAccept is not configured into the OCP interface (respaccept = 0) then MRespAccept is assumed to be on; that is the response phase is exactly one cycle long. A datahandshake phase begins whenever the datahandshake signal group becomes active. It ends when the SDataAccept signal is sampled by Clk as 1 during a datahandshake phase.
For all phases, it is legal to assert the corresponding accept signal in the cycle that the phase begins, allowing the phase to complete in a single cycle.
Phases in a Transfer
An OCP transfer consists of several phases as shown in Table 13. Every transfer has a request phase. Depending on the type of transfer and the OCP configuration, the datahandshake or response phase is optional.
Table 13 OCP Phases in an OCP Transfer
MCmd
Read, ReadEx Write, Broadcast Write, Broadcast
Phases
Request, response Request Request, datahandshake
Condition
always datahandshake = 0 datahandshake = 1
Protocol Semantics
29
A response phase cannot begin before the associated request phase begins, but can begin in the same Clk cycle. A response phase cannot end before the associated request phase ends, but can end in the same Clk cycle.
Ungrouped Signals
Signals not covered in the description of signal groups and phases are MThreadBusy and SThreadBusy. The timing cycle of the transition of each bit that makes up each of these two fields is not specified relative to the other dataflow signals. This means that there is no specific time for an OCP master or slave to drive these signals, nor a specific time for the signals to have the desired flow-control effect. It follows that MThreadBusy and SThreadBusy can only be treated as a hint. To prevent blocking of multi-threaded OCP interfaces, the sender of a ThreadBusy signal needs to produce the signal in the cycle after the last accepted request or response phase. The receiver must take the signal into account for the thread selection of the current cycle. For a further description, see Figure 22 on page 162.
30
Reset_n must be asserted for at least 16 cycles of Clk to ensure that the master and slave reach a consistent internal state. The master and slave must each be able to reach their reset state regardless of the values presented on the OCP signals. If the master or slave require more than 16 cycles of Reset_n assertion, the requirement must be documented in the IP core specifications. At the same clock edge that Reset_n is sampled deasserted, all OCP interface signals must be valid. In particular, it is legal for the master to begin its first request phase in the same clock cycle that Reset_n is deasserted.
Test Signals
Scanin and Scanout are dont care while Scanctrl is inactive (but the encoding of inactive for Scanctrl is core-specific). TestClk is dont care while ClkByp is 0. The timing of TRST_N, TCK, TMS, TDI, and TDO is specified in the IEEE 1149 standard.
Protocol Semantics
31
Transfer Effects
A successful transfer is one that completes without error. For write-type requests (Write and Broadcast command), there is no response and therefore no in-band error indication. For read-type requests (Read and ReadEx commands), a DVA response indicates a successful transfer. This section defines the effect that a successful transfer has on a slave. The request acts on the location addressed by the value passed over the MAddr and MAddrSpace fields (if present) during the request phase. The transfer effects of each command type are: Idle None. Read Returns the latest value of the addressed location on the SData field. ReadEx Returns the latest value of the addressed location on the SData field. Sets a lock for the initiating thread on at least that location. The next request on the thread that issued a ReadEx must be a Write to the same address (and with the same byte enables, if applicable). Competing requests of any type from other threads to a locked location are blocked from proceeding until the lock is unset. If the ReadEx request returns an ERR response, it is target-specific whether the lock is actually set or not. Write Places the value on the MData field in the addressed location. Unlocks access to the location if locked by a ReadEx. Broadcast Places the value on the MData field in the addressed location that may map to more than one target in a system-dependent way. If a transfer is unsuccessful, the effect of the transfer is unspecified. It is up to higher-level protocols to determine what happened and handle any cleanup.
Burst Definition
A burst is a set of transfers that are linked together into a transaction by applying a burst code to each transfer. The burst code specifies the type of burst and can also specify some information about the number of transfers left in the burst. Each individual transfer in the burst has complete request information. The burst code itself is only a hint to improve system performance and can be ignored by the slave. OCP burst codes are described in Table 14. MBurst field burst codes are specified in Table 5 on page 14. The OCP protocol places few restrictions on the values in the burst field because the information is only advisory. The main requirement is that a burst of one type must be completed (by issuing at least one command with the burst code LAST) before a burst of another type can begin. Single (non-burst) transfers always use a burst code of LAST.
32
Table 14
Burst Codes
Name
LAST TWO FOUR EIGHT CONT STRM DFLT1 DFLT2
Type
(any) incrementing incrementing incrementing incrementing streaming custom (packed) custom (not packed)
Transfers Remaining
1 (end burst) at least 2 (burst length known) at least 4 (burst length known) at least 8 (burst length known) at least 2 (burst length unknown) at least 2 at least 2 at least 2
Packing
The concept of packing must be defined before discussing burst types. When a transfer is issued across an OCP, there are two related concepts, OCP data width (or OCP word size) and natural transfer size. The natural transfer size is the size of the data units being transferred, and is not necessarily the same as the OCP word size. For example, it is possible to simultaneously transfer two, two-byte data units over a four-byte wide OCP. Packing refers to the aggregation of multiple data units of the natural transfer size into a single OCP transfer. Packing allows the system to make use of the burst attributes to improve the overall data transfer efficiency in the face of multiple OCP interfaces of different data widths. For example, if a bridge is translating a narrow OCP to a wide OCP, it can aggregate (or pack) the incoming narrow transfers into a smaller number of outgoing wide transfers. Burst types support either maximal packing, or no packing. Maximal packing means that as many natural transfer units as are available and as can fit into the OCP data word must be aggregated each time. No packing means that only one natural transfer unit is passed across the OCP at a time, even if the OCP word size is larger than the natural transfer size of the data.
Burst Types
Of the different burst types, two are very specific and two are left largely unspecified for core-specific customization: Incrementing bursts Streaming bursts Custom burst patterns that can be packed Custom burst patterns that cannot be packed
Table 15 summarizes the attributes of the burst types. All transfers in a given burst must use the same MCmd, MAddrSpace, MThreadID, and MConnID.
Protocol Semantics
33
Table 15
Burst Types*
Attribute
Command Address sequence Burst codes
Incrementing Burst
read or write incremented by OCP data width on every transfer any combination of TWO, FOUR, EIGHT, CONT no restrictions maximally packed
Streaming Burst
read or write same on every transfer STRM
DFLT1
DFLT2
* All transfers in a given burst must use the same MCmd, MAddrSpace, MThreadID, and MConnID for all burst types.
Incrementing Bursts
Incrementing bursts are read or write bursts that are characterized by an address that increases by the data width of the OCP on every transfer; that is, the word-aligned address increments by the word size. A typical use for an incrementing burst is a DMA transfer from DRAM. For incrementing bursts the burst codes give a hint of how many total transfers remain (including the current one). Since not all transfer lengths are supported, the master rounds down to the nearest power of two. If the burst length is not known by the master (as in PCI bursts, for example), the burst code CONT (continue) is used. Incrementing bursts are governed by the following rules: The address of every transfer in the burst is incremented by the OCP data width. Bursts must not wrap around the OCP address size. Within a given burst, incrementing burst codes (that is, TWO, FOUR, EIGHT, and CONT) can be freely intermixed. This makes it legal to issue a read with a burst code of EIGHT and then immediately follow that with another read (with an appropriately-incremented address) that specifies a burst code LAST to end the burst. Incrementing bursts have no byte enable restrictions; each transfer in the burst can have any byte enable pattern. Incrementing bursts must be maximally packed. This means that if the natural transfer width of the burst is smaller than the OCP word size, the transfers are packed together to match the OCP word size. The packed burst of narrow transfers must be indistinguishable from a burst with a natural transfer width that matches the OCP word size. This satisfies the requirement that addresses in incrementing bursts always increase by the OCP word size.
34
Streaming Bursts
Streaming bursts are characterized by the same address for all transfers within the burst. They are used to support clients such as networking and communications cores where the address identifies the port the transfers tend to be data-only. Examples include modem, Ethernet, and A/D converters. A streaming burst can be either a read or a write burst. The address must remain constant during a streaming burst transaction. Burst code STRM is used for every transfer except the last. The byte enables representing the natural data width for a given transfer in the streaming burst must all be enabled. This means that the byte enable pattern is the same for every transfer in the transaction, and that there are no holes in the byte enable pattern. A streaming burst cannot be packed and cannot be wider than the OCP word size, so if a streaming burst with a natural transfer width of two bytes is sent over a 32-bit wide OCP, not all OCP byte enables are enabled.
Custom Bursts
Custom burst types apply to bursts with different command, address sequence, or byte enable attributes than the pre-defined incrementing and streaming burst types. The difference between the two custom burst types is that one is packed and the other is not. Packing allows for efficient transfer of bursts over OCPs of varying width. To allow packing, the address behavior of a burst is restricted according to the maximally packed guidelines described in Incrementing Bursts on page 33. As a result, the address sequencing behavior for transfers smaller than the OCP word size must be incrementing. A typical use of packed custom bursts is to support behavior such as criticalword-first cache line refill. The burst type is thread and connection specific. By using multiple threads or connections, a slave can support many different burst styles, such as different cache line sizes for different processors.
Protocol Semantics
35
about which transfers are part of which thread must be maintained by the bridge, but the actual assignment of thread IDs is done on a per-OCP-interface basis. There is no way for a slave on the far side of a bridge to extract the original thread ID unless the slave design comprehends the characteristics of the bridge. Use connections whenever information about a request must be sent end-toend from master to slave. Any bridges in the path between the end-to-end partners preserve the connection ID, even as thread IDs are re-assigned on each OCP interface in the path. The MConnID field transfers the connection ID during the request phase. Since this establishes the mapping onto a thread ID, the other phases do not require a connection ID but are unambiguous with only a thread ID. The SThreadBusy and MThreadBusy signals are used to indicate that a particular thread is busy. If another request (or response) is issued on a busy thread, it is unlikely that the request (or response) will be accepted. These signals provide hints that allow the master or slave to avoid having the interface blocked until the thread becomes non-busy. For more information, see Ungrouped Signals on page 29.
OCP Configuration
Signal Options
The configuration parameters described in Signal Configuration on page 21, not only configure the corresponding signal into the OCP interface, but also enable the function. For example, if the burst parameter is enabled the MBurst field is added and the interface also supports burst extensions as described in Burst Definition on page 31.
Protocol Options
Not all devices support all allowable byte enable patterns. The force_aligned parameter limits byte enable patterns on the OCP to be power-of-two in size and aligned to that size. A master with this option set must not generate any byte enable patterns that are not force aligned. A slave with this option set cannot handle any byte enable patterns that are not force aligned.
Not all devices support the optional commands ReadEx and Broadcast. The readex_enable and broadcast_enable parameters indicate whether an optional command is supported. A master with one of these options not set must not generate the corresponding command. A slave with one of these options not set cannot handle the corresponding command.
36
The burst_aligned parameter provides information about the size and alignment of bursts issued by an attached master core and can be used to optimize the system. Setting burst_aligned requires all incrementing bursts to: Have exactly a power of two transfers, even if error responses are received. Use only burst codes EIGHT, FOUR, TWO, and LAST in the appropriate sequence. For example, a burst of size 8 has the following sequence of burst codes: EIGHT, FOUR, FOUR, FOUR, FOUR, TWO, TWO, LAST. Have all byte enables turned on in every transfer (or to use an OCP without byte enables). Have their starting address aligned with their total burst size. Bursts of sizes larger than 8 are allowed, however, the target cannot immediately tell the burst size, since OCP burst codes top out at EIGHT.
Minimum Implementation
A minimal OCP implementation must support at least the basic OCP dataflow signals. OCP-compatible masters and slaves must support the command types Idle, Read, and Write. Support for ReadEx, and Broadcast is optional. OCP-compatible masters and slaves must support response types NULL and DVA. The ERR response type is optional and should only be included if the OCP-compatible slave has the ability to report errors. All OCP masters must be able to accept the ERR response.
Protocol Semantics
37
3. At the signal level, two interfaces are compatible if the tie-off rules, described in the next section are observed for any mismatch at the signal level. Unless otherwise specified, all tied-off inputs are tied to logical zero.
38
SDataAccept Must be configured to match (follows from MDataValid). MConnID If the master includes the field but the slave does not (or if the master is wider than the slave), bits are lost. If the master does not include the field but the slave does (or the master is narrower than the slave), any missing slave input bits are tied to zero. Only the smaller set of connection IDs can be used. MThreadID If the master includes the field but the slave does not (or the master is wider than the slave), bits are lost. If the master does not include the field but the slave does (or the master is narrower than the slave), any missing slave input bits are tied to zero. Only the smaller set of thread IDs can be used. MDataThreadID Same as MThreadID. MThreadBusy If the master includes the field but the slave does not (or the master is wider than the slave), bits are lost. If the master does not include the field but the slave does (or the master is narrower than the slave), missing slave input bits are tied to zero. No useful information can be passed on the non-matching signals, so on the corresponding threads the interface may become blocked. SThreadID Same as MThreadID, except that the master and slave are reversed. SThreadBusy Same as MThreadBusy, except that the master and slave are reversed. Reset_n Need not match. If master or slave does not include Reset_n, it must present a valid OCP encoding on all its OCP outputs at all times. SInterrupt If the master enables the field but the slave does not, the master input is tied to zero. If the master does not enable the field but the slave does, the bit is lost. In either case no interrupt can be communicated over the OCP. SError If the master includes the field but the slave does not, the master input is tied to zero. If the master does not include the field but the slave does, the bit is lost. In either case no out-of-band error can be communicated from slave to master over the OCP. SFlag If the master includes the field but the slave does not (or the master is wider than the slave), the missing master input bits are tied to zero. If the master does not include the field but the slave does (or the master is narrower than the slave) any extra bits are lost. In either case no information can be passed on the non-matching flag signals.
Protocol Semantics
39
MFlag Same as SFlag, except that master and slave are reversed. The remaining sideband signals and all the test signals must be configured to match exactly.
Width Mismatch
For signals that allow a connection with a width mismatch and that use a binary encoding (MAddr, MAddrSpace, MConnID, MThreadID, MDataThreadID, SThreadID), it is always the most significant bits that are lost. This is also the case for the MThreadBusy and SThreadBusy signals. For the SFlag and MFlag signals any arbitrary bits can be dropped and these must be specified explicitly.
Timing Diagrams
The timing diagrams within this chapter look at signals at strategic points and are not intended to provide full explanations but rather, highlight specific areas of interest. The diagrams are provided solely as examples. For related information about phases, see Signal Timing and Protocol Phases on page 27. Most fields are unspecified whenever their corresponding phase is not asserted. This is indicated by the striped pattern in the waveforms. For example, when MCmd is IDLE the request phase is not asserted, so the values of MAddr, MData, and SCmdAccept are unspecified. Subscripts on labels in the timing diagrams denote transfer numbers that can be helpful in tracking a transfer across protocol phases. For a description of timing diagram mnemonics, see Tables 2 and 3 on page 13.
42
1 Clk MCmd Request Phase MAddr MData SCmdAccept Response Phase SResp SData
A
IDLE WR1
IDLE
RD2
IDLE
A1
A2
D1
NULL
DVA2
NULL
D2
Sequence
A. The master starts a request phase on clock 1 by switching the MCmd field from IDLE to WR. At the same time, it presents a valid address (A1) on MAddr and valid data (D1) on MData. The slave asserts SCmdAccept in the same cycle, making this a 0-latency transfer. B. The slave captures the values from MAddr and MData and uses them internally to perform the write. Since SCmdAccept is asserted, the request phase ends. C. The master starts a read request by driving RD on MCmd. At the same time, it presents a valid address on MAddr. The slave asserts SCmdAccept in the same cycle for a request accept latency of 0. D. The slave captures the value from MAddr and uses it internally to determine what data to present. The slave starts the response phase by switching SResp from NULL to DVA. The slave also drives the selected data on SData. Since SCmdAccept is asserted, the request phase ends. E. The master recognizes that SResp indicates data valid and captures the read data from SData, completing the response phase. This transfer has a request-to-response latency of 1.
Timing Diagrams
43
Request Handshake
Figure 6 illustrates the basic flow-control mechanism for the request phase using SCmdAccept. There are three write transfers, each with a different request accept latency.
Figure 6 Request Handshake
1 Clk MCmd Request Phase MAddr MData SCmdAccept Response Phase SResp SData
A NULL IDLE WR1 A1 D1
WR2 A2 D2
IDLE
WR3 A3 D3
IDLE
NULL
Sequence
A. The master starts a write request by driving WR on MCmd and valid address and data on MAddr and MData, respectively. The slave asserts SCmdAccept in the same cycle, for a request accept latency of 0. B. The master starts a new transfer in the next cycle. The slave captures the write address and data. It deasserts SCmdAccept, indicating that it is not yet ready for a new request. C. Recognizing that SCmdAccept is not asserted, the master holds all request phase signals (MCmd, MAddr, and MData). The slave asserts SCmdAccept in the next cycle, for a request accept latency of 1. D. The slave captures the write address and data. E. After 1 idle cycle, the master starts a new write request. The slave deasserts SCmdAccept. F. Since SCmdAccept is asserted, the request phase ends. SCmdAccept was low for 2 cycles, so the request accept latency for this transfer is 2. The slave captures the write address and data.
44
1 Clk MCmd Request Phase MAddr MData SCmdAccept Response Phase SResp SData
A
IDLE
RD1
IDLE
A1
NULL
DVA1
NULL
D1
Sequence
A. The master starts a request phase by issuing the RD command on the MCmd field. At the same time, it presents a valid address on MAddr. The slave is not ready to accept the command yet, so it deasserts SCmdAccept. B. The master sees that SCmdAccept is not asserted, so it keeps all request phase signals steady. The slave may be using this information for a long decode operation, and it expects the master to hold everything steady until it asserts SCmdAccept. C. The slave asserts SCmdAccept. The master continues to hold the request phase signals. D. Since SCmdAccept is asserted, the request phase ends. The slave captures the address, and although the request phase is complete, it is not ready to provide the response, so it continues to drive NULL on the SResp field. For example, the slave may be waiting for data to come back from an offchip memory device. E. The slave is ready to present the response, so it issues DVA on the SResp field, and drives the read data on SData. F. The master sees the DVA response and captures the read data.
Timing Diagrams
45
1 Clk MCmd Request Phase MAddr MData SCmdAccept Response Phase SResp SData
A
NULL DVA1 IDLE RD1
IDLE
RD2
RD3
IDLE
A1
A2
A3
NULL
DVA2
NULL
DVA3
NULL
D1
D2
D3
Sequence
A. The master starts the first read request, driving RD on MCmd and a valid address on MAddr. The slave asserts SCmdAccept, for a request accept latency of 0. When the slave sees the read command, it responds with DVA on SResp and valid data on SData. This requires a combinational path in the slave from MCmd, and possibly other request phase fields, to SResp, and possibly other response phase fields. B. Since SCmdAccept is asserted, the request phase ends. The master sees that SResp is DVA and captures the read data from SData. Because the request was accepted and the response was presented in the same cycle, the request-to-response latency is 0. C. The master launches a read request, and the slave asserts SCmdAccept. D. The master sees that SCmdAccept is asserted, so it can launch a third read even though the response to the previous read has not been received. The slave captures the address of the second read and begins driving DVA on SResp and the read data on SData. E. Since SCmdAccept is asserted, the third request ends. The master sees that the slave has produced a valid response to the second read and captures the data from SData. The request-to-response latency for this transfer is 1.
46
F. The slave has the data for the third read, so it drives DVA on SResp and the data on SData. G. The master captures the data for the third read from SData. The requestto-response latency for this transfer is 2.
Non-Pipelined Read
Figure 9 shows three read transfers to a slave that cannot pipeline responses after requests. This is the typical behavior of legacy computer bus protocols with a single WAIT or ACK signal. In each transfer, SCmdAccept is asserted in the same cycle that SResp is DVA. Therefore, the request-to-response latency is always 0, but the request accept latency varies from 0 to 2.
Figure 9 Non-Pipelined Read
1 Clk MCmd Request Phase MAddr MData SCmdAccept Response Phase SResp SData
A
NULL DVA1 IDLE RD1
RD2
IDLE
RD3
IDLE
A1
A2
A3
NULL
DVA2
NULL
DVA3
NULL
D1
D2
D3
Sequence
A. The master starts the first read request, driving RD on MCmd and a valid address on MAddr. The slave asserts SCmdAccept, for a request accept latency of 0. When the slave sees the read command, it responds with DVA on SResp and valid data on SData. (This requires a combinational path in the slave from MCmd, and possibly other request phase fields, to SResp, and possibly other response phase fields.) B. The master launches another read request. It also sees that SResp is DVA and captures the read data from SData. The slave is not ready to respond to the new request, so it deasserts SCmdAccept.
Timing Diagrams
47
C. The master sees that SCmdAccept is low and extends the request phase. The slave is now ready to respond in the next cycle, so it simultaneously asserts SCmdAccept and drives DVA on SResp and the selected data on SData. The request accept latency is 1. D. Since SCmdAccept is asserted, the phase ends. The master sees that SResp is now DVA and captures the data. E. The master launches a third read request. The slave deasserts SCmdAccept. F. The slave asserts SCmdAccept after 2 cycles, so the request accept latency is 2. It also drives DVA on SResp and the read data on SData. G. The master sees that SCmdAccept is asserted, ending the phase. It also sees that SResp is now DVA and captures the data.
Burst Read
Figure 10 illustrates a burst read transaction that is composed of four pipelined burst read transfers. An additional field, MBurst, is added to the request phase, indicating the type of the burst and the number of transfers that the master expects. In this diagram, MData and SData are assumed to be 32 bits.
Figure 10 Burst Read
1 Clk MCmd MAddr Request Phase MBurst MData SCmdAccept Response Phase SResp SData
A
NULL IDLE RD1
RD2
RD3
RD4
IDLE
0x01
0x42
0x83
0xC4
FOUR1
TWO2
TWO3
LAST4
DVA1
DVA2
DVA3
DVA4
NULL
D1
D2
D3
D4
48
Sequence
A. The master starts the burst read by driving RD on MCmd, the first address of the burst on MAddr, and the burst code FOUR on MBurst. The burst code indicates that this is an incrementing burst and that four or more transfers are expected. The slave is ready for anything, so it asserts SCmdAccept. B. The master issues the next read in the burst. MAddr is set to the next word-aligned address. For 32-bit words, the address is incremented by 4. The master also changes MBurst to TWO, meaning that two or more transfers remain in the transaction. C. The master issues the next read in the burst, incrementing MAddr and leaving MBurst set to TWO, because there are still two or more transfers remaining. The slave is now ready to respond to the first read in the burst, so it drives DVA on SResp and valid data on SData. The request-toresponse latency for this transfer is 2. D. The master issues the final read in the burst, incrementing MAddr and setting MBurst to LAST. The master also captures the data for the first read from the slave. The slave responds to the second transfer. The request-to-response latency for this transfer is 2, although it is possible for the slave to introduce more latency for each response in a burst transaction. (In OCP, bursts do not impose any additional constraints on protocol timing.) E. The master captures the data for the second read from the slave. The slave responds to the third transfer. F. The master captures the data for the third read from the slave. The slave responds to the fourth and last transfer. G. The master captures the data for the last read from the slave.
Response Accept
Figure 11 shows examples of the response accept extension used with two read transfers. An additional field, MRespAccept, is added to the response phase. This signal may be used by the master to flow-control the response phase.
Timing Diagrams
49
Figure 11
Response Accept
1 Clk MCmd Request Phase MAddr MData SCmdAccept SResp Response Phase SData MRespAccept
A
NULL IDLE RD1
IDLE
RD2
IDLE
A1
A2
DVA1
NULL
DVA2
NULL
D1
D2
Sequence
A. The master starts a read request by driving RD on MCmd and a valid address on MAddr. The slave asserts SCmdAccept immediately, and it drives DVA on SResp as soon as it sees the read request. The master is not ready to receive the response for the request it just issued, so it deasserts MRespAccept. B. Since SCmdAccept is asserted, the request phase ends. The master continues to deassert MRespAccept, however. The slave holds SResp and SData steady. C. The master starts a second read request. It is finally ready for the response from its first request, so it asserts MRespAccept. This corresponds to a response accept latency of 2. D. Since SCmdAccept is asserted, the request phase ends. The master captures the data for the first read from the slave. Since MRespAccept is asserted, the response phase ends. The slave is not ready to respond to the second read, so it drives NULL on SResp. E. The slave responds to the second read by driving DVA on SResp and the read data on SData. The master is not ready for the response, however, so it deasserts MRespAccept. F. The master asserts MRespAccept, for a request accept latency of 1. G. The master captures the data for the second read from the slave. Since MRespAccept is asserted, the response phase ends.
50
Datahandshake Extension
Figure 12 shows three write transfers using the datahandshake extension. This extension adds the datahandshake phase which is completely independent of the request and response phases. Two new signals, MDataValid and SDataAccept, are added, and MData is moved from the request phase to the datahandshake phase.
Figure 12 Datahandshake Extension
1 Clk MCmd Request Phase MAddr SCmdAccept Data Handshake Phase MDataValid MData SDataAccept
A IDLE WR1 A1
IDLE
WR2 A2
WR3 A3
IDLE
D1
D2
D3
Sequence
A. The master starts a write request by driving WR on MCmd and a valid address on MAddr. It does not yet have the write data, however, so it deasserts MDataValid. The slave asserts SCmdAccept. It does not need to assert or deassert SDataAccept yet, because MDataValid is still deasserted. B. The slave captures the write address from the master. The master is now ready to transfer the write data, so it asserts MDataValid and drives the data on MData, starting the datahandshake phase. The slave is ready to accept the data immediately, so it asserts SDataAccept. This corresponds to a data accept latency of 0. C. The master deasserts MDataValid since it has no more data to transfer,. (Like MCmd and SResp, MDataValid must always be in a valid, specified state.) The slave captures the write data from MData, completing the transfer. The master starts a second write request by driving WR on MCmd and a valid address on MAddr.
Timing Diagrams
51
D. Since SCmdAccept is asserted, the master immediately starts a third write request. It also asserts MDataValid and presents the write data of the second write on MData. The slave is not ready for the data yet, so it deasserts SDataAccept. E. The master sees that SDataAccept is deasserted, so it holds the values of MDataValid and MData. The slave asserts SDataAccept, for a data accept latency of 1. F. Since SDataAccept is asserted, the datahandshake phase ends. The master is ready to deliver the write data for the third request, so it keeps MDataValid asserted and presents the data on MData. The slave captures the data for the second write from MData, and keeps SDataAccept asserted, for a data accept latency of 0 for the third write. G. Since SDataAccept is asserted, the datahandshake phase ends. The slave captures the data for the third write from MData.
52
Threaded Read
Figure 13 illustrates out-of-order completion of read transfers using OCPs thread extension. This diagram is developed from Figure 8 on page 45. The thread IDs, MThreadID and SThreadID, have been added, and the order of two of the responses has been changed.
Figure 13 Threaded Read
1 Clk MThreadID MCmd Request Phase MAddr MData SCmdAccept SThreadID Response Phase SResp SData
A
NULL 01 IDLE 01
12
03
RD1
IDLE
RD2
RD3
IDLE
A1
A2
A3
03
12
DVA1
NULL
DVA3
NULL
DVA2
NULL
D1
D3
D2
Sequence
A. The master starts the first read request, driving RD on MCmd and a valid address on MAddr. It also drives a 0 on MThreadID, indicating that this read request is for thread 0. The slave asserts SCmdAccept, for a request accept latency of 0. When the slave sees the read command, it responds with DVA on SResp and valid data on SData. The slave also drives a 0 on SThreadID, indicating that this response is for thread 0. B. Since SCmdAccept is asserted, the phase ends. The master sees that SResp is DVA and captures the read data from SData. Because the request was accepted and the response was presented in the same cycle, the request-to-response latency is 0. C. The master launches a new read request, but this time it is for thread 1. The slave asserts SCmdAccept, however, it is not ready to respond.
Timing Diagrams
53
D. Since SCmdAccept is asserted, the master can launch another read request. This request is for thread 0, so MThreadID is switched back to 0. The slave captures the address of the second read for thread 1, but it begins driving DVA on SResp, data on SData, and a 0 on SThreadID. This means that it is responding to the third read, before the second read. E. Since SCmdAccept is asserted, the third request ends. The master sees that the slave has produced a valid response to the third read and captures the data from SData. The request-to-response latency for this transfer is 0. F. The slave has the data for the second read, so it drives DVA on SResp, data on SData, and a 1 on SThreadID. G. The master captures the data for the second read from SData. The request-to-response latency for this transfer is 3.
Timing Specification
To enable two entities to be connected together and communicate over an OCP interface, the protocols, signals, and pin-level timing must be compatible. This chapter describes how to define interface timing for a core. This process can be applied to OCP and non-OCP interfaces. When implementing IP cores in a technology independent manner it is difficult to specify only one timing number for the interface signals, since timing is dependent on technology, library and design tools. The methodology specified in this chapter allows the timing of interface signals to be specified in a technology independent way. A cores timing parameters are collected into the core synthesis configuration file described in Chapter 14 on page 141.
56
Timing Parameters
There is a set of minimum timing parameters that must be specified for a core interface. Additional optional parameters supply more information to help the system designer integrate the core. Hold-time parameters allow hold time checking. Physical-design parameters provide details on the assumptions used for deriving pin-level timing.
Minimum Parameters
At a minimum, the timing of an OCP interface is specified in terms of two parameters: setuptime is the latest time an input signal is allowed to change before the rising edge of the clock. c2qtime is the latest time an output signal is guaranteed to become stable after the rising edge of the clock.
logic
logic
c2qtime
setuptime
1 clock cycle
Hold-time Parameters
Hold-time parameters are needed to allow the system integrator to check hold time requirements. On the output side, c2qtimemin specifies the minimum time for a signal to propagate from a flip-flop to the given output pin. On the input side, holdtime specifies the minimum time for a signal to propagate from the input pin to a flip-flop.
Timing Specification
57
logic
drivingcellpin
core
For an output signal, the variable loadcellpin indicates the input load of the gate that the signal is expected to drive. The variable loads indicates how many loadcellpins the signal is expected to drive. Additionally, information on the capacitive load of the wire must be included. There are two options. Either the variable wireloaddelay can be specified, as shown in Figure 16. Or, the combination wireloadcapacitance/wireloadresistance must be specified, as shown in Figure 17.
Figure 16 Variable Loads - wireloaddelay
loadcellpin
logic
loads wireloaddelay
For instructions on calculating a delay, refer to the Synopsys Design Compiler Reference.
58
Figure 17
loadcellpin
wireloadresistance
logic
loads wireloadcapacitance
drivingcellpin
loads * loadcellpin
wireloadresistance
logic logic logic
wireloadcapacitance
c2qtime
wireloaddelay
setuptime
clock skew
1 clock cycle
Timing Specification
59
Max Delay
In addition to the setup and c2qtime paths for a core, there may also be combinational paths between input and output ports. Use maxdelay to specify the timing for these paths.
Figure 19 Max Delay Timing
max delay
False Paths
It is possible to identify a path between two ports as being logically impossible. Such paths can be specified using the falsepath constraint syntax. For instructions on specifying the cores timing parameters, see Chapter 14 on page 141.
Core Performance
To make it easier for the system integrator to choose cores and architect the system, an IP core provider must document a cores performance characteristics. This chapter supplies a template for a core performance report on page 66, and directions on how to fill out the template.
Report Instructions
To document the core, you will need to provide the following information: 1. Core name. Identify the core by the name you assigned. 2. Core ID. Specify the precise identification of the core inside the systemon-chip. The information consists of the vendor code, core code, and revision code. 3. Core is/is not process dependent. Specify whether the core is processdependent or not. This is important for the frequency, area, and power estimates that follow. If multiple processes are supported, name them here and specify corresponding frequency/area/power numbers separately for each core if they are known. 4. Frequency range for this core. Specify the frequency range that the core can run at. If there are conditions attached, state them clearly. 5. Area. Specify the area that the core occupies. State how the number was derived and be precise about the units used. 6. Power estimate. Specify an estimate of the power that the core consumes. This naturally depends on many factors, including the operations being processed by the core. State all those conditions clearly, and if possible, supply a file of vectors that was used to stimulate the core when the power estimate was made.
62
7. Special reset requirements. If the core needs Reset_n asserted for more than the default (16 OCP clock cycles) list the requirement. 8. Number of interfaces. 9. Interface information. For each OCP interface that the core provides, list the name and type. The remaining sections focus on the characteristics and performance of these OCP interfaces. For master OCP interfaces: a. Operations issued. State the types of operations issued (i.e. reads, writes, etc.) b. Issue rate (per OCP cycle for sequences of reads, writes, and interleaved reads/writes). State the maximum issue rate. Specify issue rates for sequences of reads, writes, and interleaved reads and writes. c. Maximum number of operations outstanding (pipelining support). State the number of outstanding operations that the core can support; is there support for pipelining. d. Burst support and effect on issue rates. State whether the core has burst support, how it makes use of bursts, and how the use of bursts affects the issue rates. e. High level flow-control. If the core makes use of high-level flow control, such as full/empty bits, state what these mechanisms are and how they affect performance. f. Number of threads supported and use of those threads. State the number of threads supported. If more than one, explain the use of threads.
g. Connection ID support. Explain the use and meaning of connection information. h. Use of side-band signals. For each sideband signal (such as SInterrupt, MFlag) explain the use of the signal. i. If the OCP interface has any implementation restrictions, they need to be clearly documented.
For slave OCP interfaces: a. Operations supported. For slave interfaces, state the types of operations supported. b. Unloaded latency for each operation (in OCP cycles). Describe the unloaded latency of each type of operation. c. Throughput of operations (per OCP cycle for sequences of reads, writes, and interleaved reads/writes). State the maximum throughput of the operations for sequences of reads, writes, and interleaved reads and writes.
Core Performance
63
d. Maximum number of operations outstanding (pipelining support). State the number of outstanding operations that the core can support, i.e. is there support for pipelining. e. Burst support and effect on latency and throughput numbers. State whether the core has burst support, how it makes use of bursts, and how the use of bursts affects the latency and throughput numbers stated above. f. High level flow-control. If the core makes use of high-level flow control, such as full/empty bits, state what these mechanisms are and how they affect performance.
g. Number of threads supported and use of those threads. State the number of threads supported. If more than one, explain the use of threads. h. Connection ID support. Explain the use and meaning of connection information. i. j. Use of side-band signals. For each sideband signal (such as SInterrupt, MFlag) explain the use of the signal. If the OCP interface has any implementation restrictions, they need to be clearly documented.
For every non-OCP interface, you will need to provide all of the same information as for OCP interfaces wherever it is applicable.
64
Sample Report
1. Core name 2. Core identity Vendor code Core code Revision code 3. Core is/is not process dependent 4. Frequency range for this core 5. Area 6. Power estimate 7. Special reset requirements 8. Number of interfaces 9. Interface information: Name Type For master OCP interfaces: a. Operations issued b. Issue rate (per OCP cycle for sequences of reads, writes, and interleaved reads/writes) c. Maximum number of operations outstanding (pipelining support) d. Burst support and effect on issue rates e. High level flow-control f. Number of threads supported and use of those threads 2 ip slave flashctrl 0x50c5 0x002 0x1 Not <100Mhz with NECCBC9-VX library 4400 gates 2input NAND equivalent gates not available
g. Connection ID and use of connection information h. Use of side-band signals i. Implementation restrictions
Core Performance
65
For slave OCP interfaces: a. Operations supported b. Unloaded latency for each operation (in OCP cycles) read, write Register read or write: 1 cycle. The flash read takes SBFL_TAA (read access time). Can be changed by writing corresponding register field of emem configuration register. The flash write operation takes about 2000 cycles since it has to go through the sequence of operations - writing command register, reading the status register twice. No overlap of operations therefore reciprocal of latency.
c. Throughput of operations (per OCP cycle for sequences of reads, writes, and interleaved reads/ writes) d. Maximum number of operations outstanding (pipelining support) e. Burst support and effect on latency and throughput numbers f. High level flow-control
No pipelining support.
No burst support.
g. Number of threads supported and use of those threads h. Connection ID and use of connection information i. Use of side-band signals
No connection information support. Reset_n, Control, SError. Control is used to provide additional write protection to critical blocks of flash memory. SError is used when an illegal width of write is performed. Only 16 bit writes are allowed to flash memory.
j.
Implementation restrictions Hitachi flash card HN29WT800 Only 1 flash ROM part is supported, therefore the CE_N is hardwired on the board. The ready signal RDY_N, is not used since not all parts support it. For the BYTE_N signal, only 16-bit word transfers are supported
For every non-OCP interface Provide all of the same information as for OCP interfaces wherever it is applicable.
66
g. Connection ID and use of connection information h. Use of side-band signals i. Implementation restrictions
Core Performance
67
For slave OCP interfaces: a. Operations supported b. Unloaded latency for each operation (in OCP cycles) c. Throughput of operations (per OCP cycle for sequences of reads, writes, and interleaved reads/ writes) d. Maximum number of operations outstanding (pipelining support) e. Burst support and effect on latency and throughput numbers f. High level flow-control
g. Number of threads supported and use of those threads h. Connection ID and use of connection information i. j. Use of side-band signals Implementation restrictions
For every non-OCP interface Provide all of the same information as for OCP interfaces wherever it is applicable.
Behavioral Models
Behavioral models can be used to validate OCP-compatible IP cores or for performance analysis of systems. The components in the behavioral library are: Q-Master Q-Slave OCP Merger OCP Monitor
Q-Master
Use Q-Master (Quick Master) behavioral models to stimulate cores while checking for OCP compliance and verifying core functionality and performance. Q-Master models drive the master side of an OCP interface. The models take in an assembled Socket Transaction Language (STL) script, and drive the OCP signals accordingly during system simulation. There are two types of Q-Master models. Quick System Master (QSMaster) is a behavioral OCP master used to model a system. Quick Core Master (QCMaster) is a behavioral OCP master used to model a core. The primary difference between these two behavioral models is their interaction with the OCP status and control signals. The differences are only visible in the STL statements supported by the different models (see Socket Transaction Language on page 85).
70
Features
Q-Master inherits the OCP features from the interface it is connected to, and configures itself accordingly. For example, if burst is enabled on the OCP, QMaster drives burst codes on the interface. If burst is not enabled, no burst information is driven. All legal OCP configurations except those with reset disabled are supported. Q-Master reads test vectors from a memory image file that is produced by the STL assembler, ocpasm. Q-Master drives OCP requests to and waits for OCP responses from the attached slave. Upon reaching the end of the test vector file, Q-Master sends Idle commands. While Q-Master has the ability to drive different MThreadID values onto the OCP interface, it is not a truly multi-threaded model. To build a multithreaded master with independent STL scripts, use multiple Q-Master models and merge their OCP interfaces into a single OCP interface using the OCP Merger model.
Configuration Settings
The following settings control Q-Master functionality.
Test Length
Maxtest_size controls the maximum test vector size that can be loaded into the Q-Master memory. The setting is specified in units of commands.
Table 10 Q-Master Test Memory Size
Name
maxtest_size
Default
10000
Function
maximum test size
Time Stamp
When timestamp_enable is on, cycle directives in the input Socket Transaction Language are obeyed; that is, the Q-Master model waits for the specified time to arrive until it issues the command. If timestamp_enable is off, cycle directives are ignored and the commands are issued as soon as possible. This latter mode is useful in some debugging situations, but the most common use is with timestamp_enable on.
Table 11 Q-Master Time Stamp
Name
timestamp_enable
Default
1
Function
obey cycle directives (timestamps)
Behavioral Models
71
MRespAccept Delay
If an OCP with MRespAccept extension is used, Q-Master models can delay the driving of MRespAccept upon receiving a response from the slave. The delay value (in OCP cycles) is specified by mrespaccept_delay. A value of 0 for mrespaccept_delay means that the response is accepted in the same cycle that it is received by Q-Master. If an OCP without the MRespAccept extension is used this setting is ignored.
Table 12 Q-Master Delay for MRespAccept
Name
mrespaccept_delay
Default
0
Function
MRespAccept delay in OCP cycles
MThreadBusy Control
When MThreadBusy is configured to be part of the OCP, the MThreadBusy signal associated with a thread goes active (high) for a number of cycles equal to mthreadbusy_delay after a response has been accepted on that thread. When mthreadbusy_delay is 0, the MThreadBusy signals never go active. If an OCP without the MThreadBusy extension is used this setting is ignored.
Table 13 Q-Master MThreadBusy Control
Name
mthreadbusy_delay
Default
0
Function
MThreadBusy delay in OCP cycles
MData Delay
When the OCP has the handshake option turned on, write data is sent mdata_delay cycles after the request is sent. A value of 0 for mdata_delay means that the write data is sent in the same cycle as the write command. If an OCP without datahandshake extension is used this setting is ignored.
Table 14 Q-Master Delay for MData
Name
mdata_delay
Default
0
Function
MData delay in OCP cycles
72
Q-Slave
Q-Slave (Quick Slave) models are used to model simple OCP slaves such as memories and I/O devices. You can configure them to mimic common types of behavior such as SRAM, DRAM, etc. Quick System Slave (QSSlave) is a behavioral OCP slave used to model a system. Quick Core Slave (QCSlave) is a behavioral OCP slave used to model a core. The primary difference between these two behavioral models is their interaction with the OCP status and control signals. Because the two models are so similar, they are described together in this section, and only differences between the models are explicitly pointed out.
Features
Q-Slave inherits an OCP configuration from the interface to which it is connected. All legal OCP configurations are supported, except those with reset disabled. Q-Slave waits for OCP requests from, and drives OCP responses to the attached master. Q-Slave can behave as a simple memory model, with userdefined response latency and burst behavior, as well as the ability to pre-load memory data at the start of simulation. Q-Slave can also model read and write FIFOs. You can assign specific memory addresses inside Q-Slave to trigger changes on the OCP sideband signals. When the MAddrSpace extension is present on the OCP, QSlave supports multiple address spaces in the memory area. All address spaces are completely independent of one another in terms of the storage accessed. Q-Slave supports multiple threads. With independent threads, it is possible that two or more threads are ready to return a response to the OCP interface in the same cycle. Q-Slave resolves this contention with a simple round-robin response arbitration. Q-Slave supports the OCP ReadEx command, but with slightly looser semantics than described in Protocol Semantics on page 25. Instead of locking out all competing commands, only competing ReadEx commands are locked out. The implementation aliases all addresses together, permitting only one lock to be active at any moment in time.
Configuration Settings
The following settings control Q-Slave functionality.
Memory Size
The size of the target memory is set with mem_2size. The memory size is in units of address bits and doesnt include MAddrsSpace (if present).
Behavioral Models
73
Table 15
Name
mem_2size
Default
14
Function
memory size (address bits)
Name
meminit_fixed meminit_fixeddata
Default
0 0
Function
initialize memory with fixed data data to initialize memory
Table 17
Name
meminit_preloadb meminit_preloadh
Default
0 0
Function
preload memory from binary file preload memory from hex file
Latency
The delay in OCP cycles from request acceptance to response is controlled by the latency variable. The minimum value for latency is 0. The maximum value is 255. If latency is set to 0, SResp can occur in the same cycle as SCmdAccept. The latency parameter can be annotated with a thread identifier to specify the latency for a particular thread. For example latency2 sets the response latency for thread 2.
74
Table 18
Q-Slave Latency
Name
latency[n]*
Default
3 (QCSlave) 6 (QSSlave)
Function
response latency [on thread n]
Burst Latency
Setting burstlat_enable to 1 puts the Q-Slave in burst mode. Non-burst transactions and the first transfers in a burst have a latency equal to latency. Subsequent transfers in a burst have a latency of burstlat_cycles, which must be smaller than or equal to latency. If the burst extension is not used this setting is ignored. When burstlat_1thread is set to 0, multiple bursts can be outstanding and the performance of a burst on one thread is not affected by a burst on another thread. When burstlat_1thread is 1, the core is modelled as a single threaded core that can only have one burst outstanding over all threads. If a new burst is started while there is an ongoing burst on a different thread, the outstanding burst is interrupted. The next request on that interrupted thread once again sees latency rather than the shorter burstlat_cycles. All of these parameters can also be annotated with a thread ID to allow settings for specific threads. For example, burstlat_enable1 enables burst latency behavior on thread 1.
Table 19 Q-Slave Burst Latency
Name
burstlat_enable[n]* burstlat_cycles[n]* burstlat_1thread
Default
0 1 0
Function
enable special latency for bursts [on thread n] latency for burst transactions [on thread n] single-threaded burst behavior
Name
randmode_enable
Default
0
Function
enable random latency mode
* If specified, the per-thread value is always used. If a per-thread value is not specified, the global, thread independent value is used. If neither value is specified, the per-thread default value is used.
Behavioral Models
75
Name
limitreq_enable limitreq_max
Default
0 4
Function
limit outstanding requests per thread maximum number of outstanding requests per thread
SCmdAccept Delay
Q-Slave models can delay the driving of SCmdAccept upon receiving a request from the master. The delay value (in OCP cycles) is specified by scmdaccept_delay. A value of 0 for scmdaccept_delay means that if it can be, the request is accepted in the same cycle that it is received by Q-Slave.
Table 22 Q-Slave Delay for SCmdAccept
Name
scmdaccept_delay
Default
0
Function
SCmdAccept delay in OCP cycles
SDataAccept Delay
If datahandshake is enabled, Q-Slave models can delay the driving of SDataAccept upon receiving write data from the master. The delay value (in OCP cycles) is specified by sdataaccept_delay. A value of 0 for sdataaccept_delay means that the write data is accepted in the same cycle that it is received by Q-Slave. This setting is ignored if the datahandshake extension is not used.
Table 23 Q-Slave Delay for SDataAccept
Name
sdataaccept_delay
Default
0
Function
SDataAccept delay in OCP cycles
76
interrupt_setaddr location causes the SInterrupt signal to go high. Writing any data to the interrupt_resetaddr location causes the SInterrupt signal to go low again. The two addresses must be different. The number of cycles of delay between writing of the memory location and the setting or resetting of the SInterrupt signal is configured using interrupt_delay. The same applies for the SError signal.
Table 24 Q-Slave SInterrupt Control
Name
interrupt_setaddr interrupt_resetaddr interrupt_delay
Default
x3fd0 x3fc0 1
Function
address to set SInterrupt address to reset SInterrupt delay cycles from write to set/reset SInterrupt signal
Table 25
Name
error_setaddr error_resetaddr error_delay
Default
x3fb0 x3fa0 1
Function
address to set SError signal address to reset SError signal delay cycles from write to set/reset SError signal
SFlag signals are set using the SFlag and SFlagMask registers. Reading the SFlag register retrieves the current value of the SFlag signals. Bit 0 contains the value of SFlag0 and so on. Setting of SFlags is accomplished by writing to the SFlag register, but is subject to the SFlagMask register. Only SFlags that have their bit set in the SFlagMask register are subject to change when the SFlag register is written. Use sflag_delay to configure the number of cycles of delay between writing to the sflag_addr memory location and setting or resetting of the SFlag signals. The default value of the SFlag register is to have all flag bits set to 0. The default value of the SFlagMask register is to have all bits corresponding to SFlags set to 1, that is, the setting of all flags is enabled. The addresses of the SFlag register and SFlagMask register are determined by the parameters sflag_addr and sflagmask_addr respectively. Separate the SFlag and SFlagMask registers by the width of an OCP word. In the case of a write, leaving insufficient space between the registers will produce unpredictable results. The current values of the MFlag signals can be retrieved by reading the MFlag register. The location of this register is set with the mflag_addr parameter.
Behavioral Models
77
Table 26
Name
sflag_addr sflagmask_addr sflag_delay mflag_addr
Default
0x3d00 0x3d80 1 0x3e00
Function
address of SFlag register address of SFlagMask register delay cycles form write to SFlag register address of MFlag register
SThreadBusy Control
Control over the behavior of SThreadBusy signals is through the sthreadbusy_enable parameter. When set to 1, the OCP SThreadBusy signals are asserted when no further requests can be accepted on a given thread. When set to 0, the SThreadBusy signals are never asserted, allowing the OCP interface to block any requests that are sent on a thread that cannot accept any more requests. This setting is ignored if the SThreadBusy extension is not used.
Table 27 Q-Slave SThreadBusy Control
Name
sthreadbusy_enable
Default
1
Function
enable SThreadBusy behavior
Error Response
Random error responses on the SResp field can be generated by setting the resperr_enable variable to 1. The resperr_rate value determines at what rate error responses are generated. Error responses are generated on average every resperr_rate Read or ReadEx command.
Table 28 Q-Slave Error Response
Name
resperr_enable resperr_rate
Default
0 100
Function
generate response errors average response error interval
78
To turn on write FIFO mode, set wrfifo_enable. The address of the FIFO is configured with wrfifo_addr, and the drain rate with wrfifo_rate. The write FIFO is filled with data by writing to the specified address, and a data word is drained once every wrfifo_rate OCP cycles. The write FIFO size is specified using wrfifo_size. If too many words are written into the write FIFO too quickly, it fills up and causes the OCP interface to stop accepting requests. The write FIFO can be configured to give a warning of impending emptying using an interrupt. For this mode to work, SInterrupt must be configured into the OCP. The high and low marks for this warning interrupt are set with wrfifo_high and wrfifo_low respectively. SInterrupt is set when the FIFO reaches the low mark (that is, it is about to go empty) and is reset when the FIFO reaches the high mark. The read FIFO has very similar behavior except that it is filled automatically and is drained by reading the FIFO. The read FIFO sends an interrupt when it reaches the high mark indicating that it is about to overflow.
Table 29 Q-Slave Read/Write FIFO Mode Parameters
Name
wrfifo_enable wrfifo_addr wrfifo_size wrfifo_high wrfifo_low wrfifo_rate rdfifo_enable rdfifo_addr rdfifo_size rdfifo_high rdfifo_low rdfifo_rate
Default
0 0x0 20 10 3 10.0 0 0x0 20 10 3 10.0
Function
enable write FIFO mode write FIFO address write FIFO size write FIFO high mark: reset interrupt write FIFO low mark: set interrupt write FIFO drain rate (cycles to drain one word) enable read FIFO mode read FIFO address read FIFO size read FIFO high mark: set interrupt read FIFO low mark: reset interrupt read FIFO fill rate (cycles to fill one word)
Behavioral Models
79
Table 30
Name
controlreg_addr statusreg_addr
Default
x3fe0 x3ff0
Function
address of control register address of status register
For QCSlave, a set of special registers exists to drive the status signals and observe the control signals. The location of these registers is controlled by statusdrive_addr and controlobserve_addr respectively.
Table 31 QCSlave Control/Status Registers
Name
controlobserve_addr statusdrive_addr
Default
x3fe0 x3ff0
Function
address to observe Control signals address to drive Status signals
In addition to interacting with the OCP control and status signals, QCSlave can also drive ControlBusy and StatusBusy. When controlbusye_enable is set, QCSlave drives the ControlBusy signal active for controlbusy_cycles for every control write (that is, for every assertion of the ControlWr signal). When statusbusye_enable is set, QCSlave drives the StatusBusy signal active for statusbusy_cycles for every status read (that is, for every assertion of the StatusRd signal).
Table 32 QCSlave ControlBusy and StatusBusy Behavior
Name
controlbusye_enable controlbusy_cycles statusbusye_enable statusbusy_cycles
Default
0 2 0 2
Function
causes QCSlave to assert ControlBusy when ControlWr is seen asserted number of cycles to assert ControlBusy after ControlWr is seen asserted causes QCSlave to assert StatusBusy when StatusRd is seen asserted number of cycles to assert StatusBusy after StatusRd is seen asserted
OCP Merger
The OCP Merger behavioral model merges two or more OCP interfaces into a single OCP interface to provide an environment for multi-threaded Socket Transaction Language tests. OCP Merger allows multiple QCMasters (each with their own single-threaded STL script) to be connected together using OCP Merger, yielding a multi-threaded OCP interface as shown in Figure 20. Besides merging the incoming requests and distributing the responses, OCP Merger allows the masters connected to it to communicate using core flags.
80
Figure 20
...
S lave 0 Interface
...
M aster Interface OCP
Configuration Restrictions
Two or more OCP masters and a single OCP slave can be connected to OCP Merger. The slave interfaces act as OCP system interfaces and must be connected to cores. The master interface acts as an OCP core interface and must be connected to a system. Observe the following rules: 1. Masters must be connected to OCP slave interfaces 0 through n, with no unconnected slave interface in the 0-n range. 2. All OCP interfaces must be identically configured with the following exceptions: The OCP master interface must include reset, the OCP slave interfaces need not. The OCP slave interfaces must be single-threaded, while the OCP master interface has as many threads as there are slave interfaces. SThreadBusy is not required on the OCP slave interfaces, even if it is on the OCP master interface. The OCP slave interfaces can have wider MFlag or SFlag fields than the OCP master interface to allow communication between the attached masters. mflag_wdth must be the same on all slave interfaces. sflag_wdth must be the same on all slave interfaces. The difference between mflag_wdth on the slave interfaces and mflag_wdth on the master interface must be identical to the difference between the sflag_wdth on the slave interfaces and sflag_wdth on the master interface. No OCP interface can have any test signals enabled.
3. To avoid blocking the shared master interface, use SThreadBusy on the master interface and either disable MRespAccept or enable MThreadBusy on all of the master and slave interfaces.
Behavioral Models
81
Functionality
OCP Merger has no storage. Any request received on one of the slave interfaces is either immediately forwarded to the master interface, or not accepted at the slave. If multiple masters send a request at the same time, round-robin arbitration is used to access the master interface. All sideband signals coming from the OCP slave interfaces (except surplus MFlag signals) are individually ORed together and passed to the corresponding OCP master interface sideband signals. All sideband signals coming from the OCP master interface are fanned out to the corresponding OCP slave interface sideband signals. Surplus OCP slave interface MFlag signals are individually ORed together and fanned out to the corresponding surplus SFlag signals on the OCP slave interfaces as shown in Figure 21.
Figure 21 Merger Signal Processing
MFlags
slv0
SFlags
MFlags
slv1
SFlags
MFlags
slv2
SFlags
master
MFlag
SFlag
OCP Monitor
The OCP monitor (OCPMON) tracks the data on the OCP interface and writes it out on a per cycle basis. The output of OCPMON can be post-processed by the ocpdis, ocpperf and ocpcheck tools to view and validate OCP performance and correctness. For more information, see the CoreCreator Guide. The OCP monitor supports the full functionality and configurability of the OCP interface. While the OCP monitor supports test signals, it does not monitor them. The following settings control the behavior of the OCP monitor.
82
Simulation Termination
The OCP monitor terminates simulation once a specified number of consecutive idle request cycles is seen on the corresponding OCP. This feature is turned on using watch_enable and the number of idles to watch for can be specified using watch_maxidles.
Table 33 OCPMON Simulation Termination
Name
watch_enable watch_maxidles
Default
1 1000
Function
enable watching for idle request cycles maximum number of consecutive idles before simulation is stopped
If the monitor detects Xs after initialization has completed, simulation stops and the monitor issues an error message.
WGL Format
The OCP monitor can also output a trace of the monitored signals in WGL format. wgl_enable turns on this mode, and wgl_slave controls whether the WGL is generated from the master or the slave point of view.
Table 34 OCPMON WGL Format
Name
wgl_enable wgl_slave
Default
0 1
Function
generate WGL output file generate WGL from slave point of view
Behavioral Models
83
Basic Commands
Basic commands correspond on a one-for-one basis to the OCP commands. See Chapter 4 Protocol Semantics on page 25 for a definition of OCP command semantics. The format for a basic command is: [cycle:] [tid[/cid]] cmd [as/]address [(be)][data [(mask)]] Items in brackets [ ] are optional. If an optional field has parenthesis ( ), the parenthesis must be included. Only the cmd and address fields are required. The simplest possible command is: read 0x0
86
which consists of a command (read) and an address (0x0 = 0 in hex). This command reads from address 0. The following sections describe the fields of a basic command and how they are used.
Cycle
The first OCP cycle that this command can be executed. If the STL command is processed before time specified by cycle, then the command is held until time cycle before being issued to the OCP. This also delays all of the commands that follow. If the STL command is processed after time cycle, it is issued immediately. A cycle must be specified as a decimal number and must be followed by a colon.
CMD
All basic commands except Idle require an address and may also have additional required fields. For example the Write command requires the data to be written. Table 35 lists the options for the cmd field. The Read and Write commands can also be annotated with a specific burst code.
Table 35 cmd Field Options
Command
Idle Read Write Broadcast Readex
Description
Send Idle command over the OCP (See also the Idle Macro on page 91) Read from the specified address Write the data to the specified address Broadcast the data to everyone at the specified address Exclusive read from the specified address
data data
87
Command
{burst_code}xread
Description
Read from the specified address, using the specified burst code. For example, the command 4xread produces a read command with a burst code of 4
0xwrite is exactly equivalent to write A command can have a size suffix of 8, 16, 32, 64, 128, or 256, that describes the data size to be read or written. Tests written with the size suffix perform the same transfers on OCPs with different data widths, thereby improving the reusability of the test. If a size is not specified it is assumed to be the size of the OCP. If the OCP data width is smaller than the size suffix, the command is translated into several burst transfers. For example, the command to write to address 0 is:
write 64 0 0x0123456789abcdef
would be translated to a burst of 2, 32 bit writes on a 32 bit OCP:
Address
The address to read from or write to. The address can be specified as decimal or hexadecimal (using the prefix 0x) number. An address is zero-extended or truncated to the width of the OCP address field. For basic commands, the address must be aligned with the OCP data width. If the OCP has 8 bit wide data, then addresses can be 0, 1, 2, 3, etc. If the data width is 16 bits, then addresses must be 0, 2, 4, etc. And if the data width is 32 bits, then the addresses must be 0, 4, 8, etc.
88
Data
The data to be written for a write command. Data can be specified as decimal or hexadecimal (using the prefix 0x) number. Data is zero-extended to the OCP data width. When an optional data field is used in a read, readex, or xread command, it specifies the expected response data of the read. The QMaster checks that the result of the read is the specified data.
Mask
When a mask value is present with the data in a read, readex, or xread command, only the data bits set to one in the mask field are used to test the read response data. For example: read 0x4 0x3 (0x1) has a data field of 3 and a mask of 1. This mask means that only the rightmost bit is checked when the result of the read is known.
Macro Statements
Macro statements result in the issuing of one or more basic commands reducing the need to specify long sequences of commands such as for burst reads and writes. Macro statements, except for burst write random and burst write numeric, employ the format: [cycle:] [tid[/cid] macro_cmd [as/]address [count] [(be)] [data [mask]]
Burst Fill
This set of macro statements perform a block of write transfers starting at the specified address using the designated data. bfill [as/]address count data cfill [as/]address count data nfill [as/]address count data
89
When this statement is processed it is converted into a number of individual xwrite commands. The address parameter specifies the starting address of the burst and must be aligned with the OCP data width. The parameter count determines the number of xwrite commands needed and the parameter data is the data to be written. bfill results in an incrementing burst write so only the commands 6xwrite, 4xwrite, 2xwrite, and 0xwrite are generated. cfill results in a continuous burst write. It generates the commands 7xwrite and 0xwrite as these are the legal write commands for continuous burst. nfill results in a streaming burst write (no increment). It generates the commands 5xwrite and 0xwrite. Unlike bfill and cfill, the nfill macro does not cause the address to be incremented for every transfer so all writes are to the same address.
90
For example, the macro: bwrite 0x0 2 ndata results in a write of 0 at address 0, and a write of 1 at the next address (depending on the OCP data width). Assuming a data width of 8 bytes, the preceding macro produces the following basic commands: 2xwrite 0x0 0x0 0xwrite 0x8 0x1
Burst Read
Burst read macros produce a block of read transfers starting at the specified address. When this macro is processed, it is converted into a number of individual xread commands. The parameter count determines the number of generated xread commands. Unlike basic read commands, expected data and mask values are not supported for these macros. bread [as/]address count (byteenable) cread [as/]address count nread [as/]address count bread results in an incrementing burst read so only the commands 6xread, 4xread, 2xread, and 0xread are generated. cread results in a a continuous burst read. It generates the commands 7xread and 0xread as these are the legal read commands for continuous bursts. nread results in a streaming burst read (no increment). It generates the commands 5xread and 0xread. Because the nread macro does not cause the address to be incremented for every transfer, all reads are from the same address.
91
write 0x0 (2) 0x70 That is, a write to address 0x0 with byte enable set to 2 and data of 0x70. If the explicit width is larger than the data width of the OCP, the macro produces more than one command. Explicit width macros work with other macro statements and basic commands. nfill only works if explicit width > the data width of the OCP. For the same 16-bit OCP, the explicit width macro statement: bfill8 0x1 4 0xfe results in: 2xwrite 2xwrite xwrite 0x0 (2) 0xfe00 0x2 (3) 0xfefe 0x4 (1) 0x00fe
Idle Macro
The Idle command can be issued multiple times using the idle macro. idle count The parameter count specifies the number of idle commands to issue. If count is not specified, it defaults to one. Each Idle command takes a single OCP cycle to complete.
Behavioral Statements
Behavioral statements allow you to set and observe sideband signals and control the execution of the STL trace.
Signal
The signal statement sets OCP sideband signals (MFlag) to specified values. The syntax is: signal flags[=value[(mask)]]|[controlwr=value] [statusrd=value] [controlbusy=value][statusbusy=value] The optional value parameter specifies the pattern to set on MFlag. If no value is given, all MFlag bits are set to 1. The mask parameter specifies which MFlag bits to modify. It defaults to affect all flags. For example: signal flags=0x2 sets MFlag[1] and clears all other MFlag bits. The statement: signal flags=0x2(0x2)
92
Wait
The wait statement waits until some conditions are true. There are two types of wait statements. The wait or statement waits until at least one of the conditions is true, while the wait and statement waits until all of the conditions are true. If and or or are not specified the default is wait and. The syntax is: wait [and|or] [response=value] [flags=value[(mask)]] [interrupt=value] [threadbusy=value[(mask)]] [controlwr=value] [statusrd=value] [controlbusy=value] [statusbusy=value] A response condition waits for the response to an earlier statement. For example: wait response=-3 waits for the response to the statement issued three statements before. The flags condition waits for the OCP SFlag field to contain the specified value. A mask is optionally applied to look only at some specified bits (as described for the signal statement). wait flags=0x0310 waits for Sflag bits 9, 8, and 4 to be set. While other conditions are similar to the flags condition, each looks at a different signal field in the OCP interface as shown in Table 36.
Table 36 Behavioral Statement Conditions.
OCP signal
SFlag SInterrupt SThreadBusy ControlWr StatusRd ControlBusy StatusBusy
Restrictions
The controlwr and statusrd conditions can only be used for a core (for instance for QCMaster) while the controlbusy and statusbusy conditions can only be used for a system (for instance, for QSMaster).
93
Control/Status Statements
This set of statements is used to set and observe the OCP Control and Status signals. The following statements are for cores (for instance, QCMaster): readcontrol [data] writestatus data The readcontrol statement reads the values on the OCP control field. The optional data parameter specifies expected data. The writestatus statement sends the specified data to the OCP status field. The following statements are for a system (for instance, QSMaster): writecontrol data readstatus [data] The writecontrol statement writes the specified data to the OCP control field. The readstatus statement reads the OCP status field. The optional data parameter specifies expected data.
10
Syntax
The syntax for the core RTL configuration file is:
<core_stmt>: core_language verilog|vhdl | icon <file_name> | core_id <vendor_code> <core_code> <revision_code> [<comment>] | interface <interface_name> bundle <bundle_name> [{<interface_body>*}] | addr_region <name> {<addr_region_body>*}
98
Components
This section describes the core RTL configuration file components.
Version Statement
The version statement identifies the version of the core RTL configuration file format. The version string consists of major and minor version numbers separated by a period. The current version of the file is 2.1.
Core_language Statement
This construct specifies the core description language and uses the syntax:
core_language verilog|vhdl
Icon Statement
This statement specifies the icon to display on a core. The syntax is:
icon <file_name>
file_name is the name of the graphic file, without any directory names. Store the file in the design directory of the core. For example:
icon "myCore.ppm"
The supported graphic formats are GIF, PPM, and PGM. Graphics should be no larger than 80x80 pixels. Since the text used for the core is white, use a dark background for your icon, otherwise it will be difficult to read.
Core_id Statement
The core_id statement provides identifying information to the tools about the core. This information is required. Syntax of the core_id statement is:
core_code
revision_code comment
99
0x200 - 0x2FF: Bridges Sum values from following choices plus offset 0x200: Domain: 0x00 - 0x7F: Computing 0x00 - 0x3F: PCs 0x00: ISA (inc. EISA) 0x01 - 0x0F: Reserved 0x10: PCI (33MHz/32b) 0x11: PCI (66MHz/32b) 0x12: PCI (33MHz/64b) 0x13: PCI (66MHz/64b) 0x14 - 0x1F: AGP, etc. 0x40 - 0x7F: Reserved 0x80 - 0xBF: Telecom 0xA0 - 0xAF: ATM 0xA0: Utopia Level 1 0xA1: Utopia Level 2 ... 0xC0 - 0xFF: Datacom 0x300 - 0x3FF: Reserved 0x400 - 0x5FF: Other processors (enumerate types: MPEG audio, MPEG video, 2D Graphics, 3D Graphics, packet, cell, QAM, Vitterbi, Huffman, QPSK, etc.) 0x600 - 0x7FF: I/O (enumerate types: Serial UART, Parallel, keyboard, mouse, gameport, USB, 1394, Ethernet 10/100/1000, ATM PHY, NTSC, audio in/out, A/D, D/A, I2C, PCI, AGP, ISA, etc.) 0x800 - 0xFFF: Vendor-defined (explicitly left up to vendor)
Interface Statement
The interface statement defines and names the interfaces of a core. The interface name is required so that cores with multiple interfaces can specify to which interface a particular connection should be made. Syntax for the interface statement is:
interface "xyz" bundle ocp <interface_body>: interface_type <type_name> | port <port_name> net <net_name> | prefix <name> | param <name> <value> | location (n|e|w|s|) <number>
Interface_type Statement
The interface_type statement defines characteristics of the bundle. Typically, the different types specify whether the core drives or receives a particular signal within the bundle. Syntax for the interface_type statement is:
interface_type <type_name>
The type_name must be a type defined in the bundle definition. If the bundle is OCP, the allowed types are: master, system_master, system_master_clkout, slave, system_slave, system_slave_clkout, and monitor. To define a type, specify it in the interface configuration file (described on page 115).
Port Statement
Use the port statement to rename the ports corresponding to nets within the bundle. Syntax for the port statement is:
Prefix Command
The prefix command lets you prepend a string to signal names defined in the bundle to generate module port names that appear on the chip interface. Syntax for the prefix command is:
prefix <name>
For example, the statement prefix external adds external to the front of all port names of that interface, taking the body of the name from the bundle or from a port statement.
param interrupt 1
Location Statement
The location statement provides a way for the core to indicate where to place this interface when drawing the core. The location is specified by indicating a compass direction of north(n), south(s), east(e), west(w) and a number. The number indicates a percentage from the top or left edge of the block. Syntax for the location statement is:
location s 50
# define the module version 2.1 module flashctrl { core_id 0xBBBB 0x001 0x1 Flash/Rom Controller # This core is written in VHDL core_language vhdl # Use the Vista icon icon vista.ppm addr_region FLASHCTRL0 { addr_base 0x0 addr_size 0x100000 } # one of the interfaces is an OCP slave using the pre-defined OCP bundle interface tp bundle ocp { # this is a slave type ocp interface interface_type slave # this OCP is a basic interface with byteen support plus the SError # and Reset_n signals param reset 1 param byteen 1 param force_aligned 0 param burst 0 param datahandshake 0 param threadid 0 param respaccept 0 param connid 0 param interrupt 0 param mthreadbusy 0 param sthreadbusy 0 param mflag 0 param sflag 0 param serror 1 param control 0 param controlwr 0 param controlbusy 0 param status 0 param statusrd 0
prefix tp # since the signal names do not exactly match the signal # names within the bundle, they must be explicitly linked port Reset_ni net Reset_n port Clk_i net Clk port TMCmd_i net MCmd port TMAddr_i net MAddr port TMBurst_i net MBurst port TMByteEn_i net MByteEn port TMData_i net MData port TCCmdAccept_o net SCmdAccept port TCResp_o net SResp port TCData_o net SData port TCError_o net SError # stick this interface in the middle of the top of the module location n 50 } # close interface tp defininition
# The other interface is to the flash device defined in an interface file #Define the interface for the Flash control interface emem bundle flash { # the type here indicates direction and drive of the control signals interface_type controller # # # # since this module has direction indication on some of the signals (_o,_b) and is missing assertion level indicators _n on some of the signals, the names must again be directly linked to the signal names within the bundle port Addr_o net addr port Data_b net data port OE net oe_n port WE net we_n port RP net rp_n port WP net wp_n # all of the signals on this port have the prefix emem_ prefix "emem_" # stick this interface in the middle of the bottom of the module location s 50
bundle flash { #types of flash interfaces #controller: flash controller; flash: flash device itself. interface_types controller flash net addr { #Address to the Flash device direction output input width 19 } net data { #Read or Write Data direction inout inout width 16 } net oe_n { # Output Enable, active low. direction output input } net we_n { # Write Enable, active low. direction output input } net rp_n { # Reset, active low. direction output input } net wp_n { # Write protect bit, Active low. direction output input } }
11
version <version_string> chip <chip_name> { <chip_stmt>* } <chip_stmt>: | connection <name> bundle <bundle_type> [{<connection_body>*}] | instance <inst_name> module <mod_name> [{<instance_body>*}] | interface <name> bundle <bundle_type> [{<interface_body>*}]
The version statement is generated by the tools to identify the version of the file format and should not be modified.
Connections
The components of a chip are joined together using connections. A connection statement instantiates a bundle, which is a group of signals. Bundles have a type that describes the bundle. For example, the OCP is a bundle consisting of the signals of the OCP. Bundles can have optional characteristics that refine the description. OCP is a bundle type that is pre-defined. Cores that have non-OCP interfaces need to define a bundle type that describes that interface. See page 101 for a description of how bundles are defined and for the options available to OCP bundles. A bundle requires a name so that it can be referenced from other parts of the file to make a connection. The syntax is:
Param Statement
A param statement configures a component. The syntax is:
Location Statement
Specifies the location of a connection, interface, and instance in the design. This is where these graphical objects are displayed in the design view. Specified as an (x, y) coordinate pair for instances and connections, or anchor direction (north, east, west, south) and offset for interfaces. It is recommended that you change these settings using the GUI and not the text editor. The syntax is:
Connection Example
The following example shows a connection containing burst and interrupt options:
Instances
The instance statement is used to instantiate a component and contains the following elements:
<instance_body>: | param <name> <value> | location <number> <number> | interface <name> connection <name> [{<interface_body>*}]
Param Statement
A param statement configures an entity. The syntax is:
Location Statement
Specifies the location of a connection, interface, and instance in the design. This is where these graphical objects are displayed in the design view. Specified as an (x, y) coordinate pair for instances and connections, or anchor direction (north, east, west, south) and offset for interfaces. It is recommended that you change these settings using the GUI and not the text editor. The syntax is:
Interface Statement
The interface statement within an instance body makes connections to a particular interface on a component. The interface being connected to is the interface name as defined by the module, typically in the cores rtl.conf file. The syntax of an interface statement within an instance is:
Chip Interfaces
An interface statement indicates that the signals in the bundle are external to the chip. The syntax is:
interface <interface_name> bundle <bundle_name> [{<interface_body>*}] <interface_body>: | param <name> <param_value> | location (n|e|w|s| <number>) <number> | port <port_name> net <net_name> | interface_type <intf_type_name> | prefix <name>
Param Statement
A param statement configures the interface. The syntax is:
Location Statement
Specifies the location of a connection, interface, and instance in the design. This is where these graphical objects are displayed in the design view. Specified as an (x, y) coordinate pair for instances and connections, or anchor direction (north, east, west, south) and offset for interfaces. It is recommended that you change these settings using the GUI and not the text editor.
Port Statement
The port statement allows you to rename the nets within a bundle. All nets within the bundle should be enumerated, if any are left out, it is assumed that the <port_name> and the <net_name> are the same. The syntax of a port statement is:
Interface_type Statement
Use the interface_type statement to define the type of interface that is instantiated. The syntax of an interface_type statement is:
interface_type <intf_type_name>
For example:
interface_type master
Prefix Statement
The prefix statement allows you to prepend a string to signal names that appear on the chip interface. The syntax is:
prefix <name>
For example:
prefix "external"
Syntax Specification
The syntax for an entire chip RTL configuration file is shown below. A # symbol marks the text as a comment until the next new line.
version <value> chip <chip_name> { <chip_stmt>* } <chip_stmt>: | connection <connection_name> bundle <bundle_name> [{<connection_body>*}] | instance <inst_name> module <module_name> [{<inst_body>*}] | interface <interface_name> bundle <bundle_name> [{<interface_body>*}] <connection_body>: <param> | <location> <inst_body>: | param <param_name> <param_value> | location <number> <number> | interface <name> connection <name> [{<interface_body>*}] <interface_body>: | param <name> <value> | location n|e|w|s| <number> <number> | port <port_name> net <net_name> | interface_type <name> | prefix <name>
version 2.1 chip "testbench" { interface "mii_" bundle "testbench_mii" { interface_type "controller" location 310 210 } interface "ip_" bundle "ocp" { interface_type "master"
location 280 380 param byteen 1 param burst 1 param data_wdth 32 } interface "tp_" bundle "ocp" { interface_type "slave" location 335 380 param byteen 1 param burst 1 param sflag 1 param addr_wdth 16 param data_wdth 32 } instance "testbench0" module testbench { location 280 255 interface "ip" connection "ip_" { location s 13 } interface "tp" connection "tp_" { location s 81 } interface "mii" connection "mii_" { location n 50 } } instance "testbench_ip" module ocpmon { param watch_enable 0 param wgl_slave 0 location 285 353 interface "ip" connection "ip_" { } } instance "testbench_tp" module ocpmon { param watch_enable 0 location 340 353 interface "ip" connection "tp_" { } } }
12
Name the file <bundle-name>_intfc.conf where bundle-name is the name given to the bundle that is being defined in the file. The syntax of the interface configuration file is:
<bundle_stmt>: interface_types <interface_type-name>+ | net <net_name> {<net_stmt>*} <net_stmt>: direction (input|output|inout)+ | width <number-of-bits> | vhdl_type <type-string>
version The version statement identifies the version of the interface configuration file format. The version string consists of major and minor version numbers separated by a decimal, for example 2.1. bundle This statement is required and indicates that a bundle is being defined instead of a core or a chip. Make the bundle-name the same name as the one used in the file name. interface_types The interface_types statement lists the legal values for the interface types listed in Table 37. Interface types are used by the toolset in conjunction with the direction statement to determine whether an interface uses a net as an input or output signal. This statement is required and must have at least one type defined.
Table 37 Interface Types
Sp
Master
6yyph
System_slave System_slave_clkout Slave* System_master System_master_clkout Master* Slave System_slave* System_slave_clkout* Slave System_slave* Master System_master* System_master_clkout* Master System_master* Any except monitor
6fyyhyyph
Master System_master System_master_clkout Slave System_slave System_slave_clkout Master System_master System_master_clkout Master System_master Clkout type Slave System_slave System_slave_clkout Slave System_slave Clkout type Monitor
Slave
System_master
System_master_clkout
System_slave
System_slave_clkout
Monitor
*Clk and Reset_n signals cannot be enabled or conflicting control, status or test signals.
net The net statement defines the signals that comprise the bundle. There should be one net statement for each signal that is part of the bundle. A net can also represent a bus of signals. In this case the net width is specified using the width statement. If no width statement is provided, the
net width defaults to one. A bundle is required to contain at least one net. The net-name field is the same as the one used in the net-name field of the port statements in the core rtl and chip rtl configuration files. direction The direction statement indicates whether the net is an input, output, or inout. This field is required and must have as many direction-values as there are interface types. The order of the values must duplicate the order of the interface types in the interface_types statement. The legal values are input, output, and inout. By default VHDL signals and ports are assumed to be std_logic and std_logic_vector, but if you have ports on a core that are of a different type, the vhdl_type command can be used on a net. This type will be used only when soccomp is run with the design_top=vhdl option to produce a VHDL top-level netlist. The following example defines an SRAM interface. The bundle being defined is called sram16.
bundle "sram16" { # Two interface types are defined, one is labeled # "controller" and the other is labeled "memory" interface_types controller memory # A net named Address is defined to be part of this bundle. net "Address" { # The direction of the "Address" net is defined to be # "output" for interfaces of type "controller" and "input" # for interfaces of type "memory". direction output input # The width statement indicates that there are 14 bits in # the "Address" net. width 14 } net "WData" { direction output input width 16 } net "RData" { # The direction of the "RData" net is defined to be # "input" for bundle of type "controller" and "output" for # bundles of type "memory". direction input output width 16 } net "We_n" { direction output input }
13
Version
The version of synthesis configuration file is required. Specify the version with the version command, for example: version 1.1.
Technology Section
The synthesis configuration files use values called technology variables. The values of these variables must be supplied so that the underlying configuration files can be evaluated. To help you determine appropriate settings for these variables, a program has been developed, the Technology Compiler (techcomp). The Technology Compiler extracts as much of the information as possible from a cell library. Details on running the Technology Compiler are provided in the Man Pages or the CoreCreator Guide. The following set of descriptions provide details on the technology variables. (TC) identifies values that can be derived using the Technology Compiler from a specific library. capacitancescale Default units of capacitance in the technology files are assumed to be in picofarads. This variable allows constraints that need to be specified in li-
brary units access to the correct multiplier for conversion in the synthesis library being used. Most libraries are specified in picofarads so this value is typically 1.0. (TC) For example: param capacitancescale .001 A capacitancescale of .001 would mean that the library units are in femtoFarads. defaultc2qtime The clock-to-output time is the worst case timing value from a register or input port to an output port. The default value is assigned to any output port that is not specifically constrained. It should not be used in a core constraint file as it does not show core capabilities. The Technology Compiler provides a default of 1/8 of a clock period. This particular value is not derived but is a recommended synthesis guideline. For example:
param defaultc2qtime {[expr $clockperiod * .125]} param defaultc2qtime {[expr $timescale * 1000]}
Constant values are specified in nanoseconds. For consistency, use expressions that can be interpreted in nanoseconds. defaultc2qtimemin The minimum clock-to-output time is the best-case timing value from a register or input port to the output port. The default value is assigned to any output port that is not specifically constrained. It should not be used in a core constraint file as it does not show core capabilities. The Technology Compiler provides a default of 0, the most conservative value. This particular value is not derived but is a recommended synthesis guideline. For example:
param defaultc2qtimemin {[expr $clockperiod * .01]} param defaultc2qtimemin {[expr $timescale * 100]}
Constant values are specified in nanoseconds. For consistency, use expressions that can be interpreted in nanoseconds. defaultcriticalrange The critical range value specifies the paths to be worked on by the synthesis tool during optimization. Typically, the tool will work on the most critical paths optimizing them to meet timing requirements and then proceed to the next most critical path. By specifying a critical range, it will work on all paths from the most critical to those within the specified value. (TC) A larger value takes longer for synthesis to run. A default value of 10% of the clock period is provided by the Technology Compiler. This value is not derived but is a recommended synthesis guideline described in Synopsys documentation. For example:
param defaultcriticalrange {[expr $clockperiod * .1]} param defaultcriticalrange {[expr $timescale * 1000]}
Constant values are specified in nanoseconds. For consistency, use expressions that can be interpreted in nanoseconds. defaultfalldelaymin defaultfalldelaymax Default values to use for the best/worst case clock fall time. This is the time it takes from the source of the clock to reach the shortest/furthest endpoint of the clock tree (falling edge on the endpoint). These value depend on the clock tree layout methodology being used and cannot be derived or defaulted by the Technology Compiler. The values are set by those responsible for clock tree layout. For example:
defaultfanoutload This variable specifies the value of one fanout in the synthesis library fanout units. It is used to limit the fanout of input ports of a core or agent. This is done using the maxfanout keyword for the port and specifying it in terms of the number of defaultfanoutloads. (TC) For example:
param defaultfanoutload 1
defaultholdtime The default hold time is the hold time value from an input port to a register or an output port. The default is assigned to any input port that is not specifically constrained. The Technology Compiler provides a default of 0, the most conservative value. This value is not derived but is a recommended synthesis guideline. For example:
param defaultholdtime {[expr $clockperiod * .01]} param defaultholdtime {[expr $timescale * 100]}
Constant values are specified in nanoseconds. For consistency, use expressions that can be interpreted in nanoseconds. defaultloadcellpin Name of the load library/cell/pin that represents a typical gate that would be on the fanout for a net. This value is specified to the synthesis tool as the gate to use in load calculations for output ports of a module. (TC)
Values must be a string that specifies the logical name of the synthesis library, the cell from the library, and the pin from which the load calculation is derived. The pin is optional. For example:
param defaultloads 4
defaultminusuncertainty Default value to use for the worst case clock skew between two endpoints of a clock network when checking setup time. This value is dependent on the clock tree layout methodology being used and cannot be derived or defaulted by the Technology Compiler. The value is set by those responsible for clock tree layout. For example:
param defaultsetuptime {[expr $clockperiod * .375]} param defaultsetuptime {[expr $timescale * 3000]}
Constant values are specified in nanoseconds. For consistency, use expressions that can be interpreted in nanoseconds. gatedelay gatedelaymin The maximum and minimum delay of a typically loaded gate in nanoseconds. Some cores may have setup/c2q times specified as the number of gate delays instead of as a percentage of the clock period although it is not the preferred unit. (TC) For example:
gatesize Specifies the size of a standard two input NAND gate (normally accepted as a size of one gate) in the area units of the technology library. This value can be used for conversion between the number of gates and area. (TC) For example:
Values must be a string that specifies the logical name of the synthesis library, the cell from the library, and the output pin that will be driving an input to the core. The pin is optional. (TC) For example:
If constraints are specified using these values they will show up as additional loads and resistances on output ports of a module. (TC) For example:
param longnetrcresistance {[expr $resistancescale * .09]} param longnerccapacitance {[expr $capacitancescale *.12]}
Constant values are specified in picofarads or kOhms. For consistency, use expressions that can be interpreted in picofarads or kOhms. lowdrivegatepin See highdrivegatepin
maxoperatingconditions minoperatingconditions Operating conditions specify the appropriate scaling or tables to use when calculating timing values during the synthesis process. The maxoperatingconditions variable should match the string used in the synthesis library for the worst case timing condition. The minoperatingconditions variable should match the string used in the synthesis library for the best case timing condition. If neither variable is specified, synthesis will be run with the library's default operating condition. (TC) For example
resistancescale Default units of resistance in the technology files are assumed to be in kOhms. This variable allows constraints that need to be specified in library units access to the correct multiplier for conversion in whatever synthesis library is being used. Most libraries are specified in kOhms so this value is typically 1.0. (TC) For example:
technologyvalue Specifies which timing values should be selected for the agent timing. The current choices are 15u, 18u, or 25u which correlate to .15, .18, and .25 micron respectively. These strings are used to form the name of the timing file that will be loaded when running socmap, SOCCreator, or CoreCreator. The default value is 18u. This value is not derived by the technology compiler but is a built-in default. For example:
enclosed Calculates the wire capacitance of each net using the wire load model set on the smallest subdesign that completely encloses that net. To use the set_wire_load_selection_group and set_wire_load_min_block_size commands, you must specify this mode. Enclosed is the default. top Calculates the wire capacitance of all nets using the wire load model set on the top-level design. segmented Calculates the wire capacitance for each segment of a net that crosses hierarchical subdesigns. Calculations are based on the wire load model set on the subdesign containing that segment. The total wire capacitance of the net is equal to the sum of the wire capacitances of the segments. For example:
Chip Section
The chip section of the file allows you to define clock variables and specify which technology is being used.
Defining Clocks
You must specify details about the clocks that are being brought into each design. This information includes the clock period, rise/fall time, delay or latency, and clock skew. Clock periods can only be defined in the chip synthesis configuration file.
Clock Variables
The variables that can be used to specify information about a clock are: falldelaymin falldelaymax Values to use for the best/worst case clock fall time. This is the time it takes from the source of the clock to reach the shortest/furthest endpoint of the clock tree (falling edge on the endpoint). The technology defaults of defaultfalldelaymin/defaultfalldelaymax are used. For recommended values, check with the designer of the clock tree layout. For example:
minusuncertainty The value to use for the worst case clock skew between two endpoints of a clock network when checking setup time. The technology default of defaultminusuncertainty is used if this value is not specified. For recommended values, check with the designer of the clock tree layout. For example:
or expressions specified for this variable define when the clock first rises and first falls during the clock period. More complex waveform specifications are currently not supported. If required, the resulting constraints file for the block being synthesized must be edited to supply the correct waveform.
Instance Section
The ports of the IP blocks all have default constraints defined in the core_syn configuration file. To override any of these on an instance by instance basis, use the instance section. You can also specify overrides for the max delay and false paths constraints.
drivingcellpin This variable describes which cell in the synthesis library is expected to be driving the input. To maintain portability set this variable to one of the technology values of high/medium/lowdrivegatepin. Values are a string that specifies the logical name of the synthesis library, the cell from the library, and the pin that will be driving an input for the core. The pin is optional. For example:
maxfanout This keyword limits the fanout of an input port to a specific number of fanouts. To maintain portability the setting of this variable should be expressed in terms of the technology variable defaultfanoutload Constant values are specified in library units. For example:
Specify constant values in expressions that result in kOhms for resistance and picofarads (pf) for capacitance. For example:
wireloadresistance {[expr $resistancescale * .09]} wireloadcapacitance {[expr $capacitancescale *.12]} wireloadresistance {$mediumnetrcresistance} wireloadcapacitance {$mediumnetrccapacitance}
Syntax
Parameter values are specified using Tcl syntax. Observe the following syntax conventions: Enclose all expr statements within braces, { } to differentiate between expressions that are to be evaluated while the file is being parsed and those that are to be evaluated during synthesis constraint file generation. Although not required by Tcl, enclose strings within quotation marks, , to show that they are different than keywords. Specify keywords using lower case.
Expressions can use any of the technology or environment variables, and any of the following: clockperiod This variable should only be used in calculations of timing values for ports. When evaluating expressions that use $clockperiod, the program will determine which clock the port is relative to, determine its period (in nanoseconds), and apply that value to the equation. For example:
Each entry within the technology section is specified by the keyword param followed by the name of the variable and its setting. The following example shows a sample technology section of the chip synthesis configuration file:
technology "foo param highdrivegatepin "foo/invtr_10x/x" param mediumdrivegatepin "foo/invtr_2x/x" param lowdrivegatepin "foo/invtr/x" param defaultloadcellpin "foo/invtr/i0" param defaultloads 4 param longnetdelay 5 param mediumnetdelay 3 param shortnetdelay 1 param longnetrcresistance 1 param longnetrccapacitance 2 param mediumnetrcresistance 0.5 param mediumnetrccapacitance 1 param shortnetrcresistance 0.1 param shortnetrccapacitance 0.5 param defaultsetuptime {[expr $clockperiod * .375]} param defaultholdtime 0 param defaultc2qtime {[expr $clockperiod * .125]} param defaultc2qtimemin 0 param timescale 1 param capacitancescale 1 param resistancescale 1 param defaultminusuncertainty 0.5 param defaultplusuncertainty 0.2 param defaultrisetime 0.0 param defaultfalltime {[expr $clockperiod / 2.0]} param defaultrisedelaymin 0.2 param defaultrisedelaymax 1.0 param defaultfalldelaymin 0.2 param defaultfalldelaymax 1.0 param defaultcriticalrange 1.0 param gatedelay 0.33 param gatedelaymin 0.29 timingtabledefaultfactor 1.2 timingtableglobalfactor 1.1 }
chip <chipName> { clock <clockName> { period <Value> risetime <Value> falltime <Value> plusuncertainty <Value> minusuncertainty <Value> risedelaymin <Value> risedelaymax <Value> falldelaymin <Value> falldelaymax <Value> } usetechnology <technology name> }
The clockName is the name of the top level port or signal associated with the clock. The technology name is the name of the technology section to be used with this chip. At the start of the file create a description of all of the clocks in the design. For example:
chip myChip { clock sbClk { period 10.0 risetime {$defaultrisetime} falltime {$defaultfalltime} minusuncertainty 0.5 plusuncertainty 0.3 risedelaymax 1.2 risedelaymin 0.4 falldelaymax{$defaultfalldelaymax} falldelaymin{$defaultfalldelaymin } }
instance <instanceName> { port <portName> { drivingcellpin <drivingCellName> setuptime <Value> holdtime <Value> maxfanout <Value> } port <portName> { loadcellpin <loadCellPinName> loads <Value> wireloadresistance <Value> wireloadcapacitance <Value> c2qtime <Value> c2qtimemin <Value> } maxdelay { delay <delayValue> fromport <portName> toport <portName> } falsepath { fromport myInPort1 toport myOutPort1 } }
For example:
version 1.1 technology "foo param highdrivegatepin "foo/invtr_10x/x" param mediumdrivegatepin "foo/invtr_2x/x" param lowdrivegatepin "foo/invtr/x" param defaultloadcellpin "foo/invtr/i0" param defaultloads 4 param longnetdelay 5 param mediumnetdelay 3 param shortnetdelay 1 param longnetrcresistance 1 param longnetrccapacitance 2 param mediumnetrcresistance 0.5 param mediumnetrccapacitance 1 param shortnetrcresistance 0.1 param shortnetrccapacitance 0.5 param defaultsetuptime {[expr $clockperiod * .375]} param defaultholdtime 0 param defaultc2qtime {[expr $clockperiod * .125]} param defaultc2qtimemin 0 param timescale 1 param capacitancescale 1 param lengthscale 1 param resistancescale 1 param defaultminusuncertainty 0.5 param defaultplusuncertainty 0.2 param defaultrisetime 0.0 param defaultfalltime {[expr $clockperiod / 2.0]} param defaultrisedelaymin 0.2 param defaultrisedelaymax 1.0 param defaultfalldelaymin 0.2 param defaultfalldelaymax 1.0 param defaultcriticalrange 1.0 param gatedelay 0.33 param gatedelaymin 0.29 timingtabledefaultfactor 1.2 timingtableglobalfactor 1.1 } chip "chip" { clock "sbClk" { period 10 plusuncertainty 0.3 minusuncertainty 0.6 risedelaymin 0.2 risedelaymax 0.4
} usetechnology foo } instance "read_regs" { port "ipReset_ni" {setuptime 2} port "ipMCmd_i" {setuptime 2} port "ipMAddr_i" {setuptime 2} port "ipMWidth_i" {setuptime 2} port "ipMData_i" {setuptime 2} port "ipSCmdAccept_o" {c2qtime 2} port "ipSResp_o" {c2qtime 2} port "ipSData_o" {c2qtime 2} maxdelay { delay 2 fromport MData_i toport SResp_o } falsepath { fromport MData_i toport SData_o } }
14
Syntax Conventions
Observe the following syntax conventions: Enclose all expr statements within braces, { } to differentiate between expressions that are to be evaluated while the file is being parsed and those that are to be evaluated during synthesis constraint file generation. Although not required by Tcl, enclose strings within quotation marks, , to show that they are different than keywords. Specify keywords using lower case.
Parameter values are specified using Tcl syntax. Expressions can use any of the technology or environment variables, and any of the following: clockperiod This variable should only be used in calculations of timing values for ports. When evaluating expressions that use $clockperiod, the program will determine which clock the port is relative to, determine its period (in nanoseconds), and apply that value to the equation. For example:
sbclockperiod This variable is set to the period of the main system clock typically referred to as the SB clock. It is typically used when a value needs to be some multiple of the sbclock, such as for non-OCP clocks. For example:
Version Section
The version of the core synthesis configuration file is required. Specify the version with the version command, for example: version 1.1
Clock Section
If you have non-OCP clocks for an IP block or want to specify the worstcasedelay of any clock used in the core, specify the names of the clocks in the core synthesis configuration file. Use the following syntax to specify the name of the clock and its worstcasedelay:
clock myClock
worstcasedelay The worst case delay value is the longest path through the core or instance for a particular clock. The value is used to check that the core can meet the timing requirements of the current design. To help make this value more portable, you may want to use the technology variable gatedelay. For example:
Area Section
The area is the size in gates of the core or instance. By specifying the size in gates the area can be calculated based on the size of a typical two input nand gate in a particular synthesis library. For example:
Constant values are specified in two input nand gate equivalents. For consistency, use expression that can be interpreted in gates.
clockname myClock
drivingcellpin This variable describes which cell in the synthesis library is expected to be driving the input. To maintain portability set this variable to use one of the technology values of high/medium/lowdrivegatepin. Values are a string that specifies the logical name of the synthesis library, the cell from the library, and the pin that will be driving an input for the core. The pin is optional. For example:
maxfanout This keyword limits the fanout of an input port to a specified number of fanouts. To maintain portability set this variable in terms of the technology variable defaultfanoutload.Constant values are specified in library units. For example:
Specify constant values as expressions that result in kOhms for resistance and picofarads (pf) for capacitance. For example:
wireloadresistance {[expr $resistancescale * .09]} wireloadcapacitance {[expr $capacitancescale * .12]} wireloadresistance {$mediumnetrcresistance} wireloadcapacitance {$mediumnetrccapacitance}
port <portName> { clockname <clockName> drivingcellpin <drivingCellName> setuptime <Value> holdtime <Value> maxfanout <Value> }
Examples
In the following example, the clock is not specified since this is an OCP port and is known to be controlled by the OCP clock. If a clock were specified as something other than the OCP clock, an error would result.
port <portName> { clockname <clockName> loadcellpin <loadCellPinName> loads <Value> wireloadresistance <Value> wireloadcapacitance <Value> wireloaddelay <Value> c2qtime <Value> c2qtimemin <Value> }
You cannot specify both wireloaddelay and wireloadresistance/capacitance for the same port.
Examples
In the following example, the clock is not specified since this is an OCP port and is known to be controlled by the OCP clock.
port SCmdaccept_o loadcellpin {$defaultloadcellpin} loads {$defaultloads} wireloadresistance {$mediumnetrcresistance} wireloadcapacitance {$mediumnetrccapacitance} c2qtime {[expr $clockperiod * 0.2]} }
In the following example, the clock to output time is required to be 2 ns. Time constants are assumed to be in nanoseconds. Use the timescale variable to convert library units to nanoseconds.
port SResp_o loadcellpin {$defaultloadcellpin} loads {$defaultloads} wireloadresistance {$mediumnetrcresistance} wireloadcapacitance {$mediumnetrccapacitance} c2qtime 2 }
The following example shows how to associate a clock to an output port.
In the following example, a maxdelay of 3 ns is specified for the combinational path between myInPort1 and myOutPort1. A maxdelay of 50% of the clockperiod is specified for the path between myInPort2 and myOutPort2. The braces around the expression delay evaluation until the expression is used by the mapping program.
maxdelay { delay 3 fromport myInPort1 toport myOutPort1 delay {[expr $clockperiod *.5]} fromport myInPort2 toport myOutPort2 }
version 1.1 port Reset_ni { drivingcellpin {$mediumgatedrivepin} setuptime {[expr $clockperiod * .5]} } port MCmd_i { drivingcellpin {$mediumgatedrivepin} setuptime {[expr $clockperiod * .9]} } port MAddr_i { drivingcellpin {$mediumgatedrivepin} setuptime {[expr $clockperiod * .5]} } port MWidth_i { drivingcellpin {$mediumgatedrivepin} setuptime {[expr $clockperiod * .5]} } port MData_i { drivingcellpin {$mediumgatedrivepin} setuptime {[expr $clockperiod * .5]} } port SCmdAccept_o { loadcellpin {$defaultloadcellpin}
loads {$defaultloads} wireloaddelay {$mediumnetdelay} c2qtime {[expr $clockperiod * .9]} } port SResp_o { loadcellpin {$defaultloadcellpin} loads {$defaultloads} wireloaddelay {$mediumnetdelay} c2qtime {[expr $clockperiod * .8]} } port SData_o { loadcellpin {$defaultloadcellpin} loads {$defaultloads} wireloaddelay {$mediumnetdelay} c2qtime {[expr $clockperiod * .8]} } maxdelay { delay 2 fromport MData_i toport SResp_o } falsepath { fromport MData_i toport SData_o }
15
Package File
The package file, written in Tcl format, contains information about the name of the core and the files that are to be packaged in an archive file. In the package file, specify the name of the core using the syntax:
model report scanscript scanvector synscript synshellscript test rtl wgl icon
ibis, spice, trace generator, etc. Reports about the core Scan generation scripts Scan vectors DC shell script to run synthesis Shell script to invoke synthesis program Test file RTL source file Wgl format test vector file Core icon file
set set set set set set set set set set set set set set set set set set set set
corename core files(billofmaterials) "core.pkg" files(chiprtlconf) "core_rtl.conf" files(chipsynconf) "core_syn.conf" files(corertlconf) "synek/synek_rtl.conf" files(coresynconf) "synek/synek_syn.conf" files(devicedriver) "software/synek.asm" files(doc) "myDocs/synekInstall.doc myDocs/synekDataSheet.doc" files(interfaceconf) "myInterfaces/myBundle_intfc.conf" files(method) "mytools/ocpanalyze mytools/ocpsimulate" files(model) "myModels/synek.ibis myModels/synek.tgen" files(report) "myReport/synek_syn.log myReport/synek_scantest.log" files(scanvector) "scanTest/synek.vec" files(scanscript) "scanTest/synek.scan" files(synscript) "synek/synek.scr" files(synshellscript) "synek/synek.sh" files(test)"mytest/master/config/test1.S mytest/slave/io/iorw1.PMDL" files(rtl) "synek/synek.v" files(wgl) "scanTest/synek.wgl" files(icon) "mips.gif"
16
Developers Guidelines
This chapter collects examples and implementation tips to help you make effective use of the Open Core Protocol and does not provide any additional specification material.
Signal Timing
The Open Core Protocol data transfer model allows many different types of existing legacy IP cores to be bridged to the OCP without adding expensive glue logic structures that include address or data storage. As such, it is possible to draw many state machine diagrams that are compliant with the OCP protocol. This section describes some common state machine models that can be used with the OCP, together with guidance on the use of those models. Two-way handshaking is the general principle of the dataflow signals in the OCP interface. A group of signals is asserted and must be held steady until the corresponding accept signal is asserted. This allows the receiver of a signal to force the sender to hold the signals steady until it has completely processed them. This principle produces implementations with fewer latches for temporary storage.
Request Phase
The request phase begins when the master drives MCmd to a value other than Idle. When MCmd != Idle, MCmd is referred to as asserted. All of the other request phase outputs of the master must become valid during the same clock cycle as MCmd asserted, and be held steady until the request phase ends. The request phase ends when SCmdAccept is sampled asserted (true) by the rising edge of Clk. The slave can assert SCmdAccept in the same cycle that MCmd is asserted, or stay negated for several Clk cycles. The latter choice allows the slave to force the master to hold its request phase outputs until the slave can accomplish its access without latching address or data signals.
The slave designer chooses the delay between MCmd asserted and SCmdAccept asserted based on the desired area, timing, and throughput characteristics of the slave. As the request phase does not begin until MCmd is asserted, SCmdAccept is a dont care while MCmd is not asserted so SCmdAccept can be asserted before MCmd. This allows some area-sensitive, low frequency slaves to tie SCmdAccept asserted, as long as they can always complete their transfer responsibilities in the same cycle that MCmd is asserted. Since an MCmd value of Idle specifies the absence of a valid command, the master can assert MCmd independently of the current setting of SCmdAccept. The highest throughput that can be achieved with the OCP is one data transfer per Clk cycle. High-throughput slaves can approach this rate by providing sufficient internal resources to end most request phases in the same Clk cycle that they start. This implies a combinational path from the masters MCmd output into slave logic, then back out the slaves SCmdAccept output and back into a state machine in the master. If the master has additional requests to present, it can start a new request phase on the next Clk cycle. Achieving such high throughput in high-frequency systems requires careful design including cycle time budgeting as described in Level2 Timing on page 174.
Response Phase
The response phase begins when the slave drives SResp to a value other than NULL. When SResp != NULL, SResp is referred to as asserted. All of the other response phase outputs of the slave must become valid during the same Clk cycle as SResp asserted, and be held steady until the response phase ends. The response phase ends when MRespAccept is sampled asserted (true) by the rising edge of Clk; if MRespAccept is not configured into a particular OCP, MRespAccept is assumed to be always asserted (that is, the response phase always ends in the same cycle it begins). If present, the master can assert MRespAccept in the same cycle that MResp is asserted, or it may stay negated for several Clk cycles. The latter choice allows the master to force the slave to hold its response phase outputs so the master can finish the transfer without latching the data signals. Since the response phase does not begin until SResp is asserted, MRespAccept is a dont care while SResp is not asserted so MRespAccept can be asserted before SResp. Since an SResp value of NULL specifies the absence of a valid response, the slave can assert SResp independently of the current setting of MRespAccept. In high-throughput systems, careful use of MRespAccept can result in significant area savings. To maintain high throughput, systems traditionally introduce pipelining, where later requests begin before earlier requests have finished. Pipelining is particularly important to optimize Read accesses to main memory. The OCP supports pipelining with its basic request/response protocol, since a master is free to start the second request phase as soon as the first has finished (before the first response phase, in many cases). However, without MRespAccept, the master must have sufficient storage resources to receive all
of the data it has requested. This is not an issue for some masters, but can be expensive when the master is part of a bridge between subsystems such as computer buses. While the original system initiator may have enough storage, the intermediate bridge may not. If the slave has storage resources (or the ability to flow control data that it is requesting), then allowing the master to de-assert MRespAccept enables the system to operate at high throughput without duplicating worst-case storage requirements across the die. Most simple or low-throughput slave IP cores need not implement MRespAccept. Misuse of MRespAccept makes the slaves job more difficult, because it adds extra conditions (and states) to the slaves logic. Since the OCP is a point-to-point interface, Write commands can be considered complete as soon as the data has been accepted by the slave (at the end of the request or datahandshake phases, depending on the configuration of the OCP). As a result, only Read commands require a response phase on the OCP. A common term for this type of write behavior is write posting.
Datahandshake Phase
The datahandshake phase offers throughput advantages for high-performance subsystems (particularly memory). When the datahandshake phase is not present in a configured OCP, MData becomes a request phase signal. The datahandshake phase begins when the master asserts MDataValid. The other datahandshake phase outputs of the master must become valid during the same Clk cycle while MDataValid is asserted, and be held steady until the datahandshake phase ends. The datahandshake phase ends when SDataAccept is sampled asserted (true) by the rising edge of Clk. The slave can assert SDataAccept in the same cycle that MDataValid is asserted, or it can stay negated for several Clk cycles. The latter choice allows the slave to force the master to hold its datahandshake phase outputs so the slave can accomplish its access without latching data signals. The datahandshake phase does not begin until MDataValid is asserted. While MDataValid is not asserted, SDataAccept is a dont care. SDataAccept can be asserted before MDataValid. Since MDataValid not being asserted specifies the absence of valid data, the master can assert MDataValid independently of the current setting of SDataAccept.
Request Phase
If enabled, a slaves SThreadBusy request phase output should not depend upon the current state of any other OCP signal. SThreadBusy should be stable early enough in the cycle so that the master can factor the current SThreadBusy into the decision of which thread to present a request; that is, all of the masters request phase outputs may depend upon the current SThreadBusy. SThreadBusy is a hint so the master is not required to include a combinational path from SThreadBusy into MCmd, but such paths would be typical. The slave should guarantee that SThreadBusy becomes stable early in the OCP cycle. A common goal is that SThreadBusy be driven directly from a flip-flop in the slave. A masters request phase outputs should not depend upon any current slave output other then SThreadBusy. This ensures that there is no combinational loop in the case where the slaves SCmdAccept depends upon the current MCmd. If a slaves SCmdAccept request phase output is based upon the masters request phase outputs from the current cycle, there is a combinational path from MCmd to SCmdAccept. Otherwise, SCmdAccept may be driven directly from a flip-flop, or based upon some other OCP signals. It is legal for SCmdAccept to be derived from MRespAccept. This case arises when the slave delays SCmdAccept to force the master to hold the request fields for a multicycle access. Once read data is available, the slave attempts to return it by asserting SResp. If the OCP has MRespAccept enabled, the slave then must wait for MRespAccept before negating SResp, so it may need to continue to hold off SCmdAccept until it sees MRespAccept asserted. While the phase relationships of the OCP specification do not allow the response phase to end before the request phase, it is legal for both phases to complete in the same OCP cycle. The worst-case combinational path for the request phase could be:
Clk -> SThreadBusy -> MCmd -> SResp -> MRespAccept -> SCmdAccept -> Clk
The preceding path has too much latency at typical clock frequencies. Fortunately, a multi-threaded slave (with SThreadBusy enabled) is not likely to exhibit non-pipelined read behavior, so this path is unlikely to prove useful. Slave designers need to limit the combinational paths visible at the OCP. By pipelining the read request, the previous path could be:
Clk -> SThreadBusy -> MCmd -> Clk Clk -> SCmdAccept -> Clk # Slave accepts if pipeline reg empty Clk -> SResp -> Clk Clk -> MRespAccept -> Clk
Response Phase
If enabled, a master's MThreadBusy response phase output should not be dependent upon the current state of any other OCP signal. From the perspective of the OCP, MThreadBusy should become stable early enough in the cycle
that the slave can factor the current MThreadBusy into the decision on which thread to present a response; that is, all of the slaves response phase outputs may depend upon the current MThreadBusy. MThreadBusy is a hint so the slave is not required to include a combinational path from MThreadBusy into SResp, but such paths would be typical. Therefore, the master should guarantee that MThreadBusy becomes stable early in the OCP cycle. A common goal is that MThreadBusy be driven directly from a flip-flop in the master. The slaves response phase outputs should not depend upon any current master output other than MThreadBusy. This ensures that there is no combinational loop in the case where the masters MRespAccept depends upon the current SResp. The masters MRespAccept response phase output may be based upon the slaves response phase outputs from the current cycle or not. If this is true, there is a combinational path from SResp to MRespAccept. Otherwise, MRespAccept can be driven directly from a flip-flop; MRespAccept should not be dependent upon other master outputs.
Datahandshake Phase
The masters datahandshake phase outputs should not depend upon any current slave output other than SThreadBusy. This ensures that there is no combinational loop in the case where the slaves SDataAccept depends upon the current MDataValid. The slaves SDataAccept output may or may not be based upon the masters datahandshake phase outputs from the current cycle. In the former case, there is a combinational path from MDataValid to SDataAccept. In the latter case, SDataAccept should be driven directly from a flip-flop; SDataAccept should not be dependent upon other master outputs.
Figure 22
Request Group
Master
60%
Slave
Slave
Information about space available on the per-port buffers comes out of a latch and is used to generate SThreadBusy information, which must be generated within the initial 10% of the OCP cycle (as described in Level2 Timing on page 174). These signals are also used to generate SCmdAccept: if a particular port has room, a command on the corresponding thread is accepted. The correct port information is selected through a mux driven by MThreadID at 50% of the clock cycle, making it easy to produce SCmdAccept by 75% of the OCP cycle. When the request group arrives at 60% of the OCP cycle, it is used to update the buffer status, which in turn becomes the SThreadBusy information for the next cycle.
Master
The master keeps information on what threads have commands ready to be presented (thread valid bits). When SThreadBusy arrives at 10% of the OCP clock, it is used to mask off requests, that is any thread that has its SThreadBusy signal set is not allowed to participate in arbitration for the OCP. The remaining thread valid bits are fed to thread arbitration, the result of which is the winning thread identifier, MThreadId. This is passed to the slave at 50% of the OCP clock period. It is also used to select the winning threads request group, which is then passed to the slave at 60% of the clock period. When the SCmdAccept signal arrives from the slave, it is used to compute the new thread valid bits for the next cycle.
Sequential Master
The first example is a medium-throughput, high-frequency master design. To achieve high frequency, the implementation is a completely sequential (that is, Moore state machine) design. Figure 23 shows the state machine associated with the masters OCP.
Figure 23 Sequential Master
~(WrReq | RdReq)
Idle
Wr Re q
MCmd=Idle
q Re L) Rd pt UL ce N Ac != md p SC es SR &(
cc ep t
~SCmdAccept
~SCmdAccept
SResp != NULL
SC md A
Write
MCmd=Write
Read
MCmd=Read
Wait Resp
MCmd=Idle
State
Not shown is the internal circuitry of the master. It is assumed that the master provides the state machine with two control wire inputs, WrReq and RdReq, which ask the state machine to initiate a write transfer and a read transfer, respectively. The state machine indicates back to the master the completion of a transfer as it transitions to its Idle state. Since this is a Moore state machine, the outputs are only a function of the current state. The master cannot begin a request phase by asserting MCmd until it has entered a requesting state (either write or read), based upon the WrReq and RdReq inputs. In the requesting states, the master begins a request phase that continues until the slave asserts SCmdAccept. At this point, a Write command is complete, so the master transitions back to the idle state. In case of a Read command, the next state is dependent upon whether the slave has begun the response phase or not. Since MRespAccept is not avail-
able in the Basic OCP, the response phase always ends in the cycle it begins, so the master may transition back to the idle state if SResp is asserted. If the response phase has not begun, then the next state is wait resp, where the master waits until the response phase begins. The maximum throughput of this design is one transfer every other cycle, since each transfer ends with at least one cycle of idle. The designer could improve the throughput (given a cooperative slave) by adding the state transitions marked with dashed lines. This would skip the idle state when there are more pending transfers by initiating a new request phase on the cycle after the previous request or response phase. Also, the Moore state machine adds up to a cycle of latency onto the idle to request transition, depending on the arrival time of WrReq and RdReq. This cost is addressed in Combinational Master on page 165. The benefits of this design style include very simple timing, since the master request phase outputs deliver a full cycle of setup time, and minimal logic depth associated with SResp.
Sequential Slave
An analogous design point on the slave side is shown in Figure 24. This slaves OCP logic is a Moore state machine. The slave is capable of servicing an OCP read with one Clk cycle latency. On an OCP write, the slave needs the master to hold MData and the associated control fields steady for a complete cycle so the slaves write pulse generator will store the desired data into the desired location. The state machine reacts only to the OCP (the internal operation of the slave never prevents it from servicing a request), and the only non-OCP output of the state machine is the enable (WE) for the write pulse generator.
Figure 24 Sequential OCP Slave
(MCmd == Idle)
Idle
Wr it e ) ==
d Cm (M ==
(M Cm d
) ad Re
Write
SCmdAccept SResp=NULL WE
Read
SCmdAccept SResp=DVA ~WE
Legend:
State
The state machine begins in an idle state, where it de-asserts SCmdAccept and SResp. When it detects the start of a request phase, it transitions to either a read or a write state, based upon MCmd. Since the slave will always complete its task in one cycle, both active states end the request phase (by asserting SCmdAccept), and the read state also begins the response phase. Since there is no MRespAccept in the Basic OCP, the response phase will end in the same cycle it begins. Finally, the state machine triggers the write pulse generator in its write state, since the request phase outputs of the master will be held steady until the state machine transitions back to idle. As in the sequential master shown in Figure 23 on page 163, this state machine limits the maximum throughput of the OCP to one transfer every other cycle. There is no simple way to modify this design to achieve one transfer per cycle, since the underlying slave is only capable of one write every other cycle. With a Moore machine representation, the only way to achieve one transfer per cycle is to assert SCmdAccept unconditionally (since it cannot react to the current request phase signals until the next Clk cycle). Solving this performance issue requires a combinational state machine. Since the outputs depend upon the state machine, the sequential OCP slave has attractive timing properties. It will operate at very high frequencies (providing the internal logic of the slave can run that quickly). This state machine can be extended to accommodate slaves with internal latency of more than one cycle by adding a counting state between idle and one or both of the active states.
Combinational Master
Sequential Master on page 163 describes the transfer latency penalty associated with a Moore state machine implementation of an OCP master. An attractive approach to improving overall performance while reducing circuit area is to consider a combinational Mealy state machine representation. Assuming that the internal master logic is clocked from Clk, it is acceptable for the master's outputs to be dependent on both the current state, the internal RdReq and WrReq signals, and the slave's outputs, since all of these are synchronous to Clk. Figure 25 shows a Mealy state machine for the OCP master. The assumptions about the internal master logic are the same as in Sequential Master on page 163, except that there is an additional acknowledge (Ack) signal output from the state machine to the internal master logic to indicate the completion of a transfer. This state machine asserts MCmd in the same cycle that the request arrives from the internal master logic, so transfer latency is improved. In addition, the state machine is simpler than the Moore machine, requiring only two states instead of four. The request state is responsible for beginning and waiting for the end of the request phase. The wait resp state is only used on Read commands where the slave does not assert SResp in the same cycle it asserts SCmdAccept. The arcs described by dashed lines are optional features that allow a transition directly from the end of the response phase into the beginning of the request phase, which can reduce the turn-around delay on multicycle Read commands.
Figure 25
Request
Wait Resp
Legend: ~(WrReq | RdReq) /MCmd=Idle, ~Ack WrReq/MCmd=Write, Ack=SCmdAccept RdReq & ~SCmdAccept/MCmd=Read, ~Ack RdReq & SCmdAccept & (SResp != NULL) /MCmd=Read, Ack
State
Input/output Required Arc Optional Arc
The cost of this approach is in timing. Since the master request phase outputs become valid a combinational logic delay after RdReq and WrReq, there is less setup time available to the slave. Furthermore, if the slave is capable of asserting SCmdAccept on the first cycle of the request phase, then the total path is:
Clk -> (RdReq WrReq) -> MCmd -> SCmdAccept -> Clk.
To successfully implement this path at high frequency requires careful analysis. The effort is appropriate for highly latency-sensitive masters such as CPU cores. At much lower frequencies, where area is often at a premium, the Mealy OCP master is attractive because it has fewer states and the timing constraints are much simpler to meet. This style of master design is appropriate for both the highest-performance and lowest-performance ends of the spectrum. A Moore state machine implementation may be more appropriate at medium performance.
Combinational Slave
Achieving peak OCP data throughput of one transfer per cycle is most commonly implemented using a combinational Mealy state machine implementation. If a slave can satisfy the request phase in the cycle it begins and deliver read data in the same cycle, the Mealy state machine representation is degenerate - there is only one state in the machine. The state machine always asserts SCmdAccept in the first request phase cycle, and asserts SResp in the same cycle for Read commands.
Figure 26
Idle
Legend: (MCmd == Read)/ SCmdAccept, SResp=DVA
State
The implementation shown in Figure 26, offers the ideal throughput of one transfer per cycle. This approach typically works best for low-speed I/O devices with FIFOs, medium-frequency but low-latency asynchronous SRAM controllers, and fast register files. This is because the timing path looks like:
Clk -> (master Logic) -> MCmd -> (Access internal slave resource) -> SResp -> Clk.
This path is simplest to make when:
SResp can be determined based only on MCmd assertion (and not other request phase fields nor internal slave conditions)
To satisfy the access time and operating frequency constraints of higherperformance slaves such as main memory controllers, the OCP supports transfer pipelining. From the state machine perspective, pipelining splits the slave state machine into two loosely-coupled machines: one that accepts requests, and one that produces responses. Such machines are particularly useful with the burst extensions to the OCP.
Burst Extension
The burst extension is typically used along with pipelined master and slave devices. For a pipelined OCP device, the request phase is de-coupled from the response phase - that is, the request phase may begin and end several cycles before the associated response phase begins and ends. As such, it is useful to think of separate, loosely-coupled state machines to support either the master or the slave.
Datahandshake Extension
The datahandshake extension allows the de-coupling of a write address from write data. It is typically only useful for master and slave devices that require the throughput advantages available through transfer pipelining.
Threads
The thread capability relies on a thread ID to identify and separate independent transfer streams (threads). The master labels each request with the thread ID that it has assigned to the thread. The thread ID is passed to the slave on MThreadID together with the request (MCmd). When the slave returns a response, it also provides the thread ID (on SThreadID) so the master knows which request is now complete. The transfers in each thread must remain in-order with respect to each other (as in the basic OCP), but the order between threads can change between request and response. The thread capability allows a slave device to optimize its operations. For instance, a multi-bank SDRAM controller could respond to a second read request referencing an open page before opening a new page in a different bank to service a first read on a different thread. In multi-threaded, multi-initiator systems, it is frequently useful to associate a transfer request with a thread operating on a particular initiator. Initiator identification enables busy targets (such as multi-bank SDRAM controllers) to prioritize requests. For instance, a memory controller that is completing a first burst transaction needs to determine which of several pending requests to service next. In many systems, the decision should be based upon priority; for example, real-time frame buffer transfers have higher priority than polygon data transfers. The priority is typically associated with the requesting initiator. In some cases, the initiator has several assigned priorities based upon the type of request. If different request types are assigned to different system initiator threads, the priority is dependent upon a thread within a particular initiator. For devices where these concerns are important, the OCP complex extensions support connections.
Connections
Connections are closely related to threads, but can have end-to-end meaning in the system, rather than the local meaning (that is, master to slave) of a thread. The connection ID and thread ID seem to provide similar functionality, so it is useful to consider why the OCP needs both. A thread ID is an identifier of local scope that simply identifies transfers between the master and slave. In contrast, the connection ID is an identifier of global scope that identifies transfers between a system initiator and a system target. A thread ID must be small enough (that is, a few bits) to efficiently index tables or state machines within the master and slave. There are usually more connection IDs in the system than any one slave is prepared to simultaneously accept. Using a connection ID in place of a thread ID requires expensive matching logic in the master to associate the returned connection ID (from the slave) with specific requests or buffer entries.
Using a networking analogy, the thread ID is a level-2 (data link layer) concept, whereas the connection ID is more like a level-3 (transport/session layer) concept. Some OCP slaves only operate at level-2, so it doesnt make sense to burden them or their masters with the expense of dealing with level3 resources. Alternatively, some slaves need the features of level-3 connections, so in this case it makes sense to pass the connection ID through to them. From a state machine perspective, multi-threaded behavior is most frequently implemented using one state machine per thread. The only added complexity is the arbitration scheme between threads. This is unavoidable, since the entire purpose for building a multi-threaded OCP is to support concurrency, which directly implies contention for any shared resources, such as the OCP wires. The MDataThreadID signal simplifies the implementation of the datahandshake extension along with threading, by providing the thread ID associated with the current write data transfer. Without this signal, the slave state machines must cooperate to track the Write command order (among threads). The thread busy signals provide status information that allows the master's arbiter to determine which threads will not accept requests. That information also allows the slave's arbiter to determine which threads will not accept responses. These signals provide for cooperation between the master and the slave to ensure that requests do not get presented on busy threads.
Sideband Signals
The sideband signals provide a means of transmitting control-oriented information. Since the signals are rarely performance sensitive, implementors are strongly encouraged to ensure that all sideband signals are driven stable very early in the OCP clock cycle; that is, that sideband outputs come directly out of core flip-flops. Sideband inputs should similarly be allowed to arrive very late in the OCP clock cycle; that is, sideband inputs should be registered almost immediately by the receiving core. Cores that do not implement this conservative timing may require modification to achieve timing convergence.
Scan Control
The width and meaning of the Scanctrl field is user-defined. At a minimum this field carries a signal to specify when the device is in scan chain shifting mode. The signal can be used for the scan clock if scan-clock style flip-flops are being used. When this is a multi-bit field, another common signal to carry would be one specifying the scan mode. This signal can be used to put the IP core into any special test mode that is necessary before scanning and application of ATPG vectors can begin.
Clock Control
The clock control test extensions are included to ease the integration of IP cores into full or partial scan test environments and support of debug scan operations in designs that use clock sources other than Clk. When an external clock source exists (for example, non-Clk derived clock), the ClkByp signal specifies a bypass of the external clock. In that case the TestClk signal usually becomes the clock source. The TestClk toggles in the correct sequence for applying ATPG vectors, stopping the internal clocks, and doing scan dumps as required by the user.
17
Timing Guidelines
To provide core timing information to system designers, characterize each core into one of the following timing categories: Level0 identifies the core interface as having been designed without adhering to any specific timing guidelines. Level1 timing represents conservative interface timing. Level2 represents high performance interface timing.
One category is not necessarily better than another. The timing categories are an indication of the timing characteristics of the core that allow core designers to communicate at a very high level about the interface timing of the core. Table 38 represents the inter-operability of two OCP interfaces.
Table 38 Core Interface Compatibility
Level0
Level0 Level1 Level2 X X X
Level1
X V V
Level2
X V V*
X V
V* high performance inter-operability but some minor changes may be required The timing guidelines apply to dataflow and sideband signals only. There is no timing guideline for the scan and test related signals.
Timing numbers are specified as a percentage of the minimum supported clock-cycle (at maximum operating frequency). If a core is specified at 100MHz and the c2qtime is given as 30%, the actual c2qtime is 3ns.
Level0 Timing
Level0 timing indicates that the core developer has not followed any specific guideline in designing the core interface. There is no guarantee that the interface can operate with any other core interface. Inter-operability for the core will need to be determined by comparing timing specifications for two interfaces on a per-signal basis.
Level1 Timing
Level1 timing indicates that a core has been developed for minimum timing work during system integration. The core uses no more than 25% of the clock period for any of its signals, either at the input (setuptime) or at the output (outputtime). A core interface in this category must not use any of the combinational paths allowed in the OCP interface. Since inputs and outputs each only use 25%, 50% of the cycle remains available. This means that a Level1 core can always connect to other Level1 and Level2 cores without requiring any additional modification.
Level2 Timing
Level2 timing indicates that a core interface has been developed for high performance timing. A Level2 compliant core provides or uses signals according to the timing values shown in Table 39. There are separate values for single-threaded and multi-threaded OCP interfaces. The number for each signal indicates the percentage of the minimum cycle time at which the signal is available, that is the outputtime at the output. Setuptime at the input is calculated by subtracting the number given from the minimum cycle time. For example, a time of 30% indicates that the outputtime is 30% and the setuptime is 70% of the minimum clock period. In addition to meeting the timing indicated in Table 39, a Level2 compliant core must not use any combinational paths other than the preferred paths listed in Table 40. There is no margin between outputtime and setuptime. When using Level2 cores, extra work may be required during the physical design phase of the chip to meet timing requirements for a given technology/library.
Table 39
Signal
Control, Status ControlBusy, StatusBusy ControlWr, StatusRd Datahandshake Group (excluding MDataThreadID) MDataThreadID MRespAccept MThreadBusy MThreadID Request Group (excluding MThreadID) Reset_n Response Group (excluding SThreadID) SCmdAccept SDataAccept SError, SFlag, SInterrupt, MFlag SThreadBusy SThreadID
Table 40
Core
Master
From
SThreadBusy SThreadBusy Response Group
To
Request Group Datahandshake Group MRespAccept Response Group SCmdAccept and SDataAccept SCmdAccept and SDataAccept
Slave
Index
A
addr_base statement 102 addr_size statement 102 addr_wdth parameter 12 address bus bridges 168 pipelining 168 region statement 102 space 9 STL 87 transfer 14 addrspace 14 addrspace_wdth 14 arbitration, shared resource 8 area savings 158, 165 ATPG vectors 171 nets 101, 111 signals 116 statement 116 type 108 burst alignment 36 burst_aligned parameter 36 code 14, 31 configuration parameters 74 custom 34 custom patterns 32 definition 31 extension 168 field values 31 fill 88 fill zero 89 incrementing 32 latency 74 linking transfers 14 maximal packing 32 MBurst signal 14 option coding 110 param statement 108, 109 packing 32 parameter 14 read macro 90 sequenced address 9 size 36 state machine 167 streaming 32, 34 support reporting instructions 62, 63 signals 14 transfers 168 type 14, 32 write numeric 89 random 89 burst_aligned parameter 36 burstlat_1thread 74 burstlat_cycles 74 burstlat_enable 74 bus bridges 168 independence 8 of signals 116 wrapper interface module 2 bwrite 89 byte enable field 15 MByteEn signal 15 STL 88
B
basic OCP signals 12 BCST 13 be field 88 behavioral model multi-threaded 79 OCP Merger 79 OCPMON 81 QCMaster 69 QCSlave 72 QSMaster 69 QSSlave 72 stimulus 85 behavioral statements 91 bfill 88 bread 90 Broadcast command description 8 device support 35 transfer affects 31 broadcast_enable parameter 35 buffer, large load 125 bundle characteristics 101 configuring 108, 109 defining core 101 non-OCP interface 115 testbench 108 external signals 110 name 100
C
c2qtime default 121 port constraints 145 timing 56 variable description 130 c2qtimemin 130, 145 capacitance nets 126 units 120 wireload 132, 147 wireloaddelay 132, 147 capacitancescale variable 120 capacitive load 57 cell library name 57 cfill 88 chip synthesis configuration file clocks 128 creating 119 parts 120 ports 130 sample 138 specifying technology variables 134 technology variables 120 chip_stmt 107 chipparam variable 134, 143 CID connection ID 86 Clk signal function 12 summary 21 ClkByp signal function 19 summary 22 test extensions 171 timing 30 clkctrl_enable parameter 20 clock attributes 120 bypass signal 20 control test extensions 171 description syntax 136 fall time 122, 124 gated test 20 name 136 non-OCP 144 period chip synconf file 128 rise/fall time 129 variable 129
portname 145 rise time 123, 129 signal 12 skew best case 123 hold time 129 setup time 129 worst case 123 test 20 to output time 130 variables 128 waveform editing 130 rise/fall time 124, 129 clockName 144 clockname field 145 clockperiod variable 133, 142 cmd field 86 combinational dependencies 26, 175 Mealy state machine 165 paths 59, 150 slave state machine 166 commands basic 8 extensions 8 mnemonic 13 required 36 comments 112 compatibility rules 36 complex OCP signals 15, 168 concurrency 168 configurable interfaces 102 connection description 169 identifier CID 86 definition 169 field 15 support 62, 63 transfer handling 35 uses 10 location 108, 109 statement 108 transfers 35 width mismatch 39 connid parameter 15 connid_wdth parameter 15 control event signal 18 field 18 information specifying 18 timing 30 parameter 18
Index 179
statement 93 timing 30 Control signal emulation 78 function 17 summary 21 timing 30 control_wdth parameter 18 controlbusy parameter 18 ControlBusy signal emulation 79 function 17 summary 21 timing 30 controlobserve_addr 79 controlreg_addr 78 controlwr parameter 18 ControlWr signal 30 emulation 79 function 17 summary 21 core area 61, 144 code 98 compliance 3 control busy 18 event 18 information 18 delay calculation 125 description language 98 documentation 61 documentation template 66 frequency range 61 hold time 131 ID 61 interconnecting 58 interface defining 100 description 110 timing parameters 56 name 61 non-OCP interfaces 108 power consumption 61 process dependent 61 revision 98 RTL configuration file 97 status busy 18 event 19 information 18 synthesis configuration file defining 141 overriding 120 timing 55, 173 core_id statement 98
core_language statement 98 core_name 97 core_stmt 97 cread 90 custom bursts attributes 32 types 34 cwrite 89 cycle command 86 czero 89
D
data field 88 data width 9 data_wdth parameter 13 dataflow signals definitions 12 naming 12 timing 27 datahandshake extension 168 intra-phase output 161 parameter 15 phase active 28 order 28 timing 159 QMaster emulation 71 sequence 159 signal group 26 signals 14 throughput 159 ddr_space statement 102 debug and test interface 20, 171 defaultc2qtime variable 121 defaultc2qtimemin variable 121 defaultcriticalrange variable 121 defaultfalldelaymax variable 122 defaultfalldelaymin variable 122 defaultfalltime variable 124 defaultfanoutload variable 122 defaultholdtime variable 122 defaultloadcellpin variable 122 defaultloads variable 123 defaultminusuncertainty variable 123 defaultplusuncertainty variable 123 defaultrisedelaymax variable 123 defaultrisedelaymin variable 123 defaultrisetime variable 124 defaultsetuptime variable 124
direction statement 117 DRAM page-mode operation 168 driver strength 57 drivingcellpin parameter description 131 timing requirements 57 values 146 dumpOff 83 DVA 13
H
highdrivegatepin variable 125 high-frequency design 163 hold time checking 56 clock skew 129 input port 131 value 122 holdtime description 131, 146 timing 56
E
ERR 13 error report mechanisms 10 response emulation 77 signal 17 signal emulation 75 slave 18 error_delay 76 error_resetaddr 76 error_setaddr 76 exclusive access 8 explicit width macro 90 extended OCP signals 14, 168
I
icon statement 98 idle macro 91 request cycles 82 implementation restrictions 62, 63 incrementing bursts attributes 32 rules 33 inout ports 148, 149 input load 57 port syntax 148 signal timing 56 instance body 109 location 108, 109 name 109 size 144 statement 109 interconnect delays 125, 126 interface characteristics 62 clock control 20 compatibility 36 configurable 102 configuration file 115 connections 110 core RTL description 97 debug and test 20 defining 110 location 102, 108, 109 multiple 100 parameters 102 scan 19 statement 100, 110 type statement 101
F
falldelaymax variable 128 falldelaymin variable 128 falltime variable 129 false path constraints 59, 151 falsepath parameter 59, 137, 151 fanout maximum 147 net 122 port 132 value 122 FIFO modeling 72 flags core-specific 17 master 17 slave 18 flow-control 62, 63 force_aligned parameter 35
G
gate delay 124 load calculations 122 specifying 131
Index 181
types 116 interface_type statement 111 interface_types statement 116 interfaceparam variable 134, 143 internal memory 73 scan techniques 171 interrupt parameter 18 processing 10 signal 17, 75 slave 18 interrupt_delay 76 interrupt_resetaddr 75 interrupt_setaddr 75
M
macro statement 88 MAddr signal function 12 signal mismatch 37 summary 21 MAddrSpace signal function 14 signal mismatch 37 summary 21 uses 168 mask value 88 master flags 17 interface documentation 62 reset 17, 29 response accept 15 signal compatibility 36 slave interaction 159 thread busy 16 maxdelay parameter description 150 syntax 137 timing 59 maxfanout variable 132, 147 maximal packing 32 maxoperatingconditions variable 126 maxtest_size 70 MBurst signal function 14 linking OCP transfers 168 summary 21 MByteEn signal function 14 signal mismatch 37 summary 21 MCmd signal function 12 signal mismatch 37 summary 21 MConnID signal function 15 signal mismatch 38 summary 21 MData signal data valid 15 description 12 request phase 159 signal mismatch 37 summary 21 mdata_delay 71 MDataThreadID signal datahandshake 170
J
jtag_enable parameter 20 jtagtrst_enable 20
L
latency bursts 74 QCSlave 73 QSSlave 73 random 74 sensitive master 166 variable 74 least significant byte 13 level0 timing 173, 174 level1 timing 173, 174 level2 timing 173, 174 limitreq_enable 75 limitreq_max 75 little-endian 9, 13 loadcellpin description 131, 146 timing 57 loads parameter description 131, 146 timing 57 location statement 102, 108, 109 longest path 145 longnetdelay variable 125 longnetrccapacitance variable 126 longnetrcresistance variable 126 lowdrivegatepin variable 125
function 15 signal mismatch 38 summary 21 MDataValid signal datahandshake 159 function 14 signal mismatch 37 summary 21 timing 159 mediumdrivegatepin variable 125 mediumnetdelay variable 125 mediumnetrccapacitance variable 126 mediumnetrcresistance variable 126 mem_2size 72 meminit_fixed 73 meminit_fixeddata 73 meminit_preloadb 73 meminit_preloadh 73 memory exclusive access 8 initialization 73 initialization file 73 model 72 size 72 special locations 75 target 72 mflag parameter 17 MFlag signal 17 signal mismatch 39 summary 21 mflag_addr 76 mflag_wdth parameter 17 minoperatingconditions variable 126 minusuncertainty variable 129 module name 109 MRespAccept signal definition 14 delay values 71 QMaster emulation 71 response phase 158 response phase output 161 saving area 158 signal mismatch 37 summary 21 uses 168 mrespaccept_delay 71 mthreadbusy parameter 16 MThreadBusy signal control 71 definition 15 information 35 intra-phase output 160 QMaster emulation 71
summary 21 timing cycle 29 mthreadbusy_delay 71 MThreadID signal function 15 signal mismatch 38 summary 21 multiplexed bus bridges 168 multi-threaded behavioral model 79
N
natural transfer size 32 ndata 89 nets bundle 111 capacitance 126 characterizing 117 closest 126 delays 125 fanout 122 load 123 longest 126 renaming 111 resistance 126 statement 116 nfill 88 nread 90 NULL 13 nwrite 89 nzero 89
O
OCP trace file 83 OCP Merger 79 ocpasm 70 OCPMON 81 optimization paths 121 out-of-band information 17 output port syntax 149 signal timing 56
P
package file 153 packing 32, 34 param statement chip interface 110 instance 109 RTL chip 108
Index 183
param variable 133, 143 path longest 130, 132, 145 optimization 121 shortest 130, 131, 145 period variable 129 phase datahandshake 159 intra-phase 159 ordering between transfers 29 request 157 within transfer 28 protocol 27 timing 157 transfer 28 physical design parameters 57 pin driving input 131 output 131 level timing 56 load 122 pipeline decoupling request phase 168 request/response protocol 158 support 62, 63 transfer 167 without MRespAccept 158 write data, slave 15 plusuncertainty variable 129 point-to-point signals 8 port constraint variables 145 delay 150 fanout 132 inout 148, 149 input, syntax 148 longest path 130 module names 101 net delay 132 output load calculations 123, 131 number of loadcellpins 131 syntax 149 renaming 101 setup time 124 shortest path 130 statement 101, 111 timing constraints 141 timing value best-case 121 worst case 121, 125 posted write 9 power consumption estimates 61
low 125 pre-fetching 168 prefix command 101 prefix statement 111 protocol phases mapping 28 order 28 rules 28
Q
QCMaster configuration 70 features 69 instantiating 109 memory 70 multiple 79 multi-threaded model 70 test vectors 70 QCSlave configuration 72 features 72 memory modeling 72 multiple threads 72 QSMaster configuration 70 features 69 memory 70 multi-threaded model 70 test vectors 70 QSSlave configuration 72 description 72 error 76 flag 77 interrupt 76 memory modeling 72 multiple threads 72
R
randmode_enable 74 rdata 89 RDEX 13 rdfifo_ addr 78 enable 78 high 78 low 78 rate 78 size 78 read data field 13 FIFO 77 optimizing access 158
pre-fetch 168 Read command response phase 159 transfer affects 31 readcontrol statement 93 ReadEx command description 8 device support 35 emulation 72 transfer affects 31 readex_enable parameter 35 request idle cycles 82 number 75 phase decoupling 168 intra-phase 160 order 28 outputs 157, 160 signal group 28 timing 157 transfer ordering 29 worst-case combinational path 160 prioritize 169 signals active 27 group 26 thread identifier 16 reset OCP 29 parameter 17 signal 17 special requirements 62 Reset_n signal function 17 required cycles 30 signal mismatch 38 summary 21 timing 29 resistance nets 126 units 127 wireload 132, 147 wireloaddelay 132, 147 resistancescale variable 127 respaccept parameter 15 resperr_enable 77 resperr_rate 77 response accept 15 encoding 13 field 13 flow control signals 14 mnemonics 13 phase
active 28 intra-phase 160 order 29 slave 161 timing 158 required types 36 signal group 26 thread identifier 16 revision_code 98 risedelaymax variable 129 risedelaymin variable 129 risetime variable 129
S
sbclockperiod variable 133, 143 scan clock 171 control 171 data in 19 out 19 interface signals 19 mode control 19 Scanctrl signal 19 Scanin signal 19 Scanout signal 19 test environments 171 test mode 171 scancrtl_wdth parameter 19 Scanctrl signal function 19 summary 22 uses 171 Scanin signal function 19 summary 22 timing 30 Scanout signal function 19 summary 22 timing 30 scanport_wdth parameter 19 SCmdAccept signal definition 12 emulation 75 request phase 157 request phase output 160 signal mismatch 37 summary 21 scmdaccept_delay 75 SData signal function 12 signal mismatch 37 summary 21
Index 185
SDataAccept signal datahandshake 159 emulation 75 function 14 signal mismatch 38 summary 21 sdataaccept_delay 75 serror parameter 18 SError signal 17 signal mismatch 38 summary 21 setMaxIdles 82 setuptime description 132, 147 timing 56 sflag parameter 18 SFlag signal behavioral models 75 function 17 signal mismatch 38 summary 22 sflag_addr 76 sflag_wdth 18 sflagmask_addr 76 shared resource arbitration 8 shortest path 145 shortnetdelay variable 125 shortnetrccapacitance variable 126 shortnetrcresistance variable 126 sideband signals definitions 17 timing 29, 170 signal basic OCP 12 configuration 21 control 85 dataflow 12 driver strength 57 extensions complex 15 simple 14 external 110 group division 26 mapping 28 interface compatibility 37 name 101 ordering 159 requirements 35 sideband 17 statement 91 test 19 tie-off rules 37 timing
input 56 output 56 requirements 27 restrictions 159 ungrouped 29 SInterrupt signal 17 signal mismatch 38 summary 22 slave combinational paths 160 error 18 flag description 18 signal emulation 75 interface documentation 62 interrupt 18 optimizing 169 pipelined write data 15 quick system 72 reset 17, 29 response field 13 response phase 161 signal compatibility 36 successful transfer 31 thread busy 16 transfer accept 13 write accept 15 Socket Transaction Language behavioral statements 91 command syntax 85 commands 85 cycle directives 70 macro statements 88 trace 91 SResp signal behavioral models 77 function 12 signal mismatch 37 summary 21 state machine combinational master 165 Mealy 165 slave 166 diagrams 157 multi-threaded behavior 170 sequential master 163 sequential slave 164 status busy 18 core 18 event 19 information emulation 78 signals 18 parameter 18 statement 93
timing 30 Status signal emulation 78 function 17 summary 22 status_wdth parameter 18 statusbusy parameter 18 StatusBusy signal emulation 79 function 17 summary 22 timing 30 statusbusy_cycles 79 statusbusye_enable 79 statusdrive_addr 79 statusrd parameter 19 StatusRd signal emulation 79 function 17 summary 22 timing 30 statusreg_addr 78 sthreadbusy parameter 16 SThreadBusy signal definition 15 emulation 75, 77 information 35 signal mismatch 38 slave request phase 160 summary 21 timing cycle 29 sthreadbusy_enable 77 SThreadID signal function 15 signal mismatch 38 summary 21 streaming bursts attributes 32 rules 34 device modeling 77 synchronous handshaking signals 3 interface 8 reset 17 system initiator 2 modeling 72 target 2
function 19 summary 22 TDI signal function 19 summary 22 TDO signal function 19 summary 22 technology section 120 variables 120 Technology Compiler 120 technologyvalue variable 127 test clock 20 clock control extensions 171 data in 20 out 20 logic reset 20 mode 20 signals definitions 19 timing 29, 30 size 70 vector generation 85 testbench RTL configuration file 107 TestClk signal function 19 summary 22 timing 30 thread busy master 16 signals 35 slave 16 busy signals 170 description 169 end-to-end identification 35 identifier binary-encoded value 16 definition 169 request 16 response 16 TID 86 uses 34 mapping 34 multiple 34 number supported 62 ordering 9 state machine implementation 170 support 63 transfer order 169 throughput documenting 62
T
target memory 72 TCK signal
Index 187
maximum 158 peak data 166 TID thread ID 86 timescale variable 127 timestamp_enable 70 timing best case 121, 126 calculating values 126 categories definitions 173 level0 174 level1 174 level2 174 chip port values 120 combinational path 27 combinational paths 59 constraints portable 120 ports 141 core 55 connecting 58 interface parameters 56 dataflow signals 27 default units 127 defaults, overriding 119 fast 125 IP blocks 119 max delay 59 parameters 145 pin-level 56 pins 57 sideband signals 29 signals 27, 159 test signals 29 worst case 121, 126 TMS signal function 19 summary 22 transfer accept 13 address 12 address region 14 affects on commands 31 assigning 34 burst linking 31, 168 requirements 32, 33 type 14 byte enable 15 command 12 concurrent activity 34 connection ID 15 data widths 9 efficiency 9, 32 linking 14, 168 natural size 32 optimizing performance 168
order 9, 29 out-of-order 34 phases 28 pipelining 9 successful 31 type 12 TRST_N signal function 19 summary 22
V
vendor code 98 version 144 command 120 statement 98, 107, 116 VHDL language specification 98 ports 117 signals 117 vhdl_type command 117 Virtual Socket Interface Alliance 1
W
wait statement 92 watch_enable 82 watch_maxidles 82 waveform, clock 124, 129 WGL format control 82 wgl_enable 82 wgl_slave 82 width data 9 mismatch 39 wireloadcapacitance description 132, 147 See also wireloaddelay timing 57 wireloaddelay description 132, 147 timing 57 wireloadmode variable 127 wireloadresistance description 132, 147 See also wireloaddelay timing 57 word size 9, 12 transfer 9 worstcasedelay 144 wrapper interface modules 2 wrfifo_
addr 78 enable 78 high 78 low 78 rate 78 size 78 write address, decoupling 168 data master to slave 13
slave accept 15 thread ID 16 valid 15 FIFO 77 posted 9, 159 Write command completion 159 transfer affects 31 writestatus statement 93
OCP-IP Association 5440 SW Westgate Drive, Suite 217 Portland, OR 97221 Ph: 503-291-2560 Fax: 503-297-1090 [email protected] www.ocpip.org