ADK
ADK
Copyright Information
2005 Analog Devices, Inc., ALL RIGHTS RESERVED. This document may not be reproduced in any form without prior, express written consent from Analog Devices, Inc. Printed in the USA.
Disclaimer
Analog Devices, Inc. reserves the right to change this product without prior notice. Information furnished by Analog Devices is believed to be accurate and reliable. However, no responsibility is assumed by Analog Devices for its use; nor for any infringement of patents or other rights of third parties which may result from its use. No license is granted by implication or otherwise under the patent rights of Analog Devices, Inc.
CONTENTS
INTRODUCTION
Design Advantages ........................................................................ 1-1 Architecture Overview ................................................................... 1-5 Processor Core ......................................................................... 1-5 Processing Elements ............................................................ 1-6 Program Sequence Control .................................................. 1-7 Processor Internal Buses .................................................... 1-10 Processor Peripherals .............................................................. 1-11 Dual-Ported Internal Memory (SRAM) ............................. 1-11 External Port ..................................................................... 1-12 I/O Processor .................................................................... 1-14 JTAG Port ............................................................................. 1-16 Differences From Previous SHARC Processors ............................. 1-16 Processor Core Enhancements ................................................ 1-17 Processor Internal Bus Enhancements ..................................... 1-17 Memory Organization Enhancements .................................... 1-18 External Port Enhancements .................................................. 1-18 Host Interface Enhancements ............................................ 1-18 Multiprocessor Interface Enhancements ............................. 1-19
iii
CONTENTS
IO Architecture Enhancements .............................................. 1-19 DMA Controller Enhancements ........................................ 1-19 Link Port Enhancements ................................................... 1-19 Instruction Set Enhancements ............................................... 1-20 For More Information About Analog Products ............................. 1-21 For Technical or Customer Support ............................................. 1-22 Whats New in This Manual ....................................................... 1-22 Related Documents .................................................................... 1-23 Conventions ............................................................................... 1-24
PROCESSING ELEMENTS
Setting Computational Modes ...................................................... 2-4 32-Bit (Normal Word) Floating-Point Format .......................... 2-4 40-Bit Floating-Point Format .................................................. 2-5 16-Bit (Short Word) Floating-Point Format ............................. 2-6 32-Bit Fixed-Point Format ....................................................... 2-6 Rounding Mode ...................................................................... 2-7 Using Computational Status ......................................................... 2-8 Arithmetic Logic Unit (ALU) ........................................................ 2-9 ALU Operation ....................................................................... 2-9 ALU Saturation ..................................................................... 2-10 ALU Status Flags ................................................................... 2-11 ALU Instruction Summary .................................................... 2-12 MultiplyAccumulator (Multiplier) ........................................... 2-15 Multiplier Operation ............................................................. 2-15
iv
CONTENTS
Multiplier (Fixed-Point) Result Register ................................. 2-16 Multiplier Status Flags ........................................................... 2-19 Multiplier Instruction Summary ............................................ 2-20 Barrel-Shifter (Shifter) ................................................................. 2-23 Shifter Operation .................................................................. 2-23 Shifter Status Flags ................................................................ 2-27 Shifter Instruction Summary .................................................. 2-28 Data Register File ........................................................................ 2-30 Alternate (Secondary) Data Registers ........................................... 2-32 Multifunction Computations ...................................................... 2-34 Secondary Processing Element (PEy) ............................................ 2-37 Dual Compute Units Sets ...................................................... 2-39 Dual Register Files ................................................................. 2-42 Dual Alternate Registers ........................................................ 2-43 SIMD (Computational) Operations ....................................... 2-43 SIMD And Status Flags ......................................................... 2-46
PROGRAM SEQUENCER
Instruction Pipeline ...................................................................... 3-7 Instruction Cache ......................................................................... 3-8 Using the Cache .................................................................... 3-11 Optimizing Cache Usage ....................................................... 3-11 Branches and Sequencing ............................................................ 3-13 Conditional Branches ............................................................ 3-15 Delayed Branches .................................................................. 3-15
CONTENTS
Restrictions and Limitations When Using Delayed Branches .......................................................... 3-19 Loops and Sequencing ................................................................ 3-22 Restrictions on Ending Loops ................................................ 3-25 Restrictions on Short Loops .................................................. 3-26 Loop Address Stack ............................................................... 3-29 Loop Counter Stack .............................................................. 3-30 Interrupts and Sequencing .......................................................... 3-34 Sensing Interrupts ................................................................. 3-40 Masking Interrupts ............................................................... 3-41 Latching Interrupts ............................................................... 3-42 Stacking Status During Interrupts .......................................... 3-44 Nesting Interrupts ................................................................. 3-45 Reusing Interrupts ................................................................ 3-47 Interrupting IDLE ................................................................ 3-48 Multiprocessing Interrupts .................................................... 3-49 Timer and Sequencing ................................................................ 3-50 Stacks and Sequencing ................................................................ 3-52 Conditional Sequencing .............................................................. 3-53 SIMD Mode and Sequencing ...................................................... 3-57 Conditional Compute Operations ......................................... 3-58 Conditional Branches and Loops ........................................... 3-59 Conditional Data Moves ....................................................... 3-59 Case 1: Complementary Register Pair Data Move .............. 3-60
vi
CONTENTS
Case 2: UncomplementedtoComplementary Register Move ................................................................ 3-63 Case 3: Complementary Register => Uncomplimentary Register .......................................................................... 3-64 Case 4: Data Move Involves External Memory or IOP Memory Space ........................................................ 3-65 Conditional DAG Operations ................................................ 3-66
vii
CONTENTS
MEMORY
Internal Memory .......................................................................... 5-2 External Memory .......................................................................... 5-2 Processor Architecture .................................................................. 5-4 Off-Chip Memory and Peripherals Interface .................................. 5-6 Buses ............................................................................................ 5-7 Internal Address and Data Buses .............................................. 5-7 Internal Data Bus Exchange .................................................. 5-10 ADSP-21161 Memory Map ........................................................ 5-16 Internal Memory ................................................................... 5-16 Multiprocessor Memory ........................................................ 5-19 External Memory .................................................................. 5-22 Shadow Write FIFO .............................................................. 5-24 Memory Organization and Word Size .................................... 5-25 Placing 32-Bit Words and 48-Bit Words ............................ 5-25 Mixing 32-Bit and 48-Bit Words ....................................... 5-26 Restrictions on Mixing 32-Bit and 48-Bit Words ............... 5-28 48-Bit Word Allocation .................................................... 5-31 Setting Data Access Modes .......................................................... 5-32 SYSCON Register Control Bits ............................................. 5-32 Mode 1 Register Control Bits ................................................ 5-34 Mode 2 Register Control Bits ................................................ 5-34 Wait Register Control Bits ..................................................... 5-34 Using Boot Memory .............................................................. 5-35
viii
CONTENTS
Reading From Boot Memory ............................................. 5-35 Writing to Boot Memory ................................................... 5-36 Internal Interrupt Vector Table .............................................. 5-37 Internal Memory Data Width ................................................ 5-37 Memory Bank Size ................................................................ 5-38 External Bus Priority ............................................................. 5-39 Secondary Processor Element (PEy) ........................................ 5-39 Broadcast Register Loads ....................................................... 5-40 Illegal I/O Processor Register Access ....................................... 5-41 Unaligned 64-Bit Memory Access .......................................... 5-41 External Bank X Access Mode ................................................ 5-42 External Bank X Waitstates .................................................... 5-45 Using Memory Access Status ....................................................... 5-46 Accessing Memory ...................................................................... 5-46 Access Word Size ................................................................... 5-47 Long Word (64-Bit) Accesses ............................................. 5-48 Instruction Word (48-Bit) and Extended-Precision Normal Word (40-Bit) Accesses ...................................... 5-50 Normal Word (32-Bit) Accesses ......................................... 5-50 Short Word (16-Bit) Accesses ............................................ 5-51 SISD, SIMD, and Broadcast Load Modes ............................... 5-51 Single and Dual Data Accesses ............................................... 5-52 Data Access Options .............................................................. 5-52 Short Word Addressing of Single Data in SISD Mode ........ 5-54 Short Word Addressing of Single Data in SIMD Mode ....... 5-56 ADSP-21161 SHARC Processor Hardware Reference ix
CONTENTS
Short Word Addressing of Dual-Data in SISD Mode ......... 5-58 Short Word Addressing of Dual-Data in SIMD Mode ....... 5-60 32-Bit Normal Word Addressing of Single Data in SISD Mode ................................................................... 5-62 32-Bit Normal Word Addressing of Single Data in SIMD Mode .................................................................. 5-64 32-Bit Normal Word Addressing of Dual Data in SISD Mode ................................................................... 5-66 32-Bit Normal Word Addressing of Dual Data in SIMD Mode .................................................................. 5-68 Extended Precision Normal Word Addressing of Single Data .................................................................... 5-70 Extended Precision Normal Word Addressing of Dual Data in SISD Mode ....................................................... 5-72 Extended-Precision Normal Word Addressing of Dual Data in SIMD Mode ..................................................... 5-74 Long Word Addressing of Single Data ............................... 5-76 Long Word Addressing of Dual Data in SISD Mode .......... 5-78 Long Word Addressing of Dual Data in SIMD Mode ........ 5-80 Mixed Word Width Addressing of Dual Data in SISD Mode ................................................................... 5-82 Mixed Word Width Addressing of Dual Data in SIMD Mode .................................................................. 5-84 Broadcast Load Access ...................................................... 5-86 Shadow Write FIFO Considerations in SIMD Mode .............. 5-95 Arranging Data in Memory ....................................................... 5-100 Executing Instructions From External Memory .......................... 5-101
CONTENTS
32- to 48-Bit Packing Address Generation Scheme ............... 5-109 Total Program Size (32- to 48-Bit Packing) ...................... 5-110 16- to 48-Bit Packing Address Generation Scheme ............... 5-111 Total Program Size (16- to 48-Bit Packing) ...................... 5-111 8- to 48-Bit Packing Address Generation Scheme ................. 5-112 Total Program Size (8- to 48-Bit Packing) ........................ 5-113 No Packing (48- to 48-Bit) Address Generation Scheme ....... 5-113
I/O PROCESSOR
DMA Channel Allocation and Priorities ...................................... 6-16 DMA Interrupt Vector Locations ................................................. 6-18 Booting Modes ........................................................................... 6-20 DMA Controller Operation ........................................................ 6-20 Managing DMA Channel Priority .......................................... 6-22 Chaining DMA Processes ...................................................... 6-25 Transfer Control Block (TCB) Chain Loading ................... 6-26 Setting Up and Starting the Chain ..................................... 6-28 Inserting a TCB in an Active Chain ................................... 6-28 External Port DMA ..................................................................... 6-29 External Port Registers ........................................................... 6-30 External Port FIFO Buffers .................................................... 6-33 External Port DMA Data Packing .......................................... 6-34 32-Bit Bus Downloading ................................................... 6-37 16-Bit Bus Downloading ................................................... 6-38 8-Bit Bus Downloading ..................................................... 6-39
xi
CONTENTS
Boot Memory DMA Mode .................................................... 6-42 External Port Buffer Modes ................................................... 6-42 External Port Channel Priority Modes ................................... 6-43 External Port Channel Transfer Modes ................................... 6-46 External Port Channel Handshake Modes .............................. 6-47 Master Mode .................................................................... 6-50 Paced Master Mode .......................................................... 6-54 Slave Mode ....................................................................... 6-55 Handshake Mode ............................................................. 6-57 DMA Handshake Idle Cycle .................................................. 6-64 External-Handshake Mode ................................................ 6-66 Setting Up External Port DMA .............................................. 6-68 Bootloading Through The External Port ................................ 6-70 Host Processor Booting ..................................................... 6-72 PROM Booting ................................................................ 6-74 External Port DMA Programming Examples .......................... 6-76 Link Port DMA .......................................................................... 6-81 Link Port Registers ................................................................ 6-81 Link Port Buffer Modes ......................................................... 6-83 Link Port Channel Priority Modes ......................................... 6-83 Link Port Channel Transfer Modes ........................................ 6-85 Setting Up Link Port DMA ................................................... 6-86 Bootloading Through The Link Port ..................................... 6-88 Link Port DMA Programming Examples ................................ 6-90
xii
CONTENTS
Serial Port DMA ......................................................................... 6-95 Serial Port Registers ............................................................... 6-96 Serial Port Buffer Modes ........................................................ 6-97 Serial Port Channel Priority Modes ........................................ 6-99 Serial Port Channel Transfer Modes ....................................... 6-99 Setting Up Serial Port DMA ................................................ 6-100 SPORT DMA Programming Examples ................................. 6-102 SPI Port DMA .......................................................................... 6-108 SPI Port Registers ................................................................ 6-108 SPI Port Buffer .................................................................... 6-109 SPI DMA Channel Priority .................................................. 6-112 Setting up SPl Port DMA .................................................... 6-112 Bootloading Through the SPI Port ....................................... 6-113 SPI Port DMA Programming Examples ................................ 6-116 Using I/O Processor Status ........................................................ 6-121 External Port Status ............................................................. 6-127 Link Port Status .................................................................. 6-131 Serial Port Status ................................................................. 6-135 SPI Port Status .................................................................... 6-137 Optimizing DMA Throughput .................................................. 6-139 Internal Memory DMA ....................................................... 6-139 External Memory DMA ....................................................... 6-140 System-Level Considerations ................................................ 6-144
xiii
CONTENTS
EXTERNAL PORT
Setting External Port Modes .......................................................... 7-3 External Memory Interface ........................................................... 7-3 Banked External Memory ........................................................ 7-9 Boot Memory ....................................................................... 7-10 Idle Cycle ......................................................................... 7-10 Data Hold Cycle ............................................................... 7-12 Multiprocessor Memory Space Waitstates and Acknowledge ................................................................. 7-12 Timing External Memory Accesses ......................................... 7-13 Asynchronous Mode Interface Timing ............................... 7-14 Synchronous Mode Interface Timing ................................ 7-18 Synchronous Burst Mode Interface Timing ....................... 7-26 Using External SBSRAM ....................................................... 7-36 SBSRAM Restrictions ........................................................... 7-41 Host Processor Interface ............................................................. 7-42 Acquiring the Bus ................................................................. 7-44 Asynchronous Transfers ......................................................... 7-48 Host Transfer Timing ............................................................ 7-51 Host Interface Deadlock Resolution With SBTS .................... 7-54 Slave Reads and Writes .......................................................... 7-55 IOP Shadow Registers ....................................................... 7-55 Instruction Transfers ......................................................... 7-56 Slave Write Latency .......................................................... 7-56
xiv
CONTENTS
Slave Reads ....................................................................... 7-57 Broadcast Writes .................................................................... 7-57 Data Transfers Through the EPBx Buffers .............................. 7-58 DMA Transfers ...................................................................... 7-58 Host Data Packing ................................................................. 7-59 Packing Mode Variations For Host Accesses ........................... 7-61 IOP Register Host Accesses ............................................... 7-62 LINK Port Buffer Access ................................................... 7-63 EPBx Buffer Accesses ........................................................ 7-64 8- to 32-Bit Data Packing .................................................. 7-66 16- to 32-Bit Packing ........................................................ 7-69 48-Bit Instruction Packing ................................................ 7-74 Host Interface Status ............................................................. 7-76 Interprocessor Messages and Vector Interrupts ........................ 7-76 Message Passing (MSGRx) ................................................ 7-77 Host Vector Interrupts (VIRPT) ........................................ 7-78 System Bus Interfacing .......................................................... 7-78 Access to the Processor Bus Slave Processor ..................... 7-79 Access to the System Bus Master Processor ...................... 7-79 Processor Core Access to System Bus ................................. 7-82 Deadlock Resolution ......................................................... 7-82 DMA Access to System Bus ............................................... 7-84 Multiprocessing With Local Memory ................................. 7-85 ADSP-21161 to Microprocessor Interface .......................... 7-85
xv
CONTENTS
Multiprocessor (MP) Interface .................................................... 7-87 Multiprocessing System Architectures .................................... 7-90 Data Flow Multiprocessing ............................................... 7-90 Cluster Multiprocessing .................................................... 7-91 Multiprocessor Bus Arbitration .............................................. 7-93 Bus Arbitration Protocol ................................................... 7-95 Bus Arbitration Priority (RPBA) ....................................... 7-98 Bus Mastership Timeout ................................................. 7-101 Priority Access ................................................................ 7-103 Bus Synchronization After Reset .......................................... 7-105 Booting Another processor .................................................. 7-108 Multiprocessor Writes and Reads ......................................... 7-109 Instruction Transfers ....................................................... 7-110 Bus Lock and Semaphores ................................................... 7-110 Multiprocessor Interface Status ....................................... 7-112
SDRAM INTERFACE
SDRAM Pin Connections ............................................................. 8-7 SDRAM Timing Specifications ..................................................... 8-8 SDRAM Control Register (SDCTL) ............................................. 8-9 SDRAM Configuration for Runtime ........................................... 8-10 Setting the Refresh Counter Value (SDRDIV) ....................... 8-13 Setting the SDRAM Clock Enables ........................................ 8-14 Setting the Number of SDRAM Banks (SDBN) ..................... 8-15 Setting the External Memory Bank (SDEMx) ........................ 8-16
xvi
CONTENTS
Setting the SDRAM Buffering Option (SDBUF) .................... 8-16 Selecting the CAS Latency Value (SDCL) ............................... 8-17 Selecting the SDRAM Page Size (SDPGS) .............................. 8-18 Setting the SDRAM Power-Up Mode (SDPM) ....................... 8-19 Starting the SDRAM Power-Up Sequence (SDPSS) ................ 8-19 Starting Self-Refresh Mode (SDSRF) ..................................... 8-20 Selecting the Active Command Delay (SDTRAS) ................... 8-20 Selecting the Precharge Delay (SDTRP) ................................. 8-21 Selecting the RAS-to-CAS Delay (SDTRCD) ......................... 8-21 SDRAM Controller Standard Operation ...................................... 8-22 Understanding DAG and DMA Operation ............................. 8-22 Multiprocessing Operation .................................................... 8-24 Accessing SDRAM ................................................................ 8-25 Address Mapping for SDRAM ........................................... 8-27 Understanding DQM Operation ............................................ 8-29 Executing a Parallel Refresh Command During Host Control ...................................................................... 8-29 Powering Up After Reset ........................................................ 8-30 Entering and Exiting Self-Refresh Mode ................................. 8-31 SDRAM Controller Commands .................................................. 8-31 Bank Activate (ACT) Command ............................................ 8-32 Mode Register Set (MRS) ...................................................... 8-32 Precharge Command (PRE) ................................................... 8-33 Read/Write Command ........................................................... 8-34 Read Commands ............................................................... 8-34 ADSP-21161 SHARC Processor Hardware Reference xvii
CONTENTS
Write Commands ............................................................. 8-36 DMA Transfers ................................................................. 8-37 Refresh (REF) Command ...................................................... 8-37 Setting the Delay Between Refresh Commands .................. 8-37 Understanding Multiprocessing Operation ........................ 8-38 Self Refresh Command (SREF) .............................................. 8-39 Programming Example .......................................................... 8-40
LINK PORTS
Link Port to Link Buffer Assignment ............................................. 9-3 Link Port DMA Channels ............................................................. 9-4 Link Port Booting ......................................................................... 9-5 Setting Link Port Modes ............................................................... 9-5 Link Port Control Register (LCTL) Bit Descriptions ................ 9-7 Link Data Path and Compatibility Modes ................................ 9-9 Using Link Port Handshake Signals ............................................. 9-10 Using Link Buffers ...................................................................... 9-12 Core Processor Access To Link Buffers ................................... 9-13 Host Processor Access To Link Buffers ................................... 9-14 Using Link Port DMA ................................................................ 9-16 Using Link Port Interrupts .......................................................... 9-17 Link Port Interrupts With DMA Enabled .............................. 9-18 Link Port Interrupts With DMA Disabled ............................. 9-19 Link Port Service Request Interrupts (LSRQ) ......................... 9-19 Detecting Errors on Link Transmissions ...................................... 9-22
xviii
CONTENTS
Link Port Programming Examples .......................................... 9-23 Using Token Passing With Link Ports .......................................... 9-27 Designing Link Port Systems ....................................................... 9-30 Terminations for Link Transmission Lines .............................. 9-30 Peripheral I/O Using Link Ports ............................................. 9-31 Data Flow Multiprocessing With Link Ports ........................... 9-33
SERIAL PORTS
Serial Port Pins ........................................................................... 10-3 SPORT Interrupts ...................................................................... 10-7 SPORT Reset .............................................................................. 10-8 SPORT Control Registers and Data Buffers ................................. 10-9 Serial Port Control Registers (SPCTLx) ................................ 10-14 Register Writes and Effect Latency ................................... 10-30 Transmit and Receive Data Buffers ....................................... 10-30 Clock and Frame Sync Frequencies (DIV) ............................ 10-33 Data Word Formats ................................................................... 10-35 Word Length ....................................................................... 10-36 Endian Format .................................................................... 10-36 Data Packing and Unpacking ............................................... 10-37 Data Type ....................................................................... 10-37 Companding ....................................................................... 10-39 Clock Signal Options ................................................................ 10-40 Frame Sync Options .................................................................. 10-41 Framed Versus Unframed ..................................................... 10-41
xix
CONTENTS
Internal Versus External Frame Syncs ................................... 10-42 Active Low Versus Active High Frame Syncs ........................ 10-43 Sampling Edge for Data and Frame Syncs ............................ 10-43 Early Versus Late Frame Syncs ............................................. 10-44 Data-Independent Transmit Frame Sync .............................. 10-45 SPORT Loopback .................................................................... 10-46 SPORT Operation Modes ......................................................... 10-47 I2S Mode ............................................................................ 10-48 Setting Internal Serial Clock and Frame Sync Rates ......... 10-49 I2S Control Bits ............................................................. 10-49 Setting Word Length (SLEN) .......................................... 10-49 Selecting Transmit Receive Channel Order (L_FIRST) .... 10-49 Selecting the Frame Sync Options (FS_BOTH) ............... 10-50 Enabling SPORT Master Mode (MSTR) ......................... 10-50 Enabling SPORT DMA (SDEN) .................................... 10-51 Multichannel Operation ...................................................... 10-52 Frame Syncs in Multichannel Mode ................................ 10-54 Multichannel Control Bits in SPCTL .............................. 10-55 Channel Selection Registers ............................................ 10-57 Transferring Data to Memory ................................................... 10-58 DMA Block Transfers .......................................................... 10-59 Setting Up DMA on SPORT Channels ........................... 10-60 SPORT DMA Parameter Registers ....................................... 10-61 SPORT DMA Chaining ................................................. 10-65
xx
CONTENTS
Single-Word Transfers .......................................................... 10-65 SPORT Pin/Line Terminations .................................................. 10-66 SPORT Programming Examples ................................................ 10-67
xxi
CONTENTS
Interrupt and DMA Driven Transfers .............................. 11-26 Core Driven Transfers ..................................................... 11-26 Automatic Slave Selection ............................................... 11-26 User Controlled Slave Selection ....................................... 11-27 Slave Mode Operation ......................................................... 11-28 Error Signals and Flags ............................................................. 11-29 Multi-Master Error (MME) ................................................. 11-30 Transmission Error (TXE) ................................................... 11-30 Reception Error (RBSY) ...................................................... 11-31 SPI/Link Port DMA ................................................................. 11-32 DMA Operation in SPI Master Mode .................................. 11-32 DMA Operation in Slave Mode ........................................... 11-33 SPI Booting .............................................................................. 11-34 32-Bit SPI Host Boot .......................................................... 11-38 16-Bit SPI Host Boot .......................................................... 11-39 8-Bit SPI Host Boot ............................................................ 11-41 Multiprocessor SPI Port Booting ..................................... 11-42 SPI Programming Example ....................................................... 11-44
xxii
CONTENTS
EMUPC Shift Register .......................................................... 12-7 EMUCTL Shift Register ........................................................ 12-8 EMUSTAT Shift Register .................................................... 12-11 BRKSTAT Shift Register ..................................................... 12-12 MEMTST Shift Register ...................................................... 12-13 PSx, DMx, IOx, and EPx (Breakpoint) Registers .................. 12-13 EMUN Register .................................................................. 12-16 EMUCLK and EMUCLK2 Registers ................................... 12-16 EMUIDLE Instruction ........................................................ 12-17 In Circuit Signal Analyzer (ICSA) Function ......................... 12-17 Boundary Register ..................................................................... 12-17 Device Identification Register .................................................... 12-28 Built-In Self-Test Operation (BIST) .......................................... 12-28 Private Instructions ................................................................... 12-28 References ................................................................................. 12-29
SYSTEM DESIGN
Pin Descriptions ......................................................................... 13-2 Input Synchronization Delay ............................................... 13-18 Pin States At Reset ............................................................... 13-19 Pull-Up and Pull-Down Resistors ......................................... 13-22 Clock Derivation ................................................................. 13-24 Timing Specifications ...................................................... 13-25 RESET and CLKIN ............................................................ 13-28 Reset Generators ................................................................. 13-31
xxiii
CONTENTS
Interrupt and Timer Pins .................................................... 13-33 Core-Based Flag Pins ........................................................... 13-34 Flag Inputs ..................................................................... 13-34 Flag Outputs .................................................................. 13-34 Programmable I/O Flags ..................................................... 13-35 Example #1: Configuring FLGx as Output Flags ............. 13-37 Example #2: Configuring FLGx as Input Flags ................ 13-38 System Design Considerations for Flags ............................... 13-38 Example #3: Programming 2:1 Clock Ratio ..................... 13-40 Example #4: Programming 3:1 Clock Ratio ..................... 13-40 Example #5: Programming 4:1 Clock Ratio ..................... 13-40 JTAG Interface Pins ............................................................ 13-41 Dual-Voltage Power-up Sequencing ........................................... 13-41 PLL Start-Up (Revisions 1.0/1.1) ........................................ 13-44 Power On Reset (POR) Circuit ....................................... 13-44 PLL CLKIN Enable Circuit ............................................ 13-46 PLL Start-Up (Revision 1.2) ................................................ 13-48 Designing For JTAG Emulation ................................................ 13-49 Target Board Connector ...................................................... 13-50 Layout Requirements ................................................................ 13-54 Power Sequence for Emulation .................................................. 13-56 Additional JTAG Emulator References ...................................... 13-56 Pod Specifications ..................................................................... 13-56 JTAG Pod Connector .......................................................... 13-57
xxiv
CONTENTS
3.3 V Pod Logic .................................................................. 13-58 2.5 V Pod Logic .................................................................. 13-59 Conditioning Input Signals ....................................................... 13-60 Link Port Input Filter Circuits ............................................. 13-60 RESET Input Hysteresis ...................................................... 13-61 Designing For High Frequency Operation ................................. 13-62 Clock Specifications and Jitter ............................................. 13-63 Clock Distribution .............................................................. 13-63 Point-to-Point Connections ................................................. 13-65 Signal Integrity .................................................................... 13-67 Other Recommendations and Suggestions ............................ 13-68 Decoupling Capacitors and Ground Planes .......................... 13-69 Oscilloscope Probes ............................................................. 13-70 Recommended Reading ....................................................... 13-71 Booting Single and Multiple Processors ..................................... 13-71 Multiprocessor Host Booting ............................................... 13-73 Multiprocessor EPROM Booting ......................................... 13-73 Booting From a Single EPROM ...................................... 13-73 Sequential Booting .......................................................... 13-74 Multiprocessor Link Port Booting ........................................ 13-75 Multiprocessor Booting From External Memory ................... 13-75 Data Delays, Latencies, and Throughput ................................... 13-76 Execution Stalls ................................................................... 13-77 DAG Stalls .......................................................................... 13-77
xxv
CONTENTS
Memory Stalls ..................................................................... 13-77 IOP Register Stalls .............................................................. 13-78 DMA Stalls ......................................................................... 13-78 Link Port and Serial Port Stalls ............................................ 13-78
REGISTERS
Control and Status System Registers .............................................. A-2 Mode Control 1 Register (MODE1) ........................................ A-3 Mode Mask Register (MMASK) .............................................. A-8 Mode Control 2 Register (MODE2) ...................................... A-10 Arithmetic Status Registers (ASTATx and ASTATy) ............... A-13 Sticky Status Registers (STKYx and STKYy) .......................... A-18 User-Defined Status Registers (USTATx) ............................... A-22 Processing Element Registers ....................................................... A-23 Data File Data Registers (Rx, Fx, Sx) ..................................... A-23 Multiplier Results Registers (MRFx, MRBx) .......................... A-24 Program Memory Bus Exchange Register (PX) ....................... A-25 Program Sequencer Registers ....................................................... A-25 Interrupt Latch Register (IRPTL) .......................................... A-27 Interrupt Mask Register (IMASK) ......................................... A-31 Interrupt Mask Pointer Register (IMASKP) ........................... A-32 Link Port Interrupt Register (LIRPTL) .................................. A-34 Flag Value Register (FLAGS) ................................................. A-37 IOFLAG Value Register ........................................................ A-38 Program Counter Register (PC) ............................................. A-41
xxvi
CONTENTS
Program Counter Stack Register (PCSTK) ............................ A-44 Program Counter Stack Pointer Register (PCSTKP) .............. A-44 Fetch Address Register (FADDR) .......................................... A-44 Decode Address Register (DADDR) ...................................... A-44 Loop Address Stack Register (LADDR) ................................. A-45 Current Loop Counter Register (CURLCNTR) .................... A-45 Loop Counter Register (LCNTR) ......................................... A-45 Timer Period Register (TPERIOD) ....................................... A-46 Timer Count Register (TCOUNT) ....................................... A-46 Data Address Generator Registers ............................................... A-46 Index Registers (Ix) ............................................................... A-47 Modify Registers (Mx) .......................................................... A-47 Length and Base Registers (Lx,Bx) ........................................ A-47 I/O Processor Registers ............................................................... A-47 System Configuration Register (SYSCON) ............................ A-60 Vector Interrupt Address Register (VIRPT) ........................... A-63 External Memory Waitstate and Access Mode Register (WAIT) ............................................................................. A-65 System Status Register (SYSTAT) .......................................... A-69 SDRDIV Register (SDRDIV) ............................................... A-72 SDRAM Control Register (SDCTL) ..................................... A-73 External Port DMA Buffer Registers (EPBx) .......................... A-76 Message Registers (MSGRx) ................................................. A-77 PC Shadow Register (PC_SHDW) ........................................ A-77 MODE2 Shadow Register (MODE2_SHDW) ...................... A-78 ADSP-21161 SHARC Processor Hardware Reference xxvii
CONTENTS
Bus Time-Out Maximum Register (BMAX) ........................... A-79 Bus (Time-Out) Counter Register (BCNT) ........................... A-79 External Port DMA Control Registers (DMACx) ................... A-80 Internal Memory DMA Index Registers (IIx) ......................... A-87 Internal Memory DMA Modifier Registers (IMx) .................. A-87 Internal Memory DMA Count Registers (Cx) ........................ A-87 Chain Pointer For Next DMA TCB Registers (CPx) .............. A-88 General Purpose DMA Registers (GPx) ................................. A-89 External Memory DMA Index Registers (EIEPx) ................... A-89 External Memory DMA Modifier Registers (EMEPx) ............ A-89 External Memory DMA Count Registers (ECEPx) ................. A-90 DMA Channel Status Register (DMASTAT) .......................... A-90 Link Port Buffer Registers (LBUFx) ....................................... A-92 Link Port Buffer Control Register (LCTL) ............................. A-92 Link Port Service Request & Mask Register (LSRQ) ............... A-98 Serial Port Registers ............................................................. A-100 SPORT Serial Control Registers (SPCTLx) ..................... A-100 SPORT Multichannel Control Registers (SPxyMCTL) .... A-109 SPORT Transmit Buffer Registers (TXx) ......................... A-111 SPORT Receive Buffer Registers (RXx) ........................... A-111 SPORT Divisor Registers (DIVx) .................................... A-112 SPORT Count Registers (CNTx) .................................... A-113 SPORT Transmit Select Registers (MT2CSx and MT3CSx) .................................................................... A-113
xxviii
CONTENTS
SPORT Transmit Compand Registers (MT2CCSx and MT3CCSx) ................................................................. A-113 SPORT Receive Select Registers ..................................... A-114 SPORT Receive Compand Registers ............................... A-114 Serial Peripheral Interface Registers ........................................... A-114 SPI Port Status Register ...................................................... A-115 SPI Control Register (SPICTL) ........................................... A-117 SPI Receive Buffer Register (SPIRX) ................................... A-120 SPI Transmit Buffer Register (SPITX) ................................. A-121 Register and Bit #Defines (def21161.h) .................................... A-121
GLOSSARY INDEX
xxix
CONTENTS
xxx
1 INTRODUCTION
Thank you for purchasing the Analog Devices SHARC digital signal processor (DSP).
Design Advantages
The ADSP-21161 processor is a high-performance 32-bit processor used for medical imaging, communications, military, audio, test equipment, 3D graphics, speech recognition, motor control, imaging, and other applications. This processor builds on the ADSP-21000 Family processor core to form a complete system-on-a-chip, adding a dual-ported on-chip SRAM, integrated I/O peripherals, and an additional processing element for Single-Instruction-Multiple-Data (SIMD) support. The SHARC architecture balances a high performance processor core with high performance buses (PM, DM, IO). In the core, every instruction can execute in a single cycle. The buses and instruction cache provide rapid, unimpeded data flow to the core to maintain the execution rate. Figure 1-1 shows a detailed block diagram of the processor, which illustrates the following architectural features. Two processing elements (PEx and PEy), each containing 32-Bit IEEE floating-point computation unitmultiplier, ALU, Shifter, and data register file Program sequencer with related instruction cache, interval timer, and Data Address Generators (DAG1 and DAG2)
1-1
Design Advantages
Dual-ported SRAM External port for interfacing to off-chip memory such as SDRAM, peripherals, hosts, and multiprocessor systems Input/Output (IO) processor with integrated DMA controller, SPI-compatible port, serial ports, and link ports for point-to-point multiprocessor communications JTAG Test Access Port for emulation
CO RE PR OCESSOR
T IME R INST RUCT ION CACHE
32 x 48- B IT A DD R AD D R
DU AL-PORTE D SRA M
B LO C K 0 B LO C K 1
JT AG
T EST & E MUL ATIO N
I/ O P OR T
D A TA D AT A ADDR
AD D R
DAG 1
8 x4 x32
DAG2
8x4x 32
EXTERN A L PORT
AD DR BUS MUX MULTIPROCESSOR INTERFACE 64 32
PM DATA BUS
BUS CONNECT (PX)
48/64 32/40/64
DM DAT A BUS
MULT
BAR RE L SHIFTER
B ARRE L SHIFT ER
MUL T
IOP REGISTERS (MEMORY MAPPED) CONT ROL, S TAT US, & DATA B UFFERS
DMA CONTROLLER
SE RIAL PO RTS (2) L INK PORT S (6)
4 6 6 60
ALU
A LU
I/O PR OCESSOR
Figure 1-1. ADSP-21161 SHARC Block Diagram Figure 1-1 also shows the three on-chip buses of the ADSP-21161 processor: the Program Memory (PM) bus, Data Memory (DM) bus, and Input/Output (IO) bus. The PM bus provides access to either instructions
1-2
INTRODUCTION
or data. During a single cycle, these buses let the processor access two data operands from memory, access an instruction (from the cache), and perform a DMA transfer. The buses connect to the ADSP-21161 processor external port, which provides the processor interface to external memory, memory-mapped I/O, a host processor, and additional multiprocessing ADSP-21161 processors. The external port performs bus arbitration and supplies control signals to shared, global memory and I/O devices. Figure 1-2 illustrates a typical single-processor system. The ADSP-21161 processor includes extensive support for multiprocessor systems as well. For more information, see Multiprocessor (MP) Interface on page 7-87. Further, the ADSP-21161 processor addresses the five central requirements for DSPs: Fast, flexible arithmetic computation units Unconstrained data flow to and from the computation units Extended precision and dynamic range in the computation units Dual address generators with circular buffering support Efficient program sequencing Fast, Flexible Arithmetic. The ADSP-21000 Family processors execute all instructions in a single cycle. They provide fast cycle times and a complete set of arithmetic operations. The processor is IEEE floating-point compatible and allows either interrupt on arithmetic exception or latched status exception handling. Unconstrained Data Flow. The ADSP-21161 processor has a Super Harvard Architecture combined with a 10-port data register file. In every cycle, the processor can write or read two operands to or from the register
1-3
Design Advantages
CONTROL
CLOCK 2
DATA
ADSP-21161
ADDRESS
3 12
BRST
DATA
FLAG11-0 ADDR23-0 TIMEXP DATA47-16 RPBA ID2-0 RD LXCLK WR LXACK ACK LXDAT7-0 MS3-0 SCLK0 FS0 D0A D0B SCLK1 FS1 D1A D1B SCLK2 FS2 D2A D2B SCLK3 FS3 D3A D3B SPICLK SPDS MOSI MISO RAS CAS DQM SDWE SDCLK1-0 SDCKE SDA10
ADDR DATA MEMORY AND OE PERIPHERALS WE (OPTIONAL) ACK CS RAS CAS DQM WE CLK CKE A10 CS ADDR DATA SDRAM (OPTIONAL)
Figure 1-2. Typical Single Processor System file, supply two operands to the ALU, supply two operands to the multiplier, and receive three results from the ALU and multiplier. The processors 48-bit orthogonal instruction word supports parallel data transfers and arithmetic operations in the same instruction. 1-4 ADSP-21161 SHARC Processor Hardware Reference
INTRODUCTION
40-Bit Extended Precision. The processor handles 32-bit IEEE floating-point format, 32-bit integer and fractional formats (twos-complement and unsigned), and extended-precision 40-bit floating-point format. The processors carry extended precision throughout their computation units, limiting intermediate data truncation errors. Dual Address Generators. The processor has two Data Address Generators (DAGs) that provide immediate or indirect (pre- and post-modify) addressing. Modulus, bit-reverse, and broadcast operations are supported with no constraints on data buffer placement. Efficient Program Sequencing. In addition to zero-overhead loops, the processor supports single-cycle setup and exit for loops. Loops are both nestable (six levels in hardware) and interruptable. The processors support both delayed and non-delayed branches.
Architecture Overview
The ADSP-21161 processor forms a complete system-on-a-chip, integrating a large, high-speed SRAM and I/O peripherals supported by a dedicated I/O bus. The following sections summarize the features of each functional block in the ADSP-21161 processor SHARC architecture, which appears in Figure 1-1 on page 1-2. With each summary, a cross reference points to the sections where the features are described in greater detail.
Processor Core
The processor core of the ADSP-21161 processor consists of two processing elements (each with three computation units and data register file), a program sequencer, two data address generators, a timer, and an instruction cache. All digital signal processing occurs in the processor core.
1-5
Architecture Overview
Processing Elements The processor core contains two processing elements (PEx and PEy). Each element contains a data register file and three independent computation units: an ALU, a multiplier with a fixed-point accumulator, and a shifter. For meeting a wide variety of processing needs, the computation units process data in three formats: 32-bit fixed-point, 32-bit floating-point and 40-bit floating-point. The floating-point operations are single-precision IEEE-compatible. The 32-bit floating-point format is the standard IEEE format, whereas the 40-bit extended-precision format has eight additional Least Significant Bits (LSBs) of mantissa for greater accuracy. The ALU performs a set of arithmetic and logic operations on both fixed-point and floating-point formats. The multiplier performs floating-point or fixed-point multiplication and fixed-point multiply/add or multiply/subtract operations. The shifter performs logical and arithmetic shifts, bit manipulation, field deposit and extraction, and exponent derivation operations on 32-bit operands. These computation units perform single-cycle operations; there is no computation pipeline. All units are connected in parallel, rather than serially. The output of any unit may serve as the input of any unit on the next cycle. In a multifunction computation, the ALU and multiplier perform independent, simultaneous operations. Each processing element has a general-purpose data register file that transfers data between the computation units and the data buses and stores intermediate results. A register file has two sets (primary and secondary) of sixteen registers each, for fast context switching. All of the registers are 40 bits wide. The register file, combined with the core processors Super Harvard architecture, allows unconstrained data flow between computation units and internal memory.
1-6
INTRODUCTION
Primary Processing Element (PEx). PEx processes all computational instructions whether the processor is in Single-Instruction, Single-Data (SISD) or Single-Instruction, Multiple-Data (SIMD) mode. This element corresponds to the computational units and register file in previous ADSP-21000 family DSPs. Secondary Processing Element (PEy). PEy processes each computational instruction in lock-step with PEx, but only processes these instructions when the processor is in SIMD mode. Because many operations are influenced by this mode, more information on SIMD is available in multiple locations: For information on PEy operations, see Processing Elements on page 2-1 For information on data addressing in SIMD mode, see Addressing in SISD and SIMD Modes on page 4-18 For information on data accesses in SIMD mode, see SISD, SIMD, and Broadcast Load Modes on page 5-51 For information on multiprocessing in SIMD mode, see Multiprocessor (MP) Interface on page 7-87 For information on SIMD programming, see the ADSP-21160 SHARC DSP Instruction Set Reference Program Sequence Control Internal controls for ADSP-21161 processor program execution come from four functional blocks: program sequencer, data address generators, timer, and instruction cache. Two dedicated address generators and a program sequencer supply addresses for memory accesses. Together the sequencer and data address generators allow computational operations to execute with maximum efficiency since the computation units can be devoted exclusively to processing data. With its instruction cache, the
1-7
Architecture Overview
ADSP-21161 processor can simultaneously fetch an instruction from the cache and access two data operands from memory. The data address generators implement circular data buffers in hardware. Program Sequencer. The program sequencer supplies instruction addresses to program memory. It controls loop iterations and evaluates conditional instructions. With an internal loop counter and loop stack, the ADSP-21161 processor executes looped code with zero overhead. No explicit jump instructions are required to loop or to decrement and test the counter. The ADSP-21161 processor achieves its fast execution rate by means of pipelined fetch, decode, and execute cycles. If external memories are used, they are allowed more time to complete an access than if there were no decode cycle. Data Address Generators. The Data Address Generators (DAGs) provide memory addresses when data is transferred between memory and registers. Dual data address generators enable the processor to output simultaneous addresses for two operand reads or writes. DAG1 supplies 32-bit addresses to data memory. DAG2 supplies 32-bit addresses to program memory for program memory data accesses. Each DAG keeps track of up to eight address pointers, eight modifiers and eight length values. A pointer used for indirect addressing can be modified by a value in a specified register, either before (pre-modify) or after (post-modify) the access. A length value may be associated with each pointer to perform automatic modulo addressing for circular data buffers; the circular buffers can be located at arbitrary boundaries in memory. Each DAG register has a secondary register that can be activated for fast context switching.
1-8
INTRODUCTION
Circular buffers allow efficient implementation of delay lines and other data structures required in digital signal processing, and are commonly used in digital filters and Fourier transforms. The DAGs automatically handle address pointer wraparound, reducing overhead, increasing performance, and simplifying implementation. Interrupts. The ADSP-21161 processor has four external hardware interrupts: three general-purpose interrupts, IRQ2-0, and a special interrupt for reset. The processor also has internally generated interrupts for the timer, DMA controller operations, circular buffer overflow, stack overflows, arithmetic exceptions, multiprocessor vector interrupts, and user-defined software interrupts. For the general-purpose external interrupts and the internal timer interrupt, the ADSP-21161 processor automatically stacks the arithmetic status and mode (MODE1) registers in parallel with the interrupt servicing, allowing fifteen nesting levels of very fast service for these interrupts. Context Switch. Many of the processors registers have secondary registers that can be activated during interrupt servicing for a fast context switch. The data registers in the register file, the DAG registers, and the multiplier result register all have secondary registers. The primary registers are active at reset, while the secondary registers are activated by control bits in a mode control register. Timer. The programmable interval timer provides periodic interrupt generation. When enabled, the timer decrements a 32-bit count register every cycle. When this count register reaches zero, the ADSP-21161 processor generates an interrupt and asserts its timer expired output. The count register is automatically reloaded from a 32-bit period register and the count resumes immediately. Instruction Cache. The program sequencer includes a 32-word instruction cache that enables three-bus operation for fetching an instruction and two data values. The cache is selective; only instructions whose fetches
1-9
Architecture Overview
conflict with program memory data accesses are cached. This caching allows full-speed execution of core, looped operations such as digital filter multiply-accumulates and FFT butterfly processing. Processor Internal Buses The processor core has six buses: PM address, PM data, DM address, DM data, IO address, and IO data. Due to processors Super Harvard Architecture, data memory stores data operands, while program memory can store both instructions and data. This architecture allows dual data fetches, when the instruction is supplied by the cache. Bus Capacities. The PM address bus and DM address bus transfer the addresses for instructions and data. The PM data bus and DM data bus transfer the data or instructions from each type of memory. the PM address bus is 32 bits wide, allowing access of up to 62 Mwords for non-SDRAM and 254 Mwords for SDRAM banks of mixed instructions and data. The PM data bus is 64 bits wide from (8-, 16-, and 32-bits) to accommodate the 48-bit instructions and 32-bit data. The DM address bus is 32 bits wide allowing direct access of up to 4G words of data. The DM data bus is 64 bits wide. The DM data bus provides a path for the contents of any register in the processor to be transferred to any other register or to any data memory location in a single cycle. The data memory address comes from one of two sources: an absolute value specified in the instruction code (direct addressing) or the output of a data address generator (indirect addressing). The IO address and IO data buses let the IO processor access internal memory for DMA without delaying the processor core. The IO address bus is 18 bits wide, and the IO data bus is 64 bits wide. Data Transfers. Nearly every register in the processor core is classified as a Universal Register (UREG). Instructions allow transferring data between any two universal registers or between a universal register and memory. This support includes transfers between control registers, status registers,
1-10
INTRODUCTION
and data registers in the register file. The PM bus connect (PX) registers permit data to be passed between the 64-bit PM data bus and the 64-bit DM data bus, or between the 40-bit register file and the PM data bus. These registers contain hardware to handle the data width difference. For more information, see For more information, see Processing Element Registers on page A-23.
Processor Peripherals
The term processor peripherals refers to everything outside the processor core. The ADSP-21161 processor peripherals include internal memory, external port, I/O processor, JTAG port, and any external devices that connect to the processor. Dual-Ported Internal Memory (SRAM) The ADSP-21161 processor contains 1 megabit of on-chip SRAM, organized as two blocks of 0.5 Mbits. Each block can be configured for different combinations of code and data storage. Each memory block is dual-ported for single-cycle, independent accesses by the core processor and I/O processor or DMA controller. The dual-ported memory and separate on-chip buses allow two data transfers from the core and one from I/O, all in a single cycle. All of the memory can be accessed as 16-, 32-, 48-, or 64-bit words. On the ADSP-21161 processor, the memory can be configured as a maximum of 32K words of 32-bit data, 64K words of 16-bit data, 21.25K words of 48-bit instructions (and 40-bit data), or combinations of different word sizes up to 1.0 Mbit. The processor supports a 16-bit floating-point storage format, which effectively doubles the amount of data that may be stored on chip. Conversion between the 32-bit floating-point and 16-bit floating-point formats completes in a single instruction.
1-11
Architecture Overview
While each memory block can store combinations of code and data, accesses are most efficient when one block stores data, using the DM bus for transfers, and the other block stores instructions and data, using the PM bus for transfers. Using the DM bus and PM bus in this way, with one dedicated to each memory block, assures single-cycle execution with two data transfers. In this case, the instruction must be available in the cache. The processor uses its external port to maintain single-cycle execution when one of the data operands is transferred to or from off-chip. External Port The ADSP-21161 processor external port provides the processor interface to off-chip memory and peripherals. The 254 Mword off-chip address space is included in the unified address space of the ADSP-21161 processor. The separate on-chip busesfor PM address, PM data, DM address, DM data, IO address, and IO datamultiplex at the external port to create an external system bus with a single 24-bit address bus and a single 32-bit data bus. The ADSP-21161 processor on-chip DMA controller automatically packs external data into the appropriate word width during transfers. The ADSP-21161 processor supports instruction packing modes to execute from 48-, 32-, 16-, and 8-bit wide memories. With the link ports disabled, the additional link port pins can be used to execute 48-bit wide instructions. The ADSP-21161 processor also includes 32- to 48-bit, 16to 48-bit, 8- to 48-bit execution packing for executing instruction directly from 32-bit, 16-bit, or 8-bit wide external memories. External SDRAM, SRAM, or SBSRAM can be 8-, 16-, or 32-bits wide for DMA transfers to or from external memory. On-chip decoding of high-order address lines generates memory bank select signals for addressing external memory devices. The ADSP-21161 processor provides programmable memory waitstates and external memory acknowledge controls for interfacing to peripherals with variable access, hold, and disable time requirements.
1-12
INTRODUCTION
SDRAM Interface. The ADSP-21161 processor integrated on-chip SDRAM controller transfers data to and from synchronous DRAM (SDRAM) at the core clock frequency or one-half the core clock frequency. The synchronous approach, coupled with the core clock frequency, supports data transfer at a high throughputup to 400 Mbytes/second for 32-bit transfers and 600 Mbytes/second for 48-bit transfers. The SDRAM interface provides a glueless interface with standard SDRAMs16 Mbits, 64 Mbits, 128 Mbits, and 256 Mbitsand includes options to support additional buffers between the ADSP-21161 processor and SDRAM. The SDRAM interface is extremely flexible and provides capability for connecting SDRAMs to any one of the ADSP-21161 processor four external memory banks, with up to all four banks mapped to SDRAM. Systems with several SDRAM devices connected in parallel may require buffering to meet overall system timing requirements. The ADSP-21161 processor supports pipelining of the address and control signals to enable such buffering between itself and multiple SDRAM devices. Host Processor Interface. The ADSP-21161 processor host interface allows easy connection to standard microprocessor buses, 8-bit, 16-bit and 32-bit, with little additional hardware required. The interface supports asynchronous and synchronous transfers at speeds up to the half the internal core clock rate of the ADSP-21161 processor. The host interface operates through the ADSP-21161 processor external port and maps into the unified address space. Four channels of DMA are available for the host interface; code and data transfers occur with low software overhead. The host can directly read and write the IOP register space of the ADSP-21161 processor and can access the DMA channel setup and mailbox registers. The host can also perform DMA transfers to and from the internal memory of the processor. Vector interrupt support provides for efficient execution of host commands.
1-13
Architecture Overview
Multiprocessor System Interface. The ADSP-21161 processor offers powerful features tailored to multiprocessing systems. The unified address space allows direct interprocessor accesses of each ADSP-21161 processor internal IOP registers. Distributed bus arbitration logic on the processor allows simple, glueless connection of systems containing up to six ADSP-21161 processor and a host processor. Master processor changeover incurs only one cycle of overhead. Bus arbitration handles either fixed or rotating priority. Processor bus lock allows indivisible read-modify-write sequences for semaphores. A vector interrupt capability is provided for interprocessor commands. I/O Processor The ADSP-21161 processor Input/Output Processor (IOP) includes four serial ports, two link ports, a SPI-compatible port, and a DMA controller. One of the processes that the IO processor automates is booting. The processor can boot from the external port (with data from an 8-bit EPROM or a host processor) or a link port. Alternatively, a no-boot mode lets the processor start by executing instructions from external memory without booting. Serial Ports. The ADSP-21161 processor features four synchronous serial ports that provide an inexpensive interface to a wide variety of digital and mixed-signal peripheral devices. The serial ports can operate at up to half the processor core clock rate. Programmable data direction provides greater flexibility for serial communications. Serial port data can automatically transfer to and from on-chip memory using DMA. Each of the serial ports offers a TDM multichannel mode (up to 128 channels) and supports m-law or A-law companding. I2S support is also provided with the ADSP-21161 processor. The serial ports can operate with little-endian or big-endian transmission formats, with word lengths from 3 to 32 bits. The serial ports offer selectable synchronization and transmit modes. Serial port clocks and frame syncs can be internally or externally generated.
1-14
INTRODUCTION
Link Ports. The ADSP-21161 processor features two 8-bit link ports that provide additional I/O capabilities. Link port I/O is especially useful for point-to-point interprocessor communication in multiprocessing systems. The link ports can operate independently and simultaneously. The data packs into 32-bit or 48-bit words, which the processor core can directly read or the IO processor can DMA-transfer to on-chip memory. Clock and acknowledge handshaking signals control link port transfers. Transfers are programmable as either transmit or receive. Serial Peripheral (Compatible) Interface. The ADSP-21161 processor Serial Peripheral Interface (SPI) is an industry standard synchronous serial link that enables the ADSP-21161 processor SPI-compatible port to communicate with other SPI-compatible devices. SPI is a 4-wire interface consisting of two data pins, one device select pin, and one clock pin. It is a full-duplex synchronous serial interface, supporting both master and slave modes. It can operate in a multi-master environment by interfacing with up to four other SPI-compatible devices, either acting as a master or slave device. The ADSP-21161 processor SPI-compatible peripheral implementation also supports programmable baud rate and clock phase/polarities, and the use of open drain drivers to support the multi-master scenario to avoid data contention. DMA Controller. The ADSP-21161 processor on-chip DMA controller allows zero-overhead data transfers without processor intervention. The DMA controller operates independently and invisibly to the processor core, allowing DMA operations to occur while the core is simultaneously executing its program. Both code and data can be downloaded to the ADSP-21161 processor using DMA transfers. DMA transfers can occur between the ADSP-21161 processor internal memory and external memory, external peripherals, or a host processor. DMA transfers between external memory and external peripheral devices are another option. External bus packing to 8-, 16-, 32-, 48-, or 64-bit words is automatically performed during DMA transfers.
1-15
Fourteen channels of DMA are available on the ADSP-21161 processor two over the link ports (shared with SPI), eight over the serial ports, and four over the processors external port. The external port DMA channels serve for host processor, other ADSP-21161 processor DSPs, memory, or I/O transfers.
JTAG Port
The JTAG port on the ADSP-21161 processor supports the IEEE standard 1149.1 Joint Test Action Group (JTAG) standard for system test. This standard defines a method for serially scanning the I/O status of each component in a system. Emulators use the JTAG port to monitor and control the processor during emulation. Emulators using this port provide full-speed emulation with access to inspect and modify memory, registers, and processor stacks. JTAG-based emulation is non-intrusive and does not effect target system loading or timing.
1-16
INTRODUCTION
1-17
tion to each of the register files in parallel. Also, the ADSP-21161 processor permits register contents to be exchanged between the two processing elements register files in a single cycle.
1-18
INTRODUCTION
Multiprocessor Interface Enhancements The ADSP-21161 processor multiprocessor system interface supports greater throughput than the ADSP-2106x DSPs. The throughput between ADSP-21161 processors in a multiprocessing application increases due to new shared bus transfer protocols, shared bus cycle time improvements due to synchronous interface, and improvements in link port throughput. The external port supports glueless multiprocessing, with distributed arbitration for up to six ADSP-21161 processors.
IO Architecture Enhancements
The IO processor on the ADSP-21161 processor provides much greater throughput than the ADSP-2106x DSPs. This section describes how the link ports and DMA controller differ on the ADSP-21161 processor. DMA Controller Enhancements The ADSP-21161 processor DMA controller supports 14 channels compared to 10 on the ADSP-2106x DSPs. New packing modes support the 64-bit internal busing. To resolve potential deadlock scenarios, the ADSP-21161 processor DMA controller relinquishes the local bus in a similar fashion to the processor core when host logic asserts both HBR and SBTS. Link Port Enhancements The ADSP-21161 processor two link ports provide greater throughput than the ADSP-2106x DSPs. The link port data bus width on the ADSP-21161 processor is 8 bits wide (versus 4 bits on the ADSP-2106x DSPs). Link port clock control on the ADSP-21161 processor supports a wider frequency range.
1-19
1-20
INTRODUCTION
1-21
1-22
INTRODUCTION
Related Documents
For more information about Analog Devices DSPs and development products, see the following documents: ADSP-21161 SHARC DSP Microcomputer Data Sheet ADSP-21160 SHARC DSP Instruction Set Reference Getting Started Guide for VisualDSP++ & ADSP-21xxx Family DSPs VisualDSP++ User's Guide for ADSP-21xxx Family DSPs C/C++ Compiler & Library Manual for ADSP-21xxx Family DSPs Assembler Manual for ADSP-21xxx Family DSPs Linker & Utilities Manual for ADSP-21xxx Family DSPs All the manuals are included in the software distribution CD-ROM. To access these manuals, use the Help Topics command in the VisualDSP++ environments Help menu and select the Online Manuals book. From this Help topic, you can open any of the manuals, which are in Adobe Acrobat PDF format.
1-23
Conventions
Conventions
The following are conventions that apply to all chapters. Note that additional conventions, which apply only to specific chapters, appear throughout this document. Table 1-1. Notation Conventions
Example Close command (File menu)
{this | that}
Description Titles in reference sections indicate the location of an item within the VisualDSP++ environments menu system (for example, the Close command appears on the File menu). Alternative items in syntax descriptions appear within curly brackets and separated by vertical bars; read the example as this or that. One or the other is required. Optional items in syntax descriptions appear within brackets and separated by vertical bars; read the example as an optional this or that. Optional item lists in syntax descriptions appear within brackets delimited by commas and terminated with an ellipse; read the example as an optional comma-separated list of this. Commands, directives, keywords, and feature names are in text with letter gothic font. Non-keyword placeholders appear in text with italic style format. Note: For correct operation, ... A Note: provides supplementary information on a related topic. In the online version of this book, the word Note appears instead of this symbol. Warning: Injury to device users may result if ... A Warning: identifies conditions or inappropriate usage of the product that could lead to conditions that are potentially hazardous for devices users. In the online version of this book, the word Warning appears instead of this symbol.
[this | that]
[this,]
.SECTION
filename
1-24
2 PROCESSING ELEMENTS
The processors Processing Elements (PEx and PEy) perform numeric processing for digital signal processing algorithms. Each processing element contains a data register file and three computation units: an arithmetic/logic unit (ALU), a multiplier, and a shifter. Computational instructions for these elements include both fixed-point and floating-point operations, and each computational instruction can execute in a single cycle. The computational units in a processing element handle different types of operations. The ALU performs arithmetic and logic operations on fixed-point and floating-point data. The multiplier does floating-point and fixed-point multiplication and executes fixed-point multiply/add and multiply/subtract operations. The shifter completes logical shifts, arithmetic shifts, bit manipulation, field deposit, and field extraction operations on 32-bit operands. Also, the Shifter can derive exponents. Data flow paths through the computational units are arranged in parallel, as shown in Figure 2-1. The output of any computation unit may serve as the input of any computation unit on the next instruction cycle. Data moving in and out of the computational units goes through a 10-port register file, consisting of sixteen primary registers and sixteen alternate registers. Two ports on the register file connect to the PM and DM data buses, allowing data transfer between the computational units and memory (and anything else) connected to these buses.
2-1
The processors assembly language provides access to the data register files in both processing elements. The syntax lets programs move data to and from these registers and specify a computations data format at the same time with naming conventions for the registers. For information on the data register names, see Data Register File on page 2-30. Figure 2-1 provides a graphical guide to the other topics in this chapter. First, a description of the MODE1 register shows how to set rounding, data format, and other modes for the processing elements. Next, an examination of each computational unit provides details on operation and a summary of computational instructions. Outside the computational units, details on register files and data buses identify how to flow data for computations. Finally, details on the processors advanced parallelism reveal how to take advantage of multifunction instructions and SIMD mode.
2-2
Processing Elements
MODE1
PM DATA BUS DM DATA BUS REGISTER FILE (16 40-BIT) R0 R8 R9 R1 R10 R2 R11 R3 R4 R5 R6 R7 R12 R13 R14 R15
MULTIPLIER
SHIFTER
ALU
TO PROGRAM SEQUENCER
2-3
2-4
Processing Elements
For more information, see Numeric Formats on page C-1. This format is IEEE 754/854 compatible for single-precision floating-point operations in all respects except that: The processor does not provide inexact flags. NAN (Not-A-Number) inputs generate an invalid exception and return a quiet NAN (all 1s). Denormal operands flush to zero when input to a computation unit and do not generate an underflow exception. Any denormal or underflow result from an arithmetic operation flushes to zero and generates an underflow exception. The processor supports round to nearest and round toward zero modes, but does not support round to +Infinity and round to -Infinity. IEEE Single-precision floating-point data uses a 23-bit mantissa with an 8-bit exponent plus sign bit. In this case, the computation unit sets the eight LSBs of floating-point inputs to zeros before performing the operation. The mantissa of a result rounds to 23 bits (not including the hidden bit), and the 8 LSBs of the 40-bit result clear to zeros to form a 32-bit number, which is equivalent to the IEEE standard result. In fixed-point to floating-point conversion, the rounding boundary is always 40 bits even if the RND32 bit is set.
2-5
2-6
Processing Elements
a given operation. All computational units read the upper 32 bits of data (inputs, operands) from the 40-bit registers (ignoring the 8 LSBs) and write results to the upper 32 bits (zeroing the 8 LSBs).
Rounding Mode
The TRUNC bit in the MODE1 register determines the rounding mode for all ALU operations, all floating-point multiplies, and fixed-point multiplies of fractional data. The processor supports two modes of rounding: round-toward-zero and round-toward-nearest. The rounding modes comply with the IEEE 754 standard and have the following definitions. Round-Toward-Zero (TRUNC bit=1). If the result before rounding is not exactly representable in the destination format, the rounded result is the number that is nearer to zero. This definition is equivalent to truncation. Round-Toward-Nearest (TRUNC bit=0). If the result before rounding is not exactly representable in the destination format, the rounded result is the number that is nearer to the result before rounding. If the result before rounding is exactly halfway between two numbers in the destination format (differing by an LSB), the rounded result is the number that has an LSB equal to zero. Statistically, rounding up occurs as often as rounding down, so there is no large sample bias. Because the maximum floating-point value is one LSB less than the value that represents Infinity, a result that is halfway between the maximum floating-point value and Infinity rounds to Infinity in this mode. Though these rounding modes comply with standards set for floating-point data, they also apply for fixed-point multiplier operations on fractional data. The same two rounding modes are supported, but only the round-to-nearest operation is actually performed by the multiplier. Using
2-7
its local result register for fixed-point operations, the multiplier rounds-to-zero by reading only the upper bits of the result and discarding the lower bits.
and STKYy registers. Use the Bit Tst instruction to examine exception flags in the STKY register after a series of operations. If any flags are set, some of the results are incorrect. This method is useful when exception handling is not critical.
STKYx
More information on ASTAT and STKY status appears in the sections that describe the computational units. For summaries relating instructions and status bits, see Table 2-1, Table 2-2, Table 2-4, Table 2-6, and Table 2-7.
2-8
Processing Elements
ALU Operation
ALU instructions take one or two inputs: X input and Y input. These inputs (also known as operands) can be any data registers in the register file. Most ALU operations return one result; in add/subtract operations, the ALU operation returns two results, and in compare operations, the ALU operation returns no result (only flags are updated). ALU results can be returned to any location in the register file.
2-9
The processor transfers input operands from the register file during the first half of the cycle and transfers results to the register file during the second half of the cycle. With this arrangement, the ALU can read and write the same register file location in a single cycle. If the ALU operation is fixed-point, the inputs are treated as 32-bit fixed-point operands. The ALU transfers the upper 32 bits from the source location in the register file. For fixed-point operations, the result(s) are always 32-bit fixed-point values. Some floating-point operations (Logb, Mant and Fix) can also yield fixed-point results. The processor transfers fixed-point results to the upper 32 bits of the data register and clears the lower eight bits of the register. The format of fixed-point operands and results depends on the operation. In most arithmetic operations, there is no need to distinguish between integer and fractional formats. Fixed-point inputs to operations such as scaling a floating-point value are treated as integers. For purposes of determining status such as overflow, fixed-point arithmetic operands and results are treated as twos-complement numbers.
ALU Saturation
When the ALUSAT bit is set (1) in the MODE1 register, the ALU is in saturation mode. In this mode, all positive fixed-point overflows return the maximum positive fixed-point number (0x7FFF FFFF), and all negative overflows return the maximum negative number (0x8000 0000). When the ALUSAT bit is cleared (0) in the MODE1 register, fixed-point results that overflow are not saturated; the upper 32 bits of the result are returned unaltered. The ALU overflow flag reflects the ALU result before saturation.
2-10
Processing Elements
2-11
Flag update occurs at the end of the cycle in which the status is generated and is available on the next cycle. If a program writes the arithmetic status register or sticky status register explicitly in the same cycle that the ALU is performing an operation, the explicit write to the status register supersedes any flag update from the ALU operation.
2-12
Processing Elements
Rn = Rx + Ry Rn = Rx Ry Rn = Rx + Ry + CI Rn = Rx Ry + CI 1 Rn = (Rx + Ry)/2 COMP(Rx, Ry) COMPU(Rx,Ry) Rn = Rx + CI Rn = Rx + CI 1 Rn = Rx + 1 Rn = Rx 1 Rn = Rx Rn = ABS Rx Rn = PASS Rx Rn = Rx AND Ry Rn = Rx OR Ry Rn = Rx XOR Ry Rn = NOT Rx Rn = MIN(Rx, Ry) Rn = MAX(Rx, Ry) Rn = CLIP Rx BY Ry
* * * * * * * * * * * * * * * * * * * * *
* * * * 0 0 0 * * * * * * 0 0 0 0 0 0 0 0
* * * * * * * * * * * * 0 * * * * * * * *
* * * * * 0 0 * * * * * 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 * 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2-13
2-14
Processing Elements
MultiplyAccumulator (Multiplier)
The multiplier performs fixed-point or floating-point multiplication and fixed-point multiply/accumulate operations. Fixed-point multiply/accumulates are available with either cumulative addition or cumulative subtraction. Multiplier floating-point instructions operate on 32-bit or 40-bit floating-point operands and output 32-bit or 40-bit floating-point results. Multiplier fixed-point instructions operate on 32-bit fixed-point data and produce 80-bit results. Inputs are treated as fractional or integer, unsigned or twos-complement. Multiplier instructions include: Floating-point multiplication Fixed-point multiplication Fixed-point multiply/accumulate with addition, rounding optional Fixed-point multiply/accumulate with subtraction, rounding optional Rounding result register Saturating result register Clearing result register
Multiplier Operation
The multiplier takes two inputs: X input and Y input. These inputs (also known as operands) can be any data registers in the register file. The multiplier can accumulate fixed-point results in the local Multiplier Result (MRF) registers or write results back to the register file. The results in MRF can also be rounded or saturated in separate operations. Floating-point multiplies yield floating-point results, which the multiplier always writes directly to the register file.
2-15
MultiplyAccumulator (Multiplier)
The multiplier transfers input operands during the first half of the cycle and transfers results during the second half of the cycle. With this arrangement, the multiplier can read and write the same register file location in a single cycle. For fixed-point multiplies, the multiplier reads the inputs from the upper 32 bits of the data registers. Fixed-point operands may be either both in integer format or both in fractional format. The format of the result matches the format of the inputs. Each fixed-point operand may be either an unsigned or a twos-complement number. If both inputs are fractional and signed, the multiplier automatically shifts the result left one bit to remove the redundant sign bit. The register name(s) within the multiplier instruction specify input data type(s)Fx for floating-point and Rx for fixed-point.
2-16
Processing Elements
79
63
31
MRF2
MRF1
MRF0
OVERFLOW
FRACTIONAL RESULT
UNDERFLOW
OVERFLOW
OVERFLOW
INTEGER RESULT
Figure 2-2. Multiplier Fixed-Point Result Placement Fractional results can be rounded-to-nearest before being sent to the register file. If rounding is not specified, discarding bits 31-0 effectively truncates a fractional result (rounds to zero). For more information on rounding, see Rounding Mode on page 2-7. The MRF register is divided into MRF2, MRF1, and MRF0 registers, which can be individually read from or written to the register file. Each of these registers has the same format. When data is read from MRF2, it is sign-extended to 32 bits as shown in Figure 2-3. The processor zero fills the eight LSBs of the 40-bit register file location when data is read from MRF2, MRF1, or MRF0 to the register file. When the processor writes data into MRF2, MRF1, or MRF0 from the 32 MSBs of a register file location, the eight LSBs are ignored. Data written to MRF1 is sign-extended to MRF2, repeating the MSB of MRF1 in the 16 bits of MRF2. Data written to MRF0 is not sign-extended.
2-17
MultiplyAccumulator (Multiplier)
16 BITS SI GN EXTE ND
16 BITS
16 BITS ZEROS
MRF2
32 BI TS
8 BITS ZEROS
MRF1
32-BITS
8-BITS ZEROS
MRF0
Figure 2-3. MR Transfer Formats In addition to multiplication, fixed-point operations include accumulation, rounding and saturation of fixed-point data. There are three MRF register operations: Clear, Round, and Saturate. The clear operationMRF=0resets the specified MRF register to zero. Often, it is best to perform this operation at the start of a multiply/accumulate operation to remove results left over from the previous operation. The rounding operationMRF=Rnd MRFapplies only to fractional results, so integer results are not effected. This operation rounds the 80-bit MRF value to nearest at bit 32; for example, the MRF1-MRF0 boundary. Rounding of a fixed-point result occurs either as part of a multiply or multiply/accumulate operation or as an explicit operation on the MRF register. The rounded result in MRF1 can be sent either to the register file or back to the same MRF register. To round a fractional result to zero (truncation) instead of to nearest, a program would transfer the unrounded result from MRF1, discarding the lower 32 bits in MRF0. The saturate operationMRF=Sat MRFsets MRF to a maximum value if the MRF value has overflowed. Overflow occurs when the MRF value is greater than the maximum value for the data formatunsigned or twos-complement and integer or fractionalas specified in the saturate instruction.
2-18
Processing Elements
The six possible maximum values appear in Table 2-3. The result from MRF saturation can be sent either to the register file or back to the same MRF register. Table 2-3. Fixed-Point Format Maximum Values (For Saturation)
Maximum Number (Hexadecimal) MRF2 2s complement fractional (positive) 2s complement fractional (negative) 2s complement integer (positive) 2s complement integer (negative) Unsigned fractional number Unsigned integer number 0000 FFFF 0000 FFFF 0000 0000 MRF1 7FFF FFFF 8000 0000 0000 0000 FFFF FFFF FFFF FFFF 0000 0000 MRF0 FFFF FFFF 0000 0000 7FFF FFFF 8000 0000 FFFF FFFF FFFF FFFF
2-19
MultiplyAccumulator (Multiplier)
Multiplier operations also update four sticky status flags in the processing elements Sticky status ( STKYx and STKYy) register. Table A-5 on page A-19 lists all the bits in these registers. The following bits in STKYx or STKYy flag multiplier status (a 1 indicates the condition). Once set, a sticky flag remains high until explicitly cleared: Multiplier fixed-point overflow. Bit 6 (MOS) Multiplier floating-point overflow. Bit 7 (MVS) Multiplier underflow. Bit 8 (MUS) Multiplier floating-point invalid operation. Bit 9 (MIS) Flag update occurs at the end of the cycle in which the status is generated and is available on the next cycle. If a program writes the arithmetic status register or sticky register explicitly in the same cycle that the multiplier is performing an operation, the explicit write to ASTAT or STKY supersedes any flag update from the multiplier operation.
2-20
Processing Elements
indicates no effect The Input Mods column indicates the types of optional modifiers that you can apply to the instructions inputs. For a list of modifiers, see Table 2-5. Table 2-4. Fixed-Point Multiplier Instruction Summary
Instruction Fixed-Point: For Input Mods, see Table 2-5 Rn = Rx * Ry MRF = Rx * Ry MRB = Rx * Ry Rn = MRF + Rx * Ry Rn = MRB + Rx * Ry MRF = MRF + Rx * Ry MRB = MRB + Rx * Ry Rn = MRF Rx * Ry Rn = MRB Rx * Ry MRF = MRF Rx * Ry MRB = MRB Rx * Ry Rn = SAT MRF Rn = SAT MRB MRF = SAT MRF MRB = SAT MRB Rn = RND MRF Rn = RND MRB MRF = RND MRF MRB = RND MRB MRF = 0 Input Mods ASTATx,y Flags MU MN MV MI STKYx,y Flags MUS MOS MVS MIS
1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 3 3 3 3
* * * * * * * * * * * * * * * * * * *
* * * * * * * * * * * * * * * * * * *
* * * * * * * * * * * * * * * * * * *
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** ** **
2-21
MultiplyAccumulator (Multiplier)
2-22
Processing Elements
Barrel-Shifter (Shifter)
The shifter performs bit-wise operations on 32-bit fixed-point operands. Shifter operations include: Shifts and rotates from off-scale left to off-scale right Bit manipulation operations, including bit set, clear, toggle, and test Bit field manipulation operations, including extract and deposit Fixed-point/floating-point conversion operations, including exponent extract, number of leading 1s or 0s
Shifter Operation
The shifter takes from one to three inputs: X-input, Y-input, and Z-input. The inputs (also known as operands) can be any register in the register file. Within a shifter instruction, the inputs serve as follows. The X-input provides data that is operated on The Y-input specifies shift magnitudes, bit field lengths or bit positions The Z-input provides data that is operated on and updated In the following example, Rx is the X-input, Ry is the Y-input, and Rn is the Z-input. The shifter returns one output (Rn) to the register file.
Rn = Rn OR LSHIFT Rx BY Ry;
As shown in Figure 2-4, the shifter fetches input operands from the upper 32 bits of a register file location (bits 39-8) or from an immediate value in the instruction. The shifter transfers operands during the first half of the cycle and transfers the result to the upper 32 bits of a register (with the
2-23
Barrel-Shifter (Shifter)
eight LSBs zero-filled) during the second half of the cycle. With this arrangement, the shifter can read and write the same register file location in a single cycle. The X-input and Z-input are always 32-bit fixed-point values. The Y-input is a 32-bit fixed-point value or an 8-bit field (shf8), positioned in the register file. These inputs appear in Figure 2-4. Some shifter operations produce 8-bit or 6-bit results. As shown in Figure 2-5, the shifter places these results in either the shf8 field or the bit6 field and sign-extends the results to 32 bits. The shifter always returns a 32-bit result.
39 7 0
39
Figure 2-4. Register File Fields for Shifter Instructions The shifter supports bit field deposit and bit field extract instructions for manipulating groups of bits within an input. The Y-input for bit field instructions specifies two 6-bit values: bit6 and len6, which are positioned in the Ry register as shown in Figure 2-5. The shifter interprets bit6 and
2-24
Processing Elements
len6 as positive integers. Bit6 is the starting bit position for the deposit or extract, and len6 is the bit field length, which specifies how many bits are deposited or extracted.
39 R2 32 24 16 8 0
00000000
00000000
000000 10
len6
00010000
bit6
00000000
len6 = 8 bit6 = 16
0x0000 0210 00
39 R1
32
24
16
00000000
00000000
16
00000000
8
11111111
0
0 0 0 0 0 0 0 0 0x0000 00FF 00
39 R0
32
24
16
00000000
11111111
16
00000000
8
00000000
0
Reference point
00000000
0x00FF 0000 00
Figure 2-5. Register File Fields for FDEP, FEXT Instructions Field deposit (Fdep) instructions take a group of bits from the input register (starting at the LSB of the 32-bit integer field) and deposit the bits as directed anywhere within the result register. The bit6 value specifies the starting bit position for the deposit. Figure 2-7 shows how the inputs, bit6 and len6, work in an field deposit instruction (Rn=Fdep Rx By Ry). Figure 2-8 shows bit placement for the field deposit instruction R0 = FDEP R1 BY R2;. Field extract (Fext) instructions extract a group of bits as directed from anywhere within the input register and place them in the result register (aligned with the LSB of the 32-bit integer field). The bit6 value specifies the starting bit position for the extract. Figure 2-8 shows bit placement for the following field extract instruction R3 = FEXT R4 BY R5;
2-25
Barrel-Shifter (Shifter)
39 R2
32
24
16
00000000
00000000
000000 10
len6
00010000
bit6
00000000
len6 = 8 bit6 = 16
0x0000 0210 00
39 R1
32
24
16
00000000
00000000
16
00000000
8
11111111
0
00000000
0x0000 00FF 00
39 R0
32
24
16
00000000
11111111
16
00000000
8
00000000
0
Reference point
00000000
0x00FF 0000 00
RY
RY DETERMINES LENGTH OF BIT FIELD TO TAKE FROM RX AND STARTING POSITION FOR DEPOSIT IN RN
39
RX
LEN6 = NUMBER OF BITS TO TAKE FROM RX, STARTING FROM LSB OF 32-BIT FIELD
39
7 DEPOSIT FIELD
RN
BIT6 BIT6 = STARTING BIT POSITION FOR DEPOSIT, REFERENCED FROM LSB OF 32-BIT FIELD
REFERENCE POINT
2-26
Processing Elements
39 R5
32
24
16
0 0x0000 0217 00
00000000
00000000
00000010
len6
39 R4
32
24
16
10000111
1 0000000
16
00000000
8
00000000
0
0 0 0 0 0 0 0 0 0x8710 0000 00
Reference point
00000000
00000000
16
00000000
8
00001111
0
00000000
0x0000 000F 00
2-27
Barrel-Shifter (Shifter)
2-28
Processing Elements
2-29
2-30
Processing Elements
5. PEx Multiplier 6. PEy Multiplier 7. PEx Shifter 8. PEy Shifter The data register file in Figure 2-1 on page 2-3 lists register names of R0 through R15 within PExs register file. When a program refers to these registers as R0 through R15, the computational units treat the registers contents as fixed-point data. To perform floating point computations, refer to these registers as F0 through F15. For example, the following instructions refer to the same registers, but direct the computational units to perform different operations:
F0=F1 * F2; /*floating-point multiply*/ R0=R1 * R2; /*fixed-point multiply*/
The F and R prefixes on register names do not effect the 32-bit or 40-bit data transfer; the naming convention only determines how the ALU, multiplier, and shifter treat the data. To maintain compatibility with code written for previous SHARC DSPs, the assembly syntax accommodates references to PEx data registers and PEy data registers. Code may only refer to the PEy data registers (S0 through S15) for data move instructions. The rules for using register names are as follows. through R15 and F0 through F15 always refer to PEx registers for data move and computational instructions, whether the processor is in SISD or SIMD mode
R0
2-31
through R15 and F0 through F15 refer to both PEx and PEy register for computational instructions in SIMD mode
R0 S0
through S15 always refer to PEy registers for data move instructions, whether the processor is in SISD or SIMD mode
For more information on SISD and SIMD computational operations, see Alternate (Secondary) Data Registers on page 2-32. For more information on ADSP-21161 processor assembly language, see the ADSP-21160 SHARC DSP Instruction Set Reference.
2-32
Processing Elements
which register is the current MRF or MRB. This swapping facilitates context switching. Unlike other registers that have alternates, both MRF and MRB are accessible at the same time. All fixed-point multiplies can accumulate results in either MRF or MRB, without regard to the state of the MODE1 register. With this arrangement, code can use the result registers as primary and alternate accumulators, or code can use these registers as two parallel accumulators. This feature facilitates complex math. The MODE1 register controls the access to alternate registers. Table A-2 on page A-3 lists all the bits in MODE1. The following bits in MODE1 control alternate registers (a 1 enables the alternate set). Secondary registers for computation unit results. Bit 2 (SRCU) Secondary registers for hi register file, R8-R15 and S8-15. Bit 7 (SRRFH) Secondary registers for lo register file, R0-R7 and S0-S7. Bit 10 (SRRFL) The following example demonstrates how code should handle the one cycle of latency from the instruction setting the bit in MODE1 to when the alternate registers may be accessed. Note that it is possible to use any instruction that does not access the switching register file instead of an NOP instruction.
BIT SET MODE1 SRRFL; NOP; R0=7; /* activate alternate reg. file */ /* wait for access to alternates */
2-33
Multifunction Computations
Multifunction Computations
Using the many parallel data paths within its computational units, the processor supports multiple-parallel (multifunction) computations. These instructions complete in a single cycle, and they combine parallel operation of the multiplier and the ALU or dual ALU functions. The multiple operations perform the same as if they were in corresponding single-function computations. Multifunction computations also handle flags in the same way as the single-function computations, except that in the dual add/subtract computation the ALU flags from the two operations are Ored together. To work with the available data paths, the computation units constrain which data registers may hold the four input operands for multifunction computations. These constraints limit which registers may hold the X-input and Y-input for the ALU and multiplier. Figure 2-9 shows a computational unit and indicates which registers may serve as X-inputs and Y-inputs for the ALU and multiplier. For example, the X-input to the ALU can only be R8, R9, R10 or R11. Note that the shifter is gray in Figure 2-7 to indicate that there are no shifter multifunction operations.
2-34
Processing Elements
MODE1
NOTE THAT SHIFTER IS FADED HERE, INDICATING THAT IT IS NOT AVAILABLE FOR MULTIFUNCTION INSTRUCTIONS.
REGISTER FILE (16 40-BIT) R0 R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R13 R14 R15
MULTIPLIER
SHIFTER
ALU
TO PROGRAM SEQUENCER
Figure 2-9. Input Registers for Multifunction Computations (ALU and Multiplier) Table 2-8, through Table 2-11 list the multifunction computations. For more information on assembly language syntax, see the ADSP-21160 SHARC DSP Instruction Set Reference. In these tables, note the meaning of the following symbols. Rm, Ra, Rs, Rx, Ry indicate any register file location; fixed-point Fm, Fa, Fs, Fx, Fy indicate any register file location; floating-point R3-0 indicates data file registers R3, R2, R1, or R0, and F3-0 indicates data file registers F3, F2, F1, or F0
2-35
Multifunction Computations
R7-4 indicates data file registers R7, R6, R5, or R4, and F7-4 indicates data file registers F7, F6, F5, or F4 R11-8 indicates data file registers R11, R10, R9, or R8, and F11-8 indicates data file registers F11, F10, F9, or F8 R15-12 indicates data file registers R15, R14, R13, or R12, and F15-12 indicates data file registers F15, F14, F13, or F12 SSFR indicates the X-input is signed, Y-input is signed, use Fractional inputs, and Rounded-to-nearest output SSF indicates the X-input is signed, Y-input is signed, use Fractional input Table 2-8. Dual Add And Subtract
Ra = Rx + Ry, Rs = Rx Ry Fa = Fx + Fy, Fs = Fx Fy
2-36
Processing Elements
Another type of multifunction operation is also available on the processor, combining transfers between the results and data registers and transfers between memory and data registers. Like other multifunction instructions, these parallel operations complete in a single cycle. For example, the processor can perform the following multiply and parallel read of data memory:
MRF=MRF-R5*R0, R6=DM(I1,M2);
Or, the processor can perform the following result register transfer and parallel read:
R5=MR1F, R6=DM(I1,M2);
2-37
PM DAT A BUS
BUS CONNE CT (PX )
16/32/40/64 16/32/40/64
D M DATA BU S
MUL T
BARREL SHIFT ER
MULT
AL U
ALU
Figure 2-10. Block Diagram Showing Secondary Execution Complex The MODE1 register controls the operating mode of the processing elements. Table A-2 on page A-3 lists all the bits in MODE1. The PEYEN bit (bit 21) in the MODE1 register enables or disables the PEy processing element. When PEYEN is cleared (0), the ADSP-21161 processor operates in Single-Instruction-Single-Data (SISD) mode, using only PEx; this is the mode in which ADSP-2106x family DSPs operate. When the PEYEN bit is set (1), the ADSP-21161 processor operates in SIMD mode, using the PEx and PEy processing elements. There is a one cycle delay after PEYEN is set or cleared, before the change to or from SIMD mode takes effect.
2-38
Processing Elements
To support SIMD, the processor performs the following parallel operations. Dispatches a single instruction to both processing elements computation units Loads two sets of data from memory, one for each processing element Executes the same instruction simultaneously in both processing elements Stores data results from the dual executions to memory Using the information here and in the ADSP-21160 SHARC DSP Instruction Set Reference, it is possible through SIMD modes parallelism to double performance over similar algorithms running in SISD (ADSP-2106x processor compatible) mode. The two processing elements are symmetrical, and each contains the following functional blocks. ALU Multiplier primary and alternate result registers Shifter Data register file and alternate register file
2-39
implicit relation between PEx and PEy data registers corresponds to complementary register pairs in Table 2-12. Any universal registers that dont appear in Table 2-12 have the same identities in both PEx and PEy. When a computation in SIMD mode refers to a register in the PEx column, the corresponding computation in PEy refers to the complimentary register in the PEy column. Table 2-12. SIMD Mode Complementary Register Pairs
PEx R0 R1 R2 R3 R4 R5 R6 R7 R8 R9 R10 R11 R12 R13 R14 ASTATx STKYx PEy S0 S1 S2 S3 S4 S5 S6 S7 S8 S9 S10 S11 S12 S13 S14 ASTATy STKYy
2-40
Processing Elements
Table 2-13 lists the multiplier result SIMD mode complementary register pairs. These multiplier result registers are not universal (UREGs) registers and cannot be accessed directly. These registers can be read with the following multiplier operations:
MRxF/B = Rn; Rn = MRxF/B;
2-41
2-42
Processing Elements
2-43
conditional for an instruction test in both the PEx and PEy elements, dividing control of the explicit and implicit transfers as detailed in Table 2-15. Bidirectional register-to-register transfers in SIMD mode are allowed between a data register and DAG, control, or status registers. When the DAG, control, or status register is a source of the transfer, the destination can be a data register. This SIMD transfer duplicates the contents of the source register in a data register in both processing elements. Careful programming is required when a DAG, control, or status register is a destination of a transfer from a data register. If the destination register has a complement (for example ASTATx and ASTATy), the SIMD transfer moves the contents of the explicit data register into the explicit destination and moves the contents of the implicit data register into the implicit destination (the complement). If the destination register has no complement (for example, I0), only the explicit transfer occurs. Even if the code uses a conditional operation to select whether the transfer occurs, only the explicit transfer can take place if the destination register has no complement. In the case where a DAG, control, or status register is both source and destination, the data move operation executes the same as if SIMD mode were disabled. In both SISD and SIMD modes, the processor supports bidirectional register-to-register swaps. The swap always occurs between one register in each processing elements data register file. Registers swaps use the special swap operator, <->. A register-to-register swap occurs when registers in different processing elements exchange values; for example R0 <-> S1. Only single, 40-bit register to register swaps are supportedno double register operations.
2-44
Processing Elements
When they are unconditional, register-to-register swaps operate the same in SISD mode and SIMD mode. If a condition is added to the instruction in SISD mode, the condition tests only in the PEx element and controls the entire operation. If a condition is added in SIMD mode, the condition tests in both the PEx and PEy elements and controls the halves of the operation as detailed in Table 2-15. Table 2-15. Register-To-Register Move Summary (SISD Versus SIMD)
Mode SISD1 Instruction IF condition compute, Rx = Ry; IF condition compute, Rx = Sy; IF condition compute, Sx = Ry; IF condition compute, Sx = Sy; IF condition compute, Rx <-> Sy; SIMD2 IF condition compute, Rx = Ry; IF condition compute, Rx = Sy; IF condition compute, Sx = Ry; IF condition compute, Sx = Sy; IF condition compute, Rx <-> Sy;3 1 Explicit Transfer Rx loaded from Ry Rx loaded from Sy Sx loaded from Ry Sx loaded from Sy Rx swaps to Sy Sy swaps to Rx Rx loaded from Ry Rx loaded from Sy Sx loaded from Ry Sx loaded from Sy Rx swaps to Sy Sy swaps to Rx Implicit Transfer None None None None None Sx loaded from Sy Sx loaded from Ry Rx loaded from Sy Rx loaded from Ry None
In SISD mode, the conditional applies only to the entire operation and is only tested against PExs flags. When the condition tests true, the entire operation occurs. 2 In SIMD mode, the conditional applies separately to the explicit and implicit transfers. Where the condition tests true (PEx for the explicit and PEy for the implicit), the operation occurs in that processing element. 3 Register to register transfers (R0=S0) and register swaps (R0<->S0) do not cause a PMD bus conflict. These operations use only the DMD bus and a hidden 16-bit bus to do the two register moves.
SIMD conditional instructions with the same destination registers do not produce predictable transfers. For example, the instruction IF EQ R4 = R14 R15, S4 = R6; may not work as expected. This kind of usage is prohibited, as it is not logical to use it this way.
2-45
2-46
3 PROGRAM SEQUENCER
The processors program sequencer implements program flow which constantly provides the address of the next instruction to be executed by other parts of the processor. Program flow in the processor is mostly linear, with the processor executing program instructions sequentially. This linear flow varies occasionally when the program uses non-sequential program structures, such as those illustrated in Figure 3-1. Non-sequential structures direct the processor to execute an instruction that is not at the next sequential address following the current instruction. These structures include: Loops. One sequence of instructions executes several times with zero overhead. Subroutines. The processor temporarily interrupts sequential flow to execute instructions from another part of program memory. Jumps. Program flow transfers permanently to another part of program memory. Interrupts. Subroutines in which a runtime event (not an instruction) triggers the execution of the routine. Idle. An instruction that causes the processor to cease operations and hold its current state until an interrupt occurs. Then, the processor services the interrupt and continues normal execution.
3-1
The sequencer manages execution of these program structures by selecting the address of the next instruction to execute. As part of its process, the sequencer handles the following tasks: Increments the fetch address Maintains stacks Evaluates conditions Decrements the loop counter Calculates new addresses Maintains an instruction cache Handles interrupts To accomplish these tasks, the sequencer uses the blocks shown in Figure 3-2. The sequencers address multiplexer selects the value of the next fetch address from several possible sources. The fetched address enters the instruction pipeline, made up of the fetch address register, decode address register, and program counter (PC). These contain the 24-bit addresses of the instructions currently being fetched, decoded, and executed. The PC couples with the PC stack, which stores return addresses and top-of-loop addresses. All addresses generated by the sequencer are 24-bit program memory instruction addresses. To manage events, the sequencers interrupt controller handles interrupt processing, determines whether an interrupt is masked, and generates the appropriate interrupt vector address. With selective caching, the instruction cache lets the processor access data in program memory and fetch an instruction (from the cache) in the same cycle. The DAG2 data address generator outputs program memory data addresses.
3-2
Program Sequencer
LINEAR FLOW
ADDRESS: N N+1 N+2 N+3 N+4 N+5 INSTRUCTION INSTRUCTION INSTRUCTION INSTRUCTION INSTRUCTION INSTRUCTION
LOOP
DO UNTIL INSTRUCTION INSTRUCTION INSTRUCTION N TIMES INSTRUCTION INSTRUCTION
JUMP
JUMP INSTRUCTION INSTRUCTION INSTRUCTION INSTRUCTION INSTRUCTION
SUBROUTINE
IRQ CALL INSTRUCTION INSTRUCTION INSTRUCTION INSTRUCTION INSTRUCTION RTS
INTERRUPT
INSTRUCTION INSTRUCTION INSTRUCTION INSTRUCTION INSTRUCTION INSTRUCTION RTI VECTOR
IDLE
IDLE INSTRUCTION WAITING FOR IRQ INSTRUCTION INSTRUCTION INSTRUCTION INSTRUCTION INSTRUCTION INSTRUCTION
Figure 3-1. Program Flow Variations The sequencer evaluates conditional instructions and loop termination conditions by using information from the status registers. The loop address stack and loop counter stack support nested loops. The status stack stores status registers for implementing nested interrupt routines. Table 3-1 and Table 3-2 list the registers within and related to the program sequencer. All registers in the program sequencer are universal registers, so they are accessible to other universal registers and to data memory. All the sequencers registers and the tops of stacks are readable,
3-3
MODE1
MODE2
STKYX
ASTATX
A STATY
STKYY
U STAT1
U STAT2
USTAT3
US TAT4
TPERIOD
INSTRUCTION CACHE
MULTIPLEXER
TCOUNT
CONDITION LOGIC
DEC REMENT
TCOUNT=0 NO OTHER INTERRUPTS INTERRUPT CONTROLLER INTERRUPT LATC H (IRPTL) INTERRUPT MASK (IMASK) INTERRUPT MASK POINTER (IMA SKP) INTERRUPT VE CTOR
YES
TIME XP INS TRUCTI ON PIPE LINE PROGRAM COUNTER STACK FETCH ADDRESS (FADDR) DEC ODE PROGRAM ADDRESS COUNTER (DADDR) (PC) PC-RELATIVE ADDRESS +1 NEXT ADDRES S (LINE AR FLOW ) +
TOP OF PC STACK (PCS TK) PC STACK POINTE R (PCS TKP) REPE ATED ADDRESS (IDLE)
DIRECT BRANCH
INDI RE CT BRANCH
32
32
32
DM DATA BUS
PM ADDRESS BUS
PM DATA BUS
3-4
Program Sequencer
and all these registers are writable, except for the fetch address, decode address, and PC. Pushing or popping the PC stack is done with a write to the PC stack pointer, which is readable and writable. Pushing or popping the loop address stack requires explicit instructions. A set of system control registers configures or provides input to the sequencer. These registers appear across the top and within the interrupt controller shown in Figure 3-2. A bit manipulation instruction permits setting, clearing, toggling, or testing specific bits in the system registers. For information on this instruction (Bit), see the ADSP-21160 SHARC DSP Instruction Set Reference. Writes to some of these registers do not take effect on the next cycle. For example, after a write to the MODE1 register to enable ALU saturation mode, the change does not take effect until two cycles after the write. Also, some of these registers do not update on the cycle immediately following a write. An extra cycle is required before a read of the register returns the new value. With the lists of sequencer and system registers, Table 3-1 and Table 3-2 summarize the number of extra cycles (latency) for a write to take effect (effect latency) and for a new value to appear in the register (read latency). A 0 indicates that the write takes effect or appears in the register on the next cycle after the write instruction is executed, and a 1 indicates one extra cycle. Table 3-1. Program Sequencer Registers Read and Effect Latencies
Register FADDR DADDR PC PCSTK PCSTKP LADDER Contents fetch address decode address execute address top of PC stack PC stack pointer top of loop address stack Bits 24 24 24 24 5 32 Read Latency 0 1 0 Effect Latency 0 1 0
3-5
Table 3-1. Program Sequencer Registers Read and Effect Latencies (Contd)
Register CURLCNTR LCNTR Contents top of loop count stack (current loop count) Bits 32 Read Latency 0 0 Effect Latency 0 0
MODE1 MODE2 IRPTL IMASK IMASKP MMASK FLAGS LIRPTL ASTATX ASTATY STKYX STKYY USTAT1 USTAT2 USTAT3 USTAT4
mode control bits mode control bits interrupt latch interrupt mask
32 32 32 32
interrupt mask pointer (for nest- 32 ing) mode mask flag inputs link port interrupt latch/mask arithmetic status flags arithmetic status flags sticky status flags sticky status flags user-defined status flags user-defined status user-defined status user-defined status 32 32 32 32 32 32 32 32 32 32 32
3-6
Program Sequencer
The following sections in this chapter explain how to use each of the functional blocks in Figure 3-2: Instruction Pipeline on page 3-7 Instruction Cache on page 3-8 Branches and Sequencing on page 3-13 Loops and Sequencing on page 3-22 Interrupts and Sequencing on page 3-34 Timer and Sequencing on page 3-50 Stacks and Sequencing on page 3-52 Conditional Sequencing on page 3-53 SIMD Mode and Sequencing on page 3-57
Instruction Pipeline
The program sequencer determines the next instruction address by examining both the current instruction being executed and the current state of the processor. If no conditions require otherwise, the processor executes instructions from program memory in sequential order by incrementing the fetch address. Using its instruction pipeline, the processor processes instructions in three clock cycles: Fetch cycle. The processor reads the instruction from either the on-chip instruction cache or from program memory. Decode cycle. The processor decodes the instruction, generating conditions that control instruction execution. Execute cycle. The processor executes the instruction; the operations specified by the instruction complete in a single cycle.
3-7
Instruction Cache
These cycles overlap in the pipeline, as shown in Table 3-3. In sequential program flow, when one instruction is being fetched, the instruction fetched in the previous cycle is being decoded, and the instruction fetched two cycles before is being executed. Sequential program flow always has a throughput of one instruction per cycle. Table 3-3. Pipelined Execution Cycles
Cycles 1 2 3 4 5 Fetch 0x08 0x09 0x0A 0x0B 0x0C 0x08 0x09 0x0A 0x0B 0x08 0x09 0x0A Decode Execute
Any non-sequential program flow can potentially decrease the processors instruction throughput. Non-sequential program operations include: Program memory data accesses that conflict with instruction fetches Jumps Subroutine calls and returns Interrupts and return Loops
Instruction Cache
Usually, the sequencer fetches an instruction from memory on each cycle. Occasionally, bus constraints prevent some of the data and instructions from being fetched in a single cycle. To alleviate these data flow con-
3-8
Program Sequencer
straints, the processor has an instruction cache, which appears in Figure 3-2. When the processor executes an instruction that requires data access over the PM data bus, a bus conflict occurs because the sequencer uses the PM data bus for fetching instructions. To avoid these conflicts, the processor caches these instructions, reducing delays. Except for enabling or disabling the cache, its operation requires no user intervention. For more information, see Using the Cache on page 3-11. When the processor first encounters a fetch conflict, the processor must wait to fetch the instruction on the following cycle, causing a delay. The processor automatically writes the fetched instruction to the cache to prevent the same delay from happening again. The sequencer checks the instruction cache on every program memory data access. If the instruction needed is in the cache, the instruction fetch from the cache happens in parallel with the program memory data access, without incurring a delay. Because of the three-stage instruction pipeline, as the processor executes an instruction (at address n) that requires a program memory data access, this execution creates a conflict with the instruction fetch (at address n+2), assuming sequential execution. The cache stores the fetched instruction (n+2), not the instruction requiring the program memory data access. If the instruction needed to avoid a conflict is in the cache, the cache provides the instruction while the program memory data access is performed. If the needed instruction is not in the cache, the instruction fetch from memory takes place in the cycle following the program memory data access, incurring one cycle of overhead. The fetched instruction is loaded into the cache, if the cache is enabled and not frozen, so that it is available the next time the same conflict occurs. Figure 3-3 shows a block diagram of the instruction cache. The cache holds 32 instuction-address pairs. These pairs (or cache entries) are arranged into 16 (15-0) cache sets according to their address 4 least significant bits (3-0). The two entries in each set (entry 0 and entry 1) have a
3-9
Instruction Cache
valid bit, indicating whether the entry contains a valid instruction. The least recently used (LRU) bit for each set indicates which entry was not used last (0=entry 0 and 1=entry 1).
LRU VALID BIT BIT SET 0 ENTRY 0 ENTRY 1 SET 1 ENTRY 0 ENTRY 1 SET 2 ENTRY 0 ENTRY 1 0010 0001 INSTRUCTIONS ADDRESSES BITS (23-4) ADDRESSES BITS (3-0) 0000
1101
1110
1111
Figure 3-3. Instruction Cache Architecture The cache places instructions in entries according to the 4 LSBs of the instructions address. When the sequencer checks for an instruction to fetch from the cache, it uses the 4 address LSBs as an index to a cache set. Within that set, the sequencer checks the addresses of the two entries, looking for the needed instruction. If the cache contains the instruction, the sequencer uses the entry and updates the LRU bit (if necessary) to indicate the entry did not contain the needed instruction. When the cache does not contain a needed instruction, the cache loads a new instruction and its address, placing these in the least recently used entry of the appropriate cache set and toggling the LRU bit (if necessary).
3-10
Program Sequencer
3-11
Instruction Cache
An example of inefficient cache code appears in Table 3-4. The program memory data access at address 0x101 in the loop, Outer, causes the cache to load the instruction at 0x103 (into set 3). Each time the program calls the subroutine, Inner, the program memory data accesses at 0x201 and 0x211 displace the instruction at 0x103 by loading the instructions at 0x203 and 0x213 (also into set 3). If the program only calls the Inner subroutine rarely during the Outer loop execution, the repeated cache loads do not greatly influence performance. If the program frequently calls the subroutine while in the loop, the cache inefficiency has a noticeable effect on performance. To improve cache efficiency on this code (if for instance, execution of the Outer loop is time-critical), rearrange the order of some instructions. Moving the subroutine call up one location (starting at 0x201) would work here, because with that order the two cached instructions end up in cache set 4 instead of set 3. Table 3-4. Cache-Inefficient Code
Address 0x0100 0x0101 0x0102 0x0103 0x0104 0x0105 0x0106 0x0107 ... 0x0200 0x0201 ... 0x0211 pm(i9,m9)=r12; Inner: r1=R13; r14=pm(i9,m9); Instruction lcntr=1024, do Outer until LCE; r0=dm(i0,m0), pm(i8,m8)=f3; r1=r0-r15; if eq call (Inner); f2=float r1; f3=f2*f2; Outer: f3=f3+f4; pm(i8,m8)=f3;
3-12
Program Sequencer
There are a number of parameters that can be specified for branches: and CALL/return instructions can be conditional. The program sequencer can evaluate status conditions to decide whether to execute a branch. If no condition is specified, the branch is always taken. For more information on these conditions, see Conditional Sequencing on page 3-53.
JUMP JUMP
and CALL/return instructions can be immediate or delayed. Because of the instructions pipeline, an immediate branch incurs two lost (overhead) cycles. A delayed branch has no overhead. For more information, see Delayed Branches on page 3-15.
instructions that appear within a loop or within an interrupt service routine have additional options. For information on the loop abort (LA) option, see Loops and Sequencing on page 3-22. For information on the loop re-entry (LR) option, see Restrictions on Ending Loops on page 3-25.For information on the clear interrupt (CI) option, see Interrupts and Sequencing on page 3-34.
JUMP
The sequencer block diagram in Figure 3-2 on page 3-4 shows that branches can be direct or indirect. The difference is that the sequencer generates the address for a direct branch, and the PM data address generator (DAG2) produces the address for an indirect branch. Direct branches are JUMP or CALL/return instructions that use an absolutenot changing at runtimeaddress (such as a program label) or use a PC-relative address. Some instruction examples that cause a direct branch are:
JUMP fft1024; /*Where fft1024 is an address label*/ CALL (pc,10); /*Where (pc,10) a PC-relative address*/
3-14
Program Sequencer
Indirect branches are JUMP or CALL/return instructions that use a dynamic address that comes from the PM data address generator. For more information on the data address generator, see Chapter 4, Data Address Generator. Some instruction examples that cause an indirect branch are:
JUMP (m8,i12); /*where (m8,i12) are DAG2 registers*/ CALL (m9,i13); /*where (m9,i13) are DAG2 registers*/
Conditional Branches
The sequencer supports conditional branches. These are JUMP or CALL/return instructions whose execution is based on testing an IF condition. For more information on condition types in IF condition instructions, see Conditional Sequencing on page 3-53. Note that the processors Single-Instruction, Multiple-Data mode influences the execution of conditional branches. For more information, see SIMD Mode and Sequencing on page 3-57.
Delayed Branches
The instruction pipeline influences how the sequencer handles branches. For immediate branches in which JUMPs and CALL/return instructions are not specified as delayed branches (DB), two instruction cycles are lost (NOPs) as the pipeline empties and refills with instructions from the new branch. As shown in Table 3-5 and Table 3-6, the processor does not execute the two instructions after the branch, which are in the fetch and decode stages. For a CALL, the decode address (the address of the instruction after the CALL) is the return address. During the two lost (no-operation) cycles, the pipeline fetches and decodes the first instruction at the branch address.
3-15
j+1 j+2
Note that n is the branching instruction, and j is the instruction branch address 1. n+1 suppressed 2. For call, n+1 pushed on PC stack 3. n+2 suppressed
Note that n is the branching instruction, and r is the instruction branch address 1. n+1 suppressed 2. r (n+1 in Table 3-5) popped from PC stack 3. n+2 suppressed
For delayed branches, JUMPs and CALL/return instructions with the delayed branches (DB) modifier, no instruction cycles are lost in the pipeline, because the processor executes the two instructions after the branch while the pipeline fills with instructions from the new branch. As shown in Table 3-7 and Table 3-8, the processor executes the two instructions after the branch, while the instruction at the branch address is fetched and decoded. In the case of a CALL, the return address is the third address after the branch instruction. While delayed branches use the 3-16 ADSP-21161 SHARC Processor Hardware Reference
Program Sequencer
instruction pipeline more efficiently than immediate branches, note that delayed branch code can be harder to understand because of the instructions between the branch instruction and the actual branch. Table 3-7. Pipelined Execution Cycles for Delayed Branch (JUMP or CALL)
Cycles 1 2 3 4 Fetch n+2 j1 j+1 j+2 Decode n+1 n+2 j j+1 Execute n n+1 n+2 j
Note that n is the branching instruction, and j is the instruction branch address 1. For call, n+3 pushed on PC stack
n is the branching instruction, and r is the instruction branch address 1. r (n+3 in Table 3-7) popped from PC stack
3-17
Besides being more challenging to code, delayed branches impose some limitations that stem from the instruction pipeline architecture. Because the delayed branch instruction and the two instructions that follow it must execute sequentially, the instructions in the two locations that follow a delayed branch instruction cannot be any of the following: Other branches (no JUMP, CALL, or return instructions) Any stack manipulations (no PUSH or POP instructions or writes to the PC stack or PC stack pointer) Any loops or other breaks in sequential operation (no DO/UNTIL or IDLE instructions) Development software for the processor should always flag these types of instructions as code errors in the two locations after a delayed branch instruction. It is possible to follow a delayed branch instruction with a JUMP, CALL, or return instruction in one special case. If the sequential branch instructions use mutually exclusive conditions, one branch may following another. The following example is valid.
if gt jump (PC, 7) (db); // if greater than... if le jump (PC,11) (db); // if less than or equal...
Interrupt processing is also influenced by delayed branches and the instruction pipeline. Because the delayed branch instruction and the two instructions that follow it must execute sequentially, the processor does not immediately process an interrupt that occurs in between a delayed branch instruction and either of the two instructions that follow. Any interrupt that occurs during these instructions is latched, but is not processed until the branch is complete.
3-18
Program Sequencer
During a delayed branch, a program can read the PC stack or PC stack pointer immediately after a delayed call or return. This read shows that the return address on the PC stack has already been pushed or popped, even though the branch has not occurred yet. Restrictions and Limitations When Using Delayed Branches Besides being more challenging to code, delayed branches impose some limitations that stem from the instruction pipeline architecture. Because the delayed branch instruction and the two instructions that follow it must execute sequentially, the instructions in the two locations that follow a delayed branch instruction cannot be any of those described in the following five sections. Development software for the ADSP-21161 processor should always flag the operations described in the next five sections as code errors in the two locations after a delayed branch instruction. Normally it is not valid to use two conditional instructions using the (DB) option following each other. But the execution is allowed when these instructions are mutually exclusive as shown below.
If gt jump (PC, 7) (db); If le jump (pc, 11) (db);
Other Jumps, or Calls with RTI, RTS These instructions cannot be used when they follow a delayed branch instruction. This is shown in the following code that uses the JUMP instruction.
jump foo(db); jump my(db); r0=r0+r1; r1=r1+r2;
3-19
In this case, the delayed branch instruction r1=r1+r2, is not executed. Further, the control jumps to my instead of foo, where the delayed branch instruction is the execution of foo. The exception is for the JUMP instruction, which applies for the mutually exclusive conditions EQ (equal), and NE (not equal). If the first EQ condition evaluates true, then the NE conditional jump has no meaning and is the same as a NOP instruction. Code samples for these conditions are shown below.
if eq jump label1 (db); if ne jump label1 (db); nop; nop;
Pushes or Pops of the PC Stack In this case a push of the PC stack in a delayed branch is followed by a pop. If a value is pushed in the delayed branch of a call, it is first popped in the called subroutine. This is followed by an RTS instruction.
call foo (db); push PCSTK; nop; foo; /* second push due to PCSTK */ /* first push because of call */
This example shows that when a program pushes the PCSTK during a delayed slot, the PC stack pointer is pushed onto the PCSTK. The following instructions are executed prior to executing the RTS.
pop PCSTK; RTS (db); nop; nop;
3-20
Program Sequencer
If pushing the PC stack, a stack pop must be performed first. This is followed by an RTS instruction. If a value is popped inside a delayed branch, whatever subroutine return address is pushed is popped back, which is not allowed. Writes to the PC Stack or PC Stack Pointer The following two situations may arise when programs attempt to write to the PC stack inside a delayed branch. 1. If programs write into the PC stack inside a jump, one of the following situations can occur. a. The PC stack cannot hold a value that has already been pushed onto the PC stack. When the PC stack contains a value and a program writes that same value onto the stack, the original value is overwritten by the new value and the original value becomes corrupted. b. The PC stack is empty. Programs cannot write to the PC stack when they are inside a jump. In this case the PC stack remains empty. 2. Write to the PC stack inside a call. If a program writes to the PC stack inside of a call, the value that is pushed onto the PC stack because that call is overwritten by the value written onto the PC stack. Therefore, when a program performs an RTS, the program returns to the address pushed onto the PC stack and not to the address pushed while branching to the subroutine. For example:
3-21
The value 90114 is pushed onto the PC stack, while the value 9011C is written to the PC stack. Accordingly, the value 90114 is overwritten by the value 9011C in the PC stack because values that are pushed onto the stack have precedence over values written to the stack. Therefore, when the program comes back by executing an RTS, the return is to address 9011C and not to 90114. IDLE Instruction An interrupt is needed to come out of the IDLE instruction. If a program places an IDLE instruction inside the delayed branch the processor remains in the idled state because interrupts are latched but not serviced until the program exits a delayed branch.
3-22
Program Sequencer
The DO/UNTIL instruction uses the sequencers loop and condition features, which appear in Figure 3-2 on page 3-4. These features provide efficient software loops without the overhead of additional instructions to branch, test a condition, or decrement a counter. The following code example shows a DO/UNTIL loop that contains three instructions and iterates 30 times.
LCNTR=30, DO the_end UNTIL LCE; /*Loop iterates 30 times*/ R0=DM(I0,M0), F2=PM(I8,M8); R1=R0-R15; the_end: F4=F2+F3; /*Last instruction in loop*/
When executing a DO/UNTIL instruction, the program sequencer pushes the address of the loops last instruction and loops termination condition onto the loop address stack. The sequencer also pushes the top-of-loop addressaddress of the instruction following the DO/UNTIL instruction onto the PC stack. The sequencers instruction pipeline architecture influences loop termination. Because instructions are pipelined, the sequencer must test the termination condition (and, if the loop is counter-based, decrement the counter) before the end of the loop. Based on the tests outcome, the next fetch either exits the loop or returns to the top-of-loop. The condition test occurs when the processor is executing the instruction two locations before the last instruction in the loop (at location e 2, where e is the end-of-loop address). If the condition tests false, the sequencer repeats the loop, fetching the instruction from the top-of-loop address, which is stored on the top of the PC stack. If the condition tests true, the sequencer terminates the loop, fetching the next instruction after the end of the loop and popping the loop and PC stacks.
3-23
A special case of loop termination is the loop abort instruction, JUMP (LA). This instruction causes an automatic loop abort when it occurs inside a loop. When the loop aborts, the sequencer pops the PC and loop address stacks once. If the aborted loop was nested, the single pop of the stacks leaves the correct values in place for the outer loop. Table 3-9 and Table 3-10 show the pipeline states for loop iteration and termination. Table 3-9. Pipelined Execution Cycles for Loop Back (Iteration)
Cycles 1 2 3 4 Fetch e b2 b+1 b+2 Decode e 1 e b b+1 Execute e 21 e 1 e b
Note that e is the loop end instruction, and b is the loop start instruction. 1. Termination condition tests false 2. Loop start address is top of PC stack
Note that e is the loop end instruction. 1. Termination condition tests true 2. Loop aborts and loop stacks pop
3-24
Program Sequencer
3-25
Note: n is the loop start instruction, and n+2 is the instruction after the loop. 1. Loop count (LCNTR) equals 3 2. No opcode latch or fetch address update; count expired tests true 3. Loop iteration aborts; PC and loop stacks pop
3-26
Program Sequencer
Table 3-12. Pipelined Execution Cycles for Single Instruction Counter-Based Loop With Two Iterations (Two Overhead Cycles)
Cycles 1 2 3 4 5 6 Fetch n+2 n+12 n+13 n+2 n+3 n+4 Decode n+1 n+1 n+1nop4 n+1nop n+2 n+3
5
Note: n is the loop start instruction, and n+2 is the instruction after the loop. 1. Loop count (LCNTR) equals 2 2. No opcode latch or fetch address update 3. Count expired tests true 4. Loop iteration aborts; PC and loop stacks pop; n+1 suppressed 5. n+1 suppressed
Table 3-13. Pipelined Execution Cycles for Two Instruction Counter-Based Loop With Two Iterations
Cycles 1 2 3 4 5 6 Fetch n+2 n+12 n+23 n+34 n+4 n+5 Decode n+1 n+2 n+1 n+2 n+3 n+4 Execute n1 n+1 (pass 1) n+2 (pass 1) n+1 (pass 2) n+2 (pass 2) n+3
Note: n is the loop start instruction, and n+3 is the instruction after the loop. 1. Loop count (LCNTR) equals 2 2. PC stack supplies loop start address 3. Count expired tests true 4. Loop iteration aborts; PC and loop stacks pop
3-27
Table 3-14. Pipelined Execution Cycles for Two Instruction Counter-Based Loop With One Iteration (Two Overhead Cycles)
Cycles 1 2 3 4 5 6 Fetch n+2 n+12 n+23 n+3 n+4 n+5 Decode n+1 n+2 n+1nop4 n+2nop n+3 n+4
5
Note: n is the loop start instruction, and n+3 is the instruction after the loop. 1. Loop count (LCNTR) equals 1 2. PC stack supplies loop start address 3. Count expired tests true 4. Loop iteration aborts; PC and loop stacks pop; n+1 suppressed 5. n+2 suppressed
Processing of an interrupt that occurs during the last iteration of a one-instruction loop is delayed by one cycle in the following cases: the loop executes once or twice a two-instruction loop executes once a cycle follows one of these loops (which is an NOP) Similarly, in a one-instruction loop that iterates at least three times, processing is delayed by one cycle if the interrupt occurs during the third-to-last iteration. For more information on pipeline execution during interrupts, see Interrupts and Sequencing on page 3-34.
3-28
Program Sequencer
Short non-counter-based loops terminate differently from short counter-based loops. These differences stem from the architecture of the pipeline and conditional logic: In a three-instruction non-counter-based loop, the sequencer tests the termination condition when the processor executes the top of loop instruction. When the condition tests true, the sequencer completes the iteration of the loop and terminates. In a two-instruction non-counter-based loop, the sequencer tests the termination condition when the processor executes the last (second) instruction. If the condition becomes true when the first instruction is executed, the condition tests true during the second instruction, and the sequencer completes one more iteration of the loop before exiting. If the condition becomes true during the second instruction, the sequencer completes two more iterations of the loop before exiting. In a one-instruction non-counter-based loop, the sequencer tests the termination condition every cycle. After the cycle when the condition becomes true, the sequencer completes three more iterations of the loop before exiting.
3-29
The sequencer pushes an entry onto the loop address stack when executing a DO/UNTIL or PUSH loop instruction. The stack entry pops off the stack two instructions before the end of its loops last iteration or on a POP loop instruction. A stack overflow occurs if a seventh entry (one more than full) is pushed onto the loop stack. The stack is empty when no entries are occupied. The loop stacks overflow or empty status is available. Because the sequencer keeps the loop stack and loop counter stack synchronized, the same overflow and empty flags apply to both stacks. These flags are in the sticky status register (STKYx). For more information on STKYx, see Table A-5 on page A-19. For more information on how these flags work with the loop stacks, see Loop Counter Stack on page 3-30. Note that a loop stack overflow causes a maskable interrupt. Because the sequencer tests the termination condition two instructions before the end of the loop, the loop stack pops before the end of the loops final iteration. If a program reads LADDR at either of these instructions, the value is already the termination address for the next loop stack entry.
3-30
Program Sequencer
register indicate the loop counter stack full and empty states. Table A-5 on page A-19 lists the bits in the STYKx register. The STKYx bits that indicate loop counter stack status are: Loop stacks overflowed. Bit 25 (LSOV) indicates that the loop counter stack and loop stack are overflowed (if 1) or not overflowed (if 0)A sticky bit. Loop stacks empty. Bit 26 (LSEM) indicates that the loop counter stack and loop stack are empty (if 1) or not empty (if 0)Not sticky, cleared by a PUSH. Within the sequencer, the current loop counter (CURLCNTR) and loop counter (LCNTR) registers allow access to the loop counter stack. CURLCNTR tracks iterations for a loop being executed, and LCNTR holds the count value before the loop is executed. The two counters let the processor maintain the count for an outer loop, while a program is setting up the count for an inner loop. The top entry in the loop counter stack (CURLCNTR) always contains the current loop count. This register is readable and writable over the DM Data bus. Reading CURLCNTR when the loop counter stack is empty returns the value 0xFFFF FFFF. The sequencer decrements the value of CURLCNTR for each loop iteration. Because the sequencer tests the termination condition two instruction cycles before the end of the loop, the loop counter also decrements before the end of the loop. If a program reads CURLCNTR at either of the last two loop instructions, the value is already the count for the next iteration. The loop counter stack pops two instructions before the end of the last loop iteration. When the loop counter stack pops, the new top entry of the stack becomes the CURLCNTR valuethe count in effect for the executing loop. If there is no executing loop, the value of CURLCNTR is 0xFFFF FFFF after the pop.
3-31
Writing CURLCNTR does not cause a stack push. If a program writes a new value to CURLCNTR, the program changes the count value of the loop currently executing. When no DO/UNTIL LCE loop is executing, writing to CURLCNTR has no effect. Because the processor must use CURLCNTR to perform counter-based loops, some restrictions apply to how a program can write CURLCNTR. For more information, see Restrictions on Ending Loops on page 3-25. The next-to-top entry in the loop counter stack (LCNTR) is the location on the stack that takes effect on the next loop stack push. To set up a count value for a nested loop without changing the count for the currently executing loop, a program writes the count value to LCNTR. A value of zero in LCNTR causes a loop to execute 232 times. A DO/UNTIL LCE instruction pushes the value of LCNTR onto the loop count stack, making that value the new CURLCNTR value. Figure 3-4 demonstrates this process for a set of nested loops. The previous CURLCNTR value is preserved one location down in the stack. If a program reads LCNTR when the loop counter stack is full, the stack returns invalid data. When the loop counter stack is full, the stack discards any data written to LCNTR. If a program reads LCNTR during the last two instructions of a terminating loop, the value of LCNTR is the last CURLCNTR value for the loop.
3-32
Program Sequencer
LCNTR
AAAA AAAA
CURLCNTR
0XFFFF FFFF
4 AAAA AAAA BBBB BBBB CURLCNTR LCNTR CCCC CCCC DDDD DDDD CURLCNTR LCNTR
5 AAAA AAAA BBBB BBBB CCCC CCCC DDDD DDDD EEEE EEEE CURLCNTR LCNTR
6 AAAA AAAA BBBB BBBB CCCC CCCC DDDD DDDD EEEE EEEE F F F F F F FF
7 AAAA AAAA BBBB BBBB CCCC CCCC DDDD DDDD EEEE EEEE CURLCNTR F F FF FF F F
Figure 3-4. Pushing the Loop Counter Stack for Nested Loops
3-33
3-34
Program Sequencer
To process an interrupt, the processors program sequencer does the following: 1. Outputs the appropriate interrupt vector address 2. Pushes the current PC value (the return address) onto the PC stack 3. Pushes the current value of the ASTATx,y and MODE1 registers onto the status stack (if the interrupt is IRQ2-0, timer, or VIRPT) 4. Sets the appropriate bit in the interrupt latch register (IRPTL) 5. Alters the interrupt mask pointer (IMASKP) to reflect the current interrupt nesting state, depending on the nesting mode At the end of the interrupt service routine, the sequencer processes the return from interrupt (RTI) instruction and does following: 1. Returns to the address stored at the top of the PC stack 2. Pops this value off of the PC stack 3. Pops the status stack (if the ASTATx,y and MODE1 status registers were pushed for the IRQ2-0, timer, or VIRPT interrupt) 4. Clears the appropriate bit in the interrupt latch register (IRPTL) and interrupt mask pointer (IMASKP) Except for reset, all interrupt service routines should end with a return-from-interrupt (RTI) instruction. After reset, the PC stack is empty, so there is no return address. The last instruction of the reset service routine should be a jump to the start of your program. If software writes to a bit in IRPTL forcing an interrupt, the processor recognizes the interrupt in the following cycle, and two cycles of branching to the interrupt vector follow the recognition cycle.
3-35
The processor responds to interrupts in three stages: synchronization and latching (1 cycle), recognition (1 cycle), and branching to the interrupt vector (2 cycles). Table 3-15, Table 3-16, and Table 3-17 show the pipelined execution cycles for interrupt processing. Table 3-15. Pipelined Execution Cycles for Interrupt During Single-Cycle Instruction
Cycles 1 2 3 4 5 Fetch n+1 n+22 v4 v+1 v+2 Decode n n+1nop3 n+2nop5 v v+1 Execute n11 n nop nop v
Note that n is the single-cycle instruction, and v is the interrupt vector instruction 1. Interrupt occurs 2. Interrupt recognized 3. n+1 pushed on PC stack; n+1 suppressed 4. Interrupt vector output 5. n+2 suppressed
3-36
Program Sequencer
Table 3-16. Pipelined Execution Cycles for Interrupt During Instruction With Conflicting PM Data Access (Instruction Not Cached)
Cycles 1 2 3 4 5 6 Fetch n+1 2 n+24 v
6
v+1 v+2
Note that n is the conflicting instruction, and v is the interrupt vector instruction 1. Interrupt occurs 2. Interrupt recognized, but not processed; PM data access 3. n+1 suppressed 4. Interrupt processed 5. n+1 suppressed 6. Interrupt vector output 7. n+1 pushed on PC stack; n+2 suppressed
3-37
Table 3-17. Pipelined Execution Cycles for Interrupt During Delayed Branch Instruction
Cycles 1 2 3 4 5 6 7 Fetch n+1 n+22 j j+1 v5 v+1 v+2
3
j+1nop6 v v+1
Note that n is the delayed branch instruction, j is the instruction at the branch address, and v is the interrupt vector instruction 1. Interrupt occurs 2. Interrupt recognized, but not processed 3. Interrupt processed 4. For a Call, n+3 (return address) is pushed onto the PC stack; j suppressed 5. Interrupt vector output 6. j pushed on PC stack; j+1 suppressed
For most interrupts, internal and external, only one instruction is executed after the interrupt occurs (and before the two instructions aborted) while the processor fetches and decodes the first instruction of the service routine. Because of the one-cycle delay between an arithmetic exception and the STKYx,y register update, interrupt processing starts two cycles after an arithmetic exception occurs. Table 3-18 lists the latency associated with the IRQ2-0 interrupts and the multiprocessor vector interrupt. Table 3-18. Minimum Latency of the IRQ2-0 and VIRPT Interrupts
Interrupt IRQ2-0 VIRPT Minimum Latency 3 cycles 6 cycles
3-38
Program Sequencer
If nesting is enabled and a higher priority interrupt occurs immediately after a lower priority interrupt, the service routine of the higher priority interrupt is delayed by one additional cycle. This delay allows the first instruction of the lower priority interrupt routine to be executed before it is interrupted. For more information, see Nesting Interrupts on page 3-45. Certain processor operations that span more than one cycle hold off interrupt processing. If an interrupt occurs during one of these operations, the processor latches the interrupt, but delays its processing. The operations that have delayed interrupt processing are as follows: A branch (JUMP or CALL/return) instruction and the following cycle, whether it is an instruction (in a delayed branch) or an NOP (in a non-delayed branch) The first of the two cycles used to perform a program memory data access and an instruction fetch when the instruction is not cached The third-to-last iteration of a one-instruction loop The last iteration of either a one-instruction loop executed once or twice or a two-instruction loop executed once, and the following cycle (which is an NOP) The first of the two cycles used to fetch and decode the first instruction of an interrupt service routine Any waitstates for external memory accesses Any external memory access required when the processor does not have control of the external bus, during a host bus grant or when the processor is a bus slave in a multiprocessing system
3-39
Sensing Interrupts
The processor supports two types of interrupt sensitivitythe signal shape that triggers the interrupt. On interrupt pins (IRQ2-0), either the input signals edge or level can trigger an external interrupt. The processor detects a level-sensitive interrupt if the signal input is low (active) when sampled on the rising edge of CLKIN. A level-sensitive interrupt must go high (inactive) before the processor returns from the interrupt service routine. If a level-sensitive interrupt is still active when the processor samples it after returning from its service routine, the processor treats the signal as a new request. The processor repeats the same interrupt routine without returning to the main program, assuming no higher priority interrupts are active. The processor detects an edge-sensitive interrupt if the input signal is high (inactive) on one cycle and low (active) on the next cycle when sampled on the rising edge of CLKIN. An edge-sensitive interrupt signal can stay active indefinitely without triggering additional interrupts. To request another interrupt, the signal must go high, then low again. Edge-sensitive interrupts require less external hardware compared to level-sensitive requests, because negating the request is unnecessary. An advantage of level-sensitive interrupts is that multiple interrupting devices may share a single level-sensitive request line on a wired-OR basis, allowing easy system expansion. The MODE2 register controls external interrupt sensitivity. Table A-3 on page A-10 lists all bits in the MODE2 register. The following bits in MODE2 control interrupt sensitivity: Interrupt 0 Sensitivity. Bit 0 (IRQ0E), directs the processor to detect IRQ0 as edge-sensitive (if 1) or level-sensitive (if 0).
3-40
Program Sequencer
Interrupt 1 Sensitivity. Bit 1 (IRQ1E), directs the processor to detect IRQ1 as edge-sensitive (if 1) or level-sensitive (if 0). Interrupt 2 Sensitivity. Bit 2 (IRQ2E), directs the processor to detect IRQ2 as edge-sensitive (if 1) or level-sensitive (if 0). The processor accepts external interrupts that are asynchronous to the processors clock (CLKIN), allowing external interrupt signals to change at any time. An external interrupt must be held low at least one CLKIN cycle to guarantee that the processor samples the signal. External interrupts must meet the setup and hold time requirements relative to the rising edge of CLKIN. For information on interrupt signal timing requirements, see the processors Data Sheet.
Masking Interrupts
The sequencer supports interrupt maskinglatching an interrupt, but not responding to it. Except for the RESET and EMU interrupts, all interrupts are maskable. If a masked interrupt is latched, the processor responds to the latched interrupt if it is later unmasked. Interrupts can be masked globally or selectively. Bits in the MODE1, IMASK, and LIRPTL registers control interrupt masking. Table A-2 on page A-3 lists the bits in MODE1, Table A-9 on page A-27 lists the bits in IMASK, and Table A-10 on page A-34 lists the bits in LIRPTL. These bits control interrupt masking as follows: Global interrupt enable. MODE1, Bit 12 (IRPTEN) directs the processor to enable (if 1) or disable (if 0) all interrupts. Selective interrupt enable. IMASK, Bits 30-10 and 8-0, direct the processor to enable (if 1) or disable/mask (if 0) the corresponding interrupt.
3-41
Selective link port interrupt enable. LIRPTL, Bits 17-16 (LPxMSK) direct the processor to enable (if 1) or disable/mask (if 0) the corresponding link port interrupt. SPI port interrupt enable. LIRPTL, Bit 18 (SPIRMSK) and Bit 19 (SPITMSK) direct the processor to enable (if 1) or disable/mask (if 0) the SPI port receive interrupt or transmit interrupt, respectively. Except for the non-maskable interrupts and boot interrupts, all interrupts are masked at reset. For booting, the processor automatically unmasks and uses the external port (EPOI), link port (LP0I) or SPI port (SPIRI) interrupt after reset. Usage depends on whether the ADSP-21161 processor is booting from EPROM, host, SPI or link ports.
Latching Interrupts
When the processor recognizes an interrupt, the processors interrupt latch (IRPTL and LIRPTL) registers latch the interruptsset a bit to record that the interrupt occurred. The bits in these registers indicate all interrupts that are currently being serviced or are pending. Because these registers are readable and writable, any interrupt except reset can be set or cleared in software. Note that writing to the reset bit (bit 1) in IRPTL puts the processor into an illegal state. When an interrupt occurs, the sequencer sets the corresponding bit in IRPTL or LIRPTL. During execution of the interrupts service routine, the processor clears this bit during every cycle to prevent the same interrupt from being latched while its service routine is executing. After the return from interrupt (RTI), the sequencer stops clearing the latch bit. If necessary, it is possible to re-use an interrupt while it is being serviced. For more information, see Reusing Interrupts on page 3-47. The interrupt latch bits in IRPTL correspond to interrupt mask bits in the IMASK register. In both registers, the interrupt bits are arranged in order of priority. The interrupt priority is from 0 (highest) to 31 (lowest). Inter-
3-42
Program Sequencer
rupt priority determines which interrupt is serviced first when more than one occurs in the same cycle. Priority also determines which interrupts are nested when the processor has interrupt nesting enabled. For more information, see Nesting Interrupts on page 3-45. While IRPTL latches interrupts for a variety of events, the LIRPTL register contains latch and mask bits only for Link port and SPI DMA interrupts. A logical Oring of link port interrupts (masked-latch state) appears in the LPSUM bit in the IRPTL register. Because the LPSUM bit has a corresponding mask bit in the IMASK register, programs can use LPSUM for a second level of link port interrupt masking. Multiple events can cause arithmetic interruptsfixed-point overflow (FIXI) and floating-point overflow (FLTOI), underflow (FLTUI), and invalid operation (FLTII). To determine which event caused the interrupt, a program can read the arithmetic status flags in the STYKx or STKYy status registers. Table A-5 on page A-19 lists the bits in these registers. Service routines for arithmetic interrupts must clear the appropriate STKYx or STKYy bits to clear the interrupt. If the bits are not cleared, the interrupt is still active after the return from interrupt (RTI). Status bits in STKYy apply only in SIMD mode. For more information, see Secondary Processing Element (PEy) on page 2-37. One event can cause multiple interrupts. The timer decrementing to zero causes two timer expired interrupts, TMZHI (high priority) and TMZLI (low priority). This feature allows selection of the priority for the timer interrupt. Programs should unmask the timer interrupt with the desired priority and leave the other one masked. If both interrupts are unmasked, IRPTL latches both interrupts when the timer reaches zero, and the processor services the higher priority interrupt first, and then the lower priority interrupt.
3-43
The IRPTL also supports software interrupts. When a program sets the latch bit for one of these interrupts (SFT0I, SFT1I, SFT2I, or SFT3I), the sequencer services the interrupt, and the processor branches to the corresponding interrupt routine. Software interrupts have the same behavior as all other maskable interrupts.
3-44
Program Sequencer
The sequencer automatically pops the ASTATx, ASTATY, and MODE1 registers from the status stack during the return from interrupt instruction (RTI). In one other case, JUMP (CI), the sequencer pops the stack. For more information, see Reusing Interrupts on page 3-47. Only the IRQ2-0, timer expired, and VIRPT interrupts cause the sequencer to push an entry onto the status stack. All other interrupts require either explicit saves and restores of effected registers or an explicit push or pop of the stack (PUSH/POP STS). Pushing ASTATx, ASTATy, and MODE1 preserves the status and control bit settings. This feature allows a service routine to alter these bits with the knowledge that the original settings are automatically restored upon the return from the interrupt. The top of the status stack contains the current values of ASTATx, ASTATy, and MODE1. Reading and writing these registers does not move the stack pointer. Explicit PUSH or POP instructions do move the status stack pointer.
Nesting Interrupts
The sequencer supports interrupt nestingresponding to another interrupt while a previous interrupt is being serviced. Bits in the MODE1, IMASKP, and LIRPTL registers control interrupt nesting. Table A-2 on page A-3 lists the bits in MODE1, Table A-9 on page A-27 lists the bits in IMASKP, and Table A-10 on page A-34 lists the bits in LIRPTL. These bits control interrupt nesting as follows: Interrupt nesting enable. MODE1 Bit 11 (NESTM). This bit directs the processor to enable (if 1) or disable (if 0) interrupt nesting. Interrupt Mask Pointer. IMASKP Bits 30- 15, 13-10 and 8-0. These bits list the interrupts in priority order and provide a temporary interrupt mask for each nesting level.
3-45
Link Port DMA Interrupt Mask Pointer. LIRPTL Bits 25-24, (LPxMSKP). These bits are the link port DMA interrupts in priority order. They provide a temporary interrupt mask for each nesting level. SPI Port DMA Interrupt Mask Pointer. LIRPTL Bits 27-26, (SPITMSKP and SPIRMSKP). These bits are the SPI port transmit and receive DMA interrupts respectively. They provide a temporary interrupt mask. When interrupt nesting is disabled, a higher priority interrupt can not interrupt a lower priority interrupts service routine. Other interrupts are latched as they occur, but the processor processes them after the active routine finishes. When interrupt nesting is enabled, a higher priority interrupt can interrupt a lower priority interrupts service routine. Lower interrupts are latched as they occur, but the processor process them after the nested routines finish. Programs should change the interrupt nesting enable (NESTM) bit only while outside of an interrupt service routine or during the reset service routine. If nesting is enabled and a higher priority interrupt occurs immediately after a lower priority interrupt, the service routine of the higher priority interrupt is delayed by one cycle. This delay allows the first instruction of the lower priority interrupt routine to be executed, before it is interrupted. When servicing nested interrupts, the processor uses the interrupt mask pointer (IMASKP) to create a temporary interrupt mask for each level of interrupt nesting; the IMASK value is not effected. The processor changes IMASKP each time a higher priority interrupt interrupts a lower priority service routine.
3-46
Program Sequencer
The bits in IMASKP correspond to the interrupts in order of priority. When an interrupt occurs, the processor sets its bit in IMASKP. If nesting is enabled, the processor uses IMASKP to generate a new temporary interrupt mask, masking all interrupts of equal or lower priority to the highest priority bit set in IMASKP and keeping higher priority interrupts the same as in IMASK. When a return from an interrupt service routine (RTI) is executed, the processor clears the highest priority bit set in IMASKP and generates a new temporary interrupt mask. The processor masks all interrupts of equal or lower priority to the highest priority bit set in IMASKP. The bit set in IMASKP that has the highest priority always corresponds to the priority of the interrupt being serviced. If an interrupt recurs while its service routine is running and nesting is enabled, the processor updates IRPTL, but does not service the interrupt. The processor waits until the return from interrupt (RTI) completes before vectoring to the service routine again. If nesting is not enabled, the processor masks out all interrupts and IMASKP is not used, but the processor still updates IMASKP to create a temporary interrupt mask. The interrupt controller uses the IMASKP register and the LPxMSKP, SPITMSKP, and SPIRMSKP bits of the LIRPTL register. These bits should not be modified.
Reusing Interrupts
When an interrupt occurs the sequencer sets the corresponding bit in IRPTL. During execution of the service routine, the sequencer keeps this bit clearedthe processor clears the bit during every cycle, preventing the same interrupt from being latched while its service routine is already executing.
3-47
If necessary, it is possible to re-use an interrupt while it is being serviced. Using a JUMP clear interrupt, JUMP (CI), instruction in the interrupt service routine clears the interrupt, allowing its reuse while the service routing is executing. The JUMP (CI) instruction reduces an interrupt service routine to a normal subroutine, clearing the appropriate bit in the interrupt latch and interrupt mask pointer and popping the status stack. After the JUMP (CI) instruction, the processor stops automatically clearing the interrupts latch bit, allowing the interrupt to latch again. When returning from a subroutine entered with a JUMP (CI) instruction, a program must use a return loop reentry, RTS(LR), instruction. For more information, see Restrictions on Ending Loops on page 3-25. The following example shows an interrupt service routine that is reduced to a subroutine with the (CI) modifier:
instr1; /*Interrupt entry from main program*/ JUMP(PC,3) (DB,CI); /*Clear interrupt status*/ instr3; instr4; instr5; RTS (LR); /*Use LR modifier with return from subroutine*/
The JUMP (PC,3)(DB,CI) instruction actually only continues linear execution flow by jumping to the location PC + 3 (instr5). The two intervening instructions (instr3, instr4) are executed because of the delayed branch (DB). This JUMP instruction is only an examplea JUMP (CI) can be to any location.
Interrupting IDLE
The sequencer supports placing the processor in IDLEa special instruction that halts the processor core in a low-power state. The halt occurs until an external interrupt (IRQ2-0), timer interrupt, DMA interrupt, or
3-48
Program Sequencer
vector interrupt occurs. When executing an IDLE instruction, the sequencer fetches one more instruction at the current fetch address and then suspends operation. The processors I/O processor is not effected by the IDLE instructionDMA transfers to or from internal memory continues uninterrupted.
VIRPT
The processors internal clock and timer (if enabled) continue to run during IDLE. When an external interrupt (IRQ2-0), timer interrupt, DMA interrupt, or VIRPT vector interrupt occurs, the processor responds normally. After two cycles used to fetch and decode the first instruction of the interrupt service routine, the processor continues executing instructions normally.
Multiprocessing Interrupts
The sequencer supports a multiprocessor vector interrupt. The vector interrupt (VIRPT) permits passing interprocessor commands in multiple-processor systems. This interrupt occurs when an external processor (a host or another processor) writes an address to the VIRPT register, inserting a new vector address for VIRPT. The VIRPT register has space for the vector address and data for the service routine. Table A-19 on page A-64 lists the bits in the VIRPT registers. When servicing a VIRPT interrupt, the processor automatically pushes the status stack and executes the service routine located at the address specified in VIRPT. During the return from interrupt (RTI), the processor automatically pops the status stack.
3-49
To flag that a VIRPT interrupt is pending, the processor sets the VIPD bit in the SYSTAT register when the external processor writes to the VIRPT register. Programs passing interprocessor commands must monitor VIPD to check if the processor can receive a new VIRPT address, because: If an external processor writes VIRPT while a previous vector is pending, the new VIRPT address replaces the previous pending one. If an external processor writes VIRPT while a previous vector is executing, the new VIRPT address does not execute (no new interrupt is triggered). When returning from a VIRPT interrupt, the processor clears the VIPD bit. Note that if a processor writes to its own VIRPT register, the write is ignored.
3-50
Program Sequencer
for 4 cycles (when the timer is enabled), as shown in Figure 3-5. On the clock cycle after TCOUNT reaches zero, the timer automatically reloads TCOUNT from the TPERIOD register. The TPERIOD value specifies the frequency of timer interrupts. The number of cycles between interrupts is TPERIOD + 1. The maximum value of 32 TPERIOD is 2 1. To start and stop the timer, programs use the MODE2 registers TIMEN bit. With the timer disabled (TIMEN=0), the program loads TCOUNT with an initial count value and loads TPERIOD with the number of cycles for the desired interval. Then, the program enables the timer (TIMEN=1) to begin the count. When a program enables the timer, the timer starts decrementing the TCOUNT register at the end of the next clock cycle. If the timer is subsequently disabled, the timer stops decrementing TCOUNT after the next clock cycle as shown in Figure 3-5. The timer expired event (TCOUNT decrements to zero) generates two interrupts, TMZHI and TMZLI. For information on latching and masking these interrupts to select timer expired priority, see Latching Interrupts on page 3-42. As with other interrupts, the sequencer needs two cycles to fetch and decode the first instruction of the timer expired service routine before executing the routine. The pipeline execution for the timer interrupt appears in Table 3-15 on page 3-36. Programs can read and write the TPERIOD and TCOUNT registers by using universal register transfers. Reading the registers does not effect the timer. Note that an explicit write to TCOUNT takes priority over the sequencers loading TCOUNT from TPERIOD and the timers decrementing of TCOUNT. Also note that TCOUNT and TPERIOD are not initialized at reset. Programs should initialize these registers before enabling the timer.
3-51
CLKIN TCOUNT=N TIMER DISABLE Clear TIMEN in MODE2 Timer Inactive TCOUNT=N TCOUNT=N-1
3-52
Program Sequencer
PC stack full. Bit 21 (PCFL) indicates that the PC stack is full (if 1) or not full (if 0)Not a sticky bit, cleared by a POP. PC stack empty. Bit 22 (PCEM) indicates that the PC stack is empty (if 1) or not empty (if 0)Not sticky, cleared by a PUSH. The PC stack full condition causes a maskable interrupt (SOVFI). This interrupt occurs when the PC stack has 29 locations filled (the almost full state). The PC stack full interrupt occurs when one location is left, because the PC stack full service routine needs that last location for its return address. The address of the top of the PC stack is available in the PC stack pointer (PCSTKP) register. The value of PCSTKP is zero when the PC stack is empty, is 1...30 when the stack contains data, and is 31 when the stack overflows. This register is a readable and writable register. A write to PCSTKP takes effect after a one-cycle delay. If the PC stack is overflowed, a write to PCSTKP has no effect. The overflow and full flags provide diagnostic aid only. Programs should not use these flags for runtime recovery from overflow. Note that the status stack, loop stack overflow, and PC stack full conditions trigger a maskable interrupt. The empty flags can ease stack saves to memory. Programs can monitor the empty flag when saving a stack to memory to determine when the processor has transferred all values.
Conditional Sequencing
The sequencer supports conditional execution with conditional logic that appears in Figure 3-2 on page 3-4. This logic evaluates conditions for conditional (IF) instructions and loop (DO/UNTIL) terminations. The conditions are based on information from the arithmetic status registers (ASTATx and ASTATy), the mode control 1 register (MODE1), the flag inputs,
3-53
Conditional Sequencing
and the loop counter. For more information on arithmetic status, see Using Computational Status on page 2-8. When in SIMD mode, conditional execution is effected by the arithmetic status of both processing elements. For information on conditional sequencing in SIMD mode, see SIMD Mode and Sequencing on page 3-57. Each condition that the processor evaluates has an assembler mnemonic. The condition mnemonics for conditional instructions appear in Table 3-19. For most conditions, the sequencer can test both true and false states. For example, the sequencer can evaluate ALU equal-to-zero (EQ) and ALU not-equal-to-zero (NZ). To test conditions that do not appear in Table 3-19, a program can use the Test Flag (TF) condition generated from a Bit Test Flag (BTF) instruction. The TF flag is set or cleared as a result of a BIT TEST or BIT XOR instruction, which can test the contents of any of the processors system registers, including STKYx and STKYy. Table 3-19. IF Condition and DO/UNTIL Termination Mnemonics
Condition From ALU Description ALU = 0 ALU 0 ALU > 0 ALU < zero ALU 0 ALU 0 ALU carry ALU not carry ALU overflow ALU not overflow True if AZ = 1 AZ = 0 footnote1 footnote2 footnote3 footnote4 AC = 1 AC = 0 AV = 1 AV = 0 Mnemonic EQ NE GT LT GE LE AC NOT AC AV NOT AV
3-54
Program Sequencer
3-55
Conditional Sequencing
1 ALU greater than (GT) is true if: [AF and (AN xor (AV and ALUSAT)) or (AF and AN)] or AZ = 0 2 ALU less than (LT) is true if: [AF and (AN xor (AV and ALUSAT)) or (AF and AN and AZ)] = 1 3 ALU greater equal (GE) is true if: [AF and (AN xor (AV and ALUSAT)) or (AF and AN and AZ)] = 0 4 ALU lesser or equal (LT) is true if: [AF and (AN xor (AV and ALUSAT)) or (AF and AN)] or AZ = 1
The two conditions that do not have complements are LCE/NOT LCE (loop counter expired/not expired) and TRUE/FOREVER. The context of these condition codes determines their interpretation. Programs should use TRUE and NOT LCE in conditional (IF) instructions. Programs should use FOREVER and LCE to specify loop (DO/UNTIL) termination. A DO FOREVER instruction executes a loop indefinitely, until an interrupt or reset intervenes. There are some restrictions on how programs may use conditions in DO/UNTIL loops. For more information, see Restrictions on Ending Loops on page 3-25 and Restrictions on Short Loops on page 3-26. The bus master (BM) condition indicates whether the processor is the current bus master in a multiprocessor system. To enable testing this condition, a program must clear the MODE1 registers Condition Code Select (CSEL) bits. Otherwise, the bus master condition is always false.
3-56
Program Sequencer
3-57
DAG Operations 1 2 3
Complementary pairs are registers with SIMD complements, include PEx/y data registers and USTAT1/2, USTAT3/4, ASTATx/y, STKYx/y, and PX1/2 universal registers. Uncomplemented registers are universal registers that do not have SIMD complements. Post-modify operations follow this rule, but pre-modify operations always occur despite outcome.
3-58
Program Sequencer
3-59
Case 1: Complementary Register Pair Data Move In this case data moves from a complementary register pair to a complementary register pair. The processor executes the explicit move depending on the evaluation of the conditional test in the PEx processing element and the implicit move depending on the evaluation of the conditional test in the PEy processing element. Example: RegistertoMemory Move PEx Explicit Register
IF EQ DM(I0,M0) = R2;
For this instruction the processor is operating in SIMD mode, a register in the PEx data register file is the explicit register and I0 is pointing to an even address in internal memory. Indirect addressing is shown in the instructions shown in this example. However, the same results occur using direct addressing. The data movement resulting from the evaluation of the conditional test in the PEx and PEy processing elements is shown in Table 3-21. Table 3-21. RegistertoMemory Moves Complementary Pairs
Condition in PEx AZx 0 0 1 1 Condition in PEy AZy 0 1 0 1 Result Explicit NO data move occurs NO data move occurs from r2 to location I0 r2 transfers to location I0 r2 transfers to location I0 Implicit NO data move occurs s2 transfers to location (I0+1) NO data move occurs from s2 to location (I0+1) s2 transfers to location (I0+1)
3-60
Program Sequencer
For this instruction the processor is operating in SIMD mode, a register in the PEy data register file is the explicit register and I0 is pointing to an even address in internal memory. The data movement resulting from the evaluation of the conditional test in the PEx and PEy processing elements is shown in Table 3-22. Table 3-22. RegistertoRegister Moves Complementary Pairs
Condition in PEx AZx 0 0 1 1 Condition in PEy AZy 0 1 0 1 Result Explicit NO data move occurs NO data move occurs from s2 to location I0 s2 transfers to location I0 s2 transfers to location I0 Implicit NO data move occurs r2 transfers to location I0+1 NO data move occurs from r2 to location I0+1 r2 transfers to location I0+1
For these instruction the processor is operating in SIMD mode and registers in the PEx data register file are used as the explicit registers. The data movement resulting from the evaluation of the conditional test in the PEx and PEy processing elements is shown in Table 3-23.
3-61
For these instructions the processor is operating in SIMD mode and registers in the PEy data register file are used as explicit registers. The data movement resulting from the evaluation of the conditional test in the PEx and PEy processing elements is shown in Table 3-24. Table 3-24. RegistertoRegister Moves Complementary Register Pairs
Condition in PEx AZx 0 0 Condition in PEy AZy 0 1 Result Explicit NO data move occurs NO data move to registers s9,px and ustat1 occurs Implicit NO data move occurs r2 transfers to registers s9,px2, and ustat2
3-62
Program Sequencer
Case 2: UncomplementedtoComplementary Register Move In this case data moves from an uncomplemented register (Ureg without a SIMD complement) to a complementary register pair. The processor executes the explicit move depending on the evaluation of the conditional test in the PEx processing element. The processor executes the implicit move depending on the evaluation of the conditional test in the PEy processing element. In each processing element where the move occurs, the content of the source register is duplicated in destination. Example: RegistertoRegister Move
IF EQ R1 = PX;
While PX1 and PX2 are complementary registers, the combined PX register has no complementary register. For more information, see Internal Data Bus Exchange on page 5-10. For this instruction the processor is operating in SIMD mode. The data movement resulting from the evaluation of the conditional test in the PEx and PEy processing elements is shown in Table 3-24.
3-63
Case 3: Complementary Register => Uncomplimentary Register In this case data moves from a complementary register pair to an uncomplemented register. The processor executes the explicit move to the uncomplemented universal register, depending on the condition test in the PEx processing element only. The processor does not perform an implicit move. Example: RegistertoRegister Move
IF EQ PX = R1;
For this instruction the processor is operating in SIMD mode. The data movement resulting from the evaluation of the conditional test in the PEx and PEy processing elements is shown in Table 3-26. Table 3-26. ComplementarytoUncomplemented Move
Condition in PEx AZx 0 0 Condition in PEy AZy 0 1 Result Explicit px remains unchanged px remains unchanged Implicit no implicit move no implicit move
3-64
Program Sequencer
For more details on PX register transfers, refer to Internal Data Bus Exchange on page 5-10. Case 4: Data Move Involves External Memory or IOP Memory Space Conditional data moves from a complementary register pair to an uncomplemented register with an access to external memory space or IOP memory space. This results in unexpected behavior and should not be used.
IF EQ DM(I0,M0) = R2; IF EQ DM(I0,M0) = S2;
For these instruction the processor is operating in SIMD mode and the explicit register is either a PEx register or PEy register. I0 points to either external memory space or IOP memory space. Indirect addressing is shown in the instructions shown in this example. However, the same results occur using direct addressing.
3-65
3-66
The processors Data Address Generators (DAGs) generate addresses for data moves to and from Data Memory (DM) and Program Memory (PM). By generating addresses, the DAGs let programs refer to addresses indirectly, using a DAG register instead of an absolute address. The DAGs architecture, which appears in Figure 4-1, supports several functions that minimize overhead in data access routines. These functions include: Supply address and post-modifyprovides an address during a data move and auto-increments the stored address for the next move. Supply pre-modified addressprovides a modified address during a data move without incrementing the stored address. Modify addressincrements the stored address without performing a data move. Bit-reverse addressprovides a bit-reversed address during a data move without reversing the stored address. Broadcast data movesperforms dual data moves to complementary registers in each processing element to support SIMD mode.
4-1
As shown in Figure 4-1, each DAG has four types of registers. These registers hold the values that the DAG uses for generating addresses. The four types of registers are: Index registers (I0-I7 for DAG1 and I8-I15 for DAG2). An index register holds an address and acts as a pointer to memory. For example, the DAG interprets DM(I0,0) and PM(I8,0) syntax in an instruction as addresses. Modify registers (M0-M7 for DAG1 and M8-M15 for DAG2). A modify register provides the increment or step size by which an index register is pre- or post-modified during a register move. For example, the DM(I0, M1) instruction directs the DAG to output the address in register I0 then modify the contents of I0 using the M1 register. Length and Base registers (L0-L7 and B0-B7 for DAG1 and L8-L15 and B8-B15 for DAG2). Length and base registers setup the range of addresses and the starting address for a circular buffer. For more information on circular buffers, see Addressing Circular Buffers on page 4-12.
4-2
I REGISTERS 8 X 32 32
M REGISTERS 8 X 32
L REGISTERS 8 X 32
B REGISTERS 8 X 32
MUX
POST-MODIFY ADDRESSING
MODULUS LOGIC
32
UPDATE
ADDRESS ADJUSTMENT PER WORD SIZE (SHORT, NORMAL, OR LONG) OPTIONAL BIT-REVERSE FOR I0-DAG1 & I8-DAG2 OPTIONAL BROADCAST FOR I1-DAG1 & I9-DAG2 32 32 MODE1
MODE2
STKYX
Figure 4-1. Data Address Generator (DAG) Block Diagram Broadcast register loading enable, DAG2-I9. Bit 22 (BDCST9) enables register broadcast loads to complementary registers from I9 indexed moves (if 1) or disables broadcast loads (if 0). SIMD mode enable. Bit 21 (PEYEN) enables computations in PEySIMD mode(if 1) or disables PEySISD mode(if 0). For more information on SIMD mode, see Secondary Processing Element (PEy) on page 2-37.
4-3
Secondary registers for DAG2 lo, I,M,L,B8-11. Bit 6 (SRD2L) Secondary registers for DAG2 hi, I,M,L,B12-15. Bit 5 (SRD2H) Secondary registers for DAG1 lo, I,M,L,B0-3. Bit 4 (SRD1L) Secondary registers for DAG1 hi, I,M,L,B4-7. Bit 3 (SRD1H) These bits select the corresponding secondary register set (if 1) or select the corresponding primary register setthe set that is available at reset(if 0). Bit-reverse addressing enable, DAG1-I0. Bit 1 (BR0) enables bit-reversed addressing on I0 indexed moves (if 1) or disables bit-reversed addressing (if 0). Bit-reverse addressing enable, DAG2-I8. Bit 0 (BR8) enables bit-reversed addressing on I8 indexed moves (if 1) or disables bit-reversed addressing (if 0).
4-4
For more information on setting up and using circular buffers, see Addressing Circular Buffers on page 4-12. When using circular buffers, the DAGs can generate an interrupt on buffer overflow (wrap around). For more information, see Using DAG Status on page 4-8.
1. Note that the letters a and b (as in Ma or Mb) indicate numbers for modify registers in DAG1 and DAG2. The letter a indicates a DAG1 register and can be replaced with 0 through 7. The letter b indicates a DAG2 register and can be replaced with 8 through 15.
4-5
The PEYEN bit (SISD/SIMD mode select) does not influence broadcast operations. Broadcast loading is particularly useful in SIMD applications where the algorithm needs identical data loaded into each processing element. For more information on SIMD mode (in particular, a list of complementary data registers), see Secondary Processing Element (PEy) on page 2-37.
4-6
SRD1L
I2 I3
I4 I5 SRD1H I6 I7
M4 M5 M6 M7
L4 L5 L6 L7
B4 B5 B6 B7
Figure 4-2. Data Address Generator Primary and Alternate Registers from the instruction setting the bit in MODE1 to when the alternate registers may be accessed. Note that it is possible to use any instruction that does not access the switching register file instead of an NOP instruction.
BIT SET MODE1 SRD1L; NOP; R0=DM(i0,m1); /* Activate alternate dag1 lo regs */ /* Wait for access to alternates */
4-7
DM(0x51000), which is the bit-reverse of 0x8a000, then post modifies I0 for the next access with (0x8a000 + 0x4000000)=0x408a000, which is the bit-reverse of DM(0x51020) */
In addition to bit-reverse addressing mode, the processor supports a bit-reverse instruction (BITREV). This instruction bit-reverses the contents of the selected register. For more information on the BITREV instruction, see Modifying DAG Registers on page 4-17 or the ADSP-21160 SHARC DSP Instruction Set Reference.
4-8
The DAGs can provide buffer overflow information when executing circular buffer addressing for I7 or I15. When a buffer overflow occurs (a circular buffering operation increments the I register past the end of the buffer), the appropriate DAG updates a buffer overflow flag in a sticky status (STKYx) register. A buffer overflow can also generate a maskable interrupt. Two ways to use buffer overflows from circular buffering are: Interrupts. Enable interrupts and use an interrupt service routine to handle the overflow condition immediately. This method is appropriate if it is important to handle all overflows as they occur; for example in a ping-pong or swap I/O buffer pointers routine. STKYx registers. Use the BIT TST instruction to examine overflow flags in the STKY register after a series of operations. If an overflow flag is set, the buffer has overflowedwrapped aroundat least once. This method is useful when overflow handling is not critical.
DAG Operations
The processors DAGs perform several types of operations to generate data addresses. As shown in Figure 4-1 on page 4-3, the DAG registers and the MODE1, MODE2, and STKYx registers all contribute to DAG operations. The following sections provide details on DAG operations: Addressing With DAGs on page 4-10 Addressing Circular Buffers on page 4-12 Modifying DAG Registers on page 4-17 An important item to note from Figure 4-1 on page 4-3 is that the DAG automatically adjusts the output address per the word size of the address location (short word, normal word, or long word). This address adjustment lets internal memory use the address directly.
4-9
DAG Operations
SISD/SIMD mode, access word size, and data location (internal/external) all influence data access operations.
1. OUTPUT
+
M
+
M
OUTPUT
I+M
I+M
Figure 4-3. Pre-Modify and Post-Modify Operations The difference between pre-modify and post-modify instructions in the processors assembly syntax is the position of the index and modifier in the instruction. If the I register comes before the modifier, the instruction is a post-modify operation. If the modifier comes before the I register, the
4-10
instruction is a pre-modify without update operation. The following instruction accesses the program memory location indicated by the value in I15 and writes the value I15 + M12 to the I15 register:
R6 = PM(I15,M12); /* Post-modify addressing with update */
By comparison, the following instruction accesses the program memory location indicated by the value I15 + M12 and does not change the value in I15:
R6 = PM(M12,I15); /* Pre-modify addressing without update */
Modify (M) registers can work with any index (I) register in the same DAG (DAG1 or DAG2). For a list of I and M registers and their DAGs, see Figure 4-2 on page 4-7. Instructions can use a number (immediate value), instead of an M register, as the modifier. The size of an immediate value that can modify an I register depends on the instruction type. For all single data access operations, modify immediate values can be up to 32 bits wide. Instructions that combine DAG addressing with computations limit the size of the modify immediate value. In these instructions (multifunction computations), the modify immediate values can be up to 6 bits wide. The following example instruction accepts up to 32-bit modifiers:
R1=DM(0x40000000,I1); /* DM address = I1+0x4000 0000 */
Note that pre-modify addressing operations must not change the memory space of the address. For example, pre-modifying an address in the processors internal memory space should not generate an address in external memory space.
4-11
DAG Operations
4-12
THE FOLLOWING SYNTAX SETS UP AND ACCESSES A CIRCULAR BUFFER WITH: LENGTH = 11 BASE ADDRESS = 0X55000 MODIFIER = 4 BIT SET MODE1 CBUFEN; /* ENABLES CIRCULAR BUFFER ADDRESSING; JUST ONCE IN PROGRAM */ B0 = 0X55000; /* LOADS B0 AND L0 REGISTERS WITH BASE ADDRESS */ L0 = 0XB; /* LOADS L0 REGISTER WITH LENGTH OF BUFFER */ M1 = 0X4; /* LOADS M1 WITH MODIFIER OR STEP SIZE */ LCNTR = 11, DO MY_CIR_BUFFER UNTIL LCE; /* SETS UP A LOOP CONTAINING BUFFER ACCESSES */ R0 = DM(I0,M1); /* AN ACCESS WITHIN THE BUFFER USES POST MODIFY ADDRESSING */ ... /* OTHER INSTRUCTIONS IN THE MY_CIR_BUFFER LOOP */ MY_CIR_BUFFER: NOP; /* END OF MY_CIR_BUFFER LOOP */ 0 1 2 3 4 5 6 7 8 9 10 3 2 1 0 1 2 3 4 5 6 7 8 9 10 6 5 4 0 1 2 3 4 5 6 7 8 9 10 9 8 7 0 1 2 3 4 5 6 7 8 9 10 11 10
THE COLUMNS ABOVE SHOW THE SEQUENCE IN ORDER OF LOCATIONS ACCESSED IN ONE PASS. NOTE THAT "0" ABOVE IS ADDRESS DM(0X55000). THE SEQUENCE REPEATS ON SUBSEQUENT PASSES.
Figure 4-4. Circular Data Buffers As shown in Figure 4-4, programs use the following steps to set up a circular buffer: 1. Enable circular buffering (BIT SET is only needed once in a program.
Mode1 CBUFEN;).
This operation
2. Load the buffers base address into the B register. This operation automatically loads the corresponding I register.
4-13
DAG Operations
3. Load the buffers length into the corresponding L register. For example, L0 corresponds to B0. 4. Load the modify value (step size) into an M register in the corresponding DAG. For example, M0 through M7 correspond to B0. Alternatively, the program can use an immediate value for the modifier. After this set up, the DAGs use the modulus logic in Figure 4-1 on page 4-3 to process circular buffer addressing. On the ADSP-21161 processor, programs enable circular buffering by setting the CBUFEN bit in the MODE1 register. This bit has a corresponding mask bit in the MMASK register. Setting the corresponding MMASK bit causes the CBUFEN bit to be cleared following a push status instruction (PUSH STS), the execution of an external interrupt, timer interrupt, or vectored interrupt. This feature lets programs disable circular buffering while in an interrupt service routine that does not use circular buffering. By disabling circular buffering, the routine does not need to save and restore the DAGs B and L registers. Clearing the CBUFEN bit disables circular buffering for all data load and store operations. The DAGs perform normal post-modify load and store accesses instead, ignoring the B and L register values. Note that a write to a B register modifies the corresponding I register, independent of the state of the CBUFEN bit. The MODIFY instruction executes independent of the state of the CBUFEN bit. The MODIFY instruction always performs circular buffer modify of the index registers if the corresponding B and L registers are set up, independent of the state of the CBUFEN bit. For revision 1.0 and greater of ADSP-21161 processor, the Circular Buffer Enable bit (CBUFEN) in SYSCON is set (=1) upon reset. For earlier silicon revisions 0.x, this bit is cleared (=0) upon reset. This change was made to ensure code compatibility with the ADSP-2106x SHARC family (ADSP-21060/1/2 and ADSP-21065L) where circular buffering is active upon reset.
4-14
However, circular buffering is disabled upon reset for the ADSP-21160. Make note of this when porting code from ADSP-21160 to ADSP-21161 processor. On the first post-modify access to the buffer, the DAG outputs the I register value on the address bus then modifies the address by adding the modify value. If the updated index value is within the buffer length, the DAG writes the value to the I register. If the updated value is outside the buffer length, the DAG subtracts (positive) or adds (negative) the L register value before writing the updated index value to the I register. In equation form, these post-modify and wrap around operations work as follows: If M is positive: Inew = Iold + M if Iold + M < Buffer base + length (end of buffer) Inew = Iold + M L if Iold + M Buffer base + length (end of buffer) If M is negative: Inew = Iold + M if Iold + M Buffer base (start of buffer) Inew = Iold + M + L if Iold + M < Buffer base (start of buffer) The DAGs use all four types of DAG registers for addressing circular buffers. These registers operate as follows for circular buffering: The index (I) register contains the value that the DAG outputs on the address bus. The modify (M) register contains the post-modify amount (positive or negative) that the DAG adds to the I register at the end of each memory access. The M register can be any M register in the same DAG as the I register and does not have to have the same number. The modify value also can be an immediate value instead
4-15
DAG Operations
of an M register. The size of the modify value, whether from an M register or immediate, must be less than the length (L register) of the circular buffer. The length (L) register sets the size of the circular buffer and the address range that the DAG circulates the I register through. L must be positive and cannot have a value greater than 231 1. If an L registers value is zero, its circular buffer operation is disabled. The base (B) register, or the B register plus the L register, is the value that the DAG compares the modified I value with after each access. When the B register is loaded, the corresponding I register is simultaneously loaded with the same value. When I is loaded, B is not changed. Programs can read the B and I registers independently. There is one set of registers (I7 and I15) in each DAG that can generate an interrupt on circular buffer overflow (address wraparound). For more information, see Using DAG Status on page 4-8. When a program needs to use I7 or I15 without circular buffering and the processor has the circular buffer overflow interrupts unmasked, the program should disable the generation of these interrupts by setting the B7/B15 and L7/L15 registers to values that prevent the interrupts from occurring. If I7 were accessing the address range 0x10000x2000, the program could set B7=0x0000 and L7=0xFFFF. Because the processor generates the circular buffer interrupt based on the wrap around equations on page 4-15, setting the L register to zero does not necessarily achieve the desired results. If the program is using either of the circular buffer overflow interrupts, it should avoid using the corresponding I register(s) (I7 or I15) where interrupt branching is not needed.
4-16
In the case of circular buffer overflow interrupts, if CBUFEN = 1 and register L7 = 0 (or L15 = 0), then the CB7I (or CB15I) interrupt occurs at every change of I7 (or I15), after the index register (I7 or I15) crosses the base register (B7 or B15) value. This behavior is independent of the context of the DAG registers, both primary and alternate. When a Long word access, SIMD access, or Normal word access (with LW option) crosses the end of the circular buffer, the processor completes the access before responding to the end of buffer condition.
The BITREV instruction modifies and bit-reverses addresses in any DAG index register (I0-I15) without accessing memory. This instruction is independent of the bit-reverse mode. The BITREV instruction adds a 32-bit immediate value to a DAG index register, bit-reverses the result, and writes the result back to the same index register. The following example adds 4 to I1, bit-reverses the result, and updates I1 with the new value:
BITREV(I1,4);
4-17
31
Figure 4-5. Normal Word (32-bit) DAG Register Memory Transfers The DAGs align extended-precision normal word (40-bit) addressed transfers or register-to-register transfers to bits 39-8 of the buses. These transfers between a 40-bit data register and 32-bit DAG1 or DAG2 registers use the 64-bit DM and PM data buses. Figure 4-6 illustrates these transfers.
4-19
DM OR PM DATA BUS
63 0X0000 00 39 8 0X00 0
31
Figure 4-6. DAG Register to Data Register Transfers Long word (64-bit) addressed transfers between memory and 32-bit DAG1 or DAG2 registers target double DAG registers and use the 64-bit DM and PM data buses. Figure 4-7 illustrates how the bus works in these transfers. If the Long word transfer specifies an even-numbered DAG register (e.g., I0 or I2), then the even numbered register value transfers on the lower half of the 64-bit bus, and the even numbered register + 1 value transfers on the upper half (bits 63-32) of the bus.
DM OR PM DATA BUS
63 31 0
31
31
Figure 4-7. Long Word DAG Register to Data Register Transfers If the Long word transfer specifies an odd numbered DAG register (e.g., I1, or B3), the odd numbered register value transfers on the lower half of the 64-bit bus, and the odd numbered register - 1 value (I0 or B2 in this example) transfers on the upper half (bits 63-32) of the bus.
4-20
In both the even- and odd-numbered cases, the explicitly specified DAG register sources or sinks bits 31-0 of the Long word addressed memory.
DAG register are accessible in pair granularity for single-cycle access. The pairings are odd-even. For example I0 and I1 are a pair, and I2 and I3 are a pair.
4-21
Certain other sequences of instructions cause incorrect results on the processor and are flagged as errors by processor assembler software. These types of instructions can execute on the processor, but cause incorrect results: An instruction that stores a DAG register in memory using indirect addressing from the same DAG, with or without update of the index register. The instruction writes the wrong data to memory or updates the wrong index register. Do not try these: DM(M2,I1)=I0; or DM(I1,M2)=I0; These example instructions do not work because I0 and I1 are both DAG1 registers. An instruction that loads a DAG register from memory using indirect addressing from the same DAG, with update of the index register. The instruction either loads the DAG register or updates the index register, but not both. Do not try this: L2=DM(I1,M0); This example instruction does not work because L2 and I1 are both DAG1 registers.
4-22
4-23
Table 4-3. Post-Modify Addressing, Modified By 6-Bit Data and Updating I Register
DM(I7-0,Data6)=Dreg; {DAG1} PM(I15-8,Data6)=Dreg; {DAG2} Dreg=DM(I7-0,Data6); {DAG1} Dreg=PM(I15-8,Data6); {DAG2}
Table 4-5. Pre-Modify Addressing, Modified By 6-Bit Data (No I Register Update)
DM(Data6,I7-0)=Dreg; {DAG1} PM(Data6,I15-8)=Dreg; {DAG2} Dreg=DM(Data6,I7-0); {DAG1} Dreg=PM(Data6,I15-8); {DAG2}
Table 4-6. Pre-Modify Addressing, Modified By 32-Bit Data (No I Register Update)
Ureg=DM(Data32,I7-0) (LW); {DAG1} Ureg=PM(Data32,I15-8) (LW); {DAG2} DM(Data32,I7-0)=Ureg (LW); {DAG1} PM(Data32,I15-8)=Ureg (LW); {DAG2}
4-24
4-25
4-26
5 MEMORY
The ADSP-21161 processor contains a large, dual-ported internal memory for single-cycle, simultaneous, independent accesses by the core processor and I/O processor. The dual-ported memory in combination with three separate on-chip buses allow two data transfers from the core and one transfer from the I/O processor in a single cycle. Using the IO bus, the I/O processor provides data transfers between internal memory and the processors communication ports (link ports, serial ports, and external port) without hindering the processor cores access to memory. This chapter describes the processors memory and how to use it. The processor provides access to external memory through the processors external port. For information on connecting and timing accesses to external memory, see External Memory Interface on page 7-3. The processor contains one megabit of on-chip SRAM, organized as two blocks of 0.5 Mbits. Each block can be configured for different combinations of code and data storage. All of the memory can be accessed as 16-bit, 32-bit, 48-bit, or 64-bit words. The memory can be configured in each block as a maximum of 16K words of 32-bit data, 8K words of 64-bit data, 32K words of 16-bit data, 10.67K words of 48-bit instructions (or 40-bit data), or combinations of different word sizes up to 0.5 Mbit. This gives a total for the complete internal memory: a maximum of 32K words of 32-bit data, 16K words of 64-bit data, 64K words of 16-bit data, and 21K words of 48-bit instructions (or 40-bit data). The processor features a 16-bit floating-point storage format that effectively doubles the amount of data that may be stored on-chip. A single instruction converts the format from 32-bit floating-point to 16-bit floating-point.
5-1
Internal Memory
While each memory block can store combinations of code and data, accesses are most efficient when one block stores data using the DM bus, (typically block 1) for transfers, and the other block (typically block 0) stores instructions and data using the PM bus. Using the DM bus and PM bus with one dedicated to each memory block assures single-cycle execution with two data transfers. In this case, the instruction must be available in the cache.
Internal Memory
The ADSP-21161 has 2 MBits of internal memory space; 1 MBit is addressable. The 1 MBit of memory is divided into two 0.5 MBit blocks: Block 0 and Block 1. The additional 1MBit of the memory space is reserved on the ADSP-21161. Table 5-1 shows the maximum number of data or instruction words that can fit in each 0.5 MBit internal memory block. Table 5-1. Words Per 0.5 MBit Internal Memory Block
Word Type Instruction Long Word Data Extended Precision Normal Word Data Normal Word Data Short Word Data Bits Per Word 48-bits 64-bits 40-bits 32-bits 16-bits Maximum Number of Words Per 0.5 MBit block 10.67K Words 8K Words 10.67K Words 16K Words 32K Words
External Memory
While the processors internal memory is divided into blocks, the processors external memory spaces are divided into banks. The internal memory blocks and the external memory spaces may be addressed by either data 5-2 ADSP-21161 SHARC Processor Hardware Reference
Memory
address generator. External memory banks are fixed sizes that can be configured for various waitstate and access configurations. For more information, see External Memory on page 5-22. There are 254 Mwords of external memory space that the processor can address. External memory connects to the processors external port, which extends the processors 24-bit address and 32-bit data buses off the processor. The processor can make 8, 16, 32, or 48-bit accesses to external memory for instructions and 8,16, or 32-bit accesses for data. Table 5-2 shows the access types and words for processor external memory accesses. The processors DMA controller automatically packs external data into the appropriate word width during data transfer. The external data bus can be expanded to 48-bits if the link ports are disabled and the corresponding full width instruction packing mode (IPACK) is enabled in the SYSCON register. Ensure that link ports are disabled when executing code from external 48-bit memory. For more information, see Executing Instructions From External Memory on page 5-101. Table 5-2. Internal-to-External Memory Word Transfers1
Word Type Packed Instruction Normal Word Data Short Word Data 1 Transfer Type 32, 16, or 8- to 48-bit packing 32-bit word in 32-bit transfer Not supported
For external port word alignment, see Figure 7-1 on page 7-2.
The total addressable space for the fixed external memory bank sizes depends on whether SDRAM or Non-SDRAM (for example, SRAM, SBSRAM) is used. Each external memory bank for SDRAM can address 64M words. For Non-SDRAM memory, each bank can address up to
5-3
Processor Architecture
16M words. The remaining 48M words are reserved. These reserved addresses for non-SDRAM accesses are aliased to the first 16M spaces within the bank. The total external memory available is given as follows: 3*(16M) + 14M = 62M (Non- SDRAM banks) 3*(64M) + 62M = 254M (SDRAM banks) Banks 1, 2 and 3 have the same amount of external memory (16M for Non-SDRAM and 64M for SDRAM), while bank 0 is smaller (14M for Non-SDRAM and 62M for SDRAM). The external memory address bus is 24-bits wide with four additional bank select MSx lines. For more information on the external memory, see the section External Memory on page 5-22.
Processor Architecture
Most microprocessors use a single address and single data bus for memory access. This type of memory architecture is called Von Neumann architecture. But, DSPs require greater data throughput than Von Neumann architecture provides, so many DSPs use memory architectures that have separate data and address buses for program and data storage. These two sets of buses let the processor retrieve a data word and an instruction simultaneously. This type of memory architecture is called Harvard architecture. SHARC DSPs go a step further by using a Super Harvard architecture. This four bus architecture has two address buses and two data buses, but provides a single, unified address space for program and data storage. While the Data Memory (DM) bus only carries data, the Program Memory (PM) bus handles instructions and data, allowing dual-data accesses.
5-4
Memory
Processor core and I/O processor accesses to internal memory are completely independent and transparent to one another. Each block of memory can be accessed by the processor core and I/O processor in every cycleno extra cycles are incurred if the processor core and the I/O processor access the same block. A memory access conflict can occur when the processor core attempts two accesses to the same internal memory block in the same cycle. When this conflict, known as block conflict occurs, an extra cycle is incurred. The DM bus access completes first and the PM bus access completes in the following (extra) cycle. During a single-cycle, dual-data access, the processor core uses the independent PM and DM buses to simultaneously access data from both memory blocks. Though dual-data accesses provide greater data throughput, it is important to note some limitations on how programs may use them. The limitations on single-cycle, dual-data accesses are: The two pieces of data must come from different memory blocks. If the core accesses two words from the same memory block over the same bus in a single instruction, an extra cycle is needed. The data access execution may not conflict with an instruction fetch operation. The PM data bus tries to fetch an instruction in every cycle. If a data fetch is also attempted over the PM bus, an extra cycle may be required depending on the cache. If the cache contains the conflicting instruction, the data access completes in a single-cycle and the sequencer uses the cached instruction. If the conflicting instruction is not in the cache, an extra cycle is needed to complete the data access and cache the conflicting instruction. For more information, see Instruction Cache on page 3-8. For more information on how the buses access memory blocks, see Internal Memory on page 5-16.
5-5
5-6
Memory
Buses
As shown in Figure 5-1 on page 5-9, the processor has three sets of internal buses connected to its dual-ported memory, the Program Memory (PM) bus, Data Memory (DM) bus, and I/O Processor (IO) bus. The PM bus and DM bus share one memory port and the IO bus connects to the other port. Memory accesses from the processors core (computational units, data address generators, or program sequencer) use the PM or DM buses, while the I/O processor uses the IO bus for memory accesses. The processor cores PM bus and DM bus and I/O processors External Port (EP) bus can try to access multiprocessor memory space or external memory space in the same cycle. The processor has a two level arbitration system to handle this conflicting access. Arbitration stems from a priority convention and the state of the SYSCON registers EBPRx bits. When arbitrating between the processor core buses, the DM bus always has priority over the PM bus. Arbitration between the winning core bus and I/O processor EP bus depends on the priority set with the EBPRx bits. For more information on setting this priority, see External Bus Priority on page 5-39.
5-7
Buses
Almost without exception, the processors three buses can access all memory spaces, supporting all data sizes. There are three restrictions on the access of buses to memory. The limitations on the PM, DM, and IO buses are as follows: The PM, DM, and IO buses make Normal Word addressing accesses to multiprocessor or external memory. These buses can make 40/48 bit data transfers by configuring the link data pins as additional data pins for external accesses. For more information, see Multiprocessor Memory on page 5-19. The IO bus may not access the I/O processors memory mapped registers. For more information, see I/O Processor on page 6-1. The IO bus may not use short word addressing for DMA operation. Addresses for the PM and DM buses come from the processors program sequencer and Data Address Generators (DAGs). The program sequencer generates 24-bit program memory addresses while DAGs supply 32-bit addresses for locations throughout the processors memory spaces. The DAGs supply addresses for data reads and writes on both the PM and DM address buses, while the program sequencer uses only the PM address bus for sequencing execution. Each DAG is associated with a particular data bus. DAG1 supplies addresses over the DM bus and DAG2 supplies addresses over the PM bus. For more information on address generation, see Program Sequencer on page 3-1 or Data Address Generator on page 4-1. Because the processors internal memory is arranged in four 16-bit wide by 8K high columns, memory is addressable in widths that are multiples of columns up to 64 bits: 1 column = 16-bit words, 2 columns = 32-bit words, 3 columns = 48- or 40-bit words, and 4 columns = 64-bit words. For more information on the how the processor works with memory words, see Memory Organization and Word Size on page 5-25.
5-8
Memory
ADDRESS
BANK 0 (STARTING AT NORMAL WORD 0X200000) ADDRESS 24 ANY TWO PATHS SIMULTANEOUSLY ADDRESSES AND DATA FOLLOW PARALLEL PATHS EXTERNAL PORT DATA 32
DM DATA BUS
IO ADDRESS BUS
IO ADDRESS IO DATA
I/O PROCESSOR
EP EP ADDRESS DATA
IO DATA BUS
5-9
Buses
The PM and DM data buses are 64 bits wide. Both data buses can handle long word (64-bit), normal word (32-bit), extended-precision normal word (40-bit), and short word (16-bit) data, but only the PM data bus carries Instruction words (48-bit).
Combined PX Register 63 32 31 0
PX2 31 0 31
PX1 0
Figure 5-2. PM Bus Exchange (PX, PX1, and PX2) Registers The PX1, PX2, and the combined PX register are Universal registers (UREG) that are accessible for register-to-register or memory-to-register transfers.
5-10
Memory
Instruction Examples R3 = PX; Register File Transfer 40 bits 39 0 39 R3 = PX1; or R3 = PX2; Register File Transfer 32 bits 0x0 8 7 0
0x0 0 24 23 PX1 31
32 bits
0
PX1 or PX2
Figure 5-3. PX, PX1, and PX2 Register-to-Register Transfers register-to-register transfers with data registers are either 40-bit transfers for the combined PX or 32-bit transfers for PX1 or PX2. Figure 5-3 shows the bit alignment and gives an example of instructions for register-to-register transfers.
PX
Figure 5-3 shows that during a transfer between PX1 or PX2 and a data register (DREG), the bus transfers the upper 32 bits of the register file and zero fills the eight LSBs. During a transfer between the combined PX register and a register file, the bus transfers the upper 40 bits of PX and zero fills the lower 24 bits.
5-11
Buses
register-to- internal memory transfers over the DM or PM data bus are either 48-bit for the combined PX or 32-bit transfers (on bits 31-0 of the bus) for PX1 or PX2. Figure 5-4 shows these transfers.
PX Instruction Examples PX = DM (0xC0000) (LW); DM and PM Data Bus Transfer (not LW) 48 bits 63 31 0x0 8 7 0 63 PM(I7,M7) = PX1; DM or PM Data Bus Transfer 0x0 31 32 bits 0
0x0 8 7 0 31
Figure 5-4. PX, PX1, PX2 Register-to-Memory Transfers on DM (LW) or PM (LW) Data Bus Figure 5-4 shows that during a transfer between PX1 or PX2 and internal memory, the bus transfers the lower 32 bits of the register. During a transfer between the combined PX register and internal memory, the bus transfers the upper 48 bits of PX and zero fills the lower 8 bits. The status of the memory blocks Internal Memory Data Width (IMDWx) setting does not effect this default transfer size for PX to internal memory. Figure 5-5 shows a PX register-to-external memory transfer. The PX register transfers the upper 32 bits of the PM data bus into PX1 and the lower 16 bits to PX2, zero filling the remaining 16 bits.
5-12
Memory
64 bits 63 31 Combined PX 0
Figure 5-5. PX Register-to-External Memory Transfers Since there are 32 DATA pins on the ADSP-21161 processor, 40/48 bit data transfers using register to register transfers are not directly supported. To accomplish 40/48 bit data transfers with the PX register, you must configure the link data pins as additional data pins for external accesses. Full width instruction mode (IPACK) must be enabled in the SYSCON register. The 16 link data pins are configured as DATA pins and the processor fetches the upper 32 bits of instruction on 32 DATA pins and lower 16 bits of instruction on the link data pins. To transfer both 48-bit instructions and 40-bit double precision data to a register, you must swap the PX1 and PX2 registers. See the following code examples:
5-13
Buses
Example 1: To transfer 48-bits from external memory to internal memory, use the following code:
PX = DM(EXT_MEMORY_LOC); R0 = PX1; PX1 = PX2; PX2 = R0; DM(INT_MEMORY_LOC) = PX;
Example 2: To transfer a 40-bit data from external memory to a register, use the following code:
PX = DM(EXT_MEMORY_LOC); R0 = PX1; PX1 = PX2; PX2 = R0; R1 = PX;
All transfers between the PX register and the I/O processor LBUFx registers are 48-bit transfers (most significant 48-bits of PX). All transfers between the PX register (or any other internal register/memory) and any I/O processor register (other than the EPBx or LBUFx) are 32-bit transfers (least significant 32-bits of PX). All transfers between the PX register and data registers (R0-R15 or S0-S15) are 40-bit transfers. The most significant 40-bits are transferred as shown in Figure 5-3 on page 5-11. Figure 5-6 shows the transfer size between PX and internal memory over the PM or DM data bus when using the long word (LW) option.
5-14
Memory
64-bits 63 31 Combined PX 0
Figure 5-6. PX Register-to-Memory Transfers on PM Data Bus The LW notation in Figure 5-6 draws attention to an important feature of PX register-to-internal memory transfers over the PM or DM data bus for the combined PX register. PX transfers to memory are 48-bit (3-column) transfers on bits 0-31 of the PM or DM data bus, unless forced to be 64-bit (4-column) transfers with the LW (Long Word) mnemonic. There is no implicit move when the combined PX register is used in SIMD mode. For example, in SIMD mode, the following moves could occur:
PX1 = R0; PX = R0; /* R0 32-bit explicit move to PX1, and R1 32-bit implicit move to PX2 */ /* R0 40-bit explicit move to PX, but no implicit move for R1 */
5-15
Internal Memory
The ADSP-21161s internal memory space appears in Figure 5-7. This memory space has four address regions. I/O processor memory mapped registers. This region ranges from address 0x0000 0000 through 0x0000 01FF (Normal Word). Reserved memory. This region ranges from address 0x0000 0200 through 0x0001 FFFF. These addresses are not accessible.
5-16
Memory
RESERVED (I/O)
EACH OF THESE ADDRESSING TYPES ADDRESS THE SAME PHYSICAL MEMORY BUT USE DIFFERENT WORD WIDTHS.
0X0002 0000
0X0004 0000
0X0004 0000
0X0008 0000
BLOCK 0
BLOCK 1
0X0002 9FFF
0X0005 3FFF
0X0005 2AA9
0X000A 7FFF
5-17
Block 0 memory. This region, typically PM, ranges from address 0x0004 0000 through 0x0004 3FFF (Normal Word). DAG2 generates PM data addresses. Block 1 memory. This region, typically DM, ranges from address 0x0005 0000 through 0x0005 3FFF (Normal Word). DAG1 generates DM data addresses. The I/O processors memory-mapped registers control the system configuration of the processor and I/O operations. For more information, see I/O Processor on page 6-1. These registers occupy consecutive 32-bit locations in this region. If a program uses long word addressing (forced with the LW mnemonic) to accesses this region, the access is only to the addressed 32-bit register, rather than accessing two adjacent I/O processor registers. The register contents are transferred on bits 31-0 of the data bus. There are a couple of exceptions to this one-at-a-time I/O processor register access rule: Long word accesses to external port buffer (EPBx) or link port buffer (LBUFx) locations using the PX register access two adjacent 32-bit I/O registers. Long word accesses to the external port data buffer locations (EPBx) in SIMD mode access two adjacent 32-bit I/O registers. As shown in Figure 5-7 on page 5-17, the processor can address memory in the Block 0 and Block 1 using long word, normal word, or short word addressing. The processor interprets the addressing mode from the address range for the access. Though there are multiple addressing modes for each memory region, these different modes are addressing the same physical memory. For example, the long word address 0x0002 0000 corresponds to the same locations as normal word addresses 0x0004 0000 and 0x0004 0001. This also corresponds to the same locations as short word addresses 0x0008 0000, 0x0008 0001, 0x0008 0002, and 0x0008 0003.
5-18
Memory
Figure 5-7 on page 5-17 also shows that there are gaps in the processors memory map when using normal word addressing for 48-bit (instruction word) or 40-bit (extended-precision normal word) accesses. These gaps of missing addresses stem from the arrangement of this 3-column data in memory. For more information, see Memory Organization and Word Size on page 5-25.
Multiprocessor Memory
The ADSP-21161s multiprocessor memory space appears in Figure 5-8. This memory space has seven address regions that correspond to the IOP register space of the DSPs in a multiprocessing system. Each of the processors in such a system has a processor ID, which is set with the processors ID2-0 pins. The address regions by processor ID are: Internal memory with ID=001. This region ranges from address 0x0010 0000 through 0x0011 FFFF. Internal memory with ID=010. This region ranges from address 0x0012 0000 through 0x0013 FFFF. Internal memory with ID=011. This region ranges from address 0x0014 0000 through 0x0015 FFFF. Internal memory with ID=100. This region ranges from address 0x0016 0000 through 0x0017 FFFF. Internal memory with ID=101. This region ranges from address 0x0018 0000 through 0x0019 FFFF. Internal memory with ID=110. This region ranges from address 0x001A 0000 through 0x001B FFFF.
5-19
0x0000 0000 INTERNAL MEMORY SPACE IOP Registers Long Word Addressing 0x0004 0000 Normal Word Addressing 0x0008 0000 Short Word Addressing 0x0010 0000 IOP Space of ADSP-21161 with ID=001 BANK 1 0x0012 0000 IOP Space of ADSP-21161 with ID=010 0x0014 0000 IOP Space of ADSP-21161 with ID=011 MULTIPROCESSOR MEMORY SPACE 0x0016 0000 IOP Space of ADSP-21161 with ID=100 0x0018 0000 IOP Space of ADSP-21161 with ID=101 0x001A 0000 IOP Space of ADSP-21161 with ID=110 0x001C 0000 Reserved 0x001F FFFF Normal Word Addressing : 32-bit Data Words Short Word Addressing : 16-bit Data Words BANK 2 EXTERNAL MEMORY SPACE BANK 0 0x0002 0000
0x0020 0000
MS 0
0x0400 0000
MS 1
0x0800 0000
MS 2
0x0C00 0000
BANK 3
MS
0x0FFF FFFF
5-20
Memory
It is important to note that programs may only use normal word addressing in multiprocessor memory space. Long or short word writes may corrupt valid data, and long or short word reads return invalid data. The address range of the access determines which processors internal memory is the multiprocessor memory access source or destination. Instead of using its own IOP register address range, a processor can access its IOP space through the corresponding address range in multiprocessor memory space. In this case, the processor reads or writes to its own IOP registers and does not make an access on the external system bus. Note that such self-accesses through multiprocessor memory space may only be accomplished with processor-core-generated addresses, not I/O processor-generated addresses. For more information on memory accesses in multiprocessor systems, see External Port on page 7-1. Table 5-3 shows how the processor decodes and routes memory addresses over the DM and PM buses. Table 5-3. Address Decoding For Memory Accesses
Address Bits1 ADDR31-28 ADDR27-24 Field NA V Description Reserved Virtual address. Drives MS3-0 as follows: 00 = Depends on E, S and M bits; address corresponds to local processors internal or external memory bank 0 01 = External memory bank 1, local processor 10 = External memory bank 2, local processor 11 = External memory bank 3, local processor Memory address. 00000[00] = Address in local or remote processors internal memory space. xxxxx[xx] = Based on V bits; address in one of local processors four external memory banks.
ADDR23-21
E2
5-21
Description Multiprocessor memory. If this bit is 1, the address is in multiprocessor memory space. If this bit is 0, the address is in IOP register space. IOP MMS accesses. Depends on M bit. When bit 20 is set to 1, bits 19:17 indicate the following: 000 = Address is in IOP space of processor with ID1 001 = Address is in IOP space of processor with ID2 010 = Address is in IOP space of processor with ID3 100 = Address is in IOP space of processor with ID4 011 = Address is in IOP space of processor with ID5 101 = Address is in IOP space of processor with ID6 Internal memory and IOP register space.
ADDR19-17
S2
ADDR16-0 1 2
NA
Setup and hold times for these address lines are specified in the processor Data Sheet. For a description of these address fields, see Multiprocessor Memory on page 5-19.
External Memory
The ADSP-21161s external memory space appears in Figure 5-9. The processor accesses external memory space through the external port, which multiplexes the processor cores PM and DM buses and the I/O processors EP bus. To address this space, the processors DAG1, DAG2, and I/O processor generate 32-bit addresses over the DM, PM, and EP address buses, allowing the processor to access to the complete 254 Mword memory map. The program sequencer only generates 24-bit addresses over the PM bus, limiting sequencing to the low 62 Mwords (for SDRAM) or low 14 Mwords (for SRAM) of the memory map. The external memory space has four banks (bank 0-3). The processor controls access to the banked regions with memory select lines (MS3-0) in addition to the memory address. Each region of external memory may be configured for access modes and waitstates. For more information on con-
5-22
Memory
figuring external memory banks, see Setting Data Access Modes on page 5-32. For more information on accessing external memory, see External Port on page 7-1. The external memory space can also accommodate an optional boot memory EPROM or FLASH. For more information, see Using Boot Memory on page 5-35.
ALWAYS ADDRESSED AS NORMAL WORD
0X0020 0000
BANK 0
0X00FF FFFF ( NON-SDRAM ) 0X03FF FFFF ( SDRAM ) 0X0400 0000
MS0
BANK 1
0X04FF FFFF ( NON-SDRAM ) 0X07FF FFFF ( SDRAM ) 0X0800 0000
MS1
BMS
BANK 4
MS3
EXTERNAL MEMORY
5-23
5-24
Memory
5-25
Figure 5-7 on page 5-17 shows the memory ranges for each data size in the processors internal memory.
Figure 5-10. 48-bit Word Rotations Mixing 32-Bit and 48-Bit Words The processors memory organization lets programs freely place memory words of all sizes (see Memory Organization and Word Size on page 5-25) with few restrictions (see Restrictions on Mixing 32-Bit and 48-Bit Words on page 5-28). This memory organization also lets programs mix (place in adjacent addresses) words of all sizes. This section discusses how to mix odd (3-column) and even (4-column) data words in the processors memory. Transition boundaries between 48-bit (3-column) data and any other data size, can only occur at any 64-bit address boundary within either internal memory block. Depending on the ending address of the 48-bit words, there are zero, one, or two empty locations at the transition between the 48-bit (3-column) words and the 64-bit (4-column) words. These empty locations result from the column rotation for storing 48-bit words. The three possible transition arrangements appear in Figure 5-11, Figure 5-12, and Figure 5-13.
5-26
Memory
Transitioning from 48-bit to 32-bit data with zero empty locations: (48-bit word top address) 32-bit word 3 32-bit word 1 48-bit word top 48-bit word top-1 Addresses 48-bit word top-2 0 32-bit word 2 32-bit word 0 48-bit word top-1 48-bit word top-2 48-bit word top-3
5-27
Transitioning from 48-bit to 32-bit data with one empty locations: (48-bit word top address) 32-bit word 3 32-bit word 1 Empty 32-bit word 2 32-bit word 0 48-bit word top 48-bit word top-1 Addresses 48-bit word top-2 0 48-bit word top-2 48-bit word top-3
Figure 5-12. Mixed Instructions and Data With One Unused Location Restrictions on Mixing 32-Bit and 48-Bit Words There are some restrictions that stem from the memory column rotations for 3-column data (48- or 40-bit words) and relate to the way that 3-column data can mix with 4-column data (32-bit words) in memory. These restrictions apply to mixing 48- and 32-bit words, because the processor uses a normal word address to access both of these types of data even though 48-bit data maps onto 3-columns of memory and 32-bit data maps onto 2-columns of memory.
5-28
Memory
Transitioning from 48-bit to 32-bit data with two empty locations: (48-bit word top address) 32-bit word 3 32-bit word 1 Empty 48-bit word top Addresses Empty 32-bit word 2 32-bit word 0 48-bit word top 48-bit word top-1 48-bit word top-2 0 48-bit word top-3
Figure 5-13. Mixed Instructions and Data With One Unused Location When a system has a range of 3-column (48-bit) words followed by a range of 2-column (32-bit) words, there is often a gap of empty 16-bit locations between the two address ranges. The size of the address gap varies with the ending address of the range of 48-bit words. Because the addresses within the gap alias to both 48- and 32-bit words, a 48-bit write into the gap corrupts 32-bit locations, and a 32-bit write into the gap corrupts 48-bit locations. The locations within the gap are only accessible with short word (16-bit) accesses.
5-29
Calculating the starting address for 4-column data that minimizes the gap after 3-column data is a useful calculation for programs that are mixing 3and 4-column data. Given the last address of the 3-column (48-bit) data, the starting address of the 32-bit range that most efficiently uses memory can be determined by the equation shown in Listing 5-1: Listing 5-1. Starting Address m = B + 2 [(n MOD 10,922) TRUNC((n MOD 10,922) / 4)] where: n is the number of contiguous 48-bit words allocated in the internal memory block (n < 21845) B is the base normal word address of the internal memory block; if {0 < n < 10,922} then B = 0x40000 (Block 0) else B = 0x50000 (Block 1) m is the first 32-bit normal word address to use after the end of 48-bit words Example 1: Calculating a starting address for a 32-bit addresses The last valid address is 0x42694. The number of 48-bit words (n) is given as follows: n = 0x42694 - 0x40000+1= 0x2695 When you convert 0x2695 to decimal representation, the result is 9877. The base (B) Normal word address of the internal memory block is 0x40000 since the condition: 0 < 10922 is TRUE.
5-30
Memory
The first 32-bit Normal word address to use after the end of the 48-bit words is given by: m = 0x40000 + 2 [(9877 MOD 10922)- TRUNC (9877 MOD 10922)/4] m = 0x40000 + 14816decimal Convert to a hexadecimal address: 14816decimal = 0x39E0 m = 0x40000 + 0x39E0 = 0x439E0 The first valid starting 32-bit address is 0x439E0. The starting address must begin on an even address. 48-Bit Word Allocation Another useful calculation for programs that are mixing 3- and 4-column data is to calculate the amount of 3-column data that minimizes the gap before starting 4-column data. Given the starting address of the 4-column (32-bit) data, the number of 48-bit words to allocate that most efficiently uses memory can be determined as shown in Listing 5-2: Listing 5-2. 48-bit Word Allocation m = TRUNC{(4/3)[(1/2)(m-b)]} + W where m is the first 32-bit normal word address after the end of 48-bit words (0x3FFFF < m < 0x44000 for block 1, 0x4FFFF < m < 0x54000 for block 2) B is the base normal word address of the internal memory block; if {0x3FFFF < m < 0x50000} then B = 0x40000 else B = 0x50000 (Block 1)
5-31
W is the number of offset words; if {B = 0x50000} then W = 43,690 else W = 0 n is the number of contiguous 48-bit words the system should allocate in the internal memory block
5-32
Memory
Instruction Packing Mode. SYSCON Bits 30 and 31 (IPACK1 and IPACK0). These bits select the external packing instruction execution as 8- to 48-bit, 16- to 48-bit, 32- to 48-bit or no pack mode. External Bus Priority. SYSCON Bits 18-17 (EBPRx). This bit field selects the priority for the I/O processors EP bus when both the core and the IOP attempt to access external memory.
SYSCON (0x0000)
IPACK
External Packed Instruction Execution Mode 00 = 32-to-48 packed instruction execution 01 = Full 48-bit instruction execution / No-Packing Mode 10 = 16 -to-48 packed instruction execution 11 = 8- to -48 packed instruction execution 15 14 13 12 11 10 9 0 0 0 0 0 0 0 8 0 7 0 6 0 5 0 4 0 3 0 2 0 1 0 0 0 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
External Bus Priority 00=even priority between core processor and IOP bus 01=core processor priority, 10= I/O processor priority
EBPR
BSO IMDW1
Internal Memory Block 1 Data Width 0=32 -bit data, 1=40 -bit data Internal Memory Block 0 Data Width 0=32 -bit data, 1=40 -bit data
IIVT
IMDW0
5-33
5-34
Memory
External Bank X Waitstates. WAIT Bits 4-2 (EB0WS), Bits 9-7 (EB1WS), Bits 14-12 (EB2WS), Bits 19-17 (EB3WS) and Bits 24-22 (RBWS). These bit fields independently select the number of waitstates for each of the external memory banks. After reset, the default number of waitstates is seven.
5-35
gram must unmask the DMA channels interrupt in the IMASK register; if using external port DMA buffer zero (EP0I), the program could enable this interrupt by setting the EPOI bit to 1 in the IMASK register. For more information on external port DMA, see External Port DMA on page 6-29. While a program may use any external port DMA channel for accessing boot memory, it is important to note that only DMA channel 10 has a fixed 8- to 48-bit packing mode for boot memory reads. By using DMA channel 10 to complete initial program loading, a program can take advantage of this special packing mode. When a program sets BSO, the processor ignores the DMA channels packing mode (PMODE) bits for DMA channel 10 and forces 8- to 48-bit packing for reads. This 8-bit packing mode is used on DMA channel 10 during EPROM booting or on DMA reads when BSO is set. While one of the external port DMA channels is making a DMA access to boot memory with the BSO bit set, none of the other three channels may make a DMA access to external (not boot) memory. Only external port DMA transfers assert BMS when BSO is set; processor core accesses to external memory always use the MSx pins. Because the processor core only accesses external (not boot) memory, programs can access external memory in between DMA accesses to boot memory. Writing to Boot Memory In systems using write-able EEPROM or FLASH memory for boot memory, programs can write new data to the processors boot memory using the boot select override (BSO) pin. As described in Reading From Boot Memory on page 5-35, setting (=1) the BSO bit overrides the external memory selects and asserts the processors BMS pin for an external memory DMA transfer.
5-36
Memory
To write to boot memory using the BMS signal, programs must use DMA channels 11, 12 or 13, but not DMA channel 10. With the BSO bit set, programs should only use DMA channel 10 for reads. When BSO is set, programs can use DMA channels 11-13 with any settings in channels the DMACx register, any packing mode, and any data or instruction.
5-37
If a program tries to write 40-bit data (for example, a data register-to-memory transfer), the transfer truncates the lower 8-bits from the register; only writing 32 most significant bits. If a program tries to read 40-bit data (for example, a memory-to-data register transfer), the transfer zero-fills the lower 8 bits of the register; only reading the 32 most significant bits. The Program Memory Bus Exchange (PX) register is the only exception to these transfer rulesall loads/stores of the PX register are performed as 48-bit accesses unless forced to 64-bit access with the LW mnemonic. If any 40-bit data must be stored in a memory block configured for 32-bit words, the program should use the PX register to access the 40-bit data in 48-bit words. Programs should take care not to corrupt any 32-bit data with this type of access. For more information, see Restrictions on Mixing 32-Bit and 48-Bit Words on page 5-28. The Long word (LW) mnemonic only effects normal word address accesses and overrides all other factors (SIMD, IMDWx).
5-38
Memory
one of the four banks, the processor asserts the corresponding memory select line (MS3-0).The size of the memory banks is 3.67 Mwords (SRAM) or 15.67 Mwords (SDRAM).
5-39
Accesses in SIMD mode transfer both an explicit (named) location and an implicit (un-named, complementary) location. The explicit transfers is a data transfers between the explicit register and the explicit address, and the implicit transfer is between the implicit register and the implicit address. For information on complementary (implicit) registers in SIMD mode accesses, see Secondary Processing Element (PEy) on page 2-37. For more information on complementary (implicit) memory locations in SIMD mode accesses, see Accessing Memory on page 5-46.
5-40
Memory
The post increment in the explicit operation is performed before the implicit instructions are executed.
5-41
tion Detected (IICDI) interrupt if the interrupt is enabled in the IMASK register. For more information, see Mode Control 2 Register (MODE2) on page A-10. The following code example shows the access for even and odd addresses. When accessing an odd address, the sticky bit is set to indicate the unaligned access.
bit set mode2 U64MAE; bit access r0=0x11111111; r1=0x22222222; pm(0x4e800)=r0(lw); //even address in 32 bit, access is aligned pm(0x4e803)=r0(lw); //odd address in 32 bit, sticky bit is set //set testbit for align or unaligned 64
5-42
Memory
that is enabled during reset or on DSPs with ID2-0=00x. The external bank access modes appear in Table 5-5. The WAIT register bit descriptions appear in Figure 5-15. Table 5-5. External Bank Access Mode
EBxAM Field 00 External Bank Access Mode Asynchronous RD and WR strobes change before CLKOUTs edge. Accesses use the waitstate count setting from EBxWS AND require external acknowledge (ACK), allowing a de-asserted ACK to extend the access time. 01 Synchronous RD and WR strobes change on CLKOUTs edge. Accesses use the waitstate count setting from EBxWS (minimum EBxWS=001) AND require external acknowledge (ACK), allowing a de-asserted ACK to extend the read access time. Writes are 0-wait state. 10 Synchronous RD and WR strobes change on CLKOUTs edge. Accesses use the waitstate count setting from EBxWS (minimum EBxWS=001) AND require external acknowledge (ACK), allowing a de-asserted ACK to extend the read access time. Writes are 1-wait state. 11 Reserved
5-43
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
WAIT (0x0002)
1 1
HIDMA
Handshake and Idle for DMA enable 0 =no idle cycle 1=adds an idle cycle after every handshake DMA DMAG asserted longer reduces bus contention for slower devices
EB3AM
External Bank 3 Access Mode
EB3WS
External Bank 3 waitstates
RBAM
ROM Boot Access Mode
RBWS
ROM Boot Waitstates
15 14 13 12 11 10 9 0 1 1 1 0 0 1
8 1
7 1
6 0
5 0
4 1
3 1
2 1
1 0
0 0
EB2WS
External Bank 2 waitstates
EB0AM
External Bank 0 Access Mode 00=Async, uses both internal waitstate& ext ACK 01=Sync (RD~ and WR~ change on CLKOUTsedge) min 2 cycle reads, 1 cycle writes (EP0WS=001) 10=Sync (RD~ and WR~ change on CLKOUTsedge) min 2 cycle reads, 2 cycles writes (EP0WS=001) 11= reserved External Bank 0 Waitstates 000= 0 waitstates , no hold time cycle 001=1 waitstate, no hold time cycle, minimum for sync 010=2 waitstates, hold time cycle 011=3 waitstates, hold time cycle 100=4 waitstates, hold time cycle 101=5 waitstates, hold time cycle 110=6 waitstates, hold time cycle 111=7 waitstates, hold time cycle (hold time cycles for Async Mode only)
EB2AM
External Bank 2 Access Mode
EB1WS
External Bank 1 waitstates
EB1AM
External Bank 1 Access Mode
EB0WS
5-44
Memory
The processor applies hold time cycles regardless of the external bank access mode (EBxAM). For example, the asynchronous (ACK plus waitstate) mode could also have an associated hold time cycle.
5-45
Accessing Memory
The word width of the processor core accesses to internal memory include the following: 48-bit access for instruction words, extended-precision normal word (40-bit) data, and PX register 64-bit access for long word data, and normal word (32-bit) or PX register data with the LW mnemonic 32-bit access for normal word (32-bit) data 16-bit access for short word data
5-46
Memory
The processor determines whether a normal word access is 32- or 40-bit from the internal memory blocks IMDWx setting. For more information, see Internal Memory Data Width on page 5-37. While mixed accesses of 48-bit words and 16-, 32-, or 64-bit words at the same address are not allowed, mixed read/writes of 16-, 32-, and 64-bit words to the same address are allowed. For more information, see Restrictions on Mixing 32-Bit and 48-Bit Words on page 5-28. The processors DM and PM buses support 24 combinations of register-to-memory data access options. The following factors influence the data access type: Size of words: short word, normal word, extended-precision normal word, or long word Number of words: single- or dual-data move Mode of processor: SISD, SIMD, or broadcast load
5-47
Accessing Memory
The processors external memory accommodates the following word sizes: 48-bit instruction words 40-bit extended-precision normal word data (accessed as 48-bit via PX) 32-bit normal word data Long Word (64-Bit) Accesses A program makes a long word (64-bit) access to internal memory, using an access to a long word address. Programs can also make a 64-bit access through normal word addressing with the LW mnemonic or through a PX register move with the LW mnemonic. Programs may not use long word addressing to access multiprocessor memory space or external memory. The address ranges for internal memory accesses appear in Figure 5-7 on page 5-17. Since the ADSP-21161 processor external port is 32 bits wide, the SIMD and long word accesses are not supported. When data is accessed using long word addressing, the data is always long word aligned on 64-bit boundaries in internal memory space. When data is accessed using normal word addressing and the LW mnemonic, the program should maintain this alignment by using an even normal word address (least significant bit of address =0). This register selection aligns the normal word address with a 64-bit boundary (long word address). All long word accesses load or store two consecutive 32-bit data values. The register file source or destination of a long word access is a set of two neighboring data registers in a processing element. In a forced long word access (uses the LW mnemonic), the even (normal word address) location moves to or from the explicit register in the neighbor-pair, and the odd
5-48
Memory
(normal word address) location moves to or from the implicit register in the neighbor-pair. For example, the following long word moves could occur:
DM(0x40000) = R0 (LW); /* The data in R0 moves to location DM(0x40000), and the data in R1 moves to location DM(0x40001) */ R0 (LW) = DM(0x40003)(LW); /* The data at location DM(0x40002) moves to R0, and the data at location DM(0x40003) moves to R1 */
The example shows that R0 and R1 are a neighbor registers in the same processing element. Table 5-7 lists the other neighbor register assignments that apply to long word accesses. In un-forced long word accesses (accesses to LW memory space), the processor places the lower 32-bits of the long word in the named (explicit) register and places the upper 32-bits of the long word in the neighbor (implicit) register. Table 5-7. Neighbor Registers for Long Word Accesses
PEx neighbor registers r0 neighbors r1 r2 neighbors r3 r4 neighbors r5 r6 neighbors r7 r8 neighbors r9 r10 neighbors r11 r12 neighbors r13 r14 neighbors r15 PEy neighbor registers s0 neighbors s1 s2 neighbors s3 s4 neighbors s5 s6 neighbors s7 s8 neighbors s9 s10 neighbors s11 s12 neighbors s13 s14 neighbors s15
5-49
Accessing Memory
Programs can monitor for unaligned 64-bit accesses by enabling the U64MAE bit. For more information, see Unaligned 64-Bit Memory Access on page 5-41. The Long word (LW) mnemonic only effects normal word address accesses and overrides all other factors (PEYEN, IMDWx). Instruction Word (48-Bit) and Extended-Precision Normal Word (40-Bit) Accesses The sequencer uses 48-bit memory accesses for instruction fetches. Program can make 48-bit accesses with PX register moves, which default to 48-bit. A program makes an extended-precision normal word (40-bit) access to internal memory using an access to a normal word address when that internal memory blocks IMDWx bit is set (=1) for 40-bit words. Programs may not use extended-precision normal word addressing to access multiprocessor memory space or external memory. The address ranges for internal memory accesses appear in Figure 5-7 on page 5-17. For more information on configuring memory for extended-precision normal word accesses, see Internal Memory Data Width on page 5-37. The processor transfers the 40-bit data to internal memory as a 48-bit value, zero-filling the least significant 8 bits on stores and truncating these 8 bits on loads. The register file source or destination of such an access is a single 40-bit data register. Normal Word (32-Bit) Accesses A program makes a normal word (32-bit) access to internal memory using an access to a normal word address when that internal memory blocks IMDWx bit is cleared (=0) for 32-bit words. Programs use normal word addressing to access all processor memory spaces: internal, multiprocessor, and external memory space. The address ranges for memory accesses appear in Figure 5-7 on page 5-17, and Figure 5-9 on page 5-23.
5-50
Memory
The register file source or destination of a normal word access is a single 40-bit data register. The processor zero-fills the least significant 8 bits on loads and truncates these bits on stores. External memory space accesses using normal word addressing and the LW mnemonic perform a 32-bit accesses, not a 64-bit access. Short Word (16-Bit) Accesses A program makes a short word (16-bit) access to internal memory, using an access to a short word address. Programs may not use short word addressing to access multiprocessor memory space or external memory. The address ranges for internal memory accesses appear in Figure 5-7 on page 5-17. The register file source or destination of such an access is a single 40-bit data register. The processor zero-fills the least significant 8 bits on loads and truncates these bits on stores. Depending on the value of the SSE bit in the MODE1 system register, the processor loads the registers upper 16 bits by either: Zero-filling these bits if SSE=0 Sign-extending these bits if SSE=1
5-51
Accessing Memory
For examples of data flow paths for single- and dual-data transfers, see Data Access Options on page 5-52.
5-52
Memory
Symbols:LW = 64-bit data value (two 32-bit values), EW = 40-bit data value (48-bit value), NW = 32-bit data value, SW = 16-bit data value, and SWx2 = two 16-bit data values.
5-53
Accessing Memory
Short Word Addressing of Single Data in SISD Mode Figure 5-16 displays one possible SISD mode, single data, short word addressed access. For short word addressing, the processor treats the data buses as four 16-bit short word lanes. The 16-bit value for the short word access transfers using the least significant short word lane of the PM or DM data bus. The processor drives the other short word lanes of the data buses with zeros. In SISD mode, the instruction accesses PEx registers to transfer data from memory. This instruction accesses WORD X0 whose short word address has 00 for its least significant two bits of address. Other locations within this row have addresses with least significant two bits of 01, 10, or 11 and select WORD X1, WORD X2, or WORD X3 from memory respectively. The syntax targets register, RX, in PEx. The example would target a PEy register if using the syntax SX. The cross () in the PEx registers in Figure 5-16 indicates that the processor zero-fills or sign-extends the most significant 16 bits of the data register while loading the short word value into a 40-bit data register. This depends on the state of the SSE bit in the MODE1 system register. For SW transfers, the least significant 8 bits of the data register are always zero.
5-54
Memory
BLOCK 0 (PM)
MEMORY
BLOCK 1 (DM)
ADDRESS
WORD Y11 WORD Y10 WORD Y9 WORD Y8 WORD Y7 WORD Y6 WORD Y5 WORD Y4 WORD Y3 WORD Y2 WORD Y1 WORD Y0
ADDRESS
WORD X11 WORD X10 WORD X9 WORD X8 WORD X7 WORD X6 WORD X5 WORD X4 WORD X3 WORD X2 WORD X1 WORD X0
NO ACCESS
47-32
31-16
63-48 0X0000
47-32 0X0000
31-16 0X0000
15-0 WORD X0
RX 7-0
0X0000 WORD X0 0X00 PEY REGISTERS 39-24 23-8 SB 7-0 39-24 23-8 SA 7-0 39-24 23-8 SY 7-0 39-24 23-8 SX 7-0
THIS EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(SHORT WORD X0 ADDRESS); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR SISD, SHORT WORD, SINGLE-DATA TRANSFERS ARE: UREG = PM(SHORT WORD ADDRESS); UREG = DM(SHORT WORD ADDRESS); PM(SHORT WORD ADDRESS) = UREG; DM(SHORT WORD ADDRESS) = UREG;
NOTE: DIRECT ADDRESSING IS NOT SUPPORTED FOR DUAL DATA ACCESSES. DUAL DATA ACCESSES CAN BE ACCOMPLISHED BY INDIRECT ADDRESSING USING THE DAG REGISTERS.
5-55
Accessing Memory
Short Word Addressing of Single Data in SIMD Mode Figure 5-17 displays one possible SIMD mode, single data, short word addressed access. For short word addressing, the processor treats the data buses as four 16-bit short word lanes. The explicitly addressed (named in the instruction) 16-bit value transfers using the least significant short word lane of the PM or DM data bus. The implicitly addressed (not named in the instruction, but inferred from the address in SIMD mode) short word value transfers using the 47-32 bit short word lane of the PM or DM data bus. The processor drives the other short word lanes of the PM or DM data buses with zeros. The instruction explicitly accesses the register, RX, and implicitly accesses that registers complementary register, SX. This instruction uses a PEx register with an RX mnemonic. If the syntax named a PEy register SX as the explicit target the processor would use that registers complement RX as the implicit target. For more information on complementary registers, see Secondary Processing Element (PEy) on page 2-37. The cross () in the PEx and PEy registers in Figure 5-17 indicates that the processor zero-fills or sign-extends the most significant 16 bits of the data register while loading the short word value into a 40-bit data register. This depends on the state of the SSE bit in the MODE1 system register. For short word accesses, the least significant 8 bits of the data register are always zero. Figure 5-17 shows the data path for one transfer. The processor accesses short words sequentially in memory. Table 5-9 shows the pattern of SIMD mode short word accesses. For more information on arranging data in memory to take advantage of this access pattern, see Arranging Data in Memory on page 5-100.
5-56
Memory
BLOCK 0 (PM)
MEMORY
BLOCK 1 (DM)
ADDRESS
WORD Y11 WORD Y10 WORD Y9 WORD Y8 WORD Y7 WORD Y6 WORD Y5 WORD Y4 WORD Y3 WORD Y2 WORD Y1 WORD Y0
ADDRESS
WORD X11 WORD X10 WORD X9 WORD X8 WORD X7 WORD X6 WORD X5 WORD X4 WORD X3 WORD X2 WORD X1 WORD X0
NO ACCESS
47-32
31-16
63-48 0X0000
47-32 WORD X2
31-16 0X0000
15-0 WORD X0
RX 7-0
0X0000 WORD X0 0X00 PEY REGISTERS 39-24 23-8 SB 7-0 39-24 23-8 SA 7-0 39-24 23-8 SY 7-0 39-24 23-8 SX 7-0
THIS EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(SHORT WORD X0 ADDRESS); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR SIMD, SHORT WORD, SINGLE-DATA TRANSFERS ARE: UREG = PM(SHORT WORD ADDRESS); UREG = DM(SHORT WORD ADDRESS); PM(SHORT WORD ADDRESS) = UREG; DM(SHORT WORD ADDRESS) = UREG; NOTE: DIRECT ADDRESSING IS NOT SUPPORTED FOR DUAL DATA ACCESSES. DUAL DATA ACCESSES CAN BE ACCOMPLISHED BY INDIRECT ADDRESSING USING THE DAG REGISTERS.
5-57
Accessing Memory
Short Word Addressing of Dual-Data in SISD Mode Figure 5-18 displays one possible SISD mode, dual-data, short word addressed access. For short word addressing, the processor treats the data buses as four 16-bit short word lanes. The 16-bit values for short word accesses transfer using the least significant short word lanes of the PM and DM data buses. The processor drives the other short word lanes of the data buses with zeros. Note that the accesses on both buses do not have to be the same word width. SISD mode dual-data accesses can handle any combination of short word, normal word, extended-precision normal word, or long word accesses. For more information, see Mixed Word Width Addressing of Dual Data in SISD Mode on page 5-82. In SISD mode, the instruction explicitly accesses PEx registers. This instruction accesses WORD X0 in block 1 and WORD Y0 in block 0. Each of these words has a short word address with 00 for its least significant two bits of address. Other accesses within these 4-column location have the addresses with least significant two bits of 01, 10, or 11 and select WORD X1/Y1, WORD X2/Y2, or WORD X3/Y3 from memory respectively. The syntax explicitly accesses registers, RX and RY, in PEx. The example would target PEy registers if using the syntax SX or SY.
5-58
Memory
BLOCK 0 (PM)
MEMORY
BLOCK 1 (DM)
ADDRESS
ADDRESS
WORD X11 WORD X10 WORD X9 WORD X8 WORD X7 WORD X6 WORD X5 WORD X4 WORD X3 WORD X2 WORD X1 WORD X0
47-32 0
31-16 0X0000
63-48 0X0000
47-32 0
31-16 0X0000
15-0 WORD X0
RX 7-0
7-0
THIS EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(SHORT WORD X0 ADDRESS), RA = PM(SHORT WORD Y0 ADDRESS);
OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR SIMD, SHORT WORD, DUAL-DATA TRANSFERS ARE: DREG = PM(SHORT WORD ADDRESS), DREG = DM(SHORT WORD ADDRESS); PM(SHORT WORD ADDRESS) = DREG, DM(SHORT WORD ADDRESS) = DREG;
NOTE: DIRECT ADDRESSING IS NOT SUPPORTED FOR DUAL-DATA ACCESSES. DUAL-DATA ACCESSES CAN BE ACCOMPLISHED BY INDIRECT ADDRESSING USING THE DAG REGISTERS.
5-59
Accessing Memory
The cross () in the PEx registers in Figure 5-18 indicates that the processor zero-fills or sign-extends the most significant 16 bits of the data register while loading a short word value into a 40-bit data register. This depends on the state of the SSE bit in the MODE1 system register. For short word accesses, the least significant 8 bits of the data register are always zero. Short Word Addressing of Dual-Data in SIMD Mode Figure 5-19 displays one possible SIMD mode, dual-data, short word addressed access. For short word addressing, the processor treats the data buses as four 16-bit short word lanes. The explicitly addressed (named in the instruction) 16-bit values transfer using the least significant short word lanes of the PM and DM data bus. The implicitly addressed (not named in the instruction, but inferred from the address in SIMD mode) short word values transfer using the 47-32 bit short word lanes of the PM and DM data buses. The processor drives the other short word lanes of the PM and DM data buses with zeros. The accesses on both buses do not have to be the same word width. SIMD mode dual-data accesses can handle combinations of short word and normal word or extended-precision normal word and long word accesses. For more information, see Mixed Word Width Addressing of Dual Data in SIMD Mode on page 5-84. The instruction explicitly accesses registers RX and RA, and implicitly accesses the complementary registers, SX and SA. This instruction uses a PEx registers with the RX and RA mnemonics. If the syntax named PEy registers SX and SA as the explicit targets, the processor would use those registers complements, RX and RA, as the implicit targets. For more information on complementary registers, see Secondary Processing Element (PEy) on page 2-37.
5-60
Memory
BLOCK 0 (PM)
MEMORY
BLOCK 1 (DM)
ADDRESS
ADDRESS
WORD X11 WORD X10 WORD X9 WORD X8 WORD X7 WORD X6 WORD X5 WORD X4 WORD X3 WORD X2 WORD X1 WORD X0
47-32 WORD Y2
31-16 0X0000
63-48 0X0000
47-32 WORD X2
31-16 0X0000
15-0 WORD X0
RX 7-0
0X0000 WORD Y0 0X00 PEY REGISTERS 39-24 23-8 SB 7-0 39-24 23-8 SA 7-0 39-24 23-8 SY 7-0
THIS EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM (SHORT WORD X0 ADDRESS), RA = PM (SHORT WORD Y0 ADDRESS);
OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR SIMD, SHORT WORD, DUAL-DATA TRANSFERS ARE: DREG = PM(SHORT WORD ADDRESS), DREG = DM(SHORT WORD ADDRESS); PM(SHORT WORD ADDRESS) = DREG, DM(SHORT WORD ADDRESS) = DREG; NOTE: DIRECT ADDRESSING IS NOT SUPPORTED FOR DUAL-DATA ACCESSES. DUAL-DATA ACCESSES CAN BE ACCOMPLISHED BY INDIRECT ADDRESSING USING THE DAG REGISTERS.
5-61
Accessing Memory
The cross () in the PEx and PEy registers in Figure 5-19 indicates that the processor zero-fills or sign-extends the most significant 16 bits of the data registers while loading the short word values into the 40-bit data registers. For short word accesses, this depends on the state of the SSE bit in the MODE1 system register. For the short word accesses, the least significant 8 bits of the data register are always zero. Figure 5-19 shows the data path for one transfer. For short word accesses, the processor accesses short words sequentially in memory. Table 5-9 on page 5-58 shows the pattern of SIMD mode short word accesses. For more information on arranging data in memory to take advantage of this access pattern, see Arranging Data in Memory on page 5-100. 32-Bit Normal Word Addressing of Single Data in SISD Mode Figure 5-20 displays one possible SISD mode, single data, 32-bit normal word addressed access. For normal word addressing, the processor treats the data buses as two 32-bit normal word lanes. The 32-bit value for the normal word access transfers using the least significant normal word lane of the PM or DM data bus. The processor drives the other normal word lanes of the data buses with zeros. In SISD mode, the instruction accesses a PEx register. This mode accesses whose normal word address has 0 for its least significant address bit. The other access within this 4-column location has an addresses with a least significant bit of 1 and selects WORD X1 from memory. The syntax targets register RX in PEx. The example would target a PEy register if using the syntax SX.
WORD X0
For normal word accesses, the processor zero-fills least significant 8 bits of the data register on loads and truncates these bits on stores to memory.
5-62
Memory
BLOCK 0 (PM)
WORD Y5
ADDRESS
MEMORY
BLOCK 1 (DM)
WORD X5 WORD X4 WORD X2 WORD X0
WORD Y3 WORD Y1
WORD X3 WORD X1
NO ACCESS
47-32
31-16
63-48 0X0000
47-32 0X0000
31-16
15-0
WORD X0
WORD X0 PEY REGISTERS 39-24 23-8 SB 7-0 39-24 23-8 SA 7-0 39-24 23-8 SY 7-0
THIS EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(NORMAL WORD X0 ADDRESS);
OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR SISD, NORMAL WORD, SINGLE-DATA TRANSFERS ARE: UREG = PM(NORMAL WORD ADDRESS); UREG = DM(NORMAL WORD ADDRESS); PM(NORMAL WORD ADDRESS) = UREG; DM(NORMAL WORD ADDRESS) = UREG;
Figure 5-20. 32-Bit Normal Word Addressing of Single Data in SISD Mode
5-63
Accessing Memory
32-Bit Normal Word Addressing of Single Data in SIMD Mode Figure 5-21 displays one possible SIMD mode, single data, normal word addressed access. For normal word addressing, the processor treats the data buses as two 32-bit normal word lanes. The explicitly addressed (named in the instruction) 32-bit value transfers using the least significant normal word lane of the PM or DM data bus. The implicitly addressed (not named in the instruction, but inferred from the address in SIMD mode) Normal word value transfers using the most significant normal word lane of the PM or DM data bus. In Figure 5-21, the explicit access targets the named register RX, and the implicit access targets that registers complementary register SX. This case uses a PEx register with an RX mnemonic. If the syntax named a PEy register SX as the explicit target, the processor would use that registers complement, RX, as the implicit target. For more information on complementary registers, see Secondary Processing Element (PEy) on page 2-37. For normal word accesses, the processor zero-fills least significant 8 bits of the data register on loads and truncates these bits on stores to memory. Figure 5-21 shows the data path for one transfer. For normal word accesses, the processor accesses normal words sequentially in memory. Table 5-9 shows the pattern of SIMD mode normal word accesses. For more information on arranging data in memory to take advantage of this access pattern, see Arranging Data in Memory on page 5-100.
5-64
Memory
BLOCK 0 (PM)
MEMORY
BLOCK 1 (DM)
ADDRESS
WORD Y3 WORD Y1
WORD Y2 WORD Y0
ADDRESS
WORD Y5
WORD Y4
NO ACCESS
47-32
31-16
63-48
47-32
31-16
15-0
WORD X1
WORD X0
WORD X0 PEY REGISTERS 39-24 23-8 SB 7-0 39-24 23-8 SA 7-0 39-24 23-8 SY 7-0
WORD X1
THIS EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(NORMAL WORD X0 ADDRESS);
PEY REGISTER S
OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR SIMD, NORMAL WORD, SINGLE-DATA TRANSFERS ARE: UREG = PM(NORMAL WORD ADDRESS); UREG = DM(NORMAL WORD ADDRESS); PM(NORMAL WORD ADDRESS) = UREG; DM(NORMAL WORD ADDRESS) = UREG;
Figure 5-21. 32-Bit Normal Word Addressing of Single Data in SIMD Mode
5-65
Accessing Memory
32-Bit Normal Word Addressing of Dual Data in SISD Mode Figure 5-22 displays one possible SISD mode, dual data, 32-bit normal word addressed access. For normal word addressing, the processor treats the data buses as two 32-bit normal word lanes. The 32-bit values for normal word accesses transfer using the least significant normal word lanes of the PM and DM data buses. The processor drives the other normal word lanes of the data buses with zeros. Note that the accesses on both buses do not have to be the same word width. SISD mode dual-data accesses can handle any combination of short word, normal word, extended-precision normal word, or long word accesses. For more information, see Mixed Word Width Addressing of Dual Data in SISD Mode on page 5-82. In Figure 5-22, the access targets PEx registers in a SISD mode operation. This case accesses WORD X0 in block 1 and WORD Y0 in block 0. Each of these words has a normal word address with 0 for its least significant address bit. Other accesses within these 4-column locations have the addresses with the least significant bit of 1 and select WORD X1/Y1 from memory. The syntax targets registers RX and RY in PEx. The example would target PEy registers if using the syntax SX or SY. For normal word accesses, the processor zero-fills least significant 8 bits of the data register on loads and truncates these bits on stores to memory.
5-66
Memory
BLOCK 0 (PM)
MEMORY
BLOCK 1 (DM)
WORD Y5
ADDRESS
WORD Y3 WORD Y1
47-32 0X0000
31-16
63-48 0X0000
47-32 0X0000
31-16
15-0
WORD Y0
WORD X0
RX 7-0 0X00
WORD Y0
WORD X0
SX 7-0
THIS EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(NORMAL WORD X0 ADDRESS), RY = PM(NORMAL WORD Y0 ADDRESS); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR SISD, NORMAL WORD, DUAL-DATA TRANSFERS ARE: DREG = PM(NORMAL WORD ADDRESS), DREG = DM(NORMAL WORD ADDRESS); PM(NORMAL WORD ADDRESS) = DREG, DM(NORMAL WORD ADDRESS) = DREG; NOTE: DIRECT ADDRESSING IS NOT SUPPORTED FOR DUAL-DATA ACCESSES. DUAL-DATA ACCESSES CAN BE ACCOMPLISHED BY INDIRECT ADDRESSING USING THE DAG REGISTERS.
Figure 5-22. 32-Bit Normal Word Addressing of Dual Data in SISD Mode
5-67
Accessing Memory
32-Bit Normal Word Addressing of Dual Data in SIMD Mode Figure 5-23 displays one possible SIMD mode, dual data, 32-bit normal word addressed access. For normal word addressing, the processor treats the data buses as two 32-bit normal word lanes. The explicitly addressed (named in the instruction) 32-bit values transfer using the least significant normal word lane of the PM or DM data bus. The implicitly addressed (not named in the instruction, but inferred from the address in SIMD mode) normal word values transfer using the most significant normal word lanes of the PM and DM data bus. Note that the accesses on both buses do not have to be the same word width. SIMD mode dual-data accesses can handle combinations of short word and normal word or extended-precision normal word and long word accesses. For more information, see Mixed Word Width Addressing of Dual Data in SIMD Mode on page 5-84. In Figure 5-23, the explicit access targets the named registers RX and RA, and the implicit access targets those registers complementary registers SX and SA. This case uses a PEx registers with the RX and RA mnemonics. If the syntax named PEy registers SX and SA as the explicit targets, the processor would use those registers complements RX and RA as the implicit targets. For more information on complementary registers, see Secondary Processing Element (PEy) on page 2-37. For normal word accesses, the processor zero-fills least significant 8 bits of the data register on loads and truncates these bits on stores to memory. Figure 5-23 shows the data path for one transfer. For normal word accesses, the processor accesses normal words sequentially in memory. Table 5-9 on page 5-58 shows the pattern of SIMD mode normal word accesses. For more information on arranging data in memory to take advantage of this access pattern, see Arranging Data in Memory on page 5-100.
5-68
Memory
BLOCK 0 (PM)
WORD Y4 WORD Y2 WORD Y0
MEMORY
BLOCK 1 (DM)
WORD X4 WORD X2 WORD X0
WORD Y5
ADDRESS
WORD X5
ADDRESS
WORD Y3 WORD Y1
WORD X3 WORD X1
47-32
31-16
63-48
47-32
31-16
15-0
WORD Y1
WORD Y0
WORD X1
WORD X0
RA 7-0 0X00 SA 39-24 23-8 7-0 0X00 39-24 23-8 39-24 23-8
WORD X0 SY 7-0
WORD Y1
WORD X1
THIS EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(NORMAL WORD X0 ADDRESS), RY = PM(NORMAL WORD Y0 ADDRESS);
OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR SIMD, NORMAL WORD, DUAL-DATA TRANSFERS ARE: DREG = PM(NORMAL WORD ADDRESS), DREG = DM(NORMAL WORD ADDRESS); PM(NORMAL WORD ADDRESS) = DREG, DM(NORMAL WORD ADDRESS) = DREG;
NOTE: DIRECT ADDRESSING IS NOT SUPPORTED FOR DUAL-DATA ACCESSES. DUAL-DATA ACCESSES CAN BE ACCOMPLISHED BY INDIRECT ADDRESSING USING THE DAG REGISTERS.
Figure 5-23. 32-Bit Normal Word Addressing of Dual Data in SIMD Mode ADSP-21161 SHARC Processor Hardware Reference 5-69
Accessing Memory
Extended Precision Normal Word Addressing of Single Data Figure 5-24 displays one possible single data, 40-bit extended-precision normal word addressed access. For extended-precision normal word addressing, the processor treats each data bus as a 40-bit extended-precision normal word lane. The 40-bit value for the extended-precision normal word access transfers using the most significant 40 bits of the PM or DM data bus. The processor drives the lower 24 bits of the data buses with zeros. In Figure 5-24, the access targets a PEx register in a SISD or SIMD mode operation; extended-precision normal word single-data access operate the same in SISD or SIMD mode. This case accesses WORD X0 with syntax that targets register RX in PEx. The example would target a PEy register if using the syntax SX.
5-70
Memory
BLOCK 0 (PM)
WORD Y3
ADDRESS
MEMORY
BLOCK 1 (DM)
WORD X3 WORD X2 WORD X1 WORD X0
WORD Y2 WORD Y1
ADDRESS
WORD Y2 WORD Y1
WORD X2 WORD X1
WORD Y0
NO ACCESS
47-32
31-16
63-48
47-32 WORD X0
31-16 0X00
15-0 0X0000
RX 7-0
SX 7-0
THIS EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(EXTENDED-PRECISION NORMAL WORD X0 ADDRESS);
OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR BROADCAST, EXTENDED-PRECISION NORMAL WORD, SINGLE-DATA TRANSFERS ARE: UREG = PM(EP NORMAL WORD ADDRESS); UREG = DM(EP NORMAL WORD ADDRESS); PM(EP NORMAL WORD ADDRESS) = UREG; DM(EP NORMAL WORD ADDRESS) = UREG;
5-71
Accessing Memory
Extended Precision Normal Word Addressing of Dual Data in SISD Mode Figure 5-25 displays one possible SISD mode, dual data, 40-bit extended-precision normal word addressed access. For extended-precision normal word addressing, the processor treats each data bus as a 40-bit extended-precision normal word lane. The 40-bit values for the extended-precision normal word accesses transfer using the most significant 40 bits of the PM and DM data bus. The processor drives the lower 24 bits of the data buses with zeros. Note that the accesses on both buses do not have to be the same word width. SISD mode dual-data accesses can handle any combination of short word, normal word, extended-precision normal word, or long word accesses. For more information, see Mixed Word Width Addressing of Dual Data in SISD Mode on page 5-82. In Figure 5-25, the access targets PEx registers in a SISD mode operation. This case accesses WORD X0 in block 1 and WORD Y0 in block 0 with syntax that targets registers RX and RY in PEx. The example would target a PEy registers if using the syntax SX or SY.
5-72
Memory
BLOCK 0 (PM)
WORD Y3
ADDRESS
MEMORY
WORD Y2
BLOCK 1 (DM)
WORD X3 WORD X2 WORD X1 WORD X0
WORD Y2 WORD Y1
WORD Y1 WORD Y0
ADDRESS
WORD X2 WORD X1
47-32
31-16 0X00
63-48
47-32 WORD X0
31-16
15-0
WORD Y0
0X00 0X0000
RX 7-0
SA 7-0
SX 7-0
THIS EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(EP NORMAL WORD X0 ADDR.), RY = PM(EP NORMAL WORD Y0 ADDR.); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR SISD, EXTENDED PRECISION NORMAL WORD, DUAL-DATA TRANSFERS ARE: DREG = PM(EXT. PREC. NORMAL WORD ADDRESS), DREG = DM(EXT. PREC. NORMAL WORD ADDRESS); PM(EXT. PREC. NORMAL WORD ADDRESS) = DREG, DM(EXT. PREC. NORMAL WORD ADDRESS) = DREG; NOTE: DIRECT ADDRESSING IS NOT SUPPORTED FOR DUAL-DATA ACCESSES. DUAL-DATA ACCESSES CAN BE ACCOMPLISHED BY INDIRECT ADDRESSING USING THE DAG REGISTERS.
Figure 5-25. Extended-Precision Normal Word Addressing of Dual Data in SISD Mode
5-73
Accessing Memory
Extended-Precision Normal Word Addressing of Dual Data in SIMD Mode Figure 5-26 displays one possible SIMD mode, dual data, 40-bit extended-precision normal word addressed access. For extended-precision normal word addressing, the processor treats each data bus as a 40-bit extended-precision normal word lane. Because this word size approaches the limit of the data buses capacity, this SIMD mode transfer only moves the explicitly addressed locations and restricts data bus usage. The explicitly addressed (named in the instruction) 40-bit values transferred over the DM bus must source or sink a PEx data register, and the explicitly addressed (named in the instruction) 40-bit values transferred over the PM bus must source or sink a PEy data register; there are no implicit transfers in this mode. The 40-bit values for the extended-precision normal word accesses transfer using the most significant 40 bits of the PM and DM data bus. The processor drives the lower 24 bits of the data buses with zeros. The accesses on both buses do not have to be the same word width. This special case of SIMD mode dual-data accesses can handle any combination of extended-precision normal word or long word accesses. For more information, see Mixed Word Width Addressing of Dual Data in SIMD Mode on page 5-84. In Figure 5-26, the access targets PEx and PEy registers in a SIMD mode operation. This case accesses WORD X0 in block 1 with syntax that targets register RX in PEx and accesses WORD Y0 in block 0 with syntax that targets register SX in PEy.
5-74
Memory
BLOCK 0 (PM)
WORD Y3
ADDRESS
MEMORY
BLOCK 1 (DM)
WORD X3 WORD X2 WORD X1 WORD X0
ADDRESS
WORD Y2 WORD Y1
WORD X2 WORD X1
47-32
31-16
63-48
47-32 WORD X0
31-16
15-0
WORD Y0
0X00 0X0000
0X00 0X0000
RX 7-0
SX 7-0
THIS EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(EP NORMAL WORD X0 ADDR.), SX = PM(EP NORMAL WORD Y0 ADDR.); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR SIMD, EXTENDED-PRECISION NORMAL WORD, DUAL-DATA TRANSFERS ARE: PEY DREG = PM(EP NORMAL WORD ADDRESS), PEX DREG = DM(EP NORMAL WORD ADDRESS); PM(EP NORMAL WORD ADDRESS) = PEY DREG, DM(EP NORMAL WORD ADDRESS) = PEX DREG; NOTE: DIRECT ADDRESSING IS NOT SUPPORTED FOR DUAL-DATA ACCESSES. DUAL-DATA ACCESSES CAN BE ACCOMPLISHED BY INDIRECT ADDRESSING USING THE DAG REGISTERS.
Figure 5-26. Extended-Precision Normal Word Addressing of Dual Data in SIMD Mode
5-75
Accessing Memory
Long Word Addressing of Single Data Figure 5-27 displays one possible single data, long word addressed access. For long word addressing, the processor treats each data bus as a 64-bit long word lane. The 64-bit value for the long word access transfers using the full width of the PM or DM data bus. In Figure 5-27, the access targets a PEx register in a SISD or SIMD mode operation; long word single-data access operate the same in SISD or SIMD mode. This case accesses WORD X0 with syntax that explicitly targets register RX and implicitly targets its neighbor register RY in PEx. The example would target PEy registers if using the syntax SX. For more information on how neighbor registers (listed in Table 5-7 on page 5-49) work, see Long Word (64-Bit) Accesses on page 5-48.
5-76
Memory
BLOCK 0 (PM)
WORD Y2
ADDRESS
MEMORY
BLOCK 1 (DM)
WORD X2
ADDRESS
WORD Y1 WORD Y0
WORD X1 WORD X0
NO ACCESS
47-32
31-16
63-48
47-32
31-16
15-0
WORD X0
RX 7-0 0X00
SX 7-0
THIS EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(LONG WORD X0 ADDRESS); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR SISD OR SIMD, LONG WORD, SINGLE-DATA TRANSFERS ARE: UREG = PM(LONG WORD ADDRESS); UREG = DM(LONG WORD ADDRESS); PM(LONG WORD ADDRESS) = UREG; DM(LONG WORD ADDRESS) = UREG;
NOTE: DIRECT ADDRESSING IS NOT SUPPORTED FOR DUAL-DATA ACCESSES. DUAL-DATA ACCESSES CAN BE ACCOMPLISHED BY INDIRECT ADDRESSING USING THE DAG REGISTERS.
5-77
Accessing Memory
Long Word Addressing of Dual Data in SISD Mode Figure 5-28 displays one possible SISD mode, dual data, long word addressed access. For long word addressing, the processor treats each data bus as a 64-bit long word lane. The 64-bit values for the long word accesses transfer using the full width of the PM or DM data bus. In Figure 5-28, the access targets PEx registers in a SISD mode operation. This case accesses WORD X0 and WORD Y0 with syntax that explicitly targets registers RX registers RA and implicitly targets their neighbor registers RY and RB in PEx. The example would target PEy registers if using the syntax SX and SA. For more information on how neighbor registers (listed in Table 5-7 on page 5-49) work, see Long Word (64-Bit) Accesses on page 5-48. Programs must be careful not to explicitly target neighbor registers in this case. While the syntax lets programs target these registers, one of the explicit accesses targets the other accesss implicit target. The processor resolves this conflict by performing only the access with higher priority. For more information on the priority order of data register file accesses, see Data Register File on page 2-30.
5-78
Memory
BLOCK 0 (PM)
WORD Y2 WORD Y1 WORD Y0
MEMORY
BLOCK 1 (DM)
ADDRESS
ADDRESS
47-32
31-16
63-48
47-32
31-16
15-0
WORD Y0
WORD X0
THIS EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(LONG WORD X0 ADDRESS), RA = PM(LONG WORD Y0 ADDRESS);
OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR SISD, LONG WORD, DUAL-DATA TRANSFERS ARE: DREG = PM(LONG WORD ADDRESS), DREG = DM(LONG WORD ADDRESS); PM(LONG WORD ADDRESS) = DREG, DM(LONG WORD ADDRESS) = DREG;
NOTE: DIRECT ADDRESSING IS NOT SUPPORTED FOR DUAL-DATA ACCESSES. DUAL-DATA ACCESSES CAN BE ACCOMPLISHED BY INDIRECT ADDRESSING USING THE DAG REGISTERS.
5-79
Accessing Memory
Long Word Addressing of Dual Data in SIMD Mode Figure 5-29 displays one possible SIMD mode, dual data, long word addressed access targeting internal memory space. For long word addressing, the processor treats each data bus as a 64-bit long word lane. The 64-bit values for the long word accesses transfer using the full width of the PM or DM data bus. Because this word size approaches the limit of the data buses capacity, this SIMD mode transfer only moves the explicitly addressed locations and restricts data bus usage. The explicitly addressed (named in the instruction) 64-bit values transferred over the DM bus must source or sink a PEx data register, and the explicitly addressed (named in the instruction) 64-bit values transferred over the PM bus must source or sink a PEy data register; there are no implicit transfers in this mode. In Figure 5-29, the access targets PEx and PEy registers in a SIMD mode operation. This case accesses WORD X0 in block 1 with syntax that targets register RX and its neighbor register RY in PEx and accesses WORD Y0 in block 0 with syntax that targets register SX and its neighbor register SY in PEy. For more information on how neighbor registers (listed in Table 5-7 on page 5-49) work, see Long Word (64-Bit) Accesses on page 5-48. The accesses on both buses do not have to be the same word width. This special case of SIMD mode dual-data accesses can handle any combination of extended-precision normal word or long word accesses. For more information, see Mixed Word Width Addressing of Dual Data in SIMD Mode on page 5-84.
5-80
Memory
BLOCK 0 (PM)
WORD Y2
ADDRESS
MEMORY
BLOCK 1 (DM)
WORD X2
ADDRESS
WORD Y1 WORD Y0
WORD X1 WORD X0
47-32
31-16
63-48
47-32
31-16
15-0
WORD Y0
WORD X0
RY 7-0 0X00 SY 39-24 23-8 7-0 0X00 39-24 23-8 39-24 23-8
WORD X0, 63-32 PEY REGISTERS 39-24 23-8 SB 7-0 39-24 23-8 SA 7-0
THIS EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(LONG WORD X0 ADDRESS), SX = PM(LONG WORD Y0 ADDRESS); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR SIMD, LONG WORD, DUAL-DATA TRANSFERS ARE: PEY DREG = PM(LONG WORD ADDRESS), PEX DREG = DM(LONG WORD ADDRESS); PM(LONG WORD ADDRESS) = PEY DREG, DM(LONG WORD ADDRESS) = PEX DREG;
NOTE: DIRECT ADDRESSING IS NOT SUPPORTED FOR DUAL-DATA ACCESSES. DUAL-DATA ACCESSES CAN BE ACCOMPLISHED BY INDIRECT ADDRESSING USING THE DAG REGISTERS.
5-81
Accessing Memory
Mixed Word Width Addressing of Dual Data in SISD Mode Figure 5-30 displays an example of a mixed word width, dual data, SISD mode access. This example shows how the processor transfers a long word access on the DM bus and transfers a short word access on the PM bus. The memory architecture permits mixing all other combinations of dual-data SISD mode short word, normal word, extended-precision normal word, and long word accesses. In case of conflicting dual access to the data register file, the processor only performs the access with higher priority. For more information on how the processor prioritizes accesses, see Data Register File on page 2-30.
5-82
Memory
BLOCK 0 (PM)
MEMORY
BLOCK 1 (DM)
WORD X2
ADDRESS
WORD X1 WORD X0
47-32 0X0000
31-16 0X0000
63-48
47-32
31-16
15-0
WORD X0
0X0000 WORD Y0 0X00 PEY REGISTERS 39-24 23-8 SB 7-0 39-24 23-8 SA 7-0
THIS EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(LONG WORD X0 ADDRESS), RA = PM(SHORT WORD Y0 ADDRESS); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR SISD, MIXED-WORD, DUAL-DATA TRANSFERS ARE: DREG = PM(SHORT, NORMAL, EP NORMAL, LONG ADD), DREG = DM(SHORT, NORMAL, EP NORMAL, LONG ADD); PM(SHORT, NORMAL, EP NORMAL, LONG ADD) = DREG, DM(SHORT, NORMAL, EP NORMAL, LONG ADD) = DREG; NOTE: DIRECT ADDRESSING IS NOT SUPPORTED FOR DUAL-DATA ACCESSES. DUAL-DATA ACCESSES CAN BE ACCOMPLISHED BY INDIRECT ADDRESSING USING THE DAG REGISTERS.
Figure 5-30. Mixed Word Width Addressing of Dual Data in SISD Mode
5-83
Accessing Memory
Mixed Word Width Addressing of Dual Data in SIMD Mode Figure 5-31 displays an example of a mixed word width, dual data, SIMD mode access. This example shows how the processor transfers a long word access on the DM bus and transfers an extended-precision normal word access on the PM bus. The memory architecture permits mixing SIMD mode dual data short word and normal word accesses or extended-precision normal word and long word accesses. No other combinations of mixed word dual-data SIMD mode accesses are permissible.
5-84
Memory
BLOCK 0 (PM)
WORD Y3 WORD Y2 WORD Y1
MEMORY
BLOCK 1 (DM)
ADDRESS
WORD Y1 WORD Y0
ADDRESS
WORD Y2
47-32
31-16 0X00
63-48
47-32
31-16
15-0
WORD Y0
WORD X0
RY 7-0 0X00 SY 39-24 23-8 7-0 39-24 23-8 WORD Y0 39-24 23-8
WORD X0, 63-32 PEY REGISTERS 39-24 23-8 SB 7-0 39-24 23-8 SA 7-0
THIS EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(LONG WORD X0 ADDRESS), SX = PM(EP NORMAL WORD Y0 ADDRESS); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR SIMD, MIXED-WORD, DUAL-DATA TRANSFERS ARE: DREG = PM(ADDRESS), DREG = DM(ADDRESS); PM(ADDRESS) = DREG, DM(ADDRESS) = DREG; NOTE: DIRECT ADDRESSING IS NOT SUPPORTED FOR DUAL-DATA ACCESSES. DUAL-DATA ACCESSES CAN BE ACCOMPLISHED BY INDIRECT ADDRESSING USING THE DAG REGISTERS.
Figure 5-31. Mixed Word Width Addressing of Dual Data in SIMD Mode
5-85
Accessing Memory
Broadcast Load Access Figure 5-32 through Figure 5-39 provide examples of broadcast load accesses for single- and dual-data transfers. These examples show that the broadcast loads memory and register access is a hybrid of the corresponding non-broadcast SISD and SIMD mode accesses. The exceptions to this relation are broadcast load dual-data, extended-precision normal word and long word accesses. These broadcast accesses differ from their corresponding non-broadcast mode accesses.
5-86
Memory
BLOCK 0 (PM)
MEMORY
BLOCK 1 (DM)
NO ACCESS
47-32
31-16
63-48 0X0000
47-32 0X0000
31-16 0X0000
15-0 WORD X0
RX 7-0
0X0000 WORD X0 0X00 PEY REGISTERS 39-24 23-8 SB 7-0 39-24 23-8 SA 7-0 39-24 23-8 SY 7-0 39-24 23-8 SX 7-0
THIS EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(SHORT WORD X0 ADDRESS);
OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR BROADCAST, SHORT WORD, SINGLE-DATA TRANSFERS ARE: UREG = PM(SHORT WORD ADDRESS); UREG = DM(SHORT WORD ADDRESS); PM(SHORT WORD ADDRESS) = UREG; DM(SHORT WORD ADDRESS) = UREG;
5-87
Accessing Memory
BLOCK 0 (PM)
ADDRESS
MEMORY
ADDRESS
BLOCK 1 (DM)
WORD Y11 WORD Y10 WORD Y9 WORD Y8 WORD Y7 WORD Y6 WORD Y5 WORD Y4 WORD Y3 WORD Y2 WORD Y1 WORD Y0
WORD X11 WORD X10 WORD X9 WORD X8 WORD X7 WORD X6 WORD X5 WORD X4 WORD X3 WORD X2 WORD X1 WORD X0
47-32 0X0000
31-16 0X0000
63-48 0X0000
47-32 0X0000
31-16 0X0000
15-0 WORD X0
RX 7-0
0X0000 WORD Y0 0X00 PEY REGISTERS 39-24 23-8 SB 7-0 39-24 23-8 SA 7-0 39-24 23-8 SY 7-0
THIS EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(SHORT WORD X0 ADDRESS), RY = PM(SHORT WORD Y0 ADDRESS);
OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR BROADCAST, SHORT WORD, DUAL-DATA TRANSFERS ARE: DREG = PM(SHORT WORD ADDRESS), DREG = DM(SHORT WORD ADDRESS); PM(SHORT WORD ADDRESS) = DREG, DM(SHORT WORD ADDRESS) = DREG;
NOTE: DIRECT ADDRESSING IS NOT SUPPORTED FOR DUAL-DATA ACCESSES. DUAL-DATA ACCESSES CAN BE ACCOMPLISHED BY INDIRECT ADDRESSING USING THE DAG REGISTERS.
5-88
Memory
BLOCK 0 (PM)
WORD Y5
ADDRESS
MEMORY
BLOCK 1 (DM)
WORD X5
WORD Y3 WORD Y1
WORD X3 WORD X1
NO ACCESS
47-32
31-16
63-48 0X0000
47-32 0X0000
31-16
15-0
WORD X0
WORD X0 PEY REGISTERS 39-24 23-8 SB 7-0 39-24 23-8 SA 7-0 39-24 23-8 SY 7-0
WORD X0
THE ABOVE EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(NORMAL WORD X0 ADDRESS); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR BROADCAST, NORMAL WORD, SINGLE-DATA TRANSFERS ARE: UREG = PM(NORMAL WORD ADDRESS); UREG = DM(NORMAL WORD ADDRESS); PM(NORMAL WORD ADDRESS) = UREG; DM(NORMAL WORD ADDRESS) = UREG;
5-89
Accessing Memory
BLOCK 0 (PM)
MEMORY
BLOCK 1 (DM)
WORD Y5
ADDRESS
WORD Y3 WORD Y1
47-32 0X0000
31-16
63-48 0X0000
47-32 0X0000
31-16
15-0
WORD Y0
WORD X0
RY 7-0 0X00 SY 39-24 23-8 7-0 0X00 39-24 23-8 39-24 23-8
WORD X0
WORD Y0
WORD X0
THIS EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(NORMAL WORD X0 ADDRESS), RY = PM(NORMAL WORD Y0 ADDRESS); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR BROADCAST, NORMAL WORD, DUAL-DATA TRANSFERS ARE: DREG = PM(NORMAL WORD ADDRESS), DREG = DM(NORMAL WORD ADDRESS); PM(NORMAL WORD ADDRESS) = DREG, DM(NORMAL WORD ADDRESS) = DREG;
NOTE: DIRECT ADDRESSING IS NOT SUPPORTED FOR DUAL-DATA ACCESSES. DUAL-DATA ACCESSES CAN BE ACCOMPLISHED BY INDIRECT ADDRESSING USING THE DAG REGISTERS.
5-90
Memory
BLOCK 0 (PM)
WORD Y3
ADDRESS
MEMORY
BLOCK 1 (DM)
WORD X3 WORD X2 WORD X1 WORD X0
WORD Y2 WORD Y1
ADDRESS
WORD Y2 WORD Y1
WORD X2 WORD X1
WORD Y0
NO ACCESS
47-32
31-16
63-48
47-32 WORD X0
31-16 0X00
15-0 0X0000
RX 7-0
SX 7-0
THIS EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(EXTENDED-PRECISION NORMAL WORD X0 ADDRESS);
OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR BROADCAST, EXTENDED-PRECISION NORMAL WORD, SINGLE-DATA TRANSFERS ARE: UREG = PM(EP NORMAL WORD ADDRESS); UREG = DM(EP NORMAL WORD ADDRESS); PM(EP NORMAL WORD ADDRESS) = UREG; DM(EP NORMAL WORD ADDRESS) = UREG;
Figure 5-36. Extended Precision Normal Word Addressing of Single Data in Broadcast Load
5-91
Accessing Memory
BLOCK 0 (PM)
WORD Y3
ADDRESS
MEMORY
BLOCK 1 (DM)
WORD X3 WORD X2 WORD X1 WORD X0
WORD Y2
ADDRESS
WORD Y2 WORD Y1
WORD X2 WORD X1
47-32
31-16 0X00
63-48
47-32
31-16
15-0
WORD Y0
WORD X0
0X00 0X0000
RY 7-0 39-24 23-8 WORD X0 SY 39-24 23-8 WORD Y0 7-0 39-24 23-8 WORD X0
RX 7-0
SA 7-0
SX 7-0
THIS EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(EP NORMAL WORD X0 ADDR.), RY = PM(EP NORMAL WORD Y0 ADDR.); OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR BROADCAST, EXTENDED-PRECISION NORMAL WORD, DUAL-DATA TRANSFERS ARE: DREG = PM(EP NORMAL WORD ADDRESS), DREG = DM(EP NORMAL WORD ADDRESS); PM(EP NORMAL WORD ADDRESS) = DREG, DM(EP NORMAL WORD ADDRESS) = DREG;
NOTE: DIRECT ADDRESSING IS NOT SUPPORTED FOR DUAL-DATA ACCESSES. DUAL-DATA ACCESSES CAN BE ACCOMPLISHED BY INDIRECT ADDRESSING USING THE DAG REGISTERS.
Figure 5-37. Extended Precision Normal Word Addressing of Dual Data in Broadcast Load
5-92
Memory
BLOCK 0 (PM)
WORD Y2
ADDRESS
MEMORY
BLOCK 1 (DM)
WORD X2
ADDRESS
WORD Y1 WORD Y0
WORD X1 WORD X0
NO ACCESS
47-32
31-16
63-48
47-32
31-16
15-0
WORD X0
RX 7-0 0X00
SX 7-0 0X00
THIS EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(LONG WORD X0 ADDRESS);
OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR BROADCAST, LONG WORD, SINGLE-DATA TRANSFERS ARE: UREG = PM(LONG WORD ADDRESS); UREG = DM(LONG WORD ADDRESS); PM(LONG WORD ADDRESS) = UREG; DM(LONG WORD ADDRESS) = UREG;
5-93
Accessing Memory
BLOCK 0 (PM)
WORD Y2
ADDRESS
MEMORY
BLOCK 1 (DM)
WORD X2
ADDRESS
WORD Y1 WORD Y0
WORD X1 WORD X0
47-32
31-16
63-48
47-32
31-16
15-0
WORD Y0
WORD X0
RY 7-0 0X00 SY 39-24 23-8 7-0 0X00 39-24 23-8 39-24 23-8
SA 7-0 0X00
THIS EXAMPLE SHOWS THE DATA FLOW FOR INSTRUCTION: RX = DM(LONG WORD X0 ADDRESS), RA = PM(LONG WORD Y0 ADDRESS);
OTHER INSTRUCTIONS WITH SIMILAR DATA FLOWS FOR BROADCAST, LONG WORD, DUAL-DATA TRANSFERS ARE: DREG = PM(LONG WORD ADDRESS), PM(LONG WORD ADDRESS) = DREG, DREG = DM(LONG WORD ADDRESS); DM(LONG WORD ADDRESS) = DREG;
NOTE: DIRECT ADDRESSING IS NOT SUPPORTED FOR DUAL-DATA ACCESSES. DUAL-DATA ACCESSES CAN BE ACCOMPLISHED BY INDIRECT ADDRESSING USING THE DAG REGISTERS.
5-94
Memory
Table 5-12 illustrates operation when the previously written data still resides in the shadow write FIFO. For example, from a previous memory write instruction. A SIMD mode explicit access to normal word address 0x40001 results in an implicit access to normal word address 0x40000 if
5-95
Accessing Memory
the reading of the data from 0x40001 occurs while the data is still in the shadow write FIFO. This access type results in an implicit access to the next sequential even address value. Table 5-12. Data Resides In Shadow Write FIFO
Explicit R0 R0=dm(I0,M0); Explicit Address (I0) 0x40001 R0 32-bit word at 0x40001 S0 32-bit word at 0x40000 Explicit S0 S0=dm(I0,M0); R0 32-bit word at 0x40000 S0 32-bit word at 0x40001
To better demonstrate what results if the read data is in the shadow write FIFO versus internal memory, Table 5-13 shows the failing cases for a SIMD shadow aligned and non-aligned access when a SIMD read immediately follows a SIMD write. Table 5-13. SIMD Write - SIMD Read Illegal Cases
Address of Write Data in Shadow Write FIFO 0x500011 Immediate Read after Write Result Resultant Register Address Contents
r0=(0x50002), s0=(0x50001) r0=(0x50001), s0=(0x50002) r0=(0x50002)2, s0=(0x50003) r0=(0x50001), s0=(0x50002)2 r0=(0x50002), s0=(0x50003) r0=(0x50003), s0=(0x50002) r0=(0xA0004), s0=(0xA0002) r0=(0xA0001), s0=(0xA0003) r0=(0xA0002), s0=(0xA0004)
0x500021
0xA0002 3
5-96
Memory
r0=dm(0xA0003) r0=dm(0xA0004) 0xA00033 r0=dm(0xA0001) r0=dm(0xA0002) r0=dm(0xA0003) r0=dm(0xA0004) r0=dm(0xA0005) 0xA00043 r0=dm(0xA0002) r0=dm(0xA0003) r0=dm(0xA0004) r0=dm(0xA0005) r0=dm(0xA0006) 0xA00053 r0=dm(0xA0003) r0=dm(0xA0004) r0=dm(0xA0005) r0=dm(0xA0006) r0=dm(0xA0007) 0x28001 r0=dm(0x50000) r0=dm(0x50001) r0=dm(0x50002) r0=dm(0x50003) r0=dm(0x50004) 0x28001 r0=dm(0xA0001)
Correct Incorrect Incorrect Correct Correct Correct Incorrect Incorrect Correct Correct Correct Incorrect Incorrect Correct Correct Correct Incorrect Correct Incorrect Correct Incorrect Correct Correct
r0=(0xA0003), s0=(0xA0005) r0=(0xA0004)2, s0=(0xA0006) r0=(0xA0005), s0=(0xA0003) r0=(0xA0002), s0=(0xA0004) r0=(0xA0003), s0=(0xA0005)2 r0=(0xA0004), s0=(0xA0006) r0=(0xA0005)2, s0=(0xA0007) r0=(0xA0002), s0=(0xA0004)2 r0=(0xA0003), s0=(0xA0005) r0=(0xA0004), s0=(0xA0006) r0=(0xA0005), s0=(0xA0007) r0=(0xA0006)2, s0=(0xA0004) r0=(0xA0003), s0=(0xA0005)2 r0=(0xA0004), s0=(0xA0006) r0=(0xA0005), s0=(0xA0007) r0=(0xA0006), s0=(0xA0008) r0=(0xA0007), s0=(0xA0005) r0=(0x50000), s0=(0x50001) r0=(0x50001), s0=(0x50002)2 r0=(0x50002), s0=(0x50003) r0=(0x50003), s0=(0x50002) r0=(0x50004), s0=(0x50005) r0=(0xA0001), s0=(0xA0003)
5-97
Accessing Memory
r0=dm(0xA0002) r0=dm(0xA0003) r0=dm(0xA0004) r0=dm(0xA0005) r0=dm(0xA0006) r0=dm(0xA0007) r0=dm(0xA0008) 0x50002 r0=dm(0x28000) r0=dm(0x28001) r0=dm(0x28002) 0x50003 r0=dm(0x28000) r0=dm(0x28001) r0=dm(0x28002) r0=dm(0x28003) 0x50002 r0=dm(0xA0001) r0=dm(0xA0002) r0=dm(0xA0003) r0=dm(0xA0004) r0=dm(0xA0005) r0=dm(0xA0006) r0=dm(0xA0007) r0=dm(0xA0008) 0xA0004 r0=dm(0x28000)
Incorrect Incorrect Correct Correct Incorrect Incorrect Correct Correct Correct Correct Correct Incorrect Incorrect Correct Correct Incorrect Incorrect Correct Correct Incorrect Incorrect Correct Correct
r0=(0xA0002), s0=(0xA0004)2 r0=(0xA0003), s0=(0xA0005)2 r0=(0xA0004), s0=(0xA0006) r0=(0xA0005), s0=(0xA0007) r0=(0xA0006), s0=(0xA0004) r0=(0xA0007), s0=(0xA0005) r0=(0xA0008), s0=(0xA000A) r0=(0x50000), r1=(0x50001) r0=(0x50002), r1=(0x50003) r0=(0x50004), r1=(0x50005) r0=(0x50000), r1=(0x50001) r0=(0x50004), r1=(0x50003) r0=(0x50004)2, r1=(0x50005) r0=(0x50006), r1=(0x50007) r0=(0xA0001), s0=(0xA0003) r0=(0xA0002), s0=(0xA0004)2 r0=(0xA0003), s0=(0xA0005)2 r0=(0xA0004), s0=(0xA0006) r0=(0xA0005), s0=(0xA0007) r0=(0xA0006), s0=(0xA0004) r0=(0xA0007), s0=(0xA0005) r0=(0xA0008), s0=(0xA000A) r0=(0x50000), r1=(0x50001)
5-98
Memory
r0=dm(0x28001) r0=dm(0x28002) 0xA0006 r0=dm(0x28000) r0=dm(0x28001) r0=dm(0x28002) r0=dm(0x28003) 0xA0004 r0=dm(0x50000) r0=dm(0x50001) r0=dm(0x50002) r0=dm(0x50003) r0=dm(0x50004) 1 2 3 4
Correct Correct Correct Incorrect Incorrect Correct Correct Incorrect Correct Incorrect Correct
r0=(0x50002), r1=(0x50003) r0=(0x50004), r1=(0x50005) r0=(0x50000), r1=(0x50001) r0=(0x50002)4, r1=(0x50003) r0=(0x50004)2, r1=(0x50005) r0=(0x50006), r1=(0x50007) r0=(0x50000), s0=(0x50001) r0=(0x50001), s0=(0x50002)2 r0=(0x50002), s0=(0x50003) r0=(0x50003), s0=(0x50002)4 r0=(0x50004), s0=(0x50005)
Normal word accesses Old data from memory is accessed instead of new data in Shadow Write FIFO Short word accesses PEx and PEy data is partly from shadow and partly from memory
If the new written data resides in shadow write FIFO, then for normal and short word SIMD accesses, a write access to an even address followed by a read access to the adjacent (higher or lower) odd address results in incorrect SIMD access operation. Similarly, a write access to an odd address followed by a read access to the adjacent (higher or lower) even address results in incorrect SIMD access operation.
5-99
To prevent unexpected SIMD read results when a write is followed by a read from the same long word boundary addresses, two options are recommended. These two suggestions are independent of one another and can be used to work around the SIMD shadow write FIFO. Align all variables and arrays in memory to long word address boundaries using the .ALIGN assembler directive. Do not explicitly access odd normal word addresses or non-long word boundary aligned short word addresses in SIMD mode. Note that for program generated addresses which are odd, you cannot use the .ALIGN workaround. For example, this workaround cannot be used for indirect addressing using the index or pointer DAG registers. OR Include two NOPs or non-memory access instructions to clear the shadow write FIFO.
5-100
Memory
The following guidelines provide an overview of how programs should interleave data in memory locations. For more information and examples, see the ADSP-21160 SHARC DSP Instruction Set Reference: Programs can use odd or even modify values (1, 2, 3, ) to step through a buffer in single-or dual-data, SISD or Broadcast load mode regardless of the data word size (long word, extended-precision normal word, normal word, or short word). Programs should use multiple of 4 modify values (4, 8, 12, ) to step through a buffer of short word data in single-or dual-data, SIMD mode. Programs must step through a buffer twice, once for addressing even short word addresses and once for addressing odd short word addresses. Programs should use multiple of 2 modify values (2, 4, 6, ) to step through a buffer of normal word data in single- or dual-data SIMD mode. Programs can use odd or even modify values (1, 2, 3, ) to step through a buffer of long word or extended-precision normal word data in single- or dual-data, SIMD mode.
5-101
This automatic instruction packing is performed only when the Program Sequencer initiates an external access to fetch an instruction with one of four instruction packing modes enabled in the SYSCON register: 8- to 48-bit, 16- to 48-bit, 32- to 48-bit or 48- to 48-bit packing. Note that the processor only supports program execution from external memory bank 0. The default packing mode the ADSP-21161 processor is 32- to 48-bit packing. Packed instruction execution for 8-, 16-, 32-, or 48-bit wide external memory is also supported and controlled by the IPACK[1:0] bits of the SYSCON register. Table 5-14 summarizes the packing mode configurations controlled by IPACK[1:0] bits. There is a no packing 48-bit bus width mode available on the processor which assumes the EPD bus is 48 bits wide. This full instruction width execution from external memory is made possible by multiplexing 16 link port pins with DATA[15:0] enabling the program execution to run at full-rate. These additional 16 data lines should only be enabled when the link ports are not used. Data lines DATA[15:8] multiplex with L1DAT[7:0] and data lines DATA[7:0] multiplex with L0DAT[7:0]. Set the IPACK bits [1:0] of the SYSCON register to 01 in order to enable DATA[15:0] pins for a 48-bit wide external bus. There are four boot and one no boot modes available on the processor. In the no-boot mode, the processor fetches instructions using a 32- to 48-bit packing. In a boot mode, the packing mode can be changed by writing the new execution packing mode to the IPACK bits before a fetch from external memory occurs. A host can write the new values into the processor or the software loader kernel can change the values during booting.
5-102
Memory
1 1
0 1
DATA 47-16
47 40 39 32 31 24 23 16 15
DATA 15-0
8 7 0
PROM BOOT 8-bit Packed DMA Data 8-bit Packed Instruction Execution 16-bit Packed DMA Data 16-bit Packed Instruction Execution Float or Fixed, D31-D0 DMA 32-bit Packed Instruction 48-bit Instruction Fetch (No Packing)
Extra Data Lines DATA[15-0] Are Only Accessible If Link Ports Are Disabled. Enabled by setting IPACK [1:0] to the no instruction pack mode in the SYSCON register
5-103
When writing to bits 30 and 31(IPACK[1:0]) in the SYSCON register to enable the packed instruction mode, delay the instruction fetch from external memory by two instructions. This can be done by inserting two NOPs after a write to SYSCON register or by following the execution sequence shown in the code segment.
ext_isr_tabl_seg_dma10: jump int_codeaddr (db); ustatx = 0x80000000 ; /* change packing from 32-48 to 16-48 */ dm(syscon) = ustatx; int_codeaddr: jump ext_codeaddr (db); ustatx = new_wait_value; dm(WAIT) = ustatx;
The following tables show the addresses for instructions packed in two, three or six consecutive locations in external memory: 48- to 48-Bit External Instruction Packing on page 5-104 32- to 48-Bit External Instruction Packing on page 5-105 16- to 48-Bit External Instruction Packing on page 5-106 8- to 48-Bit External Instruction Packing on page 5-107 For more information on instruction packing in external memory, see the VisualDSP++ Users Guide for ADSP-21xxx Family DSPs. Table 5-15. 48- to 48-Bit External Instruction Packing
ADDRESS 0x200000 0x200001 0x200002 DATA[47:0] Instr0[47:0] Instr1[47:0] Instr2[47:0]
5-104
Memory
For 48- to 48-bit full instruction width packing, the processor stores one instruction in every 48-bit word memory location. In this packing mode, no address translation is performed by the program sequencer. Instructions are executed from SDRAM at the core clock rate. By enabling IPACK[1:0], the link port data pins L1DAT[7:0] and L0DAT[7:0] are activated as DATA[15:0]. Table 5-16. 32- to 48-Bit External Instruction Packing
ADDRESS 0x200000 0x200001 0x200002 0x200003 0x200004 ......... Instr1[47:16] Instr1[15:0] DATA[47: 2] Instr0[47:16] Instr0[15:0] DATA[31:16]
For 32- to 48-bit instruction packing, the processor stores an instruction in two consecutive memory locations. In this packing mode, the first 32 bits of the 48-bit instruction are stored in an even location and the lower 16 bits of the 48-bit opcode are stored in the adjacent odd location in memory. The program sequencer automatically generates the correct external addresses based on the IPACK bits in the SYSCON register. The program sequencer generates addresses in groups of two physical locations.
5-105
To generate a corresponding address in external memory for the second part of the 48-bit instruction, the processor increments the internal logical address of the previous access by 1. Table 5-17. 16- to 48-Bit External Instruction Packing
ADDRESS 0x200000 0x200001 0x200002 0x200003 0x200004 0x200005 0x200006 0x200007 DATA[31:16] Instr0[47:32] Instr0[31:16] Instr0[15:0] Unused Memory Space Instr1[47:32] Instr1[31:16] Instr1[15:0] Unused Memory Space
Similarly, for 16- to 48-bit instruction packing, the first 16 bits are stored at an even address and the remaining 16 bit segments are stored in consecutive locations. The program sequencer generates addresses in groups of four physical locations. For the remaining accesses, the previous internal logical address is incremented by 1. However, this leaves an unused 16-bit location after every three 16-bit valid instruction segments in the external memory. For example, the three 16 bit segments may be placed at 0x0200000, 0x0200001 and 0x0200002 respectively. The next instruction sixteen bit segments should be placed from address 0x200004 to 0x200007 and so on.
5-106
Memory
For 8- to 48-bit instruction packing, the first 8 bits are stored at an even address and the remaining 8-bit segments are stored in consecutive locations. The program sequencer generate addresses in groups of eight physical locations. For the remaining accesses, the previous internal logical
5-107
address is incremented by 1. However, this leaves two unused 8-bit locations after every six 8-bit internal logical segments in the external memory. For example, the six 8-bit segments may be placed at 0x0200000, 0x0200001, 0x0200002, 0x0200003, 0x0200004 and 0x0200005 respectively. The next instruction eight bit segments should be placed from address 0x200008 to 0x20000D and so on. In 32- to 48-bit packing mode, each access of external memory to fetch an instruction translates into two accesses to successive locations. In 16- to 48-bit packing mode, each access of external memory to fetch an instruction translates into three accesses to successive locations. In 8- to 48-bit packing mode, each access of external memory to fetch an instruction translates into six accesses to successive locations. The processor core speed for instruction execution is affected by the type of external memory (SDRAM or non-SDRAM) and external memory width. For packed execution modes of 32- to 48-bit, 16- to 48-bit and 8to 48-bit, with the SDCKR bit in the SDCTL register set (=1) and the program executing from SDRAM, the core instruction rate is 2, 3 or 6 times slower than executing from internal memory. When SDCKR=0, the core instruction rate is 4, 6 or 12 times slower. If the program is executing from SRAM or FLASH with a CLKIN-core clock ratio of 2:1, the core speed is reduced by the number of waitstates and a factor of 4, 6 or 12. The effect of external memory accesses on core speed is shown in Table 5-19.
5-108
Memory
In summary, instruction access to external memory translate to one (full 48-bit data bus mode), two, three, or six accesses to successive locations depending on the instruction packing mode selected in bits 30 and 31 in the SYSCON register. For 16- to 48-bit packing, one external address space (two bytes) is unused for every single instruction. Similarly, for 8- to 48-bit packing two external address spaces (two bytes) are unused for every single instruction. For 32- to 48-bit packing, every external address contains valid data. The next sections examine the addressing schemes and unused addresses for all three packing mode cases.
5-109
Seg 2
Seg 3
1 Note that segmented internal address ranges allows continuous addresses in external memory for 48- to 32-bit packing.
Total Program Size (32- to 48-Bit Packing) Total external memory available is 14 Mwords (non-SDRAM) and 62 Mwords (SDRAM). Given that one instruction takes two external memory locations, the external program memory is 7 Mwords non-SDRAM space and 31 Mwords SDRAM space. This scheme limits the size of the contiguous program segment (internal) to 1 Mword. There are seven of these segments in bank 0 non-SDRAM space and 30 segments in bank 0 SDRAM space. See Table 5-22 on page 5-112 for a comparison of total program sizes based on different packing modes.
5-110
Memory
Seg 2
Seg 3
Total Program Size (16- to 48-Bit Packing) Total external memory available is 14 Mwords (non-SDRAM) and 62 Mwords (SDRAM). Given that one instruction takes four external memory locations, the external program memory is 3.5 Mwords non-SDRAM space and 15.5 Mwords SDRAM space. This scheme limits the size of the contiguous program segment (internal) to 0.5M. There are seven of these
5-111
segments in bank 0 non-SDRAM space and 30 segments in bank 0 SDRAM space. See Table 5-23 on page 5-113 for a comparison of total program sizes based on different packing modes.
Seg 2
Seg 3
5-112
Memory
Total Program Size (8- to 48-Bit Packing) Total external memory available is 14 Mwords (non-SDRAM) and 62 Mwords (SDRAM). Given that one instruction takes eight external memory locations, the external program memory is 1.75 Mwords non-SDRAM space and 7.75 Mwords SDRAM space. This scheme limits the size of the contiguous program segment (internal) to 0.25 Mwords. There are seven of these segments in bank 0 non-SDRAM space and 30 segments in bank 0 SDRAM space.
5-113
5-114
6 I/O PROCESSOR
The processors I/O processor manages Direct Memory Accessing (DMA) of processor memory through the external, SPI, link, and serial ports. Each DMA operation transfers an entire block of data. By managing DMA, the I/O processor lets programs move data as a background task while using the processor core for other processor operations. The I/O processors architecture supports a number of DMA operations. These operations include the following transfer types: Internal memory external memory or external peripherals Internal memory internal memory of other processors Internal memory host processor Internal memory serial port I/O Internal memory link port I/O Internal memory SPI I/O External memory external peripherals This chapter describes the I/O processor and how it controls external port, link port, SPI port, and serial port operations. DMA transfers between internal memory and external memory, multiprocessor memory, or a host use the processors external port. For these types of transfers, a program provides the DMA controller with the internal memory buffer size and address, the address modifier, and the direction of
6-1
transfer. After setup, the DMA transfers begin when the program enables the channel and continues until the I/O processor transfers the entire buffer to or from processor memory. Similarly, DMA transfers between internal memory and link, serial, or SPI port have DMA parameters. When the I/O processor performs DMA between internal memory and one of these ports, the program sets up the parameters, and the I/O uses the port instead of the external bus. The direction (receive or transmit) of the I/O port determines the direction of data transfer. When the port receives data, the I/O processor automatically transfers the data to internal memory. When the port needs to transmit a word, the I/O processor automatically fetches the data from internal memory. The I/O processor also lets the processor system perform DMA transfers between an external device and external memory. This external to external transfer only uses the external port and I/O processor. External devices can control external port DMA transfers in two ways. If the external device can handle bus mastership, the external device master reads or writes to DMA buffers on the processor. External devices also can assert a DMA Request input (DMARx) to request service. To further minimize loading on the processor core, the I/O processor supports chained DMA operations. When using chained DMA, a program initiates a DMA transfer to automatically set up and start the next DMA transfer after the current one completes. External bus packing and unpacking of 16-, 32-, 48-, or 64- bit words in internal memory is performed during DMA transfers from either 8-, 16-, or 32- bit wide external memory. Fourteen channels of DMA are available on the ADSP-21161; two channels are shared between the SPI interface and the link ports, eight channels are available via the serial ports, and four channels are available via the processor's external port for host processor, other ADSP-21161s, memory, or I/O transfers. Asynchronous off-chip peripherals can control two DMA channels using DMA
6-2
I/O Processor
Request/Grant lines (DMAR1-2, DMAG1-2). Other DMA features include interrupt generation upon completion of DMA transfers and DMA chaining for automatic linked DMA transfers. For information on connecting external devices to the external port, link ports, SPI port, or serial ports, see External Port on page 7-1, Link Ports on page 9-1, Serial Peripheral Interface (SPI) on page 11-1 or Serial Ports on page 10-1. Figure 6-1 shows the processors I/O processor, related ports, and buses. Figure 6-5 on page 6-23 shows more detail on DMA channel data paths.
Addr
Data
Addr
Data
External Port
Addr (24-bit)
Data (32-bit)
18
64
Serial Port Buffer FIFOs (2 deep x 32-bits) Link Port Buffer FIFOs (2 deep x 48-bits) SPI Port Buffer FIFOs (2 deep x 32-bits)
6-3
The Data Buffer Registers column in Figure 6-2 shows the data buffer registers for each port. These registers include: External Port Buffer (EBPx). These 64-bit buffers for the external port have eight-position FIFOs for transmitting or receiving data when interfacing with a host or external devices such as memory and memory mapped devices. Link Port Buffer (LBUFx). These buffers for the link ports have two-position FIFOs for transmitting or receiving DMA data when connected to another link port. Serial Port Receive Buffer (RXx). These receive buffers for the serial ports have two-position FIFOs for receiving data when connected to another serial device. Serial Port Transmit Buffer (TXx). These transmit buffers for the serial ports have two position FIFOs for transmitting data when connected to another serial device. SPI Receive Buffer (SPIRX). This receive buffer for the SPI port has two-position FIFOs for receiving data when connected to another serial device. SPI Transmit Buffer (SPITX). This transmit buffer for the SPI port has two position FIFOs for transmitting data when connected to another serial device.
6-4
I/O Processor
IRPTL, LIRPTL
SERIAL PORTS 3-0 II3A-0A, II3B- 0B, IM3A-0A, IM3B-0B C3A-0A,C3B-0B, CP3A-0A, CP3B-0B, GP3A-0A, GP3B-0B TX3A-0A, TX3B-0B, RX3A-0A, RX3B-0B SPI PORT IISRX, IISTX, IMSTX, IMSRX, CSRX, CSTX, GPSTX, GPSRX IILB1-0, IMLB1-0, CLB1-0, CPLB1-0, GPLB1-0
SPCTL3-0
SPICTL
LCTL
LBUF1-0
DMARx DMAGx
6-5
The Port, Buffer, and DMA Control Registers column in Figure 6-2 shows the control registers for the ports and DMA channels. These registers include: System Configuration register (SYSCON). This register configures packing, priority, and word order for the external port. Waitstate and Access Mode register (WAIT). This register configures handshake, idle cycle insertion, and waitstate insertion for external memory DMA accesses. External Port DMA Control registers (DMACx). These control registers for each external port DMA channel select the direction, format, handshake, and enable chaining, transfer mode, and DMA start. Link Port Control register (LCTL). This control register selects the direction, word width, transfer rate, and enable chaining and DMA start. This register assigns link buffers to link ports for link port operations. This register indicates link buffer packing and error status for link port operations. Serial Port Control registers (SPCTLx). These control registers for each port select the receive or transmit format, monitor FIFO status, enable chaining, and start DMA. SPI Port Control register (SPICTL). This control register configures and enables the SPI interface, selects the device as master or slave, and determines the data transfer and word size.
6-6
I/O Processor
The DMA Parameter Registers column in Figure 6-2 shows the parameter registers for each DMA channel. These registers function similarly to data address generator registers and include: Internal Index registers (IIx). Index registers provide an internal memory address, acting as a pointer to the next internal memory DMA read or write location. These registers are 18 bits wide and are offset 0x40000 for internal addressing in normal word space. Internal Modify registers (IMx). Modify registers provide the signed increment by which the DMA controller post-modifies the corresponding internal memory index register after the DMA read or write. These registers are 16 bits wide. Count registers (Cx). Count registers indicate the number of words remaining to be transferred to or from internal memory on the corresponding DMA channel. These registers are 16 bits wide. Chain Pointer registers (CPx). Chain pointer registers hold the starting address of the Transfer Control Block (parameter register values) for the next DMA operation on the corresponding channel. These registers also control whether the I/O processor generates an interrupt when the current DMA process ends. These registers are 19 bits wide and are offset 0x40000 for internal addressing in normal word space. General Purpose registers (GPx). General purpose DMA registers hold an address or other value. These registers are 17 bits wide. External Index registers (EIEPx). Index registers provide an external memory address, acting as a pointer to the next external memory DMA read or write location. These registers only apply to external port EPBx DMA. These External Port DMA registers are 32 bits wide.
6-7
External Modify registers (EMEPx). Modify registers provide the increment by which the DMA controller post-modifies the corresponding external memory index register after the DMA read or write. These registers only apply to external port EPBx DMA. These External Port DMA registers are 32 bits wide. External Count registers (ECEPx). External count registers indicate the number of words remaining to be transferred to or from external memory on the corresponding DMA channel. These registers only apply to external port EPBx DMA. These External Port DMA registers are 32 bits wide. Figure 6-3 shows a block diagram of the I/O processors address generator (DMA controller). Table 6-1 lists the parameter registers for each DMA channel. The parameter registers are uninitialized following a processor reset.
6-8
I/O Processor
I IX INDEX (ADDRESS)
IMX MODIFI ER
+
POST-MO DIFY
MUX
1 +
CX CO UNT
WORKING REGISTER
MUX
+
POST-MODIFY
Figure 6-3. DMA Address Generator The I/O processor generates addresses for DMA channels much the same way that the Data Address Generators (DAGs) generate addresses for data memory accesses. Each channel has a set of parameter registers (shown in Figure 6-4) including an index register (IIx) and modify register (IMx)
6-9
that the I/O processor uses to address a data buffer in internal memory. The index register must be initialized with a starting address for the data buffer. As part of the DMA operation, the I/O processor outputs the address in the index register onto the processors I/O address bus and applies the address to internal memory during each DMA cyclea clock cycle in which a DMA transfer is taking place.
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
IIx
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
IMx
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Cx
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
CPx
Program -Controlled Interrupt Bit If this bit is set, the I/O processor will generate a DMA interrupt on completion of a chained DMA
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
PCI Bit
GPx
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
EIEPx
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
EMEPx
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ECEPx
(Reserved bits must always be set to zero when programming DMA parameter registers)
6-10
I/O Processor
All addresses in the index (IIx) registers are offset by a value matching the processors first internal Normal word addressed RAM location, before the I/O processor uses the addresses. For the ADSP-21161, this offset value is 0x0004 0000. While DMA addresses must always be Normal word (32-bit) memory, the internal memory data transfer sizes may be 64-, 48-, or 32-bits. External memory data transfer sizes may be 32-, 16 or 8-bits. The I/O processor can transfer Short word data (16-bit) using the packing capability of the external port, serial port and SPI port DMA channels. After transferring each data word to or from internal memory, the I/O processor adds the modify value to the index register to generate the address for the next DMA transfer and writes the modified index value to the index register. The modify value in the IMx register is a signed integer, which allows both increment and decrement modifies. The modify value IMx (which was fixed to 1 on the ADSP-21065L) can now have any positive or negative integer value because of SIMD mode. If the I/O processor modifies the index register past the maximum 18-bit value to indicate an address out of internal memory, the index wraps around to zero. With the offset for the ADSP-21161, the wrap around address is 0x0004 0000. Each DMA channel has a count register (Cx) that loads the programs with a word count to be transferred. The I/O processor decrements the count register after each DMA transfer on that channel. When the count reaches zero, the I/O processor generates the interrupt for that DMA channel. For more information on DMA interrupts, see Using I/O Processor Status on page 6-121. If a program loads the count (Cx) register with zero, the I/O processor does not disable DMA transfers on that channel. The I/O processor interprets the zero as a request for 216 transfers. This count occurs because the I/O processor starts the first transfer before the testing the count value. The only way to disable a DMA
6-11
channel is to clear its DMA enable bit. For more information, see External Port Channel Transfer Modes on page 6-46, Link Port Channel Transfer Modes on page 6-85, or Serial Port Channel Transfer Modes on page 6-99. Each DMA channel also has a chain pointer register (CPx) and a general-purpose register (GPx). Chained DMA sequences are a set of multiple DMA sequences, each autoinitializing the next in line. The location of the parameters for the next sequence comes from the CPx register. These parameters are called a Transfer Control Block (TCB), and they set up DMA parameter values for autoinitializing the next DMA sequence in the chain. Programs can use the GP register for any purpose, but usually programs store the address of the previous TCB in this register during chained DMA. For more information, see Chaining DMA Processes on page 6-25. The external port DMA channels each contain three additional parameter registers, the external index register (EIEPx), external modify register (EMEPx), and external count register (ECEPx). These three registers are not available for the serial port, SPI port and link port DMA channels. The I/O processor generates 32-bit external memory addresses using the EIEPx, EMEPx, and ECEPx registers, during DMA transfers between internal memory and external memory or devices. Programs must load the ECEPx register with the count of external bus transfers in the DMA. If the external port is using word packing, the ECEPx count differs from the number of words transferred in the DMA. Memory mapped devices can communicate with the I/O processor using an internal DMA request/grant handshake on an external port DMA channel. Each channel has a single request and a single grant. When a particular I/O port needs to perform transfers to or from internal memory, the channel asserts a request. The I/O processor prioritizes this request with all other valid DMA requests. The default channel priority is DMA
6-12
I/O Processor
channel 0 as highest and DMA channel 13 as lowest. Table 6-1 lists the DMA channels in priority order. For more information, see Managing DMA Channel Priority on page 6-22. When a channel becomes the highest priority requester, the I/O processor services the channels request. In the next clock cycle, the I/O processor starts the DMA transfer. If a DMA channel is disabled, the I/O processor does not service requests for that channel, whether or not the channel has data to transfer. The processors 14 DMA channels are numbered as shown in Table 6-1. This table also shows the control, parameter, and data buffer registers that correspond to each channel. Table 6-1. DMA Channel Registers: Controls, Parameters, and Buffers
DMA Chan# 0 1 2 3 4 5 SPCTL2 SPCTL1 Control Registers SPCTL0 Parameter Registers Buffer Register Description
II0A, IM0A, C0A, CP0A, GP0A RX0A, TX0A Serial Port 0 A Data II0B, IM0B, C0B, CP0B, GP0B RX0B, TX0B Serial Port 0 B Data
II1A, IM1A, C1A, CP1A, GP1A RX1A, TX1A Serial Port 1 A Data II1B, IM1B, C1B, CP1B, GP1B RX1B, TX1B Serial Port 1 B Data
II2A, IM2A, C2A, CP2A, GP2A RX2A, TX2A Serial Port 2 A Data II2B, IM2B, C2B, CP2B, GP2B RX2B, TX2B Serial Port 2 B Data
6-13
Table 6-1. DMA Channel Registers: Controls, Parameters, and Buffers (Contd)
DMA Chan# 6 7 8 LCTL, SPICTL1 Control Registers SPCTL3 Parameter Registers Buffer Register Description
II3A, IM3A, C3A, CP3A, GP3A RX3A, TX3A Serial Port 3 A Data II3B, IM3B, C3B, CP3B, GP3B IILB0, IMLB0, CLB0, CPLB0, GPLB0 IISRX, IMSRX, CSRX, GPSRX IILB1, IMLB1, CLB1, CPLB1, GPLB1 IISTX, IMSTX, CSTX, GPSTX RX3B, TX3B LBUF0, SPIRX LBUF1 SPITX EPB0 Serial Port 3 B Data Link Buffer 0 SPI Receive Link Buffer 1 SPI Transmit External Port FIFO Buffer 0 External Port FIFO Buffer 1 External Port FIFO Buffer 2 External Port FIFO Buffer 3
10
DMAC10
IIEP0, IMEP0, CEP0, CPEP0, GPEP0, EIEP0, EMEP0, ECEP0 IIEP1, IMEP1, CEP1, CPEP1, GPEP1, EIEP1, EMEP1, ECEP1 IIEP2, IMEP2, CEP2, CPEP2, GPEP2, EIEP2, EMEP2, ECEP2 IIEP3, IMEP3, CEP3, CPEP3, GPEP3, EIEP3, EMEP3, ECEP3
112
DMAC11
EPB1
123
DMAC12
EPB2
13
DMAC13
EPB3
Link port and SPI DMA parameter register names correspond to the same IOP addresses since these peripherals share DMA channels 8 and 9. Since chaining is not supported for SPI DMA, a chain pointer register cannot be use for DMA operation. 2 The DMAR1 and DMAG1 pins are handshake controls for DMA channel 11. 3 The DMAR2 and DMAG2 pins are handshake controls for DMA channel 12.
6-14
I/O Processor
All of the I/O processors registers are memory-mapped in the processors internal memory, ranging from address 0x0000 0000 to 0x0000 01FF. For more information on these registers, see I/O Processor Registers on page A-47. Because the I/O processor registers are memory-mapped, the processor and external processors (host or multiprocessors) have access to program DMA operations. A processor sets up a DMA channel by writing the transfers parameters to the DMA parameter registers. After the IIx, IMx, and Cx registers (among others) are loaded with a starting source or destination address, an address modifier, and a word count, the processor is ready to start the DMA. The external ports, link ports, SPI port, and serial ports each have a DMA enable bit (DEN, LxDEN, SPIEN, or SDEN) in their channel control register. Setting this bit for a DMA channel with configured DMA parameters starts the DMA on that channel. If the parameters configure the channel to receive, the I/O processor transfers data words received at the buffer to the destination in internal memory. If the parameters configure the channel to transmit, the I/O processor transfers a word automatically from the source memory to the channels buffer register. These transfers continue until the I/O processor transfers the selected number of words as determined by the count parameter. To start a new (non-chained) DMA sequence after the current one is finished, programs must disable the channel (clear its DEN bit); write new parameters to the IIx, IMx, and CEPx registers; then enable the channel (set its DEN bit). For chained DMA operations, this disable-enable process is not necessary. For more information, see Chaining DMA Processes on page 6-25.
6-15
RX0A or TX0A RX0B or TX0B RX1A or TX1A RX1B or TX1B RX2A or TX2A RX2B or TX2B RX3A or TX3A RX3B or TX3B
II0A, IM0A, C0A, CP0A, GP0A II0B, IM0B, C0B, CP0B, GP0B II1A, IM1A, C1A, CP1A, GP1A II1B, IM1B, C1B, CP1B, GP1B II2A, IM2A, C2A, CP2A, GP2A II2B, IM2B, C2B, CP2B, GP2B II3A, IM3A, C3A, CP3A, GP3A II3B, IM3B,C3B, CP3B, GP3B
Serial Port 0 A data Serial Port 0 B data Serial Port 1 A data Serial Port 1 B data Serial Port 2 A data Serial Port 2 B data Serial Port 3 A data Serial Port 3 B dara
6-16
I/O Processor
Table 6-2. DMA Channel Allocation and Parameter Register Assignments (Contd)
DMA Channel # 8 Data Buffer Parameter Registers IOP Address of DMA Parameter Register 0x30 to 0x34 Description
LBUF0/SPIRX
IILB0,IMLB0,CLB0, CPLB0,GPLB0 IISRX, IMSRX, CSRX, GPSRX (no CPx) IILB1, IMLB1, CLB1, CPLB1, GPLB1 IISTX, IMSTX, CSTX, GPSTX (no CPx) IIEP0, IMEP0, CEP0, CPEP0, GPEP0, EIEP0, EMEP0, ECEP0 IIEP1, IMEP1, CEP1, CPEP1, GPEP1, EIEP1, EMEP1, ECEP1 IIEP2, IMEP2, CEP2, CPEP2, GPEP2, EIEP2, EMEP2, ECEP2 IIEP3, IMEP3, CEP3, CPEP3, GPEP3, EIEP3, EMEP3, ECEP3
LBUF1/SPITX
0x38 to 0x3C
10
EPB0
0x40 to 0x47
111
EPB1
0x48 to 0x4F
122
EPB2
0x50 to 0x57
13
EPB3
0x58 to 0x5F
1 2
DMAR1 and DMAG1 are handshake controls for DMA channel 11 DMAR2 and DMAG2 are handshake controls for DMA channel 12.
DMA channel 0 has the highest priority and DMA channel 13 has the lowest priority.
6-17
The DMA channel arbitration feature allows the link port or SPI channel group to rotate priority with the external port channels. This feature may be enabled by setting the PRROT bit in the SYSCON IOP register. The DMA controller can be programmed to use a rotating priority scheme for the four external port channels by setting the DCPR bit in the SYSCON register. The DMA controller can be programmed to use a rotating priority scheme for the two link port DMA channels (channels 8 and 9) by setting the LDCPR bit in the SYSCON register. Each channel has a set of parameter registers (II, IM, C, CP, GP etc.) which are used to setup DMA transfers. DMA parameter register assignments for the various channels are shown in Table 6-2. For ADSP-21160 programs to run on ADSP-21161 processor with no modifications, note that previously used mnemonics and the new mnemonics map to the same addresses whenever appropriate.
6-18
I/O Processor
6-19
Booting Modes
Booting Modes
The booting modes that are supported by the ADSP-21161 processor are given in Table 6-5. Table 6-5. Booting Modes for ADSP-21161
EBOOT 1 0 0 0 0 1 2 3 LBOOT 0 0 1 1 0 BMS output 1 (input) 1 (input) 0 (input) 0 (input) Booting Mode EPROM Boot (connect BMS to EPROM chip select)1 Host Boot1 Link Boot2 Serial Boot (SPI)3 No Booting (processor executes from the external memory)
For the Host and EPROM boots, the DMA channel 10 (EPB0) is used. For the link boot, the DMA channel 8 (LBUF0) is used. Serial boot (SPI) uses DMA channel 8 (its mutually exclusive with the link ports).
6-20
I/O Processor
Chaining is enabled, the CPx register address field is non-zero, and the current DMA sequence finishes. Again, TCB chain loading occurs. A DMA sequence ends when one of the following occurs: The count register Cx decrements to zero (both CEPx and ECEPx for external port channels). Chaining is disabled and the channels DEN bit transitions from high to low. If the DEN bit goes low (=0) and chaining is enabled, the channel enters chain insertion mode and the DMA sequence continues. For more information, see Inserting a TCB in an Active Chain on page 6-28. When a program sets the DEN bit (=1) after a single DMA finishes, the DMA sequence continues from where it left off (for non-chained operations only). To start a new DMA sequence after the current one is finished, a program must first clear the DEN enable bit, write new parameters to the IIx, IMx, and Cx registers, then set the DEN bit to re-enable DMA. For chained DMA operations, these steps are not necessary. For more information, see Chaining DMA Processes on page 6-25. If a DMA operation completes and the count register is rewritten before the DMA enable bit is cleared, the DMA transfer restarts at the new count. Once a program starts a DMA process, the process is influenced by two external controls: DMA channel priority and DMA chaining. For more information, see Managing DMA Channel Priority on page 6-22 or Chaining DMA Processes on page 6-25.
6-21
6-22
I/O Processor
INTERNAL MEMORY
ON ASYNCHRONOUS WRITES, THE SLAVE WRITE FIFO IS 2 DEEP. ON SYNCHRONOUS WRITES, THE SLAVE WRITE FIFO IS 4 DEEP.
ADDR
DATA
DATA
PMD
DMD
EXTERNAL PORT
EPA 32
I/O PROCESSOR INTERNAL DMA ADDRESS GENERATORS GRNTS REQ.S EXTERNAL PORT FIFOS EPB (8 DEEP) LINK PORT FIFOS LBUF (2 DEEP) EXTERNAL DMA ADDRESS GENERATORS GRNTS REQ.S DMAR REQUESTS GRANTS DMAG EXTERNAL DMA PRIORITZER SPI PORT BUFFER FIF OS (2 DEEP) DMA CONTROLLER LINK PORTS OTHER IOP REGISTERS SERIAL PORTS SPI PORT
6-23
For information on programming serial port priority modes, see Serial Port Channel Priority Modes on page 6-99. For information on programming SPI port priority modes, see SPI DMA Channel Priority on page 6-112. The SPI port does not support DMA chaining. The I/O processor determines which DMA channel has the highest priority internal DMA request during every cycle between each data transfer. Internal DMA channel arbitration differs from external bus arbitration. For more information on external bus arbitration, see Multiprocessor Bus Arbitration on page 7-93. Processor core accesses of I/O processor registers and TCB chain loading are subject to the same prioritization scheme as the DMA channels. Applying this scheme uniformly prevents I/O bus contention, because these accesses are also performed over the internal I/O bus. TCB chain loading has a higher priority than external port accesses and link port/SPI port DMA accesses. This TCB priority permits chained serial port DMA, even when the external port is attempting an access in every cycle. For more information, see Chaining DMA Processes on page 6-25. If a processor has the link ports enabled and active at the same time, the default priority scheme could hold off external port DMA channels for extended periods of time. Because this hold off could have a significant negative impact on external bus performance, the I/O processor permits rotating DMA channel priority between the link port channel group and external port channel group. For more information on using the PRROT bit to rotate priority between link ports and the external port, see Link Port Channel Priority Modes on page 6-83.
6-24
I/O Processor
6-25
Bit 18 of the CPx register (shown in Figure 6-6) is the Program Controlled Interrupts (PCI) bit. If set, the PCI bit enables a DMA channel interrupt to occurs at the completion of the current DMA sequence. The PCI bit only effects DMA channels that have chaining enabled (CHEN =1). Also, interrupt requests enabled by the PCI bit are maskable with the IMASK register. Because the PCI bit is not part of the memory address in the CPx register, programs must be careful when writing and reading addresses to and from the register. To prevent errors, programs should mask out the PCI bit (bit 18) when copying the address in CPx to another address register.
18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
CPx
Program -Controlled Interrupt Bit If this bit is set, the I/O processor will generate a DMA interrupt on completion of a chained DMA
PCI Bit
Figure 6-6. CPX Register During chained DMA, the channels General Purpose (GP) register is a useful place to point to the last completed DMA sequence. This practice lets programs determine where the last full (or empty) data buffer is located. Transfer Control Block (TCB) Chain Loading During TCB chain loading, the I/O processor loads the DMA channel parameter registers with values retrieved from internal memory. The address in the CPx register points to the highest address of the TCB (containing the IIx or IIEPx parameter). The TCB values reside in consecutive memory locations.
6-26
I/O Processor
Table 6-6 shows the TCB-to-register loading sequence for the external port, link port, and serial port DMA channels. The I/O processor reads each word of the TCB and loads it into the corresponding register. Programs must set up the TCB in memory in the order shown in Table 6-6, placing the IIx parameter at the address pointed to by the CPx register of the previous DMA operation of the chain. Table 6-6. TCB Chain Loading Sequence
Address1 CPx + 0x0004 0000 CPx 1 + 0x0004 0000 CPx 2 + 0x0004 0000 CPx 3 + 0x0004 0000 CPx 4 + 0x0004 0000 CPx 5 + 0x0004 0000 CPx 6 + 0x0004 0000 CPx 7 + 0x0004 0000 CPx 8 + 0x0004 0000 1 External Port IIEPx IMEPx CEPx CPEPx GPEPx EIEPx EMEPx ECEPx Link and Serial Ports IIx IMx Cx CPx GPx
An x denotes the DMA channel used. Link, SPI, and serial ports use the first five locations only.
A TCB chain load request is prioritized like all other DMA operations. The I/O processor latches a TCB loading request and holds it until the load request has the highest priority. If multiple chaining requests are present, the I/O processor services the TCB registers for the highest priority DMA channel first. A channel which is in the process of chain loading cannot be interrupted by a higher priority channel. For a list of DMA channels in priority order, see Table 6-1 on page 6-13. For more information on DMA priority, see Managing DMA Channel Priority on page 6-22.
6-27
Setting Up and Starting the Chain To setup and initiate a chain of DMA operations, use the following steps: 1. Set up all TCBs in internal memory. 2. Write to the appropriate DMA control register, setting the DEN DMA enable bit to 1 and the CHEN chaining enable bit to 1. 3. Write the address containing the IIx register value of the first TCB to the CPx register, starting the chain. The I/O processor responds by autoinitializing the channels parameter registers with the first TCB and starting the first transfer. When the transfer finishes, the I/O processor begins the next TCB chain load if the current chain pointer address is non-zero. The CPx address points to the next TCB. The address field of the CPx registers is only 18 bits wide. If a program writes a symbolic address to bit 18 of CPx, there may be a conflict with the PCI bit. Programs should clear the upper bits of the address, then AND the PCI bit separately, if needed. Inserting a TCB in an Active Chain It is possible to insert a single DMA operation or another DMA chain within an active DMA chain. Programs may need to perform insertion when a high priority DMA requires service and cannot wait for the current chain to finish.
6-28
I/O Processor
When DMA on a channel is disabled (DEN=0) and chaining on the channel is enabled (CHEN=1), the DMA channel is in chain insertion mode. This mode lets a program insert a new DMA or DMA chain within the current chain without effecting the current DMA transfer. Use the following sequence to insert a DMA subchain while another chain is active: 1. Enter chain insertion mode by setting CHEN=1 and DEN=0 in the channels DMA control register. The DMA interrupt indicates when the current DMA sequence has completed. 2. Write the CPx register value into the CP position of the last TCB in the new chain. 3. Enter chained DMA mode by setting DEN=1 and CHEN=1. 4. Write the start address of the first TCB of the new chain into the CPx register. Chain insertion mode operates the same as chained DMA mode (DEN=1, CHEN=1), except that when the current DMA transfer ends, automatic chaining is disabled and an interrupt request occurs. This interrupt request is independent of the PCI bit state. Chain insertion should not be set up as an initial mode of operation. This mode should only be used to insert a DMA within an active DMA chaining operation.
6-29
bus master capability use these channels. Channels 10, 11, 12, and 13 are assigned to EPB0, EBP1, EPB2 and EPB3 buffers respectively, and are controlled by DMAC10, DMAC11, DMAC12 and DMAC13 DMA control registers. The ADSP-21161 processor supports a number of DMA modes for external port DMA. The following sections describes typical external port DMA processes: Setting Up External Port DMA on page 6-68 Bootloading Through The External Port on page 6-70 Boot Memory DMA Mode on page 6-42 External Port Buffer Modes on page 6-42 External Port Channel Priority Modes on page 6-43 External Port Channel Transfer Modes on page 6-46 External Port Channel Handshake Modes on page 6-47
The following bits control external port I/O processor modes. Except for the FLSH bit, the control bits in the DMACx registers have a one cycle effect latency. The FLSH bit has a two cycle effect latency. Programs should not modify an active DMA channels DMACx register other than to disable the 6-30 ADSP-21161 SHARC Processor Hardware Reference
I/O Processor
channel by clearing the DEN bit. For information on verifying a channels status with the DMASTAT register, see Using I/O Processor Status on page 6-121. Some other bits in SYSCON, WAIT, and DMACx setup non-DMA external port features. For information on these features, see Setting External Port Modes on page 7-3. Boot Select Override. SYSCON Bit 1 (BSO). This bit enables (if set, =1) or disables (if cleared, =0) access to Boot Memory Space. When BSO is set, the processor uses the BMS select line (instead of MS3-0) to perform DMA channel 10 accesses to external memory. Host Bus Width. SYSCON Bits 5-4 (HBW). These bits select the host bus width as follows: 00=32-bit width, 01=16-bit width, 10=8-bit width (reset value). Host Most Significant Word First Packing Select. SYSCON Bit 7 (HMSWF). This bit selects the word packing order for host accesses as most-significant-word first (if set, =1) or least-significant-word first (if cleared, =0). Buffer Hang Disable. SYSCON Bit 16 (BHD). This bit controls whether the processor core proceeds (hang disabled if set, =1) or is held-off (hang enabled if cleared, =0) when the core tries to read from an empty EPBx, RXx, LBUFx or SPIRX buffer or tries to write to a full EPBx, TXx, LBUFx or SPITX buffer. External Port DMA Channel Priority Rotation Enable. SYSCON Bit 19 (DCPR). This bit enables (rotates if set, =1) or disables (fixed if cleared, =0) priority rotation among external port DMA channels (channel 10-13). Handshake and Idle for DMA Enable. WAIT Bit 30 (HIDMA). This bit enables (if set, =1) or disables (if cleared, =0) adding an idle cycle after every memory access for DMAs with handshaking (DMARx-DMAGx).
6-31
External Port DMA Enable. DMACx Bit 0 (DEN). This bit enables (if set, =1) or disables (if cleared, =0) DMA for the corresponding external port FIFO buffer (EPBx). External Port DMA Chaining Enable. DMACx Bit 1 (CHEN). This bit enables (if set, =1) or disables (if cleared, =0) DMA chaining for the corresponding external port FIFO buffer (EPBx). External Port Transmit/Receive Select. DMACx Bit 2 (TRAN). This bit selects the transfer direction for the corresponding external port FIFO buffer (EPBx). If set (=1), the port transmits data from internal memory. If cleared (=0), the port receives data from external memory. External Port Data Type Select. DMACx Bit 5 (DTYPE). This bit selects the transfer data type (40/48=bit, 3-column if set, =1) (32/64-bit, 4-column if cleared, =0) for the corresponding external port FIFO buffer (EPBx). External Port Packing Mode. DMACx Bits 8-6 (PMODE). These bits select the packing mode for the corresponding external port FIFO buffer (EPBx) as follows: 000=reserved, 001=16 external to 32/64 internal packing, 010=16 external to 48 internal packing, 011=32 external to 48 internal packing, 100= no packing, 101=8 external to 48 internal packing, 110= 8 external to 32/64 internal packing, 111=reserved. During reset, the default is PMODE = 101. Most Significant Word First. DMACx Bit 9 (MSWF). When the buffers PMODE is 001 or 010, this bit selects the packing order of 8-bit or 16-bit words (most significant first if set, =1) (least significant first if cleared, =0) for the corresponding external port FIFO buffer (EPBx). Master Mode Enable. DMACx Bit 10 (MASTER). This bit enables (if set, =1) or disables (if cleared, =0) master mode for the corresponding external port FIFO buffer (EPBx).
6-32
I/O Processor
Handshake Mode Enable. DMACx Bit 11 (HSHAKE) This bit enables (if set, =1) or disables (if cleared, =0) handshake mode for the corresponding external port FIFO buffer (EPBx). External Handshake Mode Enable. DMACx Bit 13 (EXTERN). This bit enables (if set, =1) or disables (if cleared, =0) external handshake mode for the corresponding external port FIFO buffer (EPBx). External Port Bus Priority. DMACx Bit 15 (PRIO). This bit selects the external bus access priority level (high if set, =1) (low if cleared, =0) for the corresponding external port FIFO buffer (EPBx).
6-33
To flush (clear) an external port buffer, write 1 to the FLSH bit in the appropriate DMACx control register. The DMA for the channel must be disabled during the write operation. The FLSH bit is not latched internally and always reads as 0. Status can change in the following cycle. Do not enable and flush an external port buffer in the same cycle. For DMA transfers between the processors internal memory and external memory, the DMA controller must generate addresses in both memories. The external port DMA channels contain both EIEPx (External Index) and EMEPx (External Modify) registers to generate external addresses. The EIEPx register provides the external port address for the current DMA cycle. It is updated with the modifier value in EMEPx for the next external memory access.
6-34
I/O Processor
DATA47-16
47 40 39 32 31 24 23 16 15
DATA15-0
87 0
16-bit Packed DMAData 16-bit Packed Instruction Execution Float or Fixed, D31-D0, 32-bit Packed DMA 32-bit Packed Instruction Execution 48-bit Instruction Fetch (No Packing)
Figure 6-7. External Port Data Alignment The PMODE bits in the DMACx control registers determine the packing mode for internal bus words while the HBW bits in the SYSCON register determine the packing mode for external bus words. Table 6-7 shows the packing modes of operation for the PMODE[2:0] that correspond to bits 8, 7, and 6 in the DMACx register. During reset, the default value PMODE in DMAC10 is 101 (8- to 48-bit packing for PROM or Host booting)
6-35
Each external port DMA control register contains a three bit PS field that indicates the number of short words currently packed in the EPBx buffer. The PS status field behaves the same way during packing and unpacking operations. All packing functions are available for all types of DMA transfers. Table 6-8 shows the values of PS[2:0] that correspond to bits 23, 22, and 21 of the DMACx register. Packing mode bit settings depend on whether the host access is processor-to-processor or processor-to-memory. To access another ADSP-21161 or memory, you must set the PMODE bits only (HBW bits have no effect) to pack and unpack individual data words for the following modes: master mode, paced master mode and handshake mode DMA.
6-36
I/O Processor
For host accesses, to pack and unpack individual data words, you must set both the PMODE bits in the appropriate DMACx control register and the HBW bits in the SYSCON register. Table 6-7 shows the packing mode bit settings for access to IOP, link port and external port buffers. Table 6-8. External Port FIFO Buffer Packing Status (Read Only)
PS[2:0] 000 001 010 011 100 EPBx Packing Status Packing complete 1st stage 2nd stage 3rd stage fifth stage of 8/48
For transfers to or from the EPBx data buffers, the packing mode is determined by the setting of the HBW bits of the SYSCON register AND the PMODE bits in the DMACx control register of each external port buffer. The external port buffer can pack data in most significant word first (MSWF) order or in least significant word first (LSWF) order. Setting the bit MSWF to 1 in the DMACx control register selects MSW mode for both packing and unpacking operations. The MSWF bit has no effect when PMODE=111 or PMODE=000. 32-Bit Bus Downloading The packing sequence for downloading processor instruction from a 32-bit bus (PMODE=011, HBW=00) takes three cycles for every two words, as shown in Table 6-9.
6-37
For host transfers to or from the EPBx buffers, you must set the HBW bits in the SYSCON register to correspond to the external bus width. Note that the processor transfers 32-bit data on data bus lines DATA[47-16]. To transfer an odd number of instruction words, you must write a dummy access to flush the packing buffer and remove the unused word. For 32- to 48-bit host packing, the processor ignores the HMSWF bit in the SYSCON register and the MSWF bit in the DMACx control register. For non-host accesses (for example, DMA master mode accesses to external memory) the processor uses the MSWF bit for packing and ignores the value of HMSWF in SYSCON. 16-Bit Bus Downloading Table 6-10 and Table 6-11 show the packing sequence for downloading processor instructions from a 16-bit bus (PMODE=010, HBW=01). When interfacing to a host processor, the HMSWF bit determines whether the I/O processor packs to most significant 16-bit word first (=1) or least significant 16-bit word first (=0). Table 6-10. Download Packing sequence for 16-bit bus (MSW first)
Transfer First Second Third Data Bus Pins 31-16 Word 1; bits 47-32 Word 1; bits 31-16 Word 1; bits 15-0
6-38
I/O Processor
Table 6-11. Download Packing Sequence For 16-Bit Bus (LSW first)
Transfer First Second Third Data Bus Pins Word 1; bits 15-0 Word 1; bits 31-16 Word 1; bits 47-32
8-Bit Bus Downloading The packing sequence for downloading processor instructions from an 8-bit host (PMODE=101, HBW=10) takes six cycles for each word, as shown in Table 6-12 and Table 6-13. The HMSWF bit in SYSCON determines whether the I/O processor packs the most significant (=1) or least significant 8-bit word first (=0). Table 6-12. Download Packing Sequence From 8-Bit Bus (MSW first)
Transfer First Second Third Fourth Fifth Sixth Data Bus Pins 23-16 Word 1; bits 47-40 Word 1; bits 39-32 Word 1; bits 31-24 Word 1; bits 23-16 Word 1; bits 15-8 Word 1; bits 7-0
6-39
Table 6-13. Download Packing Sequence From 8-bit Bus (LSW first)
Transfer First Second Third Fourth Fifth Sixth Data Bus Pins 23-16 Word 1; bits 7-0 Word 1; bits 15-8 Word 1; bits 23-16 Word 1; bits 31-24 Word 1; bits 39-32 Word 1; bits 47-40
6-40
I/O Processor
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
PS
Ext Port EPBx FIFO Buffer Packing Status (read-only) 000=packing complete 001=1st stage pack/unpack 010=2 nd stage pack/unpack 011=3rd stage 100=5 th stage of 8 to 48 -bit packing 101=110=111=reserved
FS
Ext. Port FIFO Buffer Status (read-only) 00=buffer empty 01=buffer-not- full 10=buffer-not - empty 11=buffer full
INT32
Internal Memory 32 -bit Transfers Select 1=32-bit transfers/EPBx access width 0=64-bit transfers/EPBx access width
MAXBL
Maximum Burst Length Select 00=burst disabled 01=burst limit of 4 10=11=reserved 15 14 13 12 11 10 9 0 0 0 0 0 0 0
8 0
7 0
6 0
5 0
4 0
1 0
0 0
0 0
External Port Bus Priority Access 1=DSP asserts PA~ for external bus access 0=PA~ not asserted Flush EPBx FIFO Buffers & Status 1=flush EPBx External Handshake Mode Enable 1=enable, external devices to external memory 0=disable
PRIO
DEN
Ext. Port DMA Enable 1=enable, 0=disable
FLSH
CHEN
Ext. Port DMA Chaining Enable 1=enable, 0=disable
EXTERN
TRAN
Ext. Port EPBx Transmit/Rcv. Select 1=transmit data from intern memory 0=receive data from ext memory
INTIO
Single Word Interrupts for EPBx FIFO Buffers 1=enable single - wd non -DMA interrupt-driven xfers 0=disabled, FIFO fully enabled
DTYPE
EPBx FIFO Buffer Data Type Select 1=40/48 - bit, 3-column data 0=32/64 - bit, 4- column data
HSHAKE
EPBx DMA Handshake Mode Enable 1=enable, 0=disable
PMODE
Ext Port EPBx FIFO Packing Mode 000, 111= reserved 001=16 ext- to- 32/64 int 010=16 ext-to-48 int 011=32 ext- to -48 int 100=no pack (32 ext -to - 32/64 int) 101=8 ext-to -48 int 110=8 ext - to- 32/64int
MASTER
EPBx DMA Master Mode Enable 1=enable, 0=disable
MSWF
Most Significant Word First During Packing 1=enable, MSW first 0=disable, LSW first
6-41
6-42
I/O Processor
In addition to selecting the packing mode for external port processor transfers, programs must indicate the type of data in the transfer, using the Data Type (DTYPE) bit. For more information, see External Port Channel Transfer Modes on page 6-46. The Buffer Hang Disable (BHD) bit lets the processor core proceed if the core tries to read from an empty EPBx, TXx, LBUFx or SPIRX buffer or tries to write to a full EPBx, RXx, LBUFx or SPITX buffer. The processor core still performs buffer accesses when buffer hang is disabled (FBHD=1). If the processor core attempts to read from an empty receive buffer, the core gets a repeat of the last value that was in the buffer. If the processor core attempts to write to a full buffer, the core overwrites the last value that was written to the buffer. Because these buffers are not initialized at reset, a read from a buffer that hasn't been filled since the reset returns an undefined value. If an external port buffers INTIO bit is set and DMA for that channel is not enabled, the external port channel is in single-word, interrupt-driven transfer mode. For more information, see Using I/O Processor Status on page 6-121.
6-43
When the DCPR bit is set, the priority levels rotate. High priority shifts to a new channel after each packed single-word transfer. The I/O processor services a single-word transfer then rotates priority to the next higher numbered channel. Rotation continues until the I/O processor services all four external port channels. Figure 6-9 illustrates this process as described in the following steps: 1. At reset, external port channels have priority orderfrom high to low10, 11, 12, and 13. 2. The external port performs a single transfer on channel 11. 3. The I/O processor rotates channel priority, changing it to 12, 13, 10, and 11 (because rotating priority is enabled for this example, DCPR=1).
HIGHEST PRIORITY 10 HIGHEST PRIORITY 12
LOWEST PRIORITY
13
STEP 2
11
LOWEST PRIORITY
11
STEP 3
13
12
10
ONE TRANSFER OCCURS ON CHANNEL 11 (STEP 2), ROTATING CHANNEL 11'S PRIORITY TO THE LOWEST PRIORITY SLOT (STEP 3).
6-44
I/O Processor
Even though the external port channel DMA priority can rotate, the interrupt priorities of all DMA channels are fixed. When external port DMA channel priority is fixed (DCPR=0), channel 10 has the highest priority, and channel 13 has the lowest priority. But, programs can redefine this priority order by assigning one of the other channels the highest priority. To change the fixed priority sequence of the external port DMA channels, a program could use the following procedure: 1. Disable all external port DMA channels except the one which is to have lowest priority. 2. Select rotating priority. 3. Cause at least one transfer to occur on the enabled channel. 4. Disable rotating priority and re-enable all of the external port DMA channels. After completing this procedure, the channel immediately after the selected channel has the highest fixed priority. In systems where multiple processors are using the external bus, the PRIO bit raises the priority level for external port DMA transfers. When a channels PRIO bit is set, the I/O processor asserts the Priority Access (PA) pin when that channel uses the external bus. The channel gets higher priority in bus arbitration, allowing the DMA to complete more quickly. Programs can also rotate priority between external port and link port DMA channels. For more information, see Link Port Channel Priority Modes on page 6-83.
6-45
Because the external port is bidirectional, the I/O processor uses the Transmit select (TRAN) bit to determine the transfer direction (transmit or receive). Data flows from internal to external memory when in transmit mode. In transmit mode, the I/O processor fills the channels EPBx buffer with data from internal memory when the channels DEN bit is set. The Data Type (DTYPE) bit determines how the DMA channel accesses columns of internal memory. If DTYPE is set, the data is 40- or 48-bit words, and the I/O processor makes 3-column internal memory accesses. If DTYPE is cleared, the data is 32- or 64-bit words, and the I/O processor makes 4-column internal memory accesses. For more information, see Memory Organization and Word Size on page 5-25. The DTYPE for the transfer overrides the Internal Memory Data Width (IMDWx) setting for the internal memory block.
6-46
I/O Processor
6-47
Table 6-16. External Port DMA Handshake Modes: DMACx MASTER (M), HSHAKE (H), and EXTERN (E) Bits
EHM 000 DMA Mode of Operation Slave Mode. The processor responds to the buffers internal memory transfer activity based on the buffer status in the FS field, generating a DMA request whenever the buffer is not empty (on receive) or is not full (on transmit). During transmit (TRAN=1), the processor fills the EPBx buffer with data from internal memory when the program enables the buffer (DEN=1). For more information, see Slave Mode on page 6-55. 001 Master Mode. The processor attempts the internal memory DMA transfers indicated by the DMA counter (CEPx) based on the buffer status in the FS field, making transfers whenever the buffer is not empty (on receive) or is not full (on transmit). Systems using Master Mode should de-assert corresponding DMA request inputs, de-asserting DMAR1 if channel 11 is in master mode and de-asserting DMAR2 if channel 12 is in master mode. For more information, see Master Mode on page 6-50. 010 Handshake Mode. When in this mode, the processor generates a DMA request whenever the external device asserts the DMARx pin, then the processor asserts the DMAGx pin, transferring the data (and de-asserting DMAGx) when the external devices de-asserts the DMARx pin. Note that this mode only applies to external port buffers EPB1 and EPB2 and DMA channels 11 and 12. For more information, see Handshake Mode on page 6-57. 011 Paced Master Mode. The processor attempts the internal memory DMA transfers indicated by the DMA counter (CEPx), making transfers based on external DMA request inputs. The processor generates a DMA request whenever the external device asserts the DMARx pin. The processor controls the data transfer using the RD or WR and ACK pins and by applying the selected number of waitstates. Note that this mode only applies to external port buffers EPB1 and EPB2 and DMA channels 11 and 12. For more information, see Paced Master Mode on page 6-54.
6-48
I/O Processor
Table 6-16. External Port DMA Handshake Modes: DMACx MASTER (M), HSHAKE (H), and EXTERN (E) Bits (Contd)
EHM 100 101 110 DMA Mode of Operation Reserved Reserved External-Handshake Mode. The processor responds to external memory DMA requests based on external DMA request inputs. This mode is identical to Handshake Mode, but applies to transfers between external memory and external devices. The processor generates a DMA request whenever the external device asserts the DMARx pin. The processor asserts the DMAGx pin, transferring the data (and de-asserting DMAGx) when the external devices de-asserts the DMARx pin. Note that this mode only applies to external port buffers EPB1 and EPB2 and DMA channels 11 and 12. For more information, see External-Handshake Mode on page 6-66. 111 Reserved
For the handshake and external-handshake modes shown in Table 6-16, programs can insert an added idle cycle after every memory access. The handshake and Idle for DMA (HIDMA) bit in the WAIT register enables this added cycle, which reduces bus contention from devices with slow three-state timing or long recovery times. Because external port DMA transfers can go between processor internal memory and external memory, the I/O processor must generate addresses for both memory spaces. The external port DMA channels have additional parameter registers (EIEPx, EMEPx, ECEPx) for external memory access. To support data packing options for external memory DMA transfers, the EIEPx and EMEPx registers can generate addresses at a different rate than the internal address registers (IIEPx and IMEPx). Figure 6-5 on page 6-23 shows that the I/O processor has separate address generators for internal and external addresses. For this reason, when packing is used for external
6-49
memory DMA, the external count (ECEPx) register indicates the number of external port transfers, not the number of internal memory words being transferred. The DMA mode and other factors determine the size of the DMA data transfer on the external port. These other factors include the EIEPx, EMEPx, and ECEPx parameters; the PMODE, DTYPE, and MAXBL values in DMACx; and the transfer capacity available in the EPBx data buffer employed in the transfer. The internal I/O processor bus transfer size varies with the IIEPx, IMEPx, and CEPx parameters, and the PMODE, DMA mode, DTYPE, and INT32 values in DMACx. The following sections describe these DMA modes and transfer sizes in more detail: Master Mode on page 6-50 Paced Master Mode on page 6-54 Slave Mode on page 6-55 Handshake Mode on page 6-57 External-Handshake Mode on page 6-66 Master Mode When the MASTER bit is set (=1) and the EXTERN and HSHAKE bits are cleared (=0) in the channels DMACx register, the DMA channel is in master mode. A channel in this mode can independently initiate internal or external memory transfers. Master mode applies to all external port DMA channels: 10, 11, 12, and 13. When interfacing to SDRAM memory, only master mode DMA can be used for external port DMA transfers between SDRAM and internal memory. DMARx and DMAGx pins cannot be used to pace or handshake DMA transfers using SDRAM interface pins.
6-50
I/O Processor
To initiate a master mode DMA transfer, the processor sets up the channels parameter registers and sets the channels DMA enable (DEN) bit. A master mode DMA channel performing internal memory to external memory data transfer automatically performs enough transfers from internal memory to keep the EPBx buffer full. When the data transfer direction is external to internal, a master mode DMA channel also performs enough transfers from external memory to keep the EPBx buffer full. The I/O processor uses the EIEPx, EMEPx, and ECEPx registers to access external processor memory in master mode DMA. External Transfer Controls In Master Mode. In master mode, the processor determines the size of the external transfer from the channels PMODE bits and EIEPx, EMEPx, and ECEPx registers. Table 6-8 on page 6-37 shows the packing mode selected by the PMODE bits, and Table 6-17 shows the external transfer size in master mode that results from the combination of the PMODE bits. Table 6-17. Master Mode External Transfer Size
Transfer Size PMODE EIEP EMEP ECEP DTYPE EPBx Depth 1 32-bit 011, 100 X1 X X X >=1 16-bit 001, 010 X X # of 16-bit xfers X >=1 8-bit 110, 101 X X # of 8-bit xfers X >=1
32-bit External Transfers. The processor performs 32-bit transfers when PMODE= 011 (32- to 48-bit internal), or 100 (32-bit external-to-32-bit/64-bit internal). In PMODE=011or 100, all data transfers across the upper word of the data bus (DATA47-16) are as indicated in
6-51
Figure 7-1 on page 7-2. This mode supports all values of EIEPx, EMEPx, and ECEPx. ECEPx contains the number of 32-bit words to transfer. There must be at least one 32-bit EPBx FIFO entry available to support the 32-bit external transfer. 16-bit External Transfers. The processor performs 16-bit transfers when PMODE=001 (16-bit external-to-32/64-bit internal) or 010 (16-bit external to 48-bit internal). This mode supports all values of EIEPx, EMEPx, and ECEPx. The value ECEPx is programmed to the number of 16-bit words to transfer. There must be at least one 32-bit EPBx FIFO entry available to support the 16-bit external transfer. In PMODE=001, or 010, all data transfers across DATA31-16 as indicated in Figure 7-1 on page 7-2. 8-bit External Transfers. The processor performs 8-bit transfers when (8-bit external to 32/64-bit internal) or 101 (8-bit external to 48-bit internal). This mode supports all values of EIEPx, EMEPx, and ECEPx. The value ECEPx is programmed to be the number of 8-bit words to transfer. There must be at least one 32-bit EPBx FIFO entry available to support the 8-bit external transfer. In PMODE=110 or 101, all data transfers across DATA23-16 as indicated in Figure 7-1 on page 7-2.
PMODE=110
Internal Address/Transfer Size Generation. In master mode, the processor determines the size of the internal transfer from the channels PMODE bits and IIEPx, IMEPx, and CEPx registers. Table 6-7 on page 6-36 shows the packing mode selected by the PMODE bits, and Table 6-18 shows the internal transfer size in master mode that results from the combination of the PMODE bits. Table 6-18. Master Mode Internal Transfer Size Determination
Transfer Size PMODE IIEPx IMEPx 64-bit1 001, 100, 110 depends on IM2 -1 or 1 48-bit 010, 011, 101 X3 X 32-bit 001, 100, 110 X X
6-52
I/O Processor
Including packed instructions. If IMEPx is 1 for increment, IIEPx must be an even, 64-bit aligned Normal word address. If IMEPx is -1 for decrement, IIEPx must be an odd, Normal word address. X indicates any supported value.
64-bit Internal Transfers. To enable internal 64-bit transfers and increment the internal IIEPx pointer, programs must set IIEPx to match the IMEPx selection as shown in Table 6-18. CEPx contains the number of 32-bit words to transfer, and should be set to an even number of 32-bit words. The processor decrements CEPx by 2 for each 64-bit transfer. For 64-bit transfers, PMODE must be set to 001 (16-bit-to-32/64-bit internal), 100 (32-bit external-to-32/64-bit internal) or 110 (8-bit external-to-32/64-bit internal). DTYPE and INT32 must be cleared. There must be at least two 32-bit EPBx FIFO entries available to support the 64-bit external transfer. 48-bit Internal Transfers. The processor can perform 48-bit internal transfers for DMA of packed or unpacked 48-bit instructions. Many applications can use internal 64-bit transfer for 48-bit instructions. This technique can provide greater throughput than 48-bit internal transfers. In either of the 48-bit internal transfer modes in Table 6-18 (PMODE=101 and DTYPE=1 or PMODE=010 or 011 and DTYPE=0), the processor accesses the memory using instruction alignment (3-column read or write) for the EPBx buffer. In this case, IIEPx points to 48-bit words, and CEPx counts the number of 48-bit internal transfers.
6-53
32-bit Internal Transfers. The processor performs according to the conditions in Table 6-18. Under these additional conditions, the processor performs 32-bit transfers instead of 64- or 48-bit transfers: PMODE= 001 (16-bit external to 32-bit internal), or 100 (32-bit external to 32-bit internal), and IIEPx is not aligned to a 64-bit boundary, or IMEPx is < -1, or > 1, or CEPx is < 2, or EPBx depth < 2, or INT32 = 1, and DTYPE=0. Paced Master Mode When the MASTER and HSHAKE bits are set (=1) and the EXTERN bit is cleared (=0) in the channels DMACx register, the DMA channel is in Paced Master mode. A channel in this mode can independently initiate internal or external memory transfers. Paced Master mode applies only to external port DMA channels 11 and 12. In Paced Master mode, the processor has the same control for address generation and transfer size as in master mode. For more information, see Master Mode on page 6-50. The difference between these modes is that in Paced Master mode external transfers are controlled and initiated (paced) by the DMARx signal as in handshake mode. For more information, see Handshake Mode on page 6-57. The processor responds to the DMARx request only with the RD, or WR strobes, depending on direction and data alignment. DMAGx is not asserted in Paced Master mode. This method lets the processor share the same buffer between the I/O processor and processor core without external gating. Paced Master mode accesses can be extended by the ACK input, by waitstates programmed in the WAIT register, and by holding the DMARx input low.
6-54
I/O Processor
Slave Mode When the MASTER, HSHAKE, and EXTERN bits in the channels DMACx register are cleared (=0), the DMA channel is in slave mode. A channel in this mode cannot independently initiate external memory transfers. To initiate a slave mode DMA transfer, an external device must read or write the channels EPBx buffer. A slave mode DMA channel performing internal to external data transfer automatically performs enough transfers from internal memory to keep the EPBx buffer full. When the data transfer direction is external to internal, a slave mode DMA channel does not initiate any internal DMA transfers until the external device writes data to the channels EPBx buffer. Note that the I/O processor does not use the EIEPx, EMEPx, and ECEPx registers in slave mode DMA The following sequence describes a typical external to internal slave mode DMA operation where an external device transfers a block of data into the processors internal memory: 1. The external device initializes the channel by writing the DMA channels parameter registers (IIEPx, IMEPx, and CEPx) and DMACx control register. 2. The external device begins writing data to the EPBx buffer. 3. The EPBx buffer detects that data is present and asserts an internal DMA request to the I/O processor. 4. The I/O processor grants the request and performs the internal DMA transfer, emptying the EPBx buffer FIFO. If the internal DMA transfer is held off, the external device can continue writing to the EPBx buffer because of its eight-deep FIFO. When the EPBx FIFO becomes full, the processor holds off the external device with the ACK signal (for synchronous accesses) or with the REDY signal (for asynchro-
6-55
nous, host-driven accesses). This hold-off state continues until the I/O processor finishes the internal DMA transfer, freeing space in the EPBx buffer. The following sequence describes a typical internal to external slave mode DMA operation where an external device transfers a block of data from the processors internal memory: 1. The external device writes the DMA channels parameter registers (IIEPx, IMEPx, and CEPx) and DMACx control register, initializing the channel and automatically asserting an internal DMA request to the I/O processor. 2. The I/O processor grants the request and performs the internal DMA transfer, filling the EPBx buffers FIFO. 3. The external device begins reading data from the EPBx buffer. 4. The EPBx buffer detects that there is room in the buffer (it is now partially empty) and asserts another internal DMA request to the I/O processor, continuing the process. If the internal DMA transfers cannot fill the EPBx FIFO buffer at the same rate as the external device empties it, the processor holds off the external device with the ACK signal (for synchronous accesses) or with the REDY signal (for asynchronous, host-driven accesses) until valid data can be transferred to the EPBx buffer. The processor only deasserts the ACK (or REDY) signal when the EPBx FIFO buffer (or pad data buffer) is full during a write. The ACK (or REDY) signal remains asserted at the end of a completed block transfer if the EPBx buffer is not full. For reads, the processor deasserts the ACK (or REDY) signal for each read to handle the latency of the read versus posting the write to a buffer.
6-56
I/O Processor
In slave mode, the processor determines the size of the transfer based on the setting of channels PMODE bits. Table 6-19 shows the transfer size in slave mode that results from the PMODE bits and Table 6-7 on page 6-36 shows the packing mode selected by the PMODE bits. Table 6-19. Slave Mode Transfer Size Determination
Transfer Size (externalinternal) PMODE DTYPE 1 2 32-bit 32/64-bit 100 0 32-bit 48-bit 011 1 16-bit 32/64-bit1 001 0 16-bit 48-bit1 010 1 8-bit 32/64-bit2 110 0 8-bit 48-bit2 101 1
External device must be connected to DATA[31:16] External device must be connected to DATA[23:16]
Handshake Mode When the MASTER and EXTERN bits are cleared (=0) and the HSHAKE bit is set (=1) in the channels DMACx register, the DMA channel is in handshake mode. A channel in this mode cannot independently initiate external memory transfers. Note that handshake mode only applies to DMA channels 11 and 12. To initiate a handshake mode DMA transfer, an external device must assert an external DMA request, asserting DMAR1 for access to EPB1 or DMAR2 for access to EPB2. The buffers pass these request to the I/O processor, which prioritizes these requests with other internal DMA requests. When the external DMA request has the highest priority, the I/O processor asserts an external DMA grant, asserting DMAG1 for EPB1 or DMAG2 for EPB2. The grant signals the external device to read or write the EPBx buffer. A handshake mode DMA channel performing internal to external data transfer automatically performs enough transfers from internal memory to keep the EPBx buffer full. When the data transfer direction is external to
6-57
internal, a handshake mode DMA channel does not initiate any internal DMA transfers until the external devices writes data to the channels EPBx buffer. The I/O processor does not use the EIEPx or EMEPx registers in handshake mode DMA. It uses the ECEPx registers. Other than the DMARx/DMAGx handshake, handshake mode DMA operations follow almost the same process as slave mode DMA operations. The exception is that in handshake mode DMAs from internal to external memory the external device must load the channels ECEPx register with the number of external bus transfers. In handshake mode, the processor determines the size of the transfer from the channels parameter registers and PMODE bits. Table 6-7 on page 6-36 shows the packing mode selected by the PMODE bits, and Table 6-20 shows the transfer size in handshake mode that results from the combination of the read and write signals and PMODE bits. Table 6-20. Handshake Mode Transfer Size Determination
Transfer Size (externalinternal) PMODE IIEPx IMEPx CEPx ECEPx DTYPE 1 2 3 4 32-bit 32/64-bit1 100 X4 X # of 32-bit words # of 32-bit words 0 32-bit 48-bit2 011 X X # of 32-bit words 6/4 * CEPx 1 16-bit 32/64-bit2 001 X X # of 16-bit words 2 * CEPx 0 16-bit 48-bit2 010 X X # of 16-bit words 3 * CEPx 1 8-bit 32/64-bit3 110 X X # of 8-bit words 4 * CEPx 0 8-bit 48-bit2 101 X X # of 8-bit words 6 * CEPx 1
External device must be connected to the upper half of the data bus (Data[47:16]) External device must be connected to Data[16:31]) External device must be connected to Data[16:23]) X indicates any legal value
6-58
I/O Processor
DMA transfers are supported at the full system CLKIN/CLKOUT rates of 50MHz. However, full bandwidth at 2:1 core clock (CCLK) to CLKIN/CLKOUT ratio is not possible. Non synchronous timing specifications limit throughput for three DMA handshake modes: paced master mode, handshake mode and external handshake mode. The sampling rate of the DMARx signal by the internal circuitry of the ADSP-21161 processor prohibits maximum throughput at a CCLK to CLKIN/CLKOUT ratio of 2:1. For handshake mode DMA, the processor does not assert the MS3-0 memory select lines (the address strobes). For information on DMARx/DMAGx handshake timing, see Figure 6-10. to CLKIN ratios of 3:1 and 4:1 with CLKDBL =1 and CCLK to CLKIN ratios of 4:1, 6:1 and 8:1 with CLKDBL =0 support full speed throughput at the CLKIN frequency. If the maximum DMARx/DMAGx throughput at 50MHz is needed, synchronize the assertions and deassertions of DMARx with respect to CLKOUT. Refer to the ADSP-21161N DSP Microcomputer Data Sheet for specific timing information.
CCLK
DMAR rising edge allows 1st DMAG to complete DMAG has a wait state because DMAR remained asserted in the cycle prior to the DMAG assertion
CLKIN
DMARx
DMAGx
data valid
data valid
data valid
data valid
DMA device must place data in buffer prior to DMAG falling edge if no wait state
DMA device need not provide data until this cycle if wait state
6-59
The I/O processor uses the rising and falling edges of DMARx in the DMARx/ DMAGx handshake as prompts for DMA operations. On the falling edge of DMARx, the edge signals the I/O processor to begin a DMA access. On the rising edge of DMARx, the edge signals the I/O processor to complete the DMA access. The following sequence describes the process for requesting access to an EPBx buffer in handshake mode: 1. The external device asserts the buffers DMARx signal, placing an external DMA request for access to the EPBx buffer. 2. The EPBx buffer detects the falling edge of the DMARx signal and passes the external DMA request to the I/O processor, synchronizing the DMA operation with the processors system clock. To be recognized in a particular cycle, the DMARx low transition must meet the signal setup time from the processor data sheet. If the transition is slower than the setup time, the signal may not take effect until the following cycle. 3. The I/O processor prioritizes the external DMA request with other internal DMA requests. If the processor is not already bus master, the processor arbitrates for the external bus when the external DMA request has the highest priority, unless the EPBx buffer is blocked. If the EPBx buffer is full during a write or empty during a read, the buffer is blocked. The processor does not begin external bus arbitration until the I/O processor services the EPBx buffer, returning it to the unblocked state empty for writing or full for reading. 4. The processor becomes bus master and asserts DMAGx. The processor keeps DMAGx asserted until the cycle after the external device deasserts DMARx. By holding DMARx asserted, the external device holds the processor until the external device is ready to pro-
6-60
I/O Processor
ceed. If the external device does not need to extend the DMA grant cycle, the external device can deassert DMARx immediately (not waiting for DMAGx), providing the DMARx assertion time meets the timing requirements from the processor data sheet. The responding DMAGx in this case is a short pulse, and the processor only uses the external bus for one cycle. The I/O processor has a three-cycle DMA pipeline and a seven-deep external request counter. The I/O processors DMA pipeline is similar to the program sequencers fetchdecodeexecute instruction pipeline. The I/O processor processes the DMA pipeline in the following stages: It recognizes the DMA request and arbitrates internal DMA priority during the DMA fetch cycle. It generates the DMA address and arbitrates external bus access during the DMA decode cycle. It transfers DMA data during the DMA execute cycle. Because the I/O processor has a three-cycle DMA pipeline, there is a minimum delay of three cycles before the processor asserts DMAGx. This delay is in addition to any delay from internal DMA arbitration, so the external device must not assume that the DMA grant can arrive within two cycles even if higher priority DMA operations are disabled and the external bus is available for the transfer. The I/O processors external request counter increments each time the external device asserts DMARx and decrements each time the processor replies by asserting DMAGx. The external request counter records up to seven requests, so the external device can make up to seven requests before the first one has been serviced.
6-61
If the processor cannot immediately service the DMA requests in the external request counter, the processor services the requests on a prioritized basis. The external DMA device is responsible for keeping track of requests, monitoring grants, and pipelining the data when operating at full speed. If the external device makes more than seven DMARx without receiving a grant, the delayed grant causes unpredictable results. The processor only asserts DMAGx for the number of DMARx requests indicated by the external request counter. If the external devices make more requests than the count indicates, the processor DMAGx assertions cannot match the number of external device requests. To clear this mismatch, programs can clear the buffer and the external request counter using the flush bit (FLSH) in the channels DMACx register. To prevent holding off the processor, the external device must service the processors data requirements when it asserts the DMAGx grant signal. The external device should immediately supply data for writes to the processor or immediately accept data on reads from the processor. External interfaces can handle this I/O by placing the data in an external FIFO. When performing DMA operations at the full CLKIN speed of the processor, the system may need a three-deep external FIFO to handle the latency between request and grant. Programs on the external device can optimize operation of this FIFO by issuing three requests rapidly and making the next requests conditional on when the processor issues a grant. The external devices must follow the conditions in Figure 6-11 when enabling or disabling handshake mode for an external port DMA channel: The processor ignores a disabled (transitioning from disabled to enabled) DMA channels DMARx and DMAGx pins and ignores internal DMARx assertions for up to two processor core clock cycles after the instruction that enables the channel in handshake mode.
6-62
I/O Processor
The external devices must maintain DMARx deasserted (kept high, not low or changing) during the instruction that enables DMA in handshake mode. Before using the channel for the first time, programs flush the DMA channel, asserting the FLSH bit in the DMACx control register. This action is not required during chain insertion. The processor deasserts DMAGx if a program disables the channel while DMARx and DMAGx are asserted (=0). This action clears the channels active status bit, avoiding a potential deadlock condition.
CCLK
(core clock)
DMARx
DMARx ignored
Executing Instruction
Instruction
Instruction
Figure 6-11. DMARx Delay After Enabling Handshake DMA ADSP-21161 processors in a multiprocessing cluster may share a DMAGx signal, because only the bus master drives DMAGx. On the bus slaves, DMAGx is three-stated. This state eliminates the need for external gating if more than one processor or the host needs to drive the DMA buffer. Systems may need a pullup resistor on this line if the host is not connected to the pin or does not drive it when it acquires the bus. DMAGx has the same timing and transitions as the RD and WR strobes in asynchronous access mode. For more information, see Bus Arbitration Protocol on page 7-95. DMAGx responds to the SBTS and HBR signals in the same way as the read and write strobes.
6-63
6-64
I/O Processor
CLKIN
DMAR1
DMAG1
MS0
RD
WR
ADDR[23:0]
0x255000
0x255001
0x255002
DATA[47:16]
DATA VALID
DATA VALID
DATA VALID
Figure 6-12. DMA Handshake Idle Cycle no more DMAR1 requests pending. Therefore, during the idle cycle between the second and third transfers, the MS0 line goes high. MS0 goes low again when the 3rd data transfer occurs. Systems must be evaluated to determine if the idle cycle during a external handshake DMA with an activated MSx line has an adverse impact on the chip selected memory devices or peripherals. The RD, WR, and DMAG strobes are inactive during the idle cycle, and therefore the MSx lines being activated should not affect interconnection to other devices as long as RD and WR remain inactive. Otherwise, an idle cycle insertion between DMA handshake transfers cannot be used.
6-65
External-Handshake Mode External-handshake mode is identical to handshake mode, except that external-handshake mode transfers data between external memory and an external device. This section describes the differences between handshake mode and external-handshake mode. For more information, see Handshake Mode on page 6-57. When the MASTER bit is cleared (=0) and the HSHAKE and EXTERN bits are set (=1) in the channels DMACx register, the DMA channel is in external-handshake mode. A channel in this mode cannot independently initiate external memory transfers. Like handshake mode, external-handshake mode only applies to DMA channels 11 and 12. Do not use external handshake mode DMA on an external memory bank that has SDRAM mapped and connected to its MSx line. To initiate an external-handshake mode DMA transfer, an external device must assert an external DMA request, asserting DMAR1 for access to DMA channel 11or DMAR2 for access to DMA channel 12. The channels pass these request to the I/O processor, which prioritizes these requests with other internal DMA requests. When the external DMA request has the highest priority, the I/O processor asserts an external DMA grant, asserting DMAG1 for channel 11 or DMAG2 for channel 12. The grant signals the external device to read or write the external bus. An external-handshake mode DMA channel performing external to external data transfer automatically generates external memory addresses and strobes for transfers between external memory and the external device. Unlike handshake mode, the I/O processor must use the EIEPx, EMEPx, and ECEPx registers in external-handshake mode DMA. Also unlike handshake mode, the data for DMA channels 11 and 12 does not pass through the EPB1 or EPB2 buffers.
6-66
I/O Processor
During external-handshake mode transfers, the I/O processor generates external memory access cycles. DMARx and DMAGx operate the same as in handshake mode, but the processor also outputs addresses, MS3-0 memory selects, and RD and WR strobes, and responds to ACK. On external memory writes, the processor asserts DMAGx until the external device releases the ACK line or any of the processor waitstates expire. The external memory access by the external devices responds as if the processor core were making the access. For more information, see External Port on page 7-1. Because the I/O processor accesses external memory in external-handshake mode, programs must load the DMA channels EIEPx, EMEPx, and ECEPx parameter registers and the DMAC10 or DMAC11 PMODE bits. These settings let the I/O processor generate the external memory addresses and word count. External-handshake mode does not support chained DMA interrupts. Because no internal DMA transfers occur in external-handshake mode, the PCI bit in the channels CPEPx register cannot disable the DMA interrupt. Programs must use the IMASK register to mask this interrupt. In external-handshake mode, the processor does not perform packing. The processor does determine the size of the transfer from the channels parameter registers, PMODE bits. Table 6-21 shows the transfer size in external handshake mode that results from the combination of the read and write signals and PMODE bits. For 32-bit memory transfers to an external device, PMODE must be set to the no packing mode (=100) in the DMACx register.
6-67
1 External device must be connected to the upper half of the data bus (Data[47:16]) 2 X indicates any legal value
The following sequence describes a typical external to internal DMA operation where an external device transfers a block of data into the processors internal memory: 1. The processor or host (depending on the mode) writes to the DMA channels parameter registers (IIEPx, IMEPx, and CEPx) and the DMACx register, initializing the channel for receive ( TRAN=0). 2. The processor or host (depending on the mode) sets the channels DEN bit to 1 enabling the DMA process.
6-68
I/O Processor
3. The external device begins writing data to the EPBx buffer through the external port. 4. The EPBx buffer detects data is present and asserts an internal DMA request to the I/O processor. 5. The I/O processor grants the request and performs the internal DMA transfer, emptying the EPBx buffer FIFO. The following sequence describes a typical internal to external DMA operation where an external device transfers a block of data from the processors internal memory: 1. The processor or host (depending on the mode) writes the DMA channels parameter registers (IIEPx, IMEPx, and CEPx) and the DMACx register, initializing the channel for transmit (TRAN=1). 2. The processor or host (depending on the mode) sets the channels DEN bit to 1 enabling the DMA process. Because this is a transmit, setting DEN automatically asserts an internal DMA request to the I/O processor. 3. The I/O processor grants the request and performs the internal DMA transfer, filling the EPBx buffers FIFO. The processor may signal the start of this transfer depending on the mode. 4. The external device begins reading data from the EPBx buffer through the external port. The processor may signal the start of this transfer depending on the mode. 5. The EPBx buffer detects that there is room in the buffer because it is now partially empty and asserts another internal DMA request to the I/O processor, continuing the process.
6-69
6-70
I/O Processor
boots from a host through the external port. For a list showing how to select different boot modes, see the Boot Memory Select pin description in the table Table 13-11 on page 13-72. When using any of the power-up booting modes, address 0x0004 0004 should not contain a valid instruction since it is not executed during the booting sequence. A NOP or IDLE instruction should be placed at this location. In EPROM booting through the external port, an 8-bit wide boot EPROM must be connected to data bus pins 23-16 (DATA23-16). The lowest address pins of the processor should be connected to the EPROMs address lines. The EPROMs chip select should be connected to BMS and its output enable should be connected to RD. In a multiprocessor system, the BMS output is only driven by the ADSP-21161 bus master. This allows wire-ORing of multiple BMS signals for a single common boot EPROM. Systems can boot any number of ADSP-21161s from a single EPROM using the same code for each processor or differing code for each. During reset, the processors ACK line is internally pulled high with a 20k equivalent resistor and is held high with an internal keeper latch. It is not necessary to use an external pullup resistor on the ACK line during booting or at any other time. After the boot process loads 256 words into memory locations 0x4 0000 through 0x4 00FF, the processor begins executing instructions. Because most processor programs require more than 256 words of instructions and initialization data, the 256 words typically serve as a loading routine for the application. Analog Devices supplies loading routines (loader kernels) that can load entire programs. These routines come with the development tools. For more information on loader kernels, see the development tools documentation.
6-71
Host Processor Booting When host booting mode is configured, the ADSP-21161 enters slave mode after reset and waits for the host to download the boot program. After reset the ADSP-21161 processor goes into an idle state, identical to that caused by the IDLE instruction, with the program counter (PC) set to address 0x0004 0004. The parameter registers for the external port DMA channel 10 are initialized as shown in Table 6-22. Table 6-22. DMA Channel 10 Parameter Register Initialization for Host Booting
Parameter Register IIEP0 IMEP0 CEP0 CPEP0 GPEP0 EIEP0 EMEP0 ECEP0 Initialization Value 0x0004 0000 uninitialized (increment by 1 is automatic) 0x0100 (256 instruction words) uninitialized uninitialized uninitialized uninitialized uninitialized
Table 6-22 shows how the DMA channel 10 parameter registers are initialized at reset for host booting.The count register (CEP0) is initialized to 0x0100 for transferring 256 words to internal memory. The DMAC10 control register is initialized to 0x00000161. The default value sets up external port transfers as follows:
DEN
MSWF
6-72
I/O Processor
PMODE DTYPE
The external port DMA Channel 10 (DMAC10) becomes active following reset; it is initialized to 0x0000 0161. This enables the external port DMA and selects DTYPE for instruction words. The packing mode bits (PMODE) in the DMACx register are set to 8- to 48-bit packing. The host bus width (HBW) and word order (HMSWF) bits must be programmed in the SYSCON register. For each 48-bit word of boot image, an 8-bit host performs the following sequence of operations: 1. Assert HBR and CS. 2. Wait for HBG. After the host receives the host bus grant signal back from the ADSP-21161 processor, it can start downloading instructions or it can change the reset initialization conditions of the ADSP-21161 processor by writing to any of the IOP control registers. 3. Write the six subwords to the external port buffer, EPB0. This buffer corresponds to DMA channel 10. The host must use data pins DATA23-16. 4. Deassert CS and HBR. The processor samples the inactive HBR and allows a host transition cycle. The processor can access the bus for external memory initialization. For 16 and 32-bit host bus widths, the HBW bits in the SYSCON register must be modified. The host must use the data lines as follows: 16-bit host bus width = 3 subwords using data pins DATA31-16 32-bit host bus width = 2 subwords using data pins DATA47-16
6-73
PROM Booting When the EPROM boot mode is configured, the external port DMA Channel 10 (DMAC10) becomes active following reset; it is initialized to 0000 0561. This enables the external port DMA and selects DTYPE for instruction words. 8- to 48-bit packing is forced with least-significant-word first. The RBWS and RBAM fields of the WAIT register are initialized to perform asynchronous access and to generate seven wait states (eight cycles total) for the EPROM access in external memory space. Note that wait states defined for boot memory are applied to BMS-asserted accesses. Table 6-23 shows how the DMA channel 10 parameter registers are initialized at reset for EPROM. The count register (CEP0) is initialized to 0x0100 for transferring 256 words to internal memory. The external count register (ECEP0), which is used when external addresses are generated by the DMA controller, is initialized to 0x0600 (for example, 0x0100 words with six bytes per word). The DMAC10 control register is initialized to 0000 0561. The default value sets up external port transfers as follows:
DEN
= 1, external port enabled = 0, LSW first = 101, 8- to 48-bit packing = 1, three-column data
MSWF
PMODE DTYPE
6-74
I/O Processor
Table 6-23. DMA Channel 10 Parameter Register Initialization for EPROM Booting
Parameter Register IIEP0 IMEP0 CEP0 CPEP0 GPEP0 EIEP0 EMEP0 ECEP0 Initialization Value 0x0004 0000 uninitialized (increment by 1 is automatic) 0x0100 (256 instruction words) uninitialized uninitialized 0x0080 0000 uninitialized (increment by 1 is automatic) 0x0600 (256 words x 6 bytes/word)
At system start-up, when the processors RESET input goes inactive, the following sequence occurs: 1. The processor goes into an idle state, identical to that caused by the IDLE instruction. The program counter (PC) is set to address 0x0004 0004. 2. The DMA parameter registers for channel 10 are initialized as shown in Table 6-23. 3.
BMS
4. 8-bit Master Mode DMA transfers from EPROM to internal memory begin, on the external port data bus lines 23-16. 5. The external address lines (ADDR23-0) start at 0x0080 0000 and increment after each access. 6. The RD strobe asserts as in a normal memory access with seven wait states (eight cycles).
6-75
The processors DMA controller reads the 8-bit EPROM words, packs them into 48-bit instruction words, and transfers them to internal memory until 256 words have been loaded. The EPROM is automatically selected by the BMS pin; other memory select pins are disabled. The DMA external count register (ECEP0) decrements after each EPROM transfer. When ECEP0 reaches zero, the following wake-up sequence occurs: 1. The DMA transfers stop. 2. The External Port DMA Channel 10 interrupt (EP0I) is activated. 3. is deactivated and normal external memory selects are activated.
BMS
4. The processor vectors to the EP0I interrupt vector at 0x0004 0050. At this point the processor has completed its booting mode and is executing instructions normally. The first instruction at the EP0I interrupt vector location, address 0x0004 0050, should be an RTI (Return from Interrupt). This process returns execution to the reset routine at location 0x0004 0005 where normal program execution can resume. After reaching this point, a program can write a different service routine at the EP0I vector location 0x0004 0050.
6-76
I/O Processor
.VAR source[N]= 0x11111111, 0x22222222, 0x33333333, 0x44444444, 0x55555555, 0x66666666, 0x77777777, 0x88888888; .SECTION/DM .VAR dest[8]; /*___________start of DMA initialization routine___________________*/ .SECTION/PMpm_code; init_int_to_ext_memory_DMA: segsdram;
6-77
/* Clear DMA Control Register */ /* Write source address to IIEP0 */ /* Write internal address modify to IMEP0 */ /* Load internal DMA 10 Count Register */
/* Write destination address to EIEP0 register */ /* Write external address modify to EMEP0 */ /* Load external DMA 10 Count Register */
/* master mode, no packing mode [PMODE=100] */ /* transmit data from int>ext, enable EP DMA */ /* DMAC10=b#00000000000000000000010100000101; */ ustat1 = 0x00000000; bit set ustat1 MASTER | PMODE4 | TRAN | DEN; dm(DMAC10)=ustat1; bit set imask EP0I; /* Unmask external port buffer 0 DMA interrupt */ rts;
6-78
I/O Processor
.GLOBALint_to_ext_memory_chainDMA; .SECTION/DM 0x22222222, 0x33333333, 0x44444444, 0x55555555, 0x66666666, 0x77777777, 0x88888888; .VAR tcb[8] = N, /* ECx */ 1, /* EMx */ 0, /* EIx */ 0, /* GPx */ 0, /* CPx */ N, /* Cx */ 1, /* IMx */ 0; /* IIx */ dm_data;
6-79
segsdram;
/*________start of DMA initialization routine__________*/ .SECTION/PMpm_code; int_to_ext_memory_chainDMA: r0=source; dm(tcb + 7) = r0; r0=dest; dm(tcb + 2) = r0; r0=tcb + 7; r1= b#10000000000000000000; r0=r0 or r1; /* set PCI Bit */ dm(tcb + 4) = r0; /* Write tcb address to CP slot in tcb */ r0=0; dm(DMAC10)=r0; dm(DMAC10)=r0; */ r0=tcb + 7; dm(CPEP0) =r0; bit set imask EP0I; rts; /* Load CP register*/ /* Clear DMA Control Register */ /* dma enable, Chain enable,int>ext, master mode r0=b#00000000000000000000010100000111; /* Write Dest1 address to EI slot in tcb_a */ /* Write Source1 address to II tcb_a */
6-80
I/O Processor
The following bits control link port I/O processor modes. The control bits in the LCTL registers have a one cycle effect latency. Programs should not modify an active DMA channels bits in the LCTL register other than to disable the channel by clearing the LxDEN bit. For information on verifying a channels status with the DMASTAT register, see Using I/O Processor Status on page 6-121.
6-81
Some other bits in LCTL setup non-DMA link port features. For information on these features, see Setting Link Port Modes on page 9-5. Link Port DMA Channel Priority Rotation Enable. SYSCON Bit 20 (LDCPR). This bit enables (rotates if set, =1) or disables (fixed if cleared, =0) priority rotation between link port DMA channels 8 and 9. LinkExternal Port DMA Channel Priority Rotation Enable. SYSCON Bit 21 (PRROT). This bit enables (rotates if set, =1) or disables (fixed if cleared, =0) priority rotation between link port DMA channels 8 and 9 and external port DMA channels 10 to 13. Link Port assignment for LBUFx. LCTL Bits 9-0 and 23-22 correspond to link buffer 0. LCTL Bits 19-10 and 25-24 correspond to link buffer 1. Link Buffer Enable. LCTL Bits 0 and 10 (LxEN). This bit enables (if set, =1) or disables (if cleared, =0) the corresponding link buffer (LBUFx). Link Buffer DMA Enable. LCTL Bits 1 and 11 (LxDEN). This bit enables (if set, =1) or disables (if cleared, =0) DMA transfers for the corresponding link buffer (LBUFx). Link Buffer DMA Chaining Enable. LCTL Bits 2 and 12 (LxCHEN). This bit enables (if set, =1) or disables (if cleared, =0) DMA chaining for the corresponding link buffer (LBUFx). Link Buffer Transfer Direction. LCTL Bits 3 and 13 (LxTRAN). This bit selects the transfer direction (transmit if set, =1) (receive if cleared, =0) for the corresponding link buffer (LBUFx).
6-82
I/O Processor
Link Buffer Extended Word Size. LCTL Bits 4 and 14 (LxEXT). This bit selects the transfer extended word size (48-bit if set, =1) (32-bit if cleared, =0) for the corresponding link buffer (LBUFx). Programs must not change a buffers LxEXT setting while the buffer is enabled.
6-83
When LDCPR is set (rotating priority), high priority shifts to a new channel after each single-word transfer. The following steps illustrate this process: 1. At reset, link port channels have priority orderfrom high to low. 2. The link port performs a single transfer on channel 8. 3. The I/O processor rotates channel priority, changing it from 8 to 9. Even though the link port channel DMA priority can rotate, the interrupt priorities of all DMA channels are fixed. When a program uses fixed priority for the link port DMA channels, the I/O processor assigns the higher priority to channel 8 and the lower priority to channel 9. For a list of all channel assignments, see Table 6-1 on page 6-13. Programs can change the fixed priority order, assigning a different channel to the highest priority. The following example shows how to change the fixed priority sequence of the link port DMA channels: 1. Disable all link port DMA channels except the one immediately above the channel that is to have highest priority. 2. Select rotating priority by setting the LDCPR bit. 3. Cause at least one transfer to occur on the enabled channel. 4. Disable rotating priority and re-enable all of the link port DMA channels. The channel immediately after the selected channel now has the highest fixed priority.
6-84
I/O Processor
Programs can also rotate priority between the link port and external port DMA channels. The DMA Channel Priority Rotation Enable (PRROT) bit enables (rotates if set, =1) or disables (fixed if cleared, =0) priority rotation between link port DMA channels 8 and 9 and external port DMA channels 10 to 13. Rotating priority distributes link port and external port DMA channels access to the I/O bus. When channel priority is rotating, the processor arbitrates I/O bus access between contending link port and external port DMA channels, forcing the channel types to take turns. When channel priority is fixed, any link port DMA channel always has priority over any external port DMA channel when contending for I/O bus access.
6-85
Because link ports are bidirectional, the I/O processor uses the link Transmit select (LxTRAN) bit to determine the transfer direction (transmit or receive). Data flows from internal to external memory when in transmit mode. In transmit mode, the I/O processor fills the channels LBUFx buffer when the channels LxDEN bit is set. The Link Extended Word Size (LxEXT) bit determines how the DMA channel accesses columns of internal memory. If LxEXT is set, the data is 40- or 48-bit words, and the I/O processor makes 3-column internal memory accesses. If LxEXT is cleared, the data is 32-bit words, and the I/O processor makes 2-column internal memory accesses. For more information, see Memory Organization and Word Size on page 5-25. The LxEXT for the transfer overrides the Internal Memory Data Width (IMDWx) setting for the internal memory block.
6-86
I/O Processor
3. The processor or host (depending on the mode) writes the DMA channels parameter registers (IILBx, IMLBx, and CLBx) and LCTL control register, initializing the channel for receive (LxTRAN=0). 4. The processor or host (depending on the mode) sets (=1) the channels LxDEN bit enabling the DMA process. 5. The external device begins writing data to the LBUFx buffer through the link port. 6. The LBUFx buffer detects data is present and asserts an internal DMA request to the I/O processor. 7. The I/O processor grants the request and performs the internal DMA transfer, emptying the LBUFx buffer FIFO. In general, the following sequence describes a typical internal to external DMA operation where an external device transfers a block of data from the processors internal memory using a link port: 1. The processor or host (depending on the mode) assigns the DMA channels link buffer to a link port using the channels LABx bits in the LCTL register. 2. The processor or host (depending on the mode) enables the DMA channels link buffer, setting the buffers LxEN bit in the channels LCTL register. The processor or host selects a words size (32- or 40/48-bits) using the LxEXT in the channels LCTL register. 3. The processor or host (depending on the mode) writes the DMA channels parameter registers (IILBx, IMLBx, and CLBx) and LCTL control register, initializing the channel for transmit (LxTRAN=1). 4. The processor or host (depending on the mode) sets (=1) the channels LxDEN bit enabling the DMA process. Because this is a transmit, setting LxDEN automatically asserts an internal DMA request to the I/O processor.
6-87
5. The I/O processor grants the request and performs the internal DMA transfer, filling the LBUFx buffers FIFO. 6. The external device begins reading data from the LBUFx buffer (through the link port). 7. The LBUFx buffer detects that there is room in the buffer (it is now partially empty) and asserts another internal DMA request to the I/O processor, continuing the process.
6-88
I/O Processor
The processor determines the booting mode at reset from the EBOOT, LBOOT, and BMS pin inputs. When EBOOT=0, LBOOT=1, and BMS=1, the processor boots through the link port. For a list showing how to select different boot modes, see the Boot Memory Select pin description in the table Booting Modes on page 13-72. When using any of the power-up booting modes, address 0x0004 0004 should not contain a valid instruction since it is not executed during the booting sequence. A NOP or IDLE instruction should be placed at this location. In link port booting, the processor gets boot data from another processors link port or 4-bit wide external device after system powerup. The external device must provide a clock signal to the link port assigned to link buffer 0. The clock can be any frequency, up to a maximum of the processor clock frequency. The clocks falling edges strobe the data into the link port. The most significant 4-bit nibble of the 48-bit instruction must be downloaded first. Table 6-25 shows how the DMA channel 8 parameter registers are initialized at reset for EPROM booting. The count register (CLB0) is initialized to 0x0100 for transferring 256 words to internal memory. The LCTL register is overridden during link port booting to allow link buffer 0 to receive 48-bit data. Table 6-25. DMA Channel 8 Parameter Register Initialization For Link Port Booting
Parameter Register IILB0 IMLB0 CLB0 Initialization Value 0x0004 0000 uninitialized (increment by 1 is automatic) 0x0100 (256 instruction words)
6-89
Table 6-25. DMA Channel 8 Parameter Register Initialization For Link Port Booting (Contd)
Parameter Register CPLB0 GPLB0 Initialization Value uninitialized uninitialized
In systems where multiple processors are not connected by the parallel external bus, booting can be accomplished from a single source through the link ports. To simultaneously boot all of the processors, a parallel common connection should be made to link buffer 0 on each of the processors. If only a daisy chain connection exists between the processors link ports, then each processor can boot the next one in turn. Link buffer 0 must always be used for booting.
6-90
I/O Processor
____________________*/
6-91
r0 = 0; dm(LCTL) = r0; ustat1=dm(LCTL); /*LCTL REGISTER-->LBUF0=TX, LBUF1=RX, 2x CLK RATE, LBUF 0 & 1 ENABLED, LBUF 0 & 1 -> PORT 0 DMA Enabled, DMA Chain Enabled*/ bit clr ustat1 L0TRAN | LAB0 | LAB1 | L0CLKD0 | L1CLKD0; bit set ustat1 L1TRAN | L1EN | L0EN | L0CLKD1 | L1CLKD1 | L0DEN | L1DEN | L0CHEN | L1CHEN; dm(LCTL)=ustat1; r1 = 0x0003FFFF; r0 = txtcb_source + 7; r0 = r1 AND r0; r0 = BSET r0 BY 18; dm(txtcb_source + 4) = r0; dm(CPLB1) = r0; /* CPX register mask */ /* Get DMA chaining int. mem. ptr with tx buf address */ /* Mask the pointer */ /* Set the pci bit */ /* Write DMA transmit block chain pointer to TCB buffer */ /* Transmit blk chain ptr, init.LP1 DMA transfers */ r0 = rxtcb_dest + 7; r0 = r1 AND r0; /* Mask the pointer */ r0 = BSET r0 BY 18; /* Set the pci bit */ dm(rxtcb_dest + 4) = r0; dm(CPLB0) = r0; /* Write DMA receive block chain pointer to TCB buffer*/ /* Receive block chain pointer, Initiate LP0 DMA transfers */ wait: idle; jump wait;
6-92
I/O Processor
.var source[N]= 0X11111111, 0X22222222, 0X33333333, 0X44444444, 0X55555555, 0X66666666, 0X77777777, 0X88888888; .var dest[N]; .section/pm lp1i_svc; /*Link Port 1 Vector from ldf file*/
jump lpISR1;rti;rti;rti; .section/pm lp0i_svc; jump lpISR0;rti;rti;rti; /*Link Port 0 Vector from ldf file*/
6-93
/*_____________________Main Routine________________________*/ .section/pm seg_pmco;/*Main code section from ldf file*/ start: r0 = 0; DM(LCTL) = r0; r0=source; dm(IILB0)=r0; r0=dest; dm(IILB1)=r0; buffer*/ r0=@source; dm(CLB0)=r0; dm(CLB1)=r0; r0=1; dm(IMLB0)=r0; dm(IMLB1)=r0; ustat1 = dm(SYSCON); bit clr ustat1 BHD; dm(SYSCON) = ustat1; imask = 0; lirptl = 0; /*Enable Global,Link Port and Link Port Buffer 1 interrupt */ bit set imask LPISUMI; bit set lirptl LP1MSK | LP0MSK; bit set mode1 IRPTEN | CBUFEN; ustat1=dm(LCTL); /*LCTL Register-->LBUF1=TX, LBUF0=RX, 1/4x CCLK RATE, LBUF 0 & 1 ENABLED, LBUF 0 & 1 -> PORT 0 Link buffer 0 & 1 DMA Enabled*/ /*Disable Buffer Hang*/ /*Set DMA modify (stride) to 1*/ /*Set DMA count to length of data buffers*/ /*Set DMA rx index to start of destination /*Set DMA tx index to start of source buffer*/
6-94
I/O Processor
bit clr ustat1 L1TRAN | L0CLKD0 | L1CLKD0 | LAB0 | LAB1; bit set ustat1 L0TRAN | L1EN | L0EN | L0CLKD1 | L1CLKD1 | L0DEN | L1DEN; dm(LCTL)=ustat1; wait: idle; jump wait; lpISR0: rti; lpISR1: rti;
6-95
I/O Processor
16-bit to 32-bit Word Packing Enable. SPCTLx Bit 9 (PACK). This bit enables (if set, =1) or disables (if cleared, =0) 16- to 32-bit word packing. Serial Port DMA Enable. SPCTLx Bit 18 (SDEN_A) and Bit 20 (SDEN_B).These bits enable (if set, =1) or disable (if cleared, =0) the serial ports A or B channel DMA. Serial Port DMA Chaining Enable. SPCTLx Bit 19 (SCHEN_A) and Bit 21 (SCHEN_B). These bits enable (if set, =1) or disables (if cleared, =0) the serial ports A or B channel DMA chaining.
6-97
S P C T L0 (0 x0 1 c0 ) S P C T L1 (0 x0 1 e 0 ) S P C T L2 (0 x0 1 d 0 ) S P C T L3 (0 x0 1 f0 )
DXS_A
D X A D a ta B u ffe r S ta tu s 1 1 = fu ll, 1 0 = p a rtia lly fu ll, 0 0 = e m p ty
D S P S e ria l M o d e
31 30 29 28 27 26 25 24 23 22 21 20 0 0 0 0 0 0 0 0 0 0 0 0 19 18 17 16 0 0 0 0
LFS
LAFS
L a te F S 0 = e a rly F S , 1 =la te F S
D ER R _A
D X A E rro r S ta tu s (s tic k y) D D IR = 1 ,tra n sm it u n d e rflo w s ta tu s D D IR = 0 , re ce ive o v e rflo w sta tu s
SDEN_A
S P O R T D M A e n a b le A c h a n n e l 1 =e n a b le , 0 =d isa b le
D X S_B *
D X B D a ta B u ffe r S ta tu s 1 1 = fu ll, 1 0 = p a rtia lly fu ll ,0 0 =e m p ty
SCHEN_A
D M A c h a in in g e n a b le A c h a n n e l 1 =e n a b le , 0 =d isa b le
DE R R_B *
D X B E rro r S ta tu s (s tic k y)
SDEN_B
S P O R T D M A e n a b le B c h a n n e l 1 =e n a b le , 0 =d isa b le
D D IR **
D a ta D ire c tio n C o n tro l 1 = A c tiv e T ra n s m it B u ffe rs T X n B /T X n A 0 = E n a b le R e ce ive B u ffe rs R X n B /R X n A
SCHEN_B
D M A c h a in in g e n a b le B c h a n n e l 1 =e n a b le , 0 =d isa b le
S PE N _B
S P O R T E n a b le B 1 =e n a b le , 0 =d isa b le * S ta tu s is R e a d -o n ly ** D o n o t re a d /w rite fro m /to in a c tive R X n /T X n b u ffe rs 15 14 0 0 13 12 0 0 11 10 0 0 9 0 8 0 7 0 6 0 5 0 4 0 3 0 2 0 1 0 0 0
FS_B O TH
1 = is su e W S o n ly if d a ta is p re s e n t in b o th T x 0 = is su e W S if d a ta is p re s e n t in e ith e r T x
D IT F S
D a ta In d e p e n d e n t tx F S (if D D IR = 1 ) 1 =d a ta in d e p e n d e n t, 0 = d a ta d e p e n d e n t
SP E N_A
S P O R T E n a b le A (1 = e n a b le , 0 = d is a b le )
IF S
In te rn a lly ge n e ra te d FS 1 = in te rn a l F S , 0 =e x te rn a l F S
DTYPE
D a ta typ e 0 0 = righ t-ju stify; fill M S B w ith 0 s 0 1 = righ t-ju stify; s ig n e x te n d M S B 1 0 = c o m p a n d m u -la w 1 1 = c o m p a n d A -la w
FS R
F S re qu ire m e n t 1 = F S re qu ire d , 0 = FS n o t re q u ire d
SENDN
E n d ia n w o rd fo rm a t 0 = M S B firs t, 1 = L S B firs t
CKRE
C lo ck e d ge fo r d a ta F ra m e S yn c s a m p lin g o r d rivin g (1 =ris in g e d g e , 0 = fa llin g e d g e )
SLEN
S e ria l W o rd L e n g th -1
OPMODE
S P O R T O p e ra tio n M o d e 0 =D S P s e ria l m o d e /m u ltich a n n e l m o d e 1 =I 2 S m o d e
PACK
1 6 /3 2 p a c k in g 1 = p a c kin g, 0 = n o p a c k in g
IC L K
In te rn a lly ge n e ra te d S C L K 1 = in te rn a l c lo c k, 0 = e x te rn a l c lo c k
Figure 6-13. SPCTLx Register DSP Serial Mode If the serial word length is 16-bits or smaller, the serial port can pack two of these words into the serial port buffer. The 16-bit to 32-bit word Packing Enable (PACK) bit can enable this packing because the I/O processor performs 32-bit transfers between the serial port buffers and processor memory.
6-98
I/O Processor
In addition to selecting the endian, length, and packing modes for serial port processor transfers, programs must indicate the type of data in the transfer, using the Data Type (DTYPE) bit. For more information, see Serial Port Channel Transfer Modes on page 6-99.
6-99
Because serial port buffers are bidirectional, the I/O processor does not need an indicator to determine the transfer direction (transmit or receive). Data flows from internal to external devices using a transmit (TXx) buffer. When transmitting serial data as DMA, the I/O processor fills the channels TXx buffer when the channels SDEN bit is set.
6-100
I/O Processor
5. The RXx buffer detects data is present and asserts an internal DMA request to the I/O processor. 6. The I/O processor grants the request and performs the internal DMA transfer, emptying the RXx buffer. In general, the following sequence describes a typical internal to external DMA operation where an external device transfers a block of data from the processors internal memory using a serial port: 1. The processor or host (depending on the mode) enables the DMA channels serial port, setting the ports SPEN bit in the ports SPCTLx register. The processor or host selects a words size using the DTYPE in the ports SPCTLx register. The DDIR bit is set (=1) to enable the serial interface as a transmitter. The program activates the TX buffers allowing data to transmit out of the SPORT A and B data pins. 2. The processor or host (depending on the mode) writes to the DMA channels parameter registers (IIx, IMx, and Cx) and SPCTLx control register, initializing the channel for transmit. 3. The processor or host (depending on the mode) sets (=1) the channels SDEN bit enabling the DMA process. Because this is a transmit, setting SDEN_A or SDEN_B automatically asserts an internal DMA request to the I/O processor. 4. The I/O processor grants the request and performs the internal DMA transfer, filling the TXx buffer. 5. The external device begins reading data from the TXx buffer through the serial port. 6. The TXx buffer detects that there is room in the buffer because it is now partially empty and asserts another internal DMA request to the I/O processor, continuing the process.
6-101
When programming the serial port channel (A or B) as a transmitter only the corresponding TXA and TXB become active, while the receive buffers RXA and RXB remain inactive. Similarly, when the SPORT channel A and B is programmed as receive only the corresponding RX0A and RX0B is activated. When performing core driven transfers, programs must write to the proper buffer depending on the direction setting in the SPCTL register (DDIR). For DMA-driven transfers the serial port logic performs the data transfer from internal memory to/from the appropriate buffer depending on the DDIR bit setting. If the inactive SPORT data buffers are read or written to by core while the port is already being enabled, the SPORT does not operate correctly. If, for example, the SPORT is programmed to be a transmitter, while at the same time, the core reads from the receive buffer of the same SPORT, the core hangs, just as it would if it was reading an empty buffer which was currently active. This locks up the core permanently until the SPORT is reset. The program must set the direction bit along with serial port enable and DMA enable bits before initiating any operations on the SPORT data buffers. If the processor operates on the inactive transmit or receive buffers while the SPORT is enabled it can cause unpredictable results.
6-102
I/O Processor
6-103
ustat3=dm(SYSCON); bit clr ustat3 BHD; dm(SYSCON)=ustat3; bit set imask SP0I |SP2I; bit set mode1 CBUFEN | IRPTEN; /*Unmask SPORT 0 & 2 Interrupts*/ /*Enable Circ Buffers & Interupts*/ r0 = 0x00001000; /*Set the SPL bit in the SPxxMCTL register to enable loopback*/ dm(SP02MCTL)=r0; r0 = 0x0; dm(DIV0) = r0; r0 = 0x000c21f1; /*Set bits SPEN_A, SLEN0-4, FSR--enable the A channel, set the word length to 32 bits, require frame synch, and enable DMA and DMA Chaining.*/ dm(SPCTL0)=r0; r0=0x00270004; /*TCLKDIV=[FCCLK(96Mhz)/2xFSCLK((19.2Mhz)]-1=0x0004*/ /*TFSDIV=[FSCLK(9.6Mhz)/TFS(.24Mhz)]-1=0x0027*/ dm(DIV2)=r0; r0=0x20c65f1; /*Set bits SPEN_A, SLEN0-4, ICLK, IFS, FSR, DDIR--enable the A channel, set the word length to 32 bits, generate internal framesynch and clock, require frame synch, set for transmit, and enable DMA and DMA Chaining.*/ dm(SPCTL2)=r0; r1=0x0003FFFF; /*CPx register mask*/ /*Externally generated clock and framesync*/ /*Disable Buffer Hang*/
6-104
I/O Processor
/*Get DMA chaining memory pntr containing tx buff address*/ /*Mask the pointer*/ /*Set the PCI bit*/ /*Write DMA transmit block chain pntr to TCB buffer*/ /*Transmit block chain pointer, init SP2 DMA transfers*/
r0=rxtcb+7; r0=r1 AND r0; r0=BSET r0 by 18; dm(rxtcb+4)=r0; dm(CP0A)=r0; wait: idle; jump wait; IRQ: rti; /*Initiate SP0 DMA transfers*/
6-105
.section/dm seg_dmda; .var source[N]= 0X11111111, 0X22222222, 0X33333333, 0X44444444, 0X55555555, 0X66666666, 0X77777777, 0X88888888; .var dest[N]; .section/pm sp0i_svc; jump IRQ; rti;rti;rti; .section/pm sp2i_svc; jump IRQ; rti;rti;rti; /*-----------------Main Routine----------------------------*/ .section/pm seg_pmco; start: r0=source; dm(II2A)=r0; r0=dest; dm(II0A)=r0; r0=@source; dm(C0A)=r0; dm(C2A)=r0; r0=1; dm(IM0A)=r0; dm(IM2A)=r0; ustat3=dm(SYSCON); bit clr ustat3 BHD; dm(SYSCON)=ustat3; bit set imask SP0I |SP2I; /*Unmask Sport 0&2 interrupts*/ /*Disable Core Buffer Hang*/ /*Set DMA modify (stride) to 1.*/ /*Set DMA count to length of data buffers*/ /*Set DMA rx index to start of dest buffer*/ /*Set DMA tx index to start of source buffer*/
6-106
I/O Processor
r0 = 0x00001000; /*Set the SPL bit in the SPxxMCTL register to enable loopback*/ dm(SP02MCTL)=r0; r0 = 0x0; /*Externally generated clock and framesync*/
dm(DIV0) = r0; r0 = 0x000421f1; /*Set bits SPEN_A, SLEN=32, FSR--enable the A channel, set the word length to 32 bits, and require frame synch.*/ dm(SPCTL0)=r0; r0=0x00270004; /*TCLKDIV=[FCCLK(96Mhz)/2xFSCLK((19.2Mhz)]-1=0x0004*/ /*TFSDIV=[FSCLK(9.6Mhz)/TFS(.24Mhz)]-1=0x0027*/ dm(DIV2)=r0; r0=0x20465f1; /*Set bits SPEN_A, SLEN=32, ICLK, IFS, FSR, DDIR--enable the A channel, set the word length to 32 bits, generate internal framesynch and clock, require frame synch, and set for transmit.*/ dm(SPCTL2)=r0; wait: idle; jump wait; IRQ: rti;
6-107
6-108
I/O Processor
The following bits in SPICTL setup DMA SPI port features: SPI Port Enable. SPICTL Bit 0 (SPIEN). This bit enables (if set, =1) or disables (if cleared, =0) the SPI port. Data Format. SPICTL Bits 6 (DF). This bit selects the data format. When set (=1), the MSB is sent/received first. When cleared (=0), the LSB is sent/received first. SPI Word Length Select. SPICTL Bits 8-7 (WL). These bits select the word length. Word sizes can be 8-bit (WL = 00), 16-bit (WL = 01) or 32-bit (WL = 11). Word Packing Enable. SPICTL Bit 28 (PACKEN). This bit enables (if set, =1) 8- to 32-bit packing or (if cleared, =0) disables the packing. If this bit is enabled, the receiver packs the received byte whereas the transmitter unpacks the data before sending it. For more information on packing formats, see SPI Word Packing on page 11-24. This bit should be 1 only in 8-bit data word length (WL=00). SPI Port Receive DMA Enable. SPICTL Bit 27 (RDMAEN). This bit enables (if set, =1) or disables (if cleared, =0) DMA transfers from the receive data buffer. At SPI boot this bit is set to 1 to enable the booting process through the SPI port. SPI Port Transmit DMA Enable. SPICTL Bit 13 (TDMAEN). This bit enables (if set, =1) or disables (if cleared, =0) DMA transfers to the transmit data buffer. At SPI boot this bit is 0.
6-109
SPICTL
0xB4
GM
Fetch/Discard Incoming RXB data when RXB full 0=Discard incoming data 1=Overwrite with new data
31 30 29 28 27 26 25 24 23 22 21 20 0 0 0 0 0 0 0 0 0 0 0 0
19 18 17 16 0 0 0 0
FLS1
FLAG1 Slave Device Select 1=Enable, 0=Disable
SENDLW
Send Zero/Repeat Byte When TXB Empty 0=Send zero, 1=Repeat last data
FLS2
FLAG2 Slave Device Select 1=Enable, 0=Disable
SGN
Sign Extend Data 0=no sign extend, 1=sign extend
FLS3
FLAG3 Slave Device Select 1=Enable, 0=Disable
PACKEN
8-bit Packing Enable 0=no packing, 1=8 to 32-bit packing
NSMLS
Non-Seamless operation 0=no delay, 1=delay before next word starts
RDMAEN
Receive DMA Enable 1=Enable, 0=Disable
OPD
Open Drain Output Enable for Data Pins 0=Normal, 1=Open Drain
DCPH0
Deselect SPIDS in CPHASE =0 (master mode only, NSMLS bit=1) 0=No SPI device select 1=Deselects slaves between successive transfers
DMISO
Disable MISO Pin (Broadcast) 0=MISO Enabled, 1=MISO Disabled
15 14 13 12 11 10 1 0 0 0 0 1
9 1
8 1
7 0
6 1
5 0
4 0
3 0
2 0
1 0
0 0 SPIEN
SPI System Enable 1=enable, 0=disable
FLS0
FLAG0 Slave Device Select 1=Enable, 0=Disable
PSSE
Programmable Slave Select Enable 0=Disable, 1=Enable
SPRINT
SPI RX Buffer Interrupt Enable 1=enable SPI IRQ on RXB empty, 0=disable
TDMAEN
Transmit DMA Enable 1=Enable, 0=Disable
SPTINT
SPI TX Buffer Interrupt Enable 1=enable SPI IRQ on TXB not full, 0=disable
BAUDR
Baud Rate CCLK / (2**(2 + BR))
MS
Master/Slave Mode Bit 0=SPI slave device, 1=SPI Master Device
WL
Word Length 00=8 bits, 01=16 bits, 11=32 bits, 10=RESERVED
CP
Clock polarity 0=SPICLK active high, low in idle state 1=SPICLK active low, high in idle state
DF
Data Format 0=LSB sent / received first 1=MSB sent / received first
CPHASE
Clock phase 0=SPICLK toggles at middle of 1st data bit 1=SPICLK toggles at beginning of 1st data bit
Figure 6-14. SPICTL Register enabled, data in SPITX is automatically loaded into the transmit shift register. After a word is received completely in the receive shift register, it is automatically transferred to the SPIRX. The data in SPIRX is moved into internal memory by the DMA controller All DMA transfers are 32-bit words. To disable the SPI port, clear the SPIEN bit in the SPICTL register,
6-110
I/O Processor
which also clears the status of the buffers in the SPISTAT register. The bits in the SPI control register (SPICTL) are shown in Figure A-38 on page A-120. If the SPI port is enabled without enabling DMA, the SPI port is either in single-word, interrupt-driven data transfer mode (if the corresponding interrupt enable bits in the SPICTL is set) or is in core-driven data transfer mode. The software must do the data transfers to the SPI data buffers. For more information on the different SPI transfer modes, see Master Mode Operation on page 11-25. For more information on transfer status, see Using I/O Processor Status on page 6-121. The SPI allows independent settings for the three transfer format features: bit order, word length, and word packing. The SPI port buffer has a SPI data format (DF) bit, which when cleared (=0) can transmit data as little endian words (LSB first) to or from little endian devices. This bit selects big endian words (MSB first, if set, =1) or little endian words (LSB first, if cleared, =0). The SPI Word Length (WL) bit field selects the transfer word length. Word sizes can be 8-bit (WL = 00), 16-bit (WL = 01) or 32-bit (WL = 11). If the SPI word length is 8-bits or smaller, the SPI port can pack two of these words into the SPI port data buffer. The 8-bit to 32-bit Word Packing Enable (PACKEN) bit can enable this packing because the I/O processor performs 32-bit transfers between the SPI port buffer and processor memory. If this bit is enabled, the transmitter unpacks the data before sending it, while the receiver packs the received byte. For more information on packing formats, see SPI Word Packing on page 11-24. This bit should be 1 only in 8-bit data word length (WL= 00).
6-111
6-112
I/O Processor
In general, the following sequence describes a typical internal to external DMA operation where an external device transfers a block of data from the processors internal memory using a serial port: 1. The processor or host (depending on the mode) enables the DMA channels serial port, setting the ports SPIEN bit in the ports SPICTL register. The processor or host selects a words size using the WL bits in the ports SPICTL register. 2. The processor or host (depending on the mode) writes the DMA channels parameter registers (IISTx, IMSTx, and CSTx) and SPICTL control register, initializing the channel for transmit. 3. The processor or host (depending on the mode) sets the channels TDMAEN bit to 1 enabling the DMA process. Because this is a transmit, setting TDMAEN automatically asserts an internal DMA request to the I/O processor. 4. The I/O processor grants the request and performs the internal DMA transfer, filling the SPITX buffer. 5. The external device begins reading data from the SPITX buffer through the SPI port. 6. The SPITX buffer detects that there is room in the buffer because it is now partially empty and asserts another internal DMA request to the I/O processor, continuing the process.
6-113
During the boot process the program loads 256 words into memory locations 0x40000 through 0x400FF. The processor subsequently begins executing instructions. Because most programs require more than 256 words of instructions and initialization data, the 256 words typically serve as a loading routine for the application. Analog Devices supplies loading routines (loader kernels) that load an entire program through the selected port. These routines come with the development tools. For more information on loader kernels, see the development tools documentation. For SPI booting the ADSP-21161, the Program sequencer automatically unmasks the DMA channel 8 interrupt, initializing the SPICTL register to 0x0A001F81 and IMASK register to 0x00004003. The processor determines the booting mode at reset from the EBOOT, LBOOT, and BMS pin inputs. When EBOOT=0, LBOOT=1, and BMS=0, the processor boots through the SPI Port. For a list showing how to select different boot modes, see the Boot Memory Select pin description in the table Booting Modes on page 13-72. When using any of the power-up booting modes, address 0x0004 0004 should not contain a valid instruction since it is not executed during the booting sequence. A NOP or IDLE instruction should be placed at this location. In SPI Port Booting, the processor gets boot data from another processors SPI port or another SPI compatible device after system powerup. Table 6-27 on page 6-115 shows how the DMA channel 8 parameter registers are initialized at reset for EPROM booting. The count register (CSRX) is initialized to 0x0180 for transferring 256 words to internal memory. The SPI Control Register (SPICTL) is configured to 0x0A001F81 upon reset during on SPI boot. The default value sets up SPI transfers as follows:
SPIEN MS
= 1, SPI enabled
= 0, slave device
6-114
I/O Processor
DF WL
= 0, LSB first = 11, 32-bit SPI receive shift register word length = 1111 (at 100 MHz, SPICLK = 763 Hz) = 1, MISO disabled
This configuration sets up the SPIRX register for 32-bit serial transfers. The SPIRX DMA channel 8 parameter registers are configured to DMA in 0x180 number of 32-bit words into internal memory normal word address space starting at 0x40000. Once the 32-bit DMA transfer completes, the data is then accessed as 3-column 48-bit instructions. The processor executes a 256 (0x100) word loader kernel upon completion of the 32-bit, 0x180 word DMA. Note that for 16-bit SPI hosts, shift two words into the 32-bit receive shift register before a DMA transfer to internal memory occurs. For 8-bit SPI hosts, shift four words into the 32-bit receive shift register before a DMA transfer to internal memory occurs. Table 6-27. DMA Channel 8 Parameter Register Initialization for SPI Port Booting
Parameter Register IISRX IMSRX CSRX GPSRX Initialization Value 0x0004 0000 uninitialized (increment by 1 is automatic) 0x0180 (256 instruction words) uninitialized
6-115
*/
/* vector code for receive interrupt vector from ldf file .section/pm spiri_svc; nop; nop; jump finish; nop;
*/
6-116
I/O Processor
*/
.var spi_tx_buf[size] =0x11111111,0x22222222, 0x33333333, 0x44444444, 0x55555555,0x66666666, 0x77777777, 0x88888888, 0x99999999,0xaaaaaaaa; /* receive buffer
*/
.var spi_rx_buf[size]; .section/pm seg_pmco; start: r0=spi_tx_buf; dm(IISTX)=r0; r0=@spi_tx_buf; dm(CSTX)=r0; r0=1; /* configure modify register for SPI transmit /* configure count register for SPI transmit /* configure index register for SPI transmit
*/
*/
*/
dm(IMSTX)=r0; r0=spi_rx_buf; dm(IISRX)=r0; r0=@spi_rx_buf; dm(CSRX)=r0; r0=1; dm(IMSRX)=r0; /* configure modify register for SPI receive /* configure count register for SPI receive /* configure index register for SPI receive
*/
*/
*/
ustat1 = dm(SYSCON); bit clr ustat1 BHD; dm(SYSCON) = ustat1; bit set LIRPTL SPIRMSK ; /* enable SPI RX interrupts */ /* Clear Buffer Hang Disable in SYSCON
*/
6-117
bit set MODE1 IRPTEN | CBUFEN; bit set IMASK LPISUMI; r0=0x00000000; dm(SPICTL)=r0; ustat1=dm(SPICTL); /*
*/
*/ */
*/
| BAUDR5 | SGN | GM | RDMAEN | TDMAEN; /* enable spi port, spitx and spirx interrupts, master device spiclk toggles at beginning of first data transfer bit, MSB first format, 32 bit word length, baud rate sign extend, get more new data even if receive buffer is full enable transmit and receive dma */ dm(SPICTL) = ustat1; wait: idle; jump wait; finish:rti; /* start transfer by configuring SPICTL */
6-118
I/O Processor
*/
.SECTION/pm seg_rth; Reserved_1: rti; nop; nop; nop; Chip_Reset: idle; jump start; nop; nop; .SECTION/DMseg_dmda; .var spi_tx_buf[size] =0x11111111, 0x22222222, 0x33333333, 0x44444444, 0x55555555, 0x66666666, 0x77777777, 0x88888888, 0x99999999, 0xaaaaaaaa; .var spi_rx_buf[size]; .SECTION/PMseg_pmco; .GLOBAL SPI_register_init; .GLOBALSPI_lpbk_irq_test; start: ustat1 = dm(SYSCON); bit clr ustat1 BHD; dm(SYSCON) = ustat1; bit set mode1 CBUFEN; SPIDMA_tx: r0=spi_tx_buf;dm(IILB1)=r0; /* set circular buffer enable /* Clear Buffer Hang Disable in SYSCON
*/
*/
6-119
r0=@spi_tx_buf;dm(CLB1)=r0; r0=1;dm(IMLB1)=r0; SPIDMA_rx: r0=spi_rx_buf;dm(IILB0)=r0; r0=@spi_rx_buf;dm(CLB0)=r0; r0=1;dm(IMLB0)=r0; r0=0x00000000;dm(SPICTL)=r0; /* Initially clear SPI control reg.*/ ustat1=dm(SPICTL); bit set ustat1 SPIEN|SPRINT|SPTINT|MS|CPHASE|DF|WL32|BAUDR5|PSSE|DCPH0|SGN|GM|R DMAEN|TDMAEN; bit clr ustat1 CP|FLS0|FLS1|FLS2|FLS3|SMLS|DMISO|OPD|PACKEN|SENDLW; dm(SPICTL) = ustat1; bit set LIRPTL SPIRMSK | SPITMSK; interrupts bit set MODE1 IRPTEN; wait: jump start; /* Allow global interrupts /* enable SPI TX & SPI RX */
*/
6-120
I/O Processor
The DMA controller of ADSP-21161 processor maintains the status information of the channels in a read only register, DMASTAT. Bits 0-13 indicate which DMA channel is active; bits 16-29 indicate the chaining status of the channels. Bit definitions for the DMASTAT register are defined in Table 6-28 and in Figure 6-15. Bit definitions for the SPISTAT register are defined in Table A-29 on page A-115.
6-121
DMASTAT
0x37
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 0 0
DMA13CHST
Channel 13(EPB3) Chaining Status
DMA0CHST
Channel 0 (RX0A/TX0A) Chaining Status
DMA12CHST
Channel 12 (EPB2) Chaining Status Channel 11 (EPB1) Chaining Status Channel 10 (EPB0) Chaining Status
DMA2CHST
Channel 2 (RX1A/TX1A) Chaining Status
DMA4CHST DMA6CHST
DMA8CHST
Channel 8 (LBUF0) Chaining Status
DMA5CHST
Channel 5 (RX2B/TX2B) Chaining Status
DMA9CHST DMA1CHST
DMA3CHST
Channel 3 (RX1B/TX1B) Chaining Status 15 14 13 12 11 10 0 0 9 8 7 6 5 4 3 2 1 0
DMA13ST
Channel 13 (EPB3) Status
DMA0ST
Channel 0 (RX0A/TX0A) Status Channel 2 (RX1A/TX1A) Status
DMA12ST
Channel 12 (EPB2) Status
DMA2ST DMA4ST
DMA11ST
Channel 11 (EPB1) Status
DMA10ST
Channel 10 (EPB0) Status
DMA6ST
Channel 6 (RX3A/TX3A) Status
DMA7ST
Channel 7 (RX3B/TX3B) Status
DMA8ST
Channel 8 (LBUF0/SPIRX) Status
DMA5ST
Channel 5 (RX2B/TX2B) Status
DMA9ST
Channel 9 (LBUF1/SPITX) Status
DMA3ST
Channel 3 (RX1B/TX1B) Status
DMA1ST
Channel 1 (RX0B/TX0B) Status
* Channel Active Status: 1=Active [ transferring data or waiting to transfer current block, and not transferring TCB ] 0= Inactive [DMA transter complete, or in TCB chain loading] ** Channel Chaining Status: 1=Chaining is Enabled and currently transferring TCB, or is Pending to transfer TCB, 0 = Chaining Disabled Status does not change on the master ADSP-21161 processor during external port DMA until the external portion is completed (for example, the EPBx buffers are emptied). If in chain insertion mode (DEN=0, CHEN=1), then channel chaining status will never go to a 1. Therefore, test channel status to see if it is ready so that your program can rewrite the chain pointer (CPx) register.
6-122
I/O Processor
6-123
Channel Active status: 1-active, 0 = inactive Channel Chaining status: 1 = chaining enabled/pending, 0 = chaining disabled
The I/O processor reports on DMA in progress, DMA complete, or DMA channel not ready status as follows: All DMA channels can be active or inactive. If a channel is active, a DMA is in progress on that channel. The I/O processor indicates the active status by setting the channels bit in the DMASTAT register. When an unchained (single-block) DMA process reaches completion on any DMA channel, the I/O processor generates that DMA channel's interrupt. It does this by setting the DMA channel's interrupt latch bit in the IRPTL or LIRPTL register. The DMA process is complete when the count in CEPx=0 (for Slave mode and Handshake modes) or when the count in ECEPx=0 (for External Handshake mode) or when the count in CEPx=0 and ECEPx=0 (for Master mode and Paced Master mode). When a DMA process in a chained DMA sequence reaches completion (the count in Cx=0 or CEPx=0) on any DMA channel, the I/O processor generates an interrupt if the PCI bit in the channels CPx register is set. The only exception is external-handshake mode.
6-124
I/O Processor
The I/O processor also generates that DMA channels interrupt when the last block in a chained DMA reaches completion regardless of the PCI setting. When a DMA channels buffer not being used for a DMA process, the I/O processor can generate an interrupt on single word writes or reads of the buffer. This interrupt service differs slightly for each port. For more information on single-word interrupt-driven transfers, see External Port Status on page 6-127, Link Port Status on page 6-131, and Serial Port Status on page 6-135. Using the DMA Channel Status Register (DMASTAT), programs can check which DMA channels are performing a DMA or chained DMA. For each channel, the I/O processor sets the channels active status bit if DMA for that channel is enabled and a DMA sequence is in progress on that channel. The I/O processor sets the channels chaining status bit if a chained DMA sequence is in progress or pending on that channel. There is a one cycle latency between a change in DMA channel status and the status update in the DMASTAT register. As an alternative to interrupt-driven DMA, programs can poll the DMASTAT register to determine when a single DMA sequence is done. To poll channel status, programs read DMASTAT. If both status bits for the channel are cleared, the DMA sequence has completed. If chaining is enabled on a DMA channel, programs should not use polling to determine channel status. Polling could provide inaccurate information in this case because the next DMA sequence might be under way by the time the polled status is returned. During interrupt-driven DMA, programs use the interrupt mask bits in the IMASK and LIRPTL registers to selectively mask DMA channel interrupts that the I/O processor latches into the IRPTL and LIRPTL registers.
6-125
The I/O processor only generates a DMA complete interrupt when the channels count register decrements to zero as a result of actual DMA transfers. Writing zero to a count register does not generate the interrupt. A channel interrupt mask in IMASK and IRPTL masks out DMA complete interrupts for a channel, but other types of interrupt masking are also available. These other types of interrupt masking include: By clearing a channels PCI bit during chained DMA, programs mask the DMA complete interrupt for a DMA processes within a chained DMA sequence. By masking the LPISUM interrupt, programs mask out the logical Oring of link port interrupt status. By masking the LSRQ interrupt, programs mask out link port service requests to link ports that do not have an assigned link buffer. These lower levels of interrupt masking let programs limit some of the conditions that can cause DMA channel interrupts. Each DMA channel has its own interrupt. Although the external port and link port channel access priority can rotate, the interrupt priorities of all DMA channels are fixed. In processor systems using I/O processor interrupts, an external device may need to change the processors interrupt mask. This task presents a challenge because the IMASK register is not memory-mapped and is not directly accessible to external devices through the external port. To read or write IMASK through the external port, programs can set up an interrupt vector routine to handle this task. The VIRPT vector interrupt register may be used for this task. The I/O processor can also generate non-DMA single-word interrupts for I/O port operations that do not use DMA. In this case, the I/O processor generates a DMA interrupt when data becomes available at the receive
6-126
I/O Processor
buffer or when the transmit buffer does not have new data to transmit. Generating DMA interrupts in this fashion lets programs implement interrupt-driven I/O under control of the processor core. Care is needed because multiple interrupts can occur if several I/O ports transmit or receive data in the same cycle.
Table A-24 on page A-80 and Figure 6-16 lists all the bits in the DMACx register . For a description of the IOP registers, see the Registers appendix of this manual. The following bits influence external port buffer status: Host Packing Status. SYSTAT bits 24-22 (HPS). These bits indicate the hosts packing status. External Port Packing Status. DMACx Bits 23-21(PS). These bits indicate the corresponding FIFO buffers packing status. Table 6-29 shows the available bit setting.
6-127
Single-Word Interrupt Enable. DMACx Bit 12 (INTIO). This bit enables (if set, =1) or disables (if cleared, =0) single-word, non-DMA, interrupt-driven transfers for the corresponding external port FIFO buffer (EPBx). To avoid spurious interrupts, programs must not change a buffers INTIO setting while the buffer is enabled. Flush DMA Buffers and Status. DMACx Bit 14 (FLSH). This bit flushes (when set, =1) settings for the corresponding external port FIFO buffer (EPBx). External Port FIFO Buffer Status. DMACx bit 17-16 (FS). These bits indicate the corresponding external port FIFO buffers status. Table 6-30 shows the available setting.
DMAC10 DMAC11 DMAC12 DMAC13
0x1c 0x1d 0x1e 0x1f
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
PS
Ext Port EPBx FIFO Buffer Packing Status (read-only) 000=packing complete 001=1st stage pack/unpack 010=2 nd stage pack/unpack 011=3rd stage 100 = 5 th stage of 8 to 48 -bit packing 101=110=111= reserved 15 14 13 12 11 10 9 0 0 0 0 0 0 0 8 0 7 0 6 0 5 0 4 0 3 2 1 0
FS
Ext. Port FIFO Buffer Status (read-only) 00=buffer empty 01=buffer-not-full 10=buffer-not - empty 11=buffer full
0 0
0 0
FLSH
INTIO
Single Word Interrupts for EPBx FIFO Buffers 1=enable single - wd non -DMA interrupt-driven xfers 0=disabled, FIFO fully enabled
6-128
I/O Processor
The HPS bits in the SYSTAT and PS bits in the DMACx registers indicate an external buffers packing status. These bits are read-only, and the processor clears these bits when DEN is cleared (changes from 1 to 0). Table 6-29. Processor (PS) and Host (HPS) Packing Status
PS or HPS 000 001 010 011 100 Packing Status packing complete (6th stage of 8- to 48-bit packing, 4th stage of 8- to 32-bit packing, etc.) 1st stage 2nd stage 3rd stage fifth stage of 8/48
The FS bits in the DMACx registers indicate an external buffers FIFO status. These bits are read-only. The processor clears these bits when DEN is cleared (changes from 1 to 0). Table 6-30. External Port Buffer FIFO Status
FS 00 01 10 11 FIFO Buffer Status buffer empty buffer-not-full buffer-not-empty buffer full
For transmit (TRAN=1), buffer-not-full means that the buffer has space for one normal word, and buffer-not-empty means that the buffer has space for two-or-more normal words. For receive (TRAN=0), buffer-not-full means that the buffer contains one normal word, and buffer-not-empty
6-129
means that the buffer contains two or more normal words. Any type of full status (01, 10, or 11) in receive mode indicates that new (unread) data is in the buffer. When a program sets (=1) the FLSH bit, the processor flushes the settings for the corresponding external port FIFO buffer (EPBx). Flushing these settings does the following: clears (=0) the FS and PS status bits, clears (=0) the FIFO buffer and DMA request counter, clears any partially packed words. There is a two-cycle effect latency in completing the flush operation. Programs must not set a buffers FLSH during the same write that enables the buffer. Also, programs must not set a buffers FLSH bit while the DMA channel is active. Programs should determine the channels active status by reading the corresponding bit in the DMASTAT register. Status does not change on the master processor during external port DMA until the external portion is completed (for example, the EPBx buffers are emptied). If in chain insertion mode (DEN=0, CHEN=1), then channel chaining status never goes to 1. Programs should test channel status to see if it is ready before re-writing the chain pointer (CPx). The INTIO bit in the DMACx registers support single-word interrupt-driven transfers for each corresponding external port buffer. These non-DMA transfers are available under the following conditions: The external port DMA channels DEN bit is cleared (DMA disabled). The external port DMA channels INTIO bit is set enabling interrupt-driven I/O. The external port DMA channels buffer is not empty on an external read or not full on an external write. Under these conditions, the I/O processor generates that DMA channels interrupt on the single word transfer to or from that channels external port buffer.
6-130
I/O Processor
6-131
LSRQ 0xD0
L1RRQ
31 30 29 28 27 0 0 0 0 0
26 25 24 23 22 21 20 19 18 17 0 0 0 0 0 0 0 0 0 0
16 0
L0TRQ
Link Port 0 Transmit Request
L1TRQ
Link Port 1 Transmit Request
L0RRQ
Link Port 0 Receive Request
15 14 13 12 0 L1RM
Link Port 1 Receive Mask
11 10 0 0
9 0
8 0
7 0
6 0
5 0
4 0
3 0
2 0
1 0
0 0 L0TM
Link Port 0 Transmit Mask
L1TM
Link Port 1 Transmit Mask
L0RM
Link Port 0 Receive Mask
LRERR1
Rcv. Pack Error Status for Link Buffer 1 1=incomplete, 0=complete
L1STAT[1:0]
Link Buffer 1 Status (Read - Only) 11=Full, 00 = Empty, 10=one word
LRERR0
Rcv. Pack Error Status for Link Buffer 0 1=incomplete, 0=complete
L0STAT[1:0]
Link Buffer 0 Status (Read - Only) 11=Full, 00= Empty, 10=one word
Figure 6-18. LCTL Register Status Bits The LRERRx bits in the LCTL register indicate a link port buffers receive packing status. When the buffer is ready to receive and pack a new word, the processor clears (=0) LRERRx. If this bit remains set (=1) after the buffer receives a word, a link transfer error (for example, a clock glitch) has
6-132
I/O Processor
occurred. These bits are read-only, and the processor clears these bits when LxEN is cleared (changes from 1 to 0). Table 6-31 shows the available settings. Table 6-31. Link Port Buffer Receive Packing Status
LRERRx 0 1 Receive Packing Status pack complete (reset value) pack not complete
The LxSTATx bits in the LCTL register indicate a link buffers FIFO status. When transmitting, these bits indicate when the buffer has space for more data. When receiving, these status bits indicate when the buffer contains new (unread) data. These bits are read-only. The processor clears these bits when LxEN is cleared (changes from 1 to 0) and empties the buffer. Table 6-32 shows the available settings. Table 6-32. Link Port Buffer FIFO Status
LxSTATx 00 01 10 11 FIFO Buffer Status buffer empty reserved one word buffer full
The LCTL register lets programs assign link buffers to link ports. Bits LABO and LAB1 in the LCTL register assign link buffers to link ports. Because this mapping allows link ports to be unassigned (no buffer), the I/O processor has an interrupt (LSRQI) to notify programs that an external device has made a read or write request on a disabled link port.
6-133
When an LSRQI interrupt is latched into the IRPTL register, programs use the transmit (LxTRQ) and receive (LxRRQ) request bits in LSRQ register to determine which port has a request. The LSRQ registers bits indicate the following: For a transmit request (LxTRQ=1), the LSRQI interrupt indicates that the link port (0 or 1) is disabled, but another processor has requested more data by setting the link ports acknowledge (LxACK=1). For a receive request (LxRRQ=1), the LSRQI interrupt indicates that the link port is disabled, but another processor has requested to send data by setting the link ports clock (LxCLK=1). To control sources of link port service requests, the I/O processor lets programs mask these service requests. The LSRQ register provides mask bits for transmit (LxTM) and receive (LxRM) link service requests. The LxEN bits in the LCTL register support single-word interrupt-driven transfers for each corresponding link port buffer. These non-DMA transfers are available under the following conditions: The link port DMA channels LxDEN bit is cleared (DMA disabled). The link port DMA channels LxEN bit is set enabling the link buffer. The link port DMA channels buffer is not empty on receive or not full on transmit. Under these conditions, the I/O processor generates that DMA channels interrupt on the single word transfer to or from that channels link port buffer.
6-134
I/O Processor
6-135
the buffer. The bits may change state if the data is read or written by the processor core while the serial port is disabled. Table 6-33 shows the available settings. Table 6-33. Serial Port Transmit and Receive Buffer FIFO Status
DXS_A or DXS_B 00 01 10 11 FIFO Buffer Status buffer empty reserved partially full buffer full
The DERR_A and DERR_B bits in the SPCTLx registers indicate a serial port transmit underflow or receive overflow to the buffers FIFO. Status bits are read-only. Disabling the serial port (setting SPEN=0), clears the status bits and empties the buffer. These overflow and underflow bits are sticky; once set, they remain set regardless of buffer status until the serial port is disabled. The SPEN bit in the SPCTLx register support single-word interrupt-driven transfers for each corresponding serial port transmit or receive buffer. These non-DMA transfers are available under the following conditions: The serial port DMA channels SDEN bit is cleared (DMA disabled). The serial port DMA channels SPEN bit is set (enabling the serial port transmit or receive buffer). The serial port DMA channels buffer is not empty on receive or not full on transmit. Under these conditions, the I/O processor generates that DMA channels interrupt on the single word transfer to or from that channels serial port buffer.
6-136
I/O Processor
6-137
the receive shift register. If you are not servicing the interrupt quickly enough and not transferring the contents of SPIRX, this bit is set. Receive Data Buffer Status (read-only). SPISTAT Bits 6-7 (RXS). These bits indicate the status of the SPI port receive buffer (SPIRX). If RXS =00, the buffer is empty. See Table 6-34 for available RXS bit settings.
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
SPISTAT 0xB5
15 14 13 12 11 10 9 0 0 0 0 0 0 0
8 0
7 0
6 0
5 0
4 0
1 0
0 0
0 0
RXS
SPIRX Data Buffer Status (Read-only) 00=SPIRX empty 01=SPIRX partially full 11=SPIRX full 10=Reserved
SPIF
SPI Transmit Transfer Complete 1=transfer complete, 0=active transfer
MME
Multimaster Error 0=no error, 1=SPIDS~ asserted by slave
RBSY
Reception Error (Overflow) 1=new data received with full RXB FIFO SPI enters idle mode if master device
TXE
Transmission Error (Underflow) 1=no new data in TX FIFO, SPI enters idle mode if master device
TXS
SPITX Data Buffer Status (read only) 00=SPITX empty 01=TXB partially full 11=SPITX full 10=Reserved
Figure 6-19. SPISTAT Register The TXS and RXS bits in the SPISTAT registers indicate a SPI port transmit (SPITX) or receive (SPIRX) buffers FIFO status. Disabling the SPI port (setting SPIEN=0), clears the status bits and empties the buffer. TXS and RXS may change state if the data is read or written by the processor core while the SPI port is disabled. Table 6-34 shows the available settings.
6-138
I/O Processor
Table 6-34. SPI Port Transmit and Receive Buffer FIFO Status
TXS or RXS 00 01 10 11 FIFO Buffer Status buffer empty partially full reserved buffer full
The TXE and RBSY bits in the SPISTAT registers indicate a SPI port transmit underflow or receive overflow to the buffers FIFO. Status bits are read-only. Disabling the SPI port (setting SPIEN=0), clears the status bits and empties the buffer. These overflow and underflow bits are sticky; once set, they remain set regardless of buffer status until the SPI port is disabled. Under these conditions, the I/O processor generates that DMA channels transfer request over the IOD bus on the single word transfer to the SPITX data buffer or from the SPIRX data buffers.
6-139
Each DMA transfer takes one clock cycle even when different DMA channels are being allowed access on sequential cycles; for example, there is no overall throughput loss in switching between channels. Thus, two link port DMA channels, each transferring one byte per cycle, would have one half the I/O transfer rate as one external port DMA channel transferring data to internal memory on every cycle. Any combination of link ports, serial ports, and external port transfers has the same maximum transfer rate.
6-140
I/O Processor
1 2
Figure 6-20 shows an example DMA hardware interface. The following should be noted in this figure. Because DMARx and DMAGx are tied together, only one of the processors may have DMA enabled at a time.
DMAGx
The DMA Write Grant signal can be the combination of WR and MS3-0 instead of DMAG2 if paced master mode is used. The DMA Read Grant signal can be the combination of RD and MS3-0 instead of DMAG1 if paced master mode is used. DMA transfers may be to either processor or to external memory (in external handshake mode).
6-141
ADSP-2116X
8,16, OR 32
LATCH D Q
BR2
5
ADDR23-0 DATA47-16
BR1
3 010
ID2-0
HBR
RD WR ACK MS3-0
HBG
8, 16, OR 32
LATCH Q D
BR1
5
BR2
ID2-0
OE
3 001
OE WE ACK CS
ADDR DATA EXTERNAL MEMORY
MS3-0
6-142
I/O Processor
tSDRLC
tVDATDGH
VALID DATA47-16
Figure 6-21. DMAR and DMAG Timing Figure 6-21 shows DMAR and DMAG timing. The following should be noted in this figure. setup times relate to the use of the signal in that cycle by the processor. DMA requests may be asserted asynchronously to CLKIN.
DMARx
drives DATA47-16 if the processor is receiving. DMAGx latches DATA47-16 if the processor is transmitting.
DMAGx
When data is to be transferred from internal to external memory, the internal memory data is first placed in the external ports EPBx buffer by the DMA controller; the external memory access begins independently once the data is detected in the EPBx buffer. Likewise, for external-to-internal DMA, the internal DMA request is not be made until the external memory data is in the EPBx buffer. In both cases, the external DMA address generatorthe EIEPx and EMEPx parameter registersmain-
6-143
tains the external address until the data transfer is completed. The internal and external address generators of a DMA channel are decoupled and operate independently. When EXTERN mode DMA transfers occur between an external device and external memory, no internal resources of the processor are utilized and internal DMA throughput is not affected.
System-Level Considerations
Slave mode DMA is useful in systems with a host processor because it allows the host to access any processor internal memory location indirectly through DMA while limiting the address space the host must recognize only the address space of the processors I/O processor registers. Slave mode DMA is also useful for processor-to-processor DMA transfers. Slave mode DMA has one drawback when interfacing to a slow hostthe fact that the external bus is held up during the transfer (whether initiated by the processor or the host) and no other transactions can proceed. To overcome this, the handshake DMA mode may be used. In handshake mode, the host does not have to master the bus in order to make a DMA request, nor does the processor (in master mode) have to wait on the bus for the transfer to complete. Instead, the host asserts the DMARx pin. When the processor is ready to make the transfer, it can complete it in one bus cycle. For more information, see Handshake Mode on page 6-57.
6-144
7 EXTERNAL PORT
The ADSP-21161 processors external port extends its address and data buses off-chip. Using these buses and external control lines, systems can interface the processor with external memory, 8-, 16- or 32-bit host processors, and other DSPs. Because many of the external port operations relate to external memory accessing or I/O processing, this chapter refers to the memory and I/O processor chapters (Memory on page 5-1 and I/O Processor on page 6-1) frequently. This chapter describes connection and timing issues for the external port. The main sections of this chapter describe the interfaces that are available through the external port. These interfaces include: External Memory Interface on page 7-3 Host Processor Interface on page 7-42 Multiprocessor (MP) Interface on page 7-87 Data alignment through the external port is identical for these interfaces. Figure 7-1 shows the external ports data alignment.
7-1
DATA 47-16
47 40 39 32 31 24 23 16 15
DATA 15-0
8 7 0
PROM BOOT 8-bit Packed DMA Data 8-bit Packed Instruction Execution 16-bit Packed DMA Data 16-bit Packed Instruction Execution Float or Fixed, D31-D0, 32-bit Packed DMA 32-bit Packed Instruction Execution 48-bit Instruction Fetch (No Packing)
Extra data lines DATA[15-0] are only accessible if Link Ports are disabled. Enabled when IPACK = 1 in the SYSCON register
7-2
External Port
7-3
Figure 7-2 shows how the buses and control signals extend off-chip, connecting to external memory. Table 7-1 defines the processor pins used for interfacing to external memory. The processors memory control signals permit direct connection to fast static RAM devices. Memory mapped peripherals and slower memories can also connect to the processor using a user-defined combination of programmable waitstates and hardware acknowledge signals. External memory can hold instructions and data. Packed instructions can be executed directly from 32-bit, 16-bit, or 8-bit wide external memories using 32- to 48-bit, 16- to 48-bit or 8- to 48-bit execution packing modes supported by the external port and program sequencer. The external port can also be configured to have a 48-bit wide external data bus for 48-bit non-packed execution of instructions when link ports are not used. The link port data lines are multiplexed with the data lines D0 to D15 and are enabled through control bits in the memory mapped control register SYSCON. Data packing of 32- to 48-bit, 16- to 48-bit, 8- to 48-bit, 32- to 32/64-bit, 16- to 32/64-bit or 8- to 32/64-bit is supported for DMA transfers directly from 32-bit, 16-bit, or 8-bit wide external memories to and from 32-, 48-, or 64-bit internal memory. Figure 7-1 shows how the processor transfers different data word sizes over the external port.
7-4
External Port
CONTROL
CLOCK 2
BMS
DATA
ADSP-21161
ADDRESS
CS ADDR
3 12
BRST
FLAG11-0 ADDR23-0 TIMEXP RPBA DATA47-16 ID2-0 RD LXCLK WR LXACK ACK LXDAT7-0 MS3-0 SCLK0 FS0 D0A D0B SCLK1 FS1 D1A D1B SCLK2 FS2 D2A D2B SCLK3 FS3 D3A D3B SPICLK SPIDS MOSI MISO RAS CAS DQ M SDWE SDCLK[1-0] SDCKE SDA10
RAS CAS DQM WE CLK CKE A10 CS ADDR DATA SDRAM (OPTIONAL)
CLKOUT DMAR1-2 DMAG1-2 DMA DEVICE (OPTIONAL) DATA CS HBR HBG REDY BR1-6 PA SBTS ADDR DATA HOST PROCESSOR INTERFACE (O PTIONAL)
7-5
ADDR23-0
I/O/T
BRST
I/O/T
I (Input), S (Synchronous), o/d (Open Drain), O (Output), A (Asynchronous), a/d (Active Drive), T (Three-state, when SBTS or HBR is asserted, or when the processor is a bus slave)
7-6
External Port
CLKOUT
O/T
I (Input), S (Synchronous), o/d (Open Drain), O (Output), A (Asynchronous), a/d (Active Drive), T (Three-state, when SBTS or HBR is asserted, or when the processor is a bus slave)
7-7
LxDAT7-0 [DAT15-0]
I/O [I/O/T]
MS3-0
I/O/T
I (Input), S (Synchronous), o/d (Open Drain), O (Output), A (Asynchronous), a/d (Active Drive), T (Three-state, when SBTS or HBR is asserted, or when the processor is a bus slave)
7-8
External Port
WR
I/O/T
I (Input), S (Synchronous), o/d (Open Drain), O (Output), A (Asynchronous), a/d (Active Drive), T (Three-state, when SBTS or HBR is asserted, or when the processor is a bus slave)
7-9
The MS3-0 outputs serve as chip selects for memories or other external devices, eliminating the need for external decoding logic. For more information, see Timing External Memory Accesses on page 7-13. The MS3-0 lines are decoded memory address lines that change at the same time as the other address lines. When no external memory access is occurring, the MS3-0 lines are inactive. Unlike previous SHARCs, strobe assertion for conditional instructions occurs only when the instruction condition code evaluates as true.
Boot Memory
Most often, the processor only asserts the BMS memory select line when the processor is reading from a boot EPROM. This line allows access to a separate external memory space for booting. Both ROM boot memory waitstates and the mode of the WAIT register are applied to BMS-selected accesses. The BMS output is only driven by the processor bus master. For more information on booting, see Bootloading Through The External Port on page 6-70 or Bootloading Through The Link Port on page 6-88. It is also possible to write to boot memory using BMS. For more information, see Using Boot Memory on page 5-35. Idle Cycle A bus idle cycle is an inactive bus cycle that the processor automatically generates to avoid data bus driver conflicts. Such a conflict can occur when a device with a long output disable time continues to drive after RD is deasserted while another device begins driving on the following cycle. Idle cycles are also required to provide time for a slave in one bank to three-state its ACK driver, before the slave in the next bank enables its ACK
7-10
External Port
driver in the synchronous access modes. Figure 7-3 shows idle cycle insertion between a synchronous read and a zero-wait, synchronous write in cycle 3.
READ OP 1 CLKIN ADDRESS 23:0 MS3-0 RD WR BRST DATA 47:16 ACK 2 IDLE CYCLE 3 WRITE OP 4 5
Figure 7-3. Idle Cycle Example All timing diagrams show the default data bus width DATA [47:16]. When the full bus is enabled for 48-bit non-packed execution of instructions or transfers of data with the PX register, the data bus width is 48 bits, DATA47:0.
7-11
To avoid this data bus driver conflict, the processor generates an idle cycle in the following cases: On a transition from a read operation to a write operation in the same bank. On a transition from one bank or multiprocessor memory ID space to any other bank or multiprocessor slave ID space, independent of access mode. Unlike previous SHARCs, the ADSP-21161 processor does not support idle cycle insertion on a page boundary crossing. Data Hold Cycle The data hold cycle is another configurable memory access feature for adding cycles much like waitstates, as discussed in Setting Data Access Modes on page 5-32. A hold time cycle is an inactive bus cycle that the processor automatically generates at the end of a read or write to allow a longer hold time for address and data. The address, data (if a write), and bank select (if in banked external memory) remain unchanged and are driven for one cycle after the read or write strobes are deasserted. The processor inserts the data hold cycle only in asynchronous mode and only if the number of programmed waitstates code (EBxWS) is 010-111. Figure 7-4 demonstrates a hold time cycle appended to an asynchronous write access (EBxWS=011). The ADSP-21161 processor does not append an Idle cycle after a Hold cycle. Multiprocessor Memory Space Waitstates and Acknowledge Multiprocessor memory space uses only the synchronous transfer protocols, using the zero-waitstate access for writes and a minimum one-waitstate access for reads. Slave processors deassert ACK if more access
7-12
External Port
Figure 7-4. Hold Time Cycle Example time is required. DMA burst transfers are only defined for direct read access of a processor slaves external port buffers (EPBx). For more information, see Multiprocessor (MP) Interface on page 7-87. The ADSP-21161 processor does not support the MMSWS bit from previous SHARCs.
7-13
with CLKDBL tied low can be used as a clock source to peripherals only in single processor systems.
CLKOUT
The synchronous interface mode supports DMA burst transfers, which can significantly improve bus throughput for large, contiguous block transfers. The synchronous interface protocols are compatible with Synchronous Burst SRAMS (SBSRAMs) from a variety of vendors. In a multiprocessing system, the ADSP-21161 processor must be the bus master in order to access external memory. When interfacing to synchronous external memories, CLKIN must be used to provide the clock source to the synchronous device. Asynchronous Mode Interface Timing Figure 7-5 shows typical timing for an asynchronous read or write of external memory. Here, the CLKIN clock signal indicates that the access occurs within a single CLKIN cycle. All timing for the master processor is derived synchronously from CLKIN. The asynchronous slave mode modifies the basic synchronous access to better support slaves whose timing is not derived from CLKIN. Figure 7-6 shows timing relationships used by the asynchronous external access mode. In this mode, The strobes assert and deassert based on timing derived from an internal clock whose frequency is twice that of the core clock. (This differs from synchronous mode where the strobes assert from the same edge.) The trailing edge timing is derived from the rising edge of the internal version of CLKIN. The MSx memory select lines are held stable for the entire access. (This differs from synchronous read or synchronous writeminimum 2-cyclemodes where the memory select lines are deasserted after the first cycle of the transfer that uses ACK.)
7-14
External Port
CLKIN ADDRESS[23:0] MS3-0 RD or WR (WRITE) DATA 47:16 (READ) DATA 47:16 ACK WRITE DATA READ/WRITE ADDRESS
Figure 7-5. External Memory Asynchronous Access Cycle For read operations, DATA47:16 are sampled by the processor on the rising edge of the RD. This differs from synchronous mode where DATA47:16 are sampled by the internal version of CLKIN. Asynchronous memories or memory mapped devices that require added waitstates through the deassertion of ACK must be configured for a minimum of one internal waitstate due to a potential lack of sufficient decode time for ACK delay from address/selects Refer to ADSP-21161N DSP Microcomputer Data Sheet for timing specifications. Asynchronous Mode ReadBus Master Processor bus master reads of external memory, in asynchronous mode, occur with the following sequence of events as shown in Figure 7-5. 1. The processor samples ACK synchronously. If ACK is asserted, the processor drives the read address and asserts a memory select signal (MS3-0) to indicate the selected bank. A memory select signal is not
7-15
Figure 7-6. Asynchronous Access Timing Derivation deasserted between successive accesses of the same memory bank. If ACK is sampled deasserted, the processor waits one CLKIN cycle to sample ACK again. 2. The processor asserts the read strobe. 3. The processor checks whether waitstates are needed. If so, the memory select and read strobe remain active for additional cycles. Waitstates are determined by a combination of the state of the external acknowledge signal (ACK) AND the internally programmed waitstate count.
7-16
External Port
4. The processor deasserts the read strobe in the cycle where no further waitstates are indicated. The data bus (DATA47:16) is sampled on the rising edge of the read strobe. 5. If a hold cycle is programmed for the accessed bank (via the EBxWS parameter of the WAIT register), the address bus and memory selects are held stable for an additional cycle. If initiating another read memory access to the same bank, the processor drives the address and memory select for that access in the next cycle. Asynchronous Mode WriteBus Master Processor bus master writes to external memory, in asynchronous mode, occur with the following sequence of events as shown in Figure 7-5. 1. The processor samples ACK synchronously. If ACK is asserted, the processor drives the write address and asserts a memory select signal (MS3-0) to indicate the selected bank. A memory select signal is not deasserted between successive accesses of the same memory bank. The processor also drives the write data (DATA47:16). If ACK is sampled deasserted, the processor waits one CLKIN cycle to sample ACK again. 2. The processor asserts the write strobes. 3. The processor checks whether waitstates are needed. If so, the memory select and write strobe remain active for additional cycles. Waitstates are determined by a combination of the state of the external acknowledge signal (ACK) AND the internally programmed waitstate count. 4. The processor deasserts the write strobes near the end of the cycle where no further waitstates are indicated. 5. The processor three-states its data outputs, unless the next access is also a write to the same bank, or if a hold cycle is programmed for the accessed bank using the EBxWS parameter of the WAIT register. If
7-17
a Hold cycle is inserted, the address bus, data bus, and memory selects are held stable for an additional cycle. If initiating another memory access to the same bank, the processor drives the address and memory select for the next access in the following cycle. Synchronous Mode Interface Timing Any slave addressed by a processor in a bank configured for synchronous transfer mode must use a clock with frequency and phase characteristics similar to CLKIN on the processor. The slave samples all inputs, and drives all outputs on the rising edge of this clock. Except for zero-waitstate writes, the slave must assert ACK at least twice for each access; once to acknowledge the address/command (strobe assertion) and once (if not a burst) or more to acknowledge the data transfer. Due to insufficient decode time, the first ACK can be due to the keeper latch (internal pullup enabled for ID=00x) holding the assertion of ACK from the previous slave. The following notes apply to all synchronous access modes: A slave recognizes the start of a valid bus operation by synchronously sampling one or more of the strobes and ACK asserted. ACK assertion is by the previous bus slave, allowing a new bus access to launch. For each of the non-burst, synchronous read/write accesses (except zero-waitstate writes), the master recognizes the end of the access as the cycle in which: 1. The slave samples or drives data in response to a valid operation driven by the master (read or write), 2. The slave asserts ACK to the master (except for zero-waitstate write operations), and
7-18
External Port
3. The number of waitstates for read or write access to that bank has occurredasserting ACK does not terminate the wait count early. The program must select a number of waitstates that is consistent with the access time for the slave addressed by that external memory bank. For the zero-waitstate writes, the access can only be extended beyond one clock cycle by deasserting ACK in the cycle of the transfer. This extension can occur on back-to-back writes in which ACK is deasserted due to full write buffer capacity from the previous write. Otherwise, slaves can asynchronously deassert ACK in the first cycle. Deasserting ACK during the initial command phase does inhibit waitstate count and change of bus signals. After the first ACK assertion, deasserting ACK for the data phase does not inhibit waitstate counting. Only one slave (or driver for ACK) should be allocated per external memory bank. More than one slave may introduce ACK drive contention. The read/write strobes for an access do not assert until ACK is sampled asserted. This conditional strobe assertion delays the start of an access until ACK is asserted by the previous slave. This sampling is because the slave target of a single-cycle write operation may have deasserted ACK in the cycle (due to a previous write access), to stall further writes to that slave. To provide a cycle for the previous slave to three-state its ACK driver before the next slave drives ACK, the next operation to a new bank must not launch on the bus.
7-19
Write/read access stalls (no state change, other than internal waitstate counting) on the bus if ACK is deasserted in cycle(s) of data transfer. The last read/write operation must be acknowledged via ACK before a transition to a new bus master (BTC), bank, or multiprocessor space slave occurs. The master always inserts an Idle cycle on this transition. No pipelining can occur across these boundaries. Synchronous Mode ReadBus Master An example synchronous read cycle appears in Figure 7-7. Propagation delays are not shown in this timing diagram. Because a synchronous access requires a rising clock edge for the slave to sample the asserted signals of the master (and for the master to sample the signals of the slave), the minimum read access in the synchronous mode is two CLKIN cycles. In synchronous access mode, the waitstate selection in the WAIT register (EBxWS) must be 001 or greater. EBxWS=000 is not supported in synchronous access mode. This example demonstrates a minimum latency, one-waitstate, 32-bit (normal word) read, from external memory. Bus master synchronous reads from external memory occur with the following sequence of events as shown in Figure 7-7: 1. (cycle 1) If ACK is sampled as asserted at the beginning of cycle 1, the processor drives the read address and asserts a memory select signal (MS3-0) to indicate the selected bank. The processor asserts the RD strobe. The read strobe is not deasserted between successive read accesses of the same memory bank. 2. (cycle 2) If ACK was sampled as deasserted at the beginning of the cycle (not shown), the MSx strobes would remain asserted. If ACK was sampled asserted, the MSx strobes would deassert. The slave must be capable of detecting that MSx was asserted in cycle 1 and
7-20
External Port
must retain this information internally. If ACK was deasserted by the previous slave (for a single-cycle write), deassertion of the MSx is delayed.
Figure 7-7. Typical Synchronous Read Timing 3. (cycle 2) The processor checks whether more than one waitstate is needed. If so, the read strobe remains active for additional cycle(s). Waitstates are determined by a combination of the state of the external acknowledge signal (ACK) AND the programmed waitstate count. 4. (end of cycle 2) The data bus (DATA47:16) is sampled on the rising edge of CLKIN.
7-21
5. (cycle 3) If initiating another read memory access to the same bank, the processor drives the address, memory select, and strobe for the next access. Figure 7-8 shows back-to-back reads to the same bank with the second access stalled for one cycle by the slave deasserting ACK. This example assumes that the EBxWS=001 for this bank, indicating one internal waitstate. Synchronous Write, Zero-Waitstate Mode Figure 7-9 on page 7-24 shows typical synchronous write cycle timing. Propagation delays are not shown in this timing diagram. Synchronous access requires a rising clock edge for the slave to sample the asserted signals of the master (and for the master to sample the signals of the slave). In the case of writes, the latency can be reduced to a single cycle if the slave always latches the bus signals on each clock cycle (it does not sample ACK). For example, the slave can not sample the bus, decode that it is being addressed as a slave, and sample the write data of the bus in the following cycle. The slave samples the bus each cycle and decodes the sampled value to determine if that slave was addressed by the write operation. If the slaves write queue goes full with that write, the slave deasserts ACK in the cycle after the write operation transferred on the bus. Any subsequent bus operation (read or write) stalls until ACK is sampled asserted, as shown in cycle 2 of Figure 7-9. The example demonstrates a minimum latency, zero-waitstate, 32-bit write in cycle 1 followed by a write to the same bank. This write stalls because ACK is deasserted in cycle 2 in response to the write in cycle 1. The second access is a 32-bit write to external memory. The zero-waitstate write mode provides the highest performance if the slave has sufficient write buffer storage. Systems should use this mode where the slave can always accept one write transfer (unless ACK is deasserted) and can generally accept more than one write. If the slave has only one store buffer, such that it always deasserts ACK after the first write, the
7-22
External Port
Figure 7-8. Two Synchronous Reads From Same Bank one-waitstate write mode may be the better choice. The zero-waitstate write mode is targeted towards ASIC/FPGA designs, which implement multiple write buffers (including processor as a slave), and fully pipelined synchronous devices such as SBSRAMs. Slaves that do not support bursting protocols do not need to connect to the BRST signal.
7-23
1 CLKIN ADDRESS 23:0 MS3-0 RD WR BRST DATA 47:16 ACK WRITE #1 WRITE #1
WRITE #2
Figure 7-9. Typical Synchronous Write Example Bus master synchronous writes to external memory occur with the following sequence of events as shown in Figure 7-9: 1. (cycle 1 in Figure 7-9) If ACK is sampled asserted at the start of cycle 1, the processor bus master drives the write address and asserts a memory select signal (MS3-0) to indicate the selected bank. The processor asserts the WR strobe. The write strobe is not deasserted between successive write accesses of the same memory bank. 2. (cycle 1) The previous slave three-states ACK. Note that the previous slave could have driven ACK deasserted through cycle 1 if a write in the previous cycle caused its write queue to fill. Only one slave is supported per bank, and any bank transition has an IDLE cycle inserted to provide time for the slave to three-state ACK.
7-24
External Port
3. (cycle 2) The processor is initiating another write memory access to the same bank. It drives the address, memory select, and strobe for the next access. 4. (cycle 2) The slave, having decoded that it received a valid write operation in the previous cycle, detects that it cannot accept further bus operations until an element in the write queue becomes available, so it deasserts ACK. 5. (cycle 3) The processor samples ACK deasserted by the slave. It inserts waitstates until ACK is sampled asserted. The write ends in the cycle where ACK is asserted by the slave (end of cycle 3). Figure 7-10 shows a zero waitstate write, followed by a synchronous read from the same bank. The slave addressed by both accesses determines in cycle 2 that it has no more write capacity. It deasserts ACK in this cycle, in response to the write in cycle 1. In cycle 3, the slave determines that it is now addressed by the master to perform a read and asserts ACK to acknowledge the transfer. The slave asserts ACK in cycle 4 when read data is available to complete the data transfer. The memory select for the read access is held asserted by the master until cycle 4, because ACK was deasserted in cycle 2. Synchronous Write, One Waitstate Mode Because some synchronous slaves cannot support a free-running latch function to capture zero-wait bus writes, the processor also supports a minimum two-cycle (minimum one-waitstate) write access. This mode is set using the bank Access Mode bits (EBxAM). For more information on access modes, see Table A-20 on page A-66. The one-waitstate, synchronous write access is shown in the second write of Figure 7-11. In this example, the first access is to a bank configured for asynchronous writes (cycle 1). In Figure 7-11, this condition is shown by the deassertion of the write strobe before the rising edge of CLKIN for cycle 2. In cycle 2, a bank transition occurs, and an idle cycle is inserted to
7-25
1
CLKIN write address ADDRESS 23:0 MS3-0 RD WR BRST write data DATA 47:16 ACK
read address
read data
Figure 7-10. Synchronous Write Followed by Synchronous Read Example allow the slaves to transition ownership of ACK. In cycle 3, the second write begins, to a new bank configured for one-waitstate write access. The address and data are held for a minimum of two cycles. Similar to the synchronous read, MSx deassert in the second cycle of the write (cycle 4), and the waitstate counter decrements if ACK is sampled asserted. The access can be held off the bus by deasserting ACK in cycle 2, or extended by deasserting ACK in cycle 3 (unlikely for a synchronous slave) or cycle 4. Synchronous Burst Mode Interface Timing Synchronous burst mode provides improved performance on synchronous operations. The processor supports a DMA-mastered burst mode. If the addressed slave supports this burst transfer, after the one or more waitstates associated with access to the first 32-bit read data transfer,
7-26
External Port
contiguous data can transfer on each subsequent clock cycle, up to a maximum of four 32-bit transfers. Burst accesses support only 32-bit data transfers. Partial data bus width transfers are not supported. For burst transfers, the master drives the address of the first access on the bus during the entire burst transfer. The master does not increment the address for the slave. Because the maximum length of the burst transfer is four, slaves only need a 2-bit address incrementer to generate the offset address from the address driven by the master on the bus. Table 7-2 shows burst length determination as a function of initial address. If the DMA channel has sufficient data to transfer, it initiates a new burst transfer starting at ADDR1-0 = 00, 01, or 10 when it wins bus arbitration. Bursts always terminate when ADDR1-0=11.
WRITE #1 1 CLKIN ADDRESS 23:0 MS3-0 RD WR BRST DATA 47:16 ACK IDLE 2 WRITE #2, DIFFERENT BANK 3 4
Figure 7-11. Asynchronous Write Followed By Synchronous Write One-Waitstate Mode ADSP-21161 SHARC Processor Hardware Reference 7-27
An example of a synchronous burst read of length three appears in Figure 7-12. Here, the bank used in the transfer has two waitstates.
ADDRESS[1:0] = 01
7-28
External Port
Burst Length Determination The DMA arbitration logic reduces the initial access latency by bursting up to the maximum burst length of four when possible, assuming the channel is burst enabled. When a DMA channel wins internal I/O processor arbitration, the channel drives the internal buses as with a non-burst transfer. At the same time, the I/O processor detects whether it can perform a burst transfer, according to the following criteria: 1. The DMAC burst enable (MAXBL1-0) control bit field is set for that DMA channel. 2. The EM register is set to 0 or 1. A value of 0 does not increment EI. This is useful when bursting to or from a registered data port, buffer, or register, such as the EPBx FIFOs of another processor. 3. The EC register is greater than or equal to two (32-bit) words. 4. The EPBx FIFO for that channel has at least two 32-bit words to transfer for an external burst write or has at least two empty 32-bit elements to receive data for an external burst read. 5. The two least significant bits of the DMA channel external address are not set (ADDR1-0 does not equal 11). Burst Stall Criteria If the I/O processor determines that it can perform a burst transfer (according to the burst length criteria), the arbitration between the processor core and the I/O processor locks the effective arbitration grant to that DMA channel until: 1. The DMA channel external ADDR1-0 = 11. By disconnecting the burst on this boundary, a modulo4 (ADDR23-0) is effectively implemented, which is required by SBSRAMs, and other slaves with
7-29
limited address incrementing capability. For processor-based systems, slaves only need a 2-bit counter to support the address incrementing function of the burst. 2. Space in the EPB FIFO drops to less than two 32-bit elements (if an external bus read), or less than four valid 32-bit elements for external bus writes. This almost full or empty detection is required by the master logic to deassert BRST on the cycle before the end of the burst. 3. 4.
EC
and SBTS are asserted on the external bus, indicating the deadlock resolution case in which the processor must three-state its outputs and switch into slave mode. For more information, see Deadlock Resolution on page 7-82. Assertion of either signal alone does not terminate the burst early. HBR assertion does not receive an HBG until the burst finishes. SBTS assertion causes the master to three-state outputs and insert waitstates.
HBR
If any of these conditions occur, normal arbitration between the processor core and I/O processor for the external bus occurs. If the same bursting channel wins arbitration again, a new burst is initiated, introducing at least one lost or dead cycle in the burst throughput for reads. When arbitration occurs, the DMA channel loses arbitration if any of the following conditions are detected: 1. Higher priority external request for the bus: a. b. c.
HBR BRx BRx
asserted asserted and BMAX time out has occurred asserted and PA asserted, but not by this master
7-30
External Port
2. Higher priority internal I/O processor requester: a. Processor core request (DAGs or program sequencer) b. A higher priority request from another DMA channel or direct read/write access causes this channel to lose arbitration. For more information, see I/O Processor on page 6-1. Synchronous Burst Reads External memory synchronous burst reads occur with the following sequence of events as shown in Figure 7-12: 1. (cycle 1 in Figure 7-12) If ACK is sampled asserted at the beginning of cycle 1, the processor drives the read address and asserts a memory select signal (MS3-0) to indicate the selected bank. 2. (cycle 1) The processor asserts RD strobe to indicate a read request of the slave. 3. (cycle 2) As with the non-burst synchronous read, the processor deasserts the MSx output signal, asserts the BRST output signal, and enables waitstate counting if ACK is sampled asserted at the end of cycle 1. 4. (cycle 2) The processor checks whether more than one waitstates (2 waitstates for this example) is needed. If so, BRST and the read strobe remain active for additional cycle(s). 5. (cycle 3) The slave samples BRST asserted, informing it that the master requests at least one more transfer after the current transfer is acknowledged via ACK by the slave. 6. (cycle 3) The programmed number of waitstates has been counted, and the slave is driving 32-bits of valid data and asserting the ACK signal. This ends the first access.
7-31
7. (cycle 4) The slave drives the next 32-bits of contiguous data and asserts ACK. If the slave needs more time to service any one transfer within the burst, it can deassert ACK to stall the bus transfer. 8. (cycle 4) The slave samples BRST asserted, informing it that the master requests at least one more 32-bit transfer. 9. (cycle 5) The master deasserts BRST to inform the slave that this is the last transfer of the burst. In this example, the master deasserts BRST due to the address modulo4 function. The two LSBs of the initial address = 01. The slave increments the address as 01->10->11. This is the maximum offset needed to support from the initial address. 10.(cycle 5) The slave drives valid data for the last transfer, and asserts ACK. 11.(cycle 6) If initiating another burst read memory access to the same bank, the processor asserts the address, memory select, and strobes for the next access. This introduces at least two dead cycles in the back-to-back burst throughput, because the initial waitstate count applies to the first access of the second burst. 12.(cycle 6) With BRST sampled deasserted, the slave concludes its service of the burst request by three-stating the DATA47:16 and ACK drivers. As a master, the processor supports burst reads on each of the four external port DMA channels. Each channel has an independent burst enable control field (MAXBL1-0). As a slave, the processor supports read bursts from the EPBx buffers (with the EPBx read). For more information, see Multiprocessor (MP) Interface on page 7-87 and Host Processor Interface on page 7-42.
7-32
External Port
Because reads of the EPBx FIFO are destructive, the processor slave must deassert ACK on each transfer of the burst to guarantee that it samples the deasserted BRST input before performing the EPBx FIFO read. If your system design uses a similar destructive read data buffer, use precaution when burst reads of the buffer are supported. Synchronous Burst Writes The processor can master burst read and write operations in the one-waitstate write access mode (EBxAM=10) if one or more DMA channels are configured appropriately. The processor can master non-burst, zero-waitstate, writes in every cycle. Burst write transfers are not supported in this access mode. Synchronous external devices require at least one cycle of write access latency (for example, bus bridges, SDRAM controllers, and others). These devices may be able to optimize throughput for burst write operations, based on the contiguous, incrementing block transfer information conveyed by the burst protocol. Burst accesses support only 32-bit data transfers; partial data bus width transfers are not supported. An example of a synchronous burst write appears in Figure 7-13. Here, the bank used in the transfer has the one-waitstate mode, for the first write of the burst. External memory synchronous burst writes occur with the following sequence of events as shown in Figure 7-13: 1. (cycle 1 in Figure 7-13) If ACK is sampled asserted at the start of cycle 1, the processor drives the write address and asserts a memory select signal (MS3-0) to indicate the selected bank. The processor also drives valid data in this cycle. The processor asserts the WR strobe to indicate a write command to the slave. 2. (cycle 2) The slave samples the write command and address. At this point, the slave does not see that a burst write is in progressthe access looks identical to a non-burst synchronous write. If the slave
7-33
ADDRESS[1:0]=00
Figure 7-13. External Memory Synchronous Burst Write Example cannot accept the write command, it deasserts ACK in this cycle to stall the bus until it can. In this example, it has buffer capacity to accept all of the data of the burst, so ACK stays asserted. 3. (cycle 2) If ACK was sampled asserted at the start of the cycle, the processor asserts the BRST output signal and deasserts the MSx output signal. 4. (cycle 3) The processor samples ACK asserted by the slave at the start of the cycle. It increments the data bus to the second of four data transfers within the burst.
7-34
External Port
5. (cycle 3) The slave samples BRST asserted at the start of the cycle, informing it that the master is writing at least one more 32-bit transfer. The slave samples the second of four data transfers within the burst and asserts ACK. 6. (cycle 4) The processor samples ACK asserted by the slave at the start of the cycle. It increments the data bus to the third of four data transfers within the burst. 7. (cycle 4) The slave samples BRST asserted at the start of the cycle, informing it that the master is writing at least one more 32-bit transfer. The slave also samples the third of four data transfers within the burst, and asserts ACK. If the slave needs more time to service any one transfer within the burst, it can deassert ACK to stall the bus transfer. 8. (cycle 5) The processor samples ACK asserted by the slave at the start of the cycle. It increments the data bus to the last of four data transfers within the burst. The master deasserts BRST to inform the slave that this is the last transfer of the burst. 9. (cycle 5) The slave samples BRST asserted at the start of the cycle, informing it that the master is writing at least one more 32-bit transfer. The slave samples the fourth of four data transfers within the burst and asserts ACK. 10.(cycle 6) If initiating another write burst memory access to the same bank, the processor asserts the address, memory select, and strobes for the next access. This introduces at least one dead cycle in the back-to-back burst throughput, because the initial waitstate count applies to the first access of the second burst. 11.(cycle 6) With BRST sampled deasserted, the slave concludes its service of the burst request by three-stating the ACK driver.
7-35
As a master, the processor supports burst writes on each of the four external port DMA channels. Each channel has an independent burst enable control field (MAXBL1-0). As a slave, ADSP-21161 processor does not support burst writes. Bursting is enabled by setting MAXBL1-0 to 01 in the DMACx register. Enabling bursting can corrupt data transmitted during DMA master writes because the MAXBL bit setting is not ignored when the BRST signal is asserted. The ADSP-21161 only supports processor-to-processor single cycle writes. Therefore, no improvement in throughput performance is achieved by enabling bursting. To enable ADSP-21161 to ADSP-21161 DMA driven write transfers, set MAXBL1-0 to 00.
must be used as the clock source for SBSRAM. You cannot use an external crystal when interfacing with SBSRAM.
Do not use CLKOUT as the clock source for the SBSRAM device. Using an external crystal in conjunction with CLKDBL to generate a CLKOUT frequency is not supported. Negative hold times can result from the potential skew between CLKIN and CLKOUT.
7-36
External Port
The processor can support SBSRAMs on any of the four external memory banks. The processor supports SBSRAM single transfer reads and writes and SBSRAM burst read transfer operations. Burst write transfers are not supported, because the single-write feature of SBSRAMs achieves the same throughput level, with less complexity. SBSRAM support is enabled by configuring the bank access mode (EBxAM) bits for synchronous, one-cycle writes and waitstate (EBxWS) bits for one waitstate (flow-through SBSRAMs) or two waitstates (fully pipelined SBRAMs). For more information on programming access modes and waitstates, see the WAIT register bits in Table A-20 on page A-66. If burst read transfer functionality is needed, one or more of the external port DMA channels must be configured appropriately. Because burst transfers are controlled at the DMA channel, the DMA sequence must make sure that the DMA burst transfer addresses a memory bank or slave that supports the read burst transfer. Figure 7-14 and Table 7-3 show how the processor I/O should be connected to the SBSRAM I/O. Table 7-3 assumes a 512 Kbyte SBSRAM array consisting of one bank with a 3.3V, 32K x 32 device. The names of the SBSRAM signals may vary from between vendors. Figure 7-14 is for illustrative purposesactual system designs may differ and must be carefully analyzed to determine the actual system topology. The SBSRAM devices are fully synchronous devices, except for the output enable. The processor issues commands and updates the SBSRAM address latches, as a controller, using the ADSC input of the SBSRAMs, rather than the ADSP processor input. Using the ADSC SBSRAM input enables single cycle writes and simplifies SBSRAM deselect operations.
7-37
ADSP-21161
ADDR[23:0] MS0 BRST RD WR CLKIN DATA[47:16] DATA[47:16]
SBSRAM 32KX32
ADDR[15:0] CE1 ADSC OE GW DATA[31:0] CLK ADSP CE BWE BW[4:1] LBO ADV CE2 ZZ
Figure 7-14. SBSRAM System Interface Example Table 7-3. ADSP-21161 to SBSRAM Signal Mapping
ADSP-21161 CLKIN ADDR15-0 MSx BRST RD WR DATA47:16 No connect No connect No connect No connect No connect No connect SBSRAM CLK ADDR15-0 CE ADSC OE GW DATA31-0 CE CE2 ADSP ADV BWE BW4-1 Comment Clock input of SBSRAM should be driven by CLKIN of the processor. Address connection Chip Enable, active low Address Status Controller, active low Asynchronous Output Enable of SBSRAM, active low Global Write Enable of SBSRAM, active low I/O of SBSRAM (High word of bus, odd address) Chip Enable, active high, always asserted (Vdd) Second Chip Enable, always asserted (GND) Always Deasserted (Vdd) Always Asserted (GND) Byte Write Enable, always deasserted (Vdd) Byte Write Selects, always deasserted (Vdd)
7-38
External Port
By asserting the ADV (advance address) input to the SBSRAM, the device is continuously attempting to burst. This input is ignored when ADSC is asserted. Because the BRST/ADSC signal is always low for a single access or the first access of a burst, the SBSRAM always updates its address latches correctly. For the subsequent transfers (up to three, after the initial access) of a read burst, the SBSRAM samples BRST/ADSC high. The asserted ADV correctly advances the internal address count of the SBSRAM. The processor issues four types of bus operations to the SBSRAMs, as shown in Table 7-4. Table 7-4. SBSRAM Partial Truth Table
SBSRAM Operation Read cycle, begin burst Write cycle, begin burst Read cycle, continue burst Deselect Cycle CE1 MSx L2 L X H ADSC BRST L L H L ADV1 X X L X GW WR H L H X OE RD L H L X I/O Data Hi-Z Data Hi-Z
All other signal inputs held static per Figure 7-14 1 2 ADV statically held asserted, low L=low, H=High, X=dont care, Hi-Z=three-stated, high impedance output
7-39
Single read or write transfers, and the first transfer of a burst read, use the read or write cycle and begin burst bus operation. Burst write transfers are not supported. The subsequent transfers (up to three) of a read burst use the read cycle and continue burst bus operation. The last cycle of any read access performs a deselect bus operation ensure that the SBSRAM data buffers remain three-stated for accesses to other banks. The write operations are achieved by configuring the appropriate bank of the processor to synchronous minimum one-cycle write mode. The synchronous read waitstate count should be programmed to one for flow-through SBSRAMs, or two for fully pipelined SBSRAMs. SBSRAMs are not stalled, or suspended, by assertion of ACK in this configuration. Systems should not deassert ACK during any SBSRAM access. The processor has a weak pullup device on ACK; ACK does not need to be driven during an access to a slave which does not or cannot control ACK. Figure 7-15 demonstrates a burst read of the flow-through SBSRAM, followed by a single write to the SBSRAM, and a single read of the SBSRAM. For burst operations, the deasserting BRST is not required in the last cycle of the burst transfer. The processors burst protocols also support ASIC/FPGA systems. The pipelined end-of-burst indicator may be useful in these systems. The SBSRAM array size can be increased from the example by using higher density devices or implementing multiple banks of SBSRAM. Multiple banks are possible using the depth expansion feature of the SBSRAMs and the multiple memory select outputs of the processor.
7-40
External Port
1 CLKIN ADDRESS 23:0 MS0 (CE) BRST (ADSC) RD (OE) WR (GW) ACK DATA 47:16
10
A0 DESELECT CYCLE
B1
C1 DESELECT CYCLE
IDLE CYCLE
A0
A1
A2
A3
B1
C1
SBSRAM Restrictions
SBSRAM (or other synchronous peripherals such as bridge chips) is restricted using the same external clock generator source provided to the CLKIN pin of the processor. Do not use CLKOUT as the clock source to the SBSRAM. The clock source connected to both the CLKIN and the clock input of the SBSRAM must be a clock source provided by an external oscillator or other clock source. External crystals in conjunction with the internal clock generator (and CLKDBL) should not be used to generate a CLKOUT source for the SBSRAM.
7-41
7-42
External Port
HBG REDY HBR ADSP-21161 BR1-BR6 3 ADDR23-0 SYSTEM BUS INTERFACE SYSTEM DATA BUS
DATA47:16 ID2-0 HBR HBG REDY CS WR RD ACK MS3-0 RD EXTERNAL MEMORY ADDR DATA WE OE ACK CS HBR CS ADDRESS COMPARATOR "ADDRESS VALID" REDY DSP BUS ACK SYSTEM BUS SYSTEM ADDRESS BUS WR HBG OE T/R
000
Figure 7-16. Example Processor to Host System Interface monitor the processor and to set up DMA transfers. DMA transfers are controlled by the processors I/O processor after they are set up by the host. In a multiprocessor system, the host can access the I/O processor registers of every ADSP-21161. Data written to and read from the processor can be packed or unpacked into different word widths. The host bus width control bits (HBW) in the SYSCON register configure data packing and unpacking.
7-43
HBG
I/O
CS
I/A
7-44
External Port
give up the bus to the host after the it finishes the current bus operation. If the current operation is a burst transfer, the change in bus mastership interrupts the transfer on a modulo4 boundary. The current bus master signals that it is transferring ownership of the bus by asserting HBG (low) when the current bus operation ends. The cycle in which control of the bus is transferred to the host is called a Host Transition Cycle (HTC). Bus slaves respond to HBG assertion with or without the assertion of HBR. Therefore erroneous assertions of HBG (glitching, etc.) can cause slave DSPs to believe that the host is the current bus master. Figure 7-17 shows the timing for the host acquiring the bus. HBG is asserted while the bus master releases control of the bus and remains asserted until HBR is sampled deasserted by the ADSP-21161 processor. The cycle in which control of the bus is released by the bus master is called the processors Bus Transition Cycle (BTC). HBG freezes the multiprocessor bus arbitration during the time that the host has control of the bus. HBG may be used to enable the hosts signal buffers, as shown in Figure 7-16 on page 7-43, Figure 7-24 on page 7-80, and Figure 7-25 on page 7-81. Arbitration is frozen due to the current bus master continuously asserting its BRx. While HBG is asserted in a multiprocessor system, the DSPs continue to assert their BRx outputs, as in normal operation, but no BTC occurs. The current bus master keeps its BRx output asserted throughout the entire time the host controls the bus. After HBR is asserted, and before HBG is given, HBG floats for 1 tCK (1 CLKIN cycle). To avoid erroneous grants, HBG should be pulled up with a 20k to 50k external resistor. After the host gets control of the bus, the host can perform transfers with the processor or other system components. To initiate transfers, the host asserts (low) the CS and HBR inputs of the processor that it intends to access and performs the read or write. The processor does not respond to CS until HBG is asserted.
7-45
The host may also communicate directly with system peripherals, such as SBSRAMs. These transfers occur using the protocol of the peripheral or using the external handshake mode of DMA channels 10 and 11 to control the memory or peripheral. With DMA handshaking, the host only needs to source or sink the data with the correct timing. Either of these solutions may require additional hardware support for the host. The host is responsible for driving the following signals during the HTC in which it gains control of the bus: ADDR23-0, RD, WR, and DMAGx (if used in the system). These signals must be driven by the host while the host is bus master. Also, the host must drive or weakly pull up or down the MS3-0, BRST, CLKIN, DMAG1, and DMAG2 signals as required. The bus master three-states these lines, letting the host use them. The processor with device ID=000 or 001 enables internal pullup devices on the MS3-0, RD, WR, DMAR1, DMAR2, DMAG1, and DMAG2 signals. The pullup provides a weak current source to hold these signals in the deasserted state when driven to that state. Excessive system noise can cause these weakly driven signals (MS3-0, RD, WR, DMAR1, DMAR2, DMAG1, and DMAG2) to be sampled asserted. The processor with device ID=000 or 001 enables its keeper latches on ADDR23-0 and DATA47-16, BRST, and CLKOUT, so these signals are weakly pulled to the last value driven on them if any of these signals remain undriven for multiple cycles. During read-modify-write operations, the host should keep HBR asserted to avoid temporary loss of bus mastership. HBR must remain asserted until after the host completes the last data transfer.
7-46
External Port
Host Address
A1
7-47
The following restrictions apply to bus acquisition by the host: If HBR is asserted while the processor is in reset, it does not respond with HBG until after reset and multiprocessor synchronization is completed. The host should keep HBR asserted until after the host completes its last data transfer and is ready to give up bus ownership. If SBTS is asserted after HBR, the processor enters slave mode and suspends any unfinished access to the external bus. In uniprocessor systems (with ID2-0=000), the host must assert CS in the same cycle as HBR to initiate an asynchronous access. After the host finishes its task, it can relinquish control of the bus by deasserting HBR. The bus master responds by deasserting HBG in the cycle after sampling HBR deasserted. In the cycle following deassertion of HBG, the bus master assumes control of the bus and normal multiprocessor arbitration resumes.
Asynchronous Transfers
To initiate asynchronous transfers after acquiring control of the processors external bus, the host must assert the CS input of the processor to be accessed. The host then drives the address of the I/O processor register to access. To simplify the hardware requirements for external interface logic, only the address bits shown in Table 7-6 need to be driven.
7-48
External Port
Setup and hold times for these address lines are specified in the Data Sheet. For a complete description of these address fields, see Multiprocessor Memory on page 5-19.
Table 7-6 applies to all asynchronous host access cases, including multiprocessor systems. Fewer address bits may need to be driven depending on the system. For example, in a uniprocessor system, the host does not need to drive the ADDR20 address pins. Use the following guidelines when designing a system that uses asynchronous host accesses. A host can only access IOP register space on the ADSP-21161 processor. The ADSP-21161 processor now uses 9 address lines to access the IOP registers. The ADSP-21161 processor does not support the Instruction Word Transfer (IWT) function from previous SHARC DSPs. 48-bit instructions can be transferred by configuring the host packing mode to one of the 48-bit internal transfer modes.
7-49
Host accesses to non-existent IOP register addresses are not supported. These accesses result in a host bus grant (HBG) hang. Therefore, ensure that host accesses generate valid IOP register addresses. When using asynchronous transfers and direct access to IOP register space, only the lower 9 bits, ADDR8-0, need be supplied by the host. The upper address bits can be configured as Table 7-6. Asynchronous write operations are latched at the I/O pads in a four-deep FIFO buffer; this buffer is called the slave write FIFO and appears in Figure 6-5 on page 6-23. This buffering allows previously written words to be re-synchronized while a new word is being written. For maximum host throughput and low and high pulse widths for WR and RD, refer to the ADSP-21161N DSP Microcomputer Data Sheet. A host may write to several ADSP-21161s simultaneously (a broadcast write) by asserting each of their CS pins. Each processor accepts the write as if it were the only device being addressed. Because the REDY output is wire-ORed (if configured as an open-drain output), REDY only appears asserted when all selected DSPs are ready, unless REDY is actively pulled up. ACK is not active when CS is asserted. To eliminate the need for a host to drive the multiprocessor address lines (ADDR20-17) in systems with only one processor (ID2-0=000), the processor with ID2-0=000 does not recognize synchronous accesses to these addresses. The host must drive these address lines with 0000 or one of the ADDR23-21 address pins must be driven high to select an address in external memory if the processors ID2-0 is anything other than 000. To account for buffer delays when sampling the REDY signal, systems must make sure that REDY is properly re-synchronized by the host.
7-50
External Port
Figure 7-18 shows the timing of a host write cycle, including details of data packing and unpacking. This timing applies to the example host interface hardware shown in Figure 7-25 on page 7-81 and has the following sequence: 1. The host asserts the address. HBR and CS are decoded from the host bus interface address comparator and do not need to be supplied directly by the host. The selected processor deasserts REDY immediately. 2. The host asserts WR and drives data according to the timing requirements specified in the ADSP-21161N DSP Microcomputer Data Sheet. 3. The selected processor asserts REDY when it is ready to accept the data. This transition occurs after the current bus master has completed its current transfer and has asserted HBG. HBG enables the host interface buffers to drive onto the processor bus.
7-51
HBR
CS
DRIVEN BY HOST
HOST ADDRESS
VALID
ADDRESS VALID
HOST WRITE
ADDRESS SETUP
ACK
VALID
HOST TRISTATES BEFORE ASSERTING RD DATA IS LATCHED IN HOST ON RD RISING EDGE DATA FR OM HOST IS LATCHED INTO DSP ON WR RISING EDGE
Figure 7-18. Example Timing for Host Read and Write Cycles
7-52
External Port
4. The host deasserts WR when REDY is high and stops driving data. 5. The selected processor latches data on the rising edge of WR. After the first word, the write sequence is: 1. The host asserts WR and drives data according to the timing requirements specified in the ADSP-21161N DSP Microcomputer Data Sheet. 2. The processor deasserts REDY if it is not ready to accept data. 3. The host deasserts WR when REDY is high and stops driving data. 4. The selected processor latches data on the rising edge of WR. More than one processor may have its CS pin asserted at any one time during a write, but not during a read because of bus conflicts. Figure 7-18 also shows the timing of a host read cycle. This timing applies to the bus interface hardware in Figure 7-25 on page 7-81 and has the following sequence: 1. The host asserts the address. HBR and the appropriate CS line are decoded by the host bus interface address comparator. The selected processor deasserts REDY immediately and asserts HBG. 2. The host asserts RD. 3. The selected processor drives data onto the bus and asserts REDY when the data is available. 4. The host latches the data and deasserts RD.
7-53
After the first word, the read sequence is: 1. The host asserts RD. 2. The selected processor deasserts REDY then asserts REDY, driving data when it becomes available. 3. The host deasserts RD when REDY is high and latches the data.
7-54
External Port
7-55
Instruction Transfers For 8-, 16- or 32-bit host interfaces, the ADSP-21161 can pack and unpack 48-bit instructions or 40-bit extended precision normal word data based on the host packing mode selected with the HBW bits in the SYSCON register. Slave Write Latency The processor handles asynchronous (from a host) and synchronous (from another processor) writes differently. This difference influences the latency for the writes. When a bus slave receives data from an asynchronous write, the processor latches the data and address in a four-level FIFO buffer. For synchronous writes, this buffer is two levels deep. This buffer is called the slave write FIFO and appears in Figure 6-5 on page 6-23. In the following cycle, the slave write FIFO attempts to complete the write internally. This buffering lets the host (or bus master) perform writes at the full clock rate. The slaves core cannot explicitly read the slave write FIFO. Also, the processor cannot determine the slave write FIFOs status. Writes to the I/O processor registers from the slave write FIFO usually occur in the following one or two cycles or when any current DMA transfer is completed. The write takes more than two cycles only if a direct write in the previous cycle was held off by a full buffer. If the slave write FIFO is full when a write is attempted, the processor deasserts ACK (or REDY) until the FIFO is not full. Unless higher priority on-chip DMA transfers are occurring, the slave write FIFO usually empties out within one cycle, creating a one-cycle write latency. Slave reads are held off when there is data in the slave write FIFOthis prevents false data reads and out-of-sequence operations.
7-56
External Port
Slave Reads When a read of an ADSP-21161 occurs, the address is latched on-chip by the I/O processor and REDY is deasserted asynchronously. When the data is available, the I/O processor drives the data and asserts REDY. I/O processor register reads have a maximum throughput of one access per every three CLKIN cycles. As a slave, the processor supports burst read accesses, which improve throughput for I/O processor register reads of EPBx FIFOs only. Maximum throughput for synchronous burst direct read accesses is summarized in Table 7-7. For hosts, the processor does not support the synchronous burst protocol. Table 7-7. Direct Read Latencies for a 1:2 Clock Ratio
Access Type Single Read of I/O processor register Burst Read of I/O processor registers (EPBx only) Latency (CLKIN cycles) 3 3-2-2-2
Broadcast Writes
Broadcast writes allow simultaneous transmission of data to all of the ADSP-21161 processors in a multiprocessing system. The host processor can perform broadcast writes to the same I/O processor register on all of the slaves. Broadcast writes can be used to implement reflective semaphores in a multiprocessing system. The host processor must assert the CS input of all processors in the system and the address of the appropriate memory mapped I/O processor register for a broadcast write. Unlike previous SHARCs, the ADSP-21161 processor does not include a broadcast write memory space into its address space and therefore processor to processor broadcast writes are not supported.
7-57
DMA Transfers
The host processor can also set up DMA transfers to and from the ADSP-21161. After the host gets control of the processor, the host can access the on-chip DMA control and parameter registers to set up an external port DMA operation. DMA is the most efficient way to transfer
7-58
External Port
blocks of data. For DMA programming examples, see External Port DMA Example on page 6-77 and External Port Chained DMA Example on page 6-79. DMA Transfers to Internal Memory. The host can set up external port DMA channels to transfer data to and from internal memory. DMA Transfers to External Memory. The host can set up an external port DMA channel to transfer data directly to external memory using the DMA request and grant lines (DMARx, DMAGx). For more information, see Setting Up External Port DMA on page 6-68. The host may also use the DMARx/DMAGx handshake signals for a DMA transfer as a bus slave, but may not use DMA as a bus master while HBR retains control of the bus.
7-59
The ADSP-21161 provides a glueless interface to 8-, 16-, and 32-bit hosts. Three differences between the ADSP-21161 and the ADSP-21160 are: Connection of 8-bit hosts (in addition to 16- or 32-bit hosts) is supported. There is limited direct access to IOP register space. A host processor and other ADSP-21161s in a multiprocessing configuration can only directly access the memory mapped IOP registers of an ADSP-21161. A host can only use asynchronous access to ADSP-21161 registers (by using CS of the processor). The lower nine bits of the 24-bit address bus are decoded to select an IOP register for any access into the ADSP-21161s internal memory.
7-60
External Port
Synchronous broadcast write is not supported by the ADSP-21161 because there is no broadcast memory space. However, the host can simultaneously write to the same address on all the processors asynchronously by asserting CS for each processor simultaneously during a write without any multiprocessor memory offset. The host data bus is connected to the ADSP-21161 data bus in a LSB-alignment to the default 32-bit active data bus DATA47-16. For example, data pin 0 (D0) of host data bus connects to DATA16 of ADSP-21161 data bus, data pin 1 (D1) of the host data bus connects to DATA17 of the ADSP-21161 data bus, and so on. Depending on the register access, the processor packs/unpacks data as 32 bits, 48 bits, or up to 64 bits. A host can indirectly transfer data (via DMA) to and from internal memory by writing or reading to/from EPBx. To support this, several packing options are available. The newly defined Host Bus Width (HBW) bits 5 and 4 in the SYSCON register control the host data packing. They are described in Table 7-9 on page 7-65. Host Packing Status (HPS) bits 24-22 have also been redefined in SYSTAT. They are described in Table A-21 on page A-69.
7-61
ter accessed. In most cases, when a host accesses IOP control/status registers, the processor defaults to internal data packing and unpacking to a 32-bit access (independent of the setting of the PMODE bits in the DMACx register). LBUFx buffer access is limited to 48-bits internal packing, ignoring the PMODE bits in DMACx. EPBx buffer access always depends on the PMODE bits, DTYPE and INT32 bits in DMACx. The three host access cases are described in the following sections: IOP Register Host Accesses on page 7-62 LINK Port Buffer Access on page 7-63 EPBx Buffer Accesses on page 7-64 IOP Register Host Accesses For accesses to all IOP registers except EPBx and LBUFx, the host data is fixed to packed or unpacked to/from 32-bit internal data word. In most cases, when accessing an IOP control or status register, or serial port and SPI data buffers (TXn/RXn, SPIRX/SPITX), the PMODE bits in the DMACx register are ignored. A fixed packing mode of 8-, 16- or 32-bit external to 32-bit internal is selected. This is because all IOP registers except LBUFx and EPBx are 32 bits wide. Ensure that host accesses generate valid IOP register addresses. Host accesses to non-existent IOP register addresses are not supported, and can result in host bus grant (HBG) hang. Host access of IOP control/status registers and SPORT/SPI data buffers (except EPBx and LBUFx) will pack or unpack to 32 bits internally, ignoring the value of PMODE in DMACx. The HBW bits in the SYSCON register are used as a reference to set the external packing mode.
7-62
External Port
For example, when interfacing the ADSP-21161 to an 8-bit microcontroller, the HBW bits are set in the SYSCON register to specify a host bus width of 8 bits. This results in an 8-bit external to 32-bit internal fixed data packing mode to an IOP register. Table 7-8 on page 7-60 shows the packing mode combinations. LINK Port Buffer Access The link buffers LBUF0 and LBUF1 can also be accessed by an external host processor, using direct reads and writes to IOP register space. However, there are differences in how data is accessed with the link buffers compared to other IOP control/status registers. When the host processor reads or writes to these buffers, the external packing data access width is also determined by the host bus width bits in the SYSCON register while the internal packing mode is restricted to 48 bits. Hosts accesses to the link port buffers pack or unpack to 48 bits internally, ignoring the value of PMODE in DMACx. The HBW bits in the SYSCON register are used to set the external packing mode. In the case where a host processor reads or writes to the LBUF0 and LBUF1 link buffers, the PMODE bits in the DMACx external port DMA control register are ignored and are fixed to a special 48-bit internal packing mode. This fixed 48-bit internal packing mode is required because the ADSP-21161 link port buffers transmit and receive 48-bit words. Depending on the HBW bits in SYSCON, the appropriate external to 48-bit internal memory packing mode are selected. The available bit settings are shown in Table 7-8 on page 7-60. It may be desirable in some applications for a host processor to transfer instruction opcodes to another SHARC indirectly via the directly connected SHARCs link port by reading or writing the opcode data to or from the LBUF0 and LBUF1 link buffers through the external port. For example, with a 16-bit host, the packing mode internally defaults to 48-bit packed transfers such that the packing mode is 16-bit external to 48-bit internal packed data transfers.
7-63
EPBx Buffer Accesses The external port buffers, EPB0, EPB1, EPB2, and EPB3 can also be accessed by an external host processor, using direct reads and writes to IOP register space. There are differences in how data is accessed with the EPBx buffers as compared with other IOP control/status registers. When the host processor reads or writes to external port buffers, the packing mode indicated by the PMODE bits in the corresponding DMACx register are selected. Host accesses to the external port buffers pack or unpack according to the packing mode specified with the PMODE bits in DMACx. Depending on the HBW bits in SYSCON and PMODE in DMACx, the appropriate packing mode are selected as shown in Table 7-8 on page 7-60. There is no direct write pending bit in SYSTAT (as in the ADSP-21160) since the ADSP-21161 does not have a direct write FIFO. However, the ADSP-21161 processor has two newly defined bits in SYSTAT for checking the status of the slave write FIFO. The following bits in the SYSTAT register affect host access: Synchronous Slave Write FIFO Data Pending. SYSTAT Bit 20 (SSWPD).Since a host cannot be synchronous, this bit is set for synchronous access by another ADSP-21161. The bit is set (=1) when synchronous slave IOP register write is pending. The bit is cleared (=0) when the direct write is complete. Slave Write FIFO Data Pending. SYSTAT Bit 21 (SWPD).This status bit is set for any host or SHARC write access to an IOP register. If a host processor attempts to write data through the asynchronous protocol, this status bit is set. The bit is set (=1) when a slave (asyn-
7-64
External Port
chronous or synchronous) write to an IOP register is pending. The bit is cleared (=0) when there is no slave write pending. The processor clears SWPD when the direct write is complete. Host Packing Order. SYSCON Bit 7 (HMSWF).This bit determines whether the I/O processor packs the most significant or least significant word first for 8-bit and 16-bit hosts. For 32- to 32/64 and 32- to 48-bit packing, the processor ignores the HMSWF bit in the SYSCON register and the MSWF bit in the DMACx register. Host packing examples are shown below for host direct read/write access to IOP control/status registers, TXn/RXn, SPIRX/SPITX and LBUFx data buffers. The default internal packing is 32-bit for host accesses to IOP control/status registers and 48-bit for host accesses to LBUFx, ignoring PMODE bits in DMACx. If the HMSWF bit is set (=1), the packing and unpacking is most significant word first. If the HMSWF bit is cleared (=0), the packing and unpacking is least significant word first. Table 7-9. Packing sequence for 32-bit IOP Register Data
Transfer First Second Third Fourth Data Bus Pins 23-16 (8-bit bus, LSW first) Word 1; bits 7-0 Word 1; bits 15-8 Word 1; bits 23-16 Word 1; bits 31-24 Data Bus Pins 31-16 (16-bit bus, MSW first) Word 1; bits 31-16 Word 1; bits 15-0
7-65
Table 7-10. Packing Sequence for Accessing 48-bit LBUFx Data (Contd)
Transfer Fourth Fifth Sixth Data Bus Pins 31-16 (16-bit bus, MSW first) Data Bus Pins 23-16 (8-bit bus, MSW first) LBUFx; bits 23-16 LBUFx; bits 15-8 LBUFx; bits 7-0
Table 7-11. Packing Sequence for Accessing 48-bit LBUFx Data From a 32-bit bus (MSW First)
Transfer First Second Third Data Bus Pins 47-32 LBUFx 1; bits 47-32 LBUFx 2; bits 15-0 LBUFx 2; bits 47-32 Data Bus Pins 31-16 LBUFx 1; bits 31-16 LBUFx 1; bits 15-0 LBUFx 2; bits 31-16
To write a single 48-bit word or an odd number of 48-bit words to LBUFx, write a dummy access to completely fill the packing buffer. 8- to 32-Bit Data Packing The processor latches incoming data on pins DATA23-16 for 8- to 32-bit packing on an 8-bit host bus. Similarly, the processor drives outgoing data on DATA23-16 with the other lines equal to zeroes. The sequence of events for 32-bit packing and unpacking for writes and reads are shown in Figure 7-19 on page 7-71. When a host reads a 32-bit word with 8-bit unpacking using the typical bus interface hardware shown in Figure 7-25 on page 7-81, the following sequence of events occurs: The host initiates a read cycle by driving an address, asserting CS, and asserting RD (low).
7-66
External Port
The selected processor deasserts REDY, latches the address, and performs an internal read to get the data. When the processor has the data, it asserts REDY and drives the first 8-bit word. The host latches the data and deasserts RD (high). The host initiates another read access, driving the address of the data to be accessed then asserting RD. The processor transmits the second 8-bit word. The host initiates another read access, driving the address of the data to be accessed then asserting RD. The processor transmits the third 8-bit word. The host initiates another read access, driving the address of the data to be accessed then asserting RD. The processor transmits the final 8-bit word. 8- to 32-bit packing is complete. When a host writes a 32-bit word with 8-bit packing using the typical bus interface hardware shown in Figure 7-25 on page 7-81, the following sequence of events occurs: The host initiates a write cycle by driving the write address, asserting CS, and asserting WR (low). The processor asserts REDY when it is ready to accept data. The host drives the address and the first 8-bit word and deasserts WR (high). The processor latches the first 8-bit word.
7-67
The host drives the address and initiates another write cycle for the second 8-bit word by asserting WR. The processor latches the second 8-bit word. The host drives the address and initiates another write cycle for the third 8-bit word by asserting WR. The processor latches the third 8-bit word. The host drives the address and initiates another write cycle for the fourth 8-bit word by asserting WR. When the processor has accepted the fourth word, it performs an internal write to its memory-mapped I/O processor register. If the processor's internal write has not completed by the time another host access occurs, the processor holds off that access with REDY. The packing sequence for downloading 32-bit data from a 8-bit host bus takes four cycles for every word, as illustrated in as shown in Table 7-12. The endian format of the transfers is controlled by the HMSWF bit in the SYSCON register. If HMSWF=0, the least significant 8-bit word is packed first. If HMSWF=1, the most significant 8-bit word is packed first. Table 7-12. 8- to 32-Bit Word Packing, HMSWF=1 (Host Bus <-> ADSP-21161)
Transfer First transfer Second transfer Third transfer Fourth transfer Data Bus Pins 23-16 Word1, bits 31-24 Word1, bits 23-16 Word1, bits 15-8 Word1, bits 7-0
7-68
External Port
16- to 32-Bit Packing For a 16-bit host bus, the processor latches incoming data on pins DATA31-16. Similarly, the processor drives outgoing data on DATA31-16 with the other lines equal to zeroes. The sequence of events for 32-bit packing and unpacking is different for writes and reads. When a host reads a 32-bit word with 16-bit unpacking using the bus interface hardware shown in Figure 7-25 on page 7-81, the following sequence of events occurs as illustrated in Figure 7-23 on page 7-73: The host initiates a read cycle by driving an address, asserting CS, and asserting RD (low). The selected processor deasserts REDY, latches the address, and performs an internal read to get the data. When the processor has the data, it asserts REDY and drives the first 16-bit word. The host latches the data and deasserts RD (high). The host initiates another read access, driving the address of the data to be accessed then asserting RD. The processor transmits the second 16-bit word (16 to 32-bit packing is complete). When a host writes a 32-bit word with 16-bit packing using typical bus interface hardware shown in Figure 7-25 on page 7-81, the following sequence of events occurs as illustrated in Figure 7-23 on page 7-73: The host initiates a write cycle by driving the write address, asserting CS, and asserting WR (low). The processor asserts REDY when it is ready to accept data. The host drives the address and the first 16-bit word and deasserts WR (high).
7-69
The processor latches the first 16-bit word. The host drives the address and initiates another write cycle for the second 16-bit word by asserting WR. When the processor has accepted the second word, it performs an internal write to its memory-mapped I/O processor register. If the processor's internal write has not completed by the time another host access occurs and the 4 deep asynchronous slave FIFO is full, the processor holds off that access with REDY. The packing sequence for downloading or uploading instructions over an 16-bit host bus takes two cycles for every 32-bit word, as shown in Table 7-13. The endian format of the transfers is controlled by the HMSWF bit in the SYSCON register. If HMSWF=0, the least significant 16-bit word is packed first. If HMSWF=1, the most significant 16-bit word is packed first. Table 7-13. 16- to 32-Bit Word Packing, HMSWF=1 (Host Bus <-> ADSP-21161)
Transfer First transfer Second transfer Data Bus Pins 31-16 Word1, bits 31-16 Word1, bits 15-0
7-70
External Port
WR
RD
REDY
WRITE 1ST WRITE 2ND WRITE 3RD WRITE 4TH WORD WORD WORD WORD
DATA[23:16]
WR
RD
REDY
WRITE 1ST WRITE 2ND WRITE 3RD WRITE 4TH WRITE 5TH WORD WORD WORD WORD WORD
DATA[23:16]
7-71
WR
RD
REDY
READ 1ST READ 2ND WORD WORD READ 3RD WORD READ 4TH WORD
DATA[23:16]
WR
RD
REDY
DATA[31:16]
7-72
External Port
WR
RD
REDY WRITE 1ST WORD INTO DSP READ 1ST WORD FROM DSP VALID VALID READ 2ND WORD FROM DSP VALID
DATA31-16
VALID
WR
RD
REDY WRITE 1ST WORD INTO DSP READ 1ST WORD INTO DSP
2ND WORD 3RD WORD
DATA47-16
1ST WORD
WORD1
WORD2
WORD3
7-73
If the processor is waiting for another 8- or 16-bit word from the host to complete the packed word, the HPS field in the SYSTAT register is non-zero. For more information, see Host Interface Status on page 7-76. Because there is only one packing buffer for the host interface, the host must complete each packed transfer before another is begun. For more information, see External Port Status on page 6-127. 48-Bit Instruction Packing The host can also download and upload 48-bit instructions over its 8-, 16-, or 32-bit bus. 32- to 48-Bit Packing The packing sequence for downloading instructions from a 32-bit host bus (HBW=00) takes 3 cycles for every 2 words, as illustrated in Table 7-14. Data (32-bit) is transferred on data bus lines 47-16 (DATA47-16). If an odd number of instruction words are transferred, the packing buffer must be flushed by a dummy access to remove the unused word. 40-bit extended precision data may be transferred using the 48-bit packing mode. For more information on memory allocation for different word widths, see Memory Organization and Word Size on page 5-25. Table 7-14. 32- to 48-Bit Word Packing (Host Bus <-> ADSP-21161)
Transfer First transfer Second transfer Third transfer Data Bus Lines 47-32 Word1, bits 47-32 Word2, bits 15-0 Word2, bits 47-32 Data Bus Lines 31-16 Word1, bits 31-16 Word1, bits 15-0 Word2, bits 31-16
7-74
External Port
When a host writes a 48-bit word with 32-bit packing using typical bus interface hardware shown in Figure 7-25 on page 7-81, the sequence of events occurs as illustrated in Figure 7-23 on page 7-73. 16- to 48-Bit Packing The packing sequence for downloading or uploading instructions over a 16-bit host bus takes three cycles for every 48-bit word, as shown in Table 7-15. Table 7-15. 16- to 48-Bit Word Packing, HMSWF=1 (Host Bus <-> ADSP-21161)
Transfer First transfer Second transfer Third transfer Data Bus Pins 31-16 Word1, bits 47-32 Word1, bits 31-16 Word1, bits 15-0
When a host writes a 48-bit word with 16-bit packing using typical bus interface hardware shown in Figure 7-25 on page 7-81, the sequence of events occurs as illustrated in Figure 7-22 on page 7-72. 8- to 48-Bit Packing The packing sequence for downloading or uploading instructions over an 8-bit host bus takes six cycles for every 48-bit word, as shown in Table 7-16. The endian format of the transfers is controlled by the HMSWF bit in the SYSCON register. If HMSWF=0, the least significant word is packed first. If HMSWF=1, the most significant word is packed first. When a host writes a 48-bit word with 8-bit packing using typical bus interface hardware shown in Figure 7-25 on page 7-81, the sequence of events occurs as illustrated in Figure 7-23 on page 7-73.
7-75
Table 7-16. 8- to 48-Bit Word Packing, HMSWF=1 (Host Bus <-> ADSP-21161)
Transfer First transfer Second transfer Third transfer Fourth transfer Fifth transfer Sixth transfer Data Bus Pins 23-16 Word1, bits 47-40 Word1, bits 39-32 Word1, bits 31-24 Word1, bits 23-16 Word1, bits 15-8 Word1, bits 7-0
7-76
External Port
Vector Interrupts. The host can issue a vector interrupt to the processor by writing the address of an interrupt service routine to the VIRPT register. When serviced, this high priority interrupt causes the processor to branch to the service routine at that address. The MSGRx and VIRPT registers also support shared-bus multiprocessing through the external port. Because these registers may be shared resources within a single processor, conflicts may occuryour system software must prevent this. For further discussion of I/O processor register access conflicts, see I/O Processor Registers on page A-47. Message Passing (MSGRx) There are three possible software protocols that the host can use for communicating with the processor through the MSGRx message registers: vector-interrupt-driven, register handshake, and register write-back. For the vector-interrupt-driven method, the host fills predetermined MSGRx registers with data, and triggers a vector interrupt by writing the address of the service routine to VIRPT. The service routine should read the data from the MSGRx registers and then write 0 into VIRPT. This signals the host that the routine is complete. The service routine also could use one of the processors FLAG11-0 pins to indicate completion. For the register handshake method, four of the MSGRx registers are designated as follows: a receive register (R), a receive handshake register (RH), a transmit register (T), and a transmit handshake register (TH). To pass data to the ADSP-21161processor, the host would write data into T and then write a 1 into TH. When the ADSP-21161 sees a 1 in TH, it reads the data from T and then writes back a 0 into TH. When the host sees a 0 in TH, it knows that the transfer is complete. A similar sequence of events occurs when the ADSP-21161 passes data to the host through R and RH.
7-77
The register write-back method is similar to register handshaking, but uses only the T and R data registers. The host writes data to T. When the ADSP-21161 sees a non-zero value in T, it retrieves it and writes back a 0 to T. A similar sequence occurs when the host is receiving data. This simpler method works well when the data to be passed does not include 0. Host Vector Interrupts (VIRPT) Vector interrupts are used for interprocessor commands between the host and a ADSP-21161 or between two ADSP-21161s. When the external processor writes an address to the ADSP-21161s VIRPT register, the write triggers a vector interrupt. For more information, see Multiprocessing Interrupts on page 3-49. To use the ADSP-21161s vector interrupt feature, the host can perform the following sequence of actions: 1. Poll the processors VIRPT register until the host reads a certain token value (for example, zero). 2. Write the vector interrupt service routine address to VIRPT. 3. When the service routine is finished, the processor writes the token back into VIRPT to indicate that it is finished and that another vector interrupt can be initiated.
7-78
External Port
Access to the Processor Bus Slave Processor Figure 7-24 shows an example of a interface to a system bus that isolates the local processor bus from the system bus. When the system is not accessing the DSPs, the local bus supports transfers between other local DSPs and local external memory or devices. When the system needs to access a processor, the system executes a read or write to the address range of the processor subsystem. The external address comparator detects a local access and asserts HBR and one of the appropriate CS lines. The processor holds off the system bus with REDY until the processor is ready to accept the data. The HBG signal enables the system bus buffers. The buffers direction for data is controlled by the read or write signals. To avoid glitching the HBR line when addresses are changing, the address comparator may be qualified by an enable signal from the system or qualified by the system read or write signals. These methods cause HBR to be deasserted each time system read or write is deasserted or the address is changed. Because these techniques deassert HBR with each access, the overhead of an HTC occurs as part of each access. Access to the System Bus Master Processor Figure 7-25 shows a bidirectional system interface in which the processor subsystem can access the system bus by becoming a bus master. Before beginning the access, the processor first requests permission to become the bus master by generating the System Bus Request signal (SBR). A bus arbitration unit determines when to respond with SBR. Here, each system bus master generates and responds to its own unique pair of signals. The method a processor uses to arbitrate for the system bus depends on whether the access is from the processor processor core or the I/O processor. For more information, see Processor Core Access to System Bus on page 7-82 and DMA Access to System Bus on page 7-84.
7-79
BR1, DATA47-16 BR3-BR6 ID2-0 HBR HBG REDY CS WR RD WRITE ACK MS3-0 READ EXTERNAL MEMORY ADDR DATA WE OE ACK CS HBR CS2 CS1 ADDRESS COMPARATOR "ADDRESS VALID" REDY ACK SYSTEM BUS SYSTEM ADDRESS BUS OE HBG T/R
ADSP-21161 #1 BR1 5 BR2-BR6 DATA47-16 3 001 ID2-0 HBR HBG REDY CS WR RD ACK MS3-0 CLUSTER BUS ADDR23-0
7-80
External Port
BR1, DATA47-16 BR3-BR6 ID2-0 HBR HBG REDY SBTS CS WR RD ACK MS3-0 READ EXTERNAL MEMORY ADDR DATA WE OE ACK CS HBR CS2 CS1 ADDRESS COMPARATOR ACK REDY SYSTEM ADDRESS BUS WRITE HBG SYSTEM BUS GRANT
"ADDRESS VALID"
ADSP-21161 #1 ADDR23-0
BR2-BR6 DATA47-16 ID2-0 HBR HBG REDY SBTS CS SYSTEM BUS WR RD ACK MS3-0 CLUSTER BUS SYSTEM BUS REQUEST ACK MS3-0 FLAG0 (1,2) SYSTEM BUS GRANT
7-81
Processor Core Access to System Bus The processor core may arbitrate for the system bus by setting a flag and waiting for SBG on another flag. This technique has the benefit of not stalling the local bus while waiting. If SBG is tied to an interrupt pin, the processor can continue processing while waiting. Another method for the processor access is to attempt the access assuming that the system bus is available. The processor then either waits or aborts the access if the bus is not available. The processor begins the access to the system bus by asserting one of the memory select lines, MS3-0. This assertion also asserts SBR. If the system bus is not available (for example, SBG is deasserted), the processor should be held off with ACK. This approach is simple, but stalls the processor and the local bus when the system bus is accessed while it is busy. To overcome this stall, programs can use the Type 10 instruction:
IF condition JUMP(addr), ELSE compute, DM(addr)=dreg;
This instruction aborts the bus access if the condition (SBG) is not true and causes the program to branch to a try-again-later routine. This method works well if SBG is asserted most of the time. If the Type 10 instruction is not used, a deadlock condition can result if an access is attempted before the bus is granted. The processor samples FLAG inputs at the CLKIN frequency except when CLKDBL is enabled. When CLKDBL is enabled, the processor samples FLAG inputs at the CLKOUT frequency. FLAG outputs must be held stable for at least one full CLKIN cycle. Deadlock Resolution When both the processor subsystem and the system try to access each others bus in the same cycle, a deadlock may occur in which neither access can complete; ACK stays deasserted.
7-82
External Port
Normally, the master processor responds to an HBR request by asserting HBG after the completion of the current access. If the processor is accessing the system bus at the same time, HBG is not asserted, because this current access cannot completethis condition results in a deadlock in which neither access can complete. The deadlock may be broken by asserting the Suspend Bus Three-state (SBTS) input for one or more cycles after the deadlock is detectedwhen the system bus to local bus buffer is requested from both sides. The combination of SBTS and HBR puts the master processor into slave mode and suspends the processor cores external access. This suspension lets the system access to the local bus proceed, after the processor asserts HBG. The combination of HBR and SBTS should only be applied when there is a deadlock caused by a processor access to the system bus. SBTS should not be used when there is a local bus transfer, because the WR signal is asserted twiceonce before the SBTS is asserted and once after the access resumes. For processor-to-processor transfers on the local bus, this double assertion violates the slave timing requirements. The following sequence of actions allows the host processor to suspend an ongoing processor access and gain access to its internal resources, provided that: 1) the access originates from the processors core, not the DMA controller, 2) a DRAM page miss is not detected for that memory access, and 3) bus lock is not enabled. 1. After HBR is asserted, the host asserts SBTS for one or more cycles. If SBTS is asserted one or more cycles after HBR is recognized, HBG is guaranteed to be asserted in the next cycle. SBTS should be deasserted before HBR is deasserted. 2. The host drives the RD and WR strobe to their correct values after HBG is asserted. The host may then perform as many accesses as desired. 3. The host has full control of the bus and may access any of the processors or peripherals on the bus.
7-83
4. The host deasserts HBR. HBG is deasserted when the internal read buffer is empty. 5. One cycle after the processor deasserts HBG, it restarts its suspended access. DMA Access to System Bus Using the SBTS and HBR inputs to resolve a system bus deadlock, as described in Deadlock Resolution on page 7-82, cannot be used for DMA transfers, because after a DMA word transfer has begun in the ADSP-21161, it must be completed (for example, it must receive the ACK signal). If SBTS and HBR are asserted during a DMA access, the HBG pin is not asserted until the access cycle has completed. If the single DMA access is not allowed to complete, a deadlock condition may result. To prevent system bus deadlock when using DMA, programs must make sure that SBG has been asserted before the DMA sequence begins. If a higher priority access is needed, the DMA sequence may be held off (by asserting HBR) at any time after a word has been transferred. Systems must ensure that SBG is asserted before HBR is deasserted to prevent the possibility of another deadlock occurring. When the DMA sequence is complete, the DMA interrupt service routine should clear the external SBR flag. Because the system bus is likely to be considerably slower than the local bus, performance on the local bus may be improved considerably by using handshake mode DMA. In this case, the SBG signal is tied to the DMA request line, DMARx. The local and system bus accesses are only initiated when the system bus is available. Using a FIFO in the system interface unit, to allow DMA data from the local bus to be posted, may also increase performance on the local bus when using a slow system bus.
7-84
External Port
Multiprocessing With Local Memory Figure 7-26 shows how several subsystems may be connected together on a system bus for high throughput. The gate array implements bus arbitration when the system bus is accessed. The buffers isolate the local buses from the system bus. The example system in Figure 7-26 works in the following way: A processor requests the system bus with SBR when it asserts the MS2 line. The gate array arbitrates between the SBR lines and then enables the highest priority group by asserting SBG, which is tied to FLAG0. The master processor may connect to system memory or to other processor groups. When the bus buffer is enabled, the read or write strobe enables should be asserted with a delay to allow the address to stabilize. To access a processor slave in another group, the master processor addresses that groups multiprocessor memory space. The gate array detects group multiprocessor memory space from three high-order address bits and asserts the HBR line for the selected group. When HBG is asserted, the gate array enables the slaves bus buffer. The high-order group address bits are cleared by the buffer to allow the group to decode the address as local multiprocessor memory space. The access is asynchronous because the CS line is asserted. The single waitstate option for the bus should be enabled. If two groups access each other in the same cycle, a deadlock may occur. The SBTS pin may be used to clear the deadlock. ADSP-21161 to Microprocessor Interface A ADSP-21161 without external memory may connect to a host microprocessors bus. Depending on the microprocessors I/O capabilities, the interface may not require any buffers. This type of connection assumes
7-85
3 ADSP-21161 #2
SYSTEM BUS
SBR SBG
MS2 FLAG0 ACK REDY CS HBR HBG LOCAL MEMORY SYSTEM MEMORY DATA ADDR LOCAL BUS BUFFER ENABLE
SBTS
DATA ADDR
(GATE ARRAY)
ADSP-2116I #1
SBR SBG
MS2 FLAG0 ACK REDY CS HBR HBG SBTS LOCAL MEMORY DATA ADDR LOCAL BUS ENABLE BUFFER
Figure 7-26. Subsystems on a System Bus that the processor can execute its application from internal memory most of the time and only occasionally needs to request an external access. The host microprocessor should always keep the HBR request asserted unless it sees BR1 asserted (for the BRx line of the processor with ID=001). The host can then deassert HBR to allow the processor to perform an external access
7-86
DATA47-16
External Port
when the host is ready to give up its bus. Usually, the host can read or write to the processor as needed. The host accesses the processor by asserting CS and handshaking with REDY. The HBG is not necessary in this system.
7-87
CONTROL
ADDRESS
ADSP-21161 #4
ADSP-21161 #3
CLKIN RESET 3 ID2-0 CONTROL ADDR23-0 DATA47-16
ADSP-21161 #2
CLKIN RESET 2 ID2-0 ADDR23-0 DATA47-16 CONTROL ADDR DATA CS BOOT EPROM (OPTIONAL)
ADSP-21161 #1
BMS
CLOCK
ADDR23-0 CLKIN RESET 1 RESET ID2-0 DATA47-16 RD WR ACK MS3-0 SBTS CS HBR HBG REDY
DATA
CONTROL
ADDR
CONTROL ADDRESS
BR6-2 BR1
DATA
DATA
SDA10
A10 CS
ADDR DATA
7-88
External Port
Table 7-17 shows the external port signals for multiprocessor processor arbitration and communication. Table 7-17. Signal for Cluster Multiprocessor Systems
Signal Types Synchronization Arbitration Bused Information Master Controls Slave Control Host 1 2 Interface2 Signals CLKIN, RESET BR6-1, PA1 ADDR23-0, DATA47-16 RD, WR, BRST ACK HBR, HBG, CS, REDY, SBTS
Optional, only needed if Priority Access function is used Optional, only needed if Host Interface is used.
The I/O processor registers of the systems processors make up the multiprocessor memory space. Multiprocessor memory space is mapped into the unified address space of each processor. For more information, see the multiprocessor memory map in Figure 5-8 on page 5-20. After a processor becomes the bus master, it can read and write to any of the slaves I/O processor registers, including their external port FIFO data buffers. For example, the master processor may write to a slaves I/O processor registers to set up DMA transfers or to send a vector interrupt. The ADSP-21161 processor only supports direct reads and writes to I/O processor registers. However, internal memory can be accessed indirectly through EPBx DMA transfers.
7-89
LINK PORT
LINK PORT
LINK PORT
LINK PORT
LINK PORT
LINK PORT
Figure 7-28. Data Flow Multiprocessing The ADSP-21161 provides complete support for data flow multiprocessing applications, because the processor eliminates the need for interprocessor data FIFOs and external memory. The internal memory of the processor is usually large enough to contain both code and data for most applications using data-flow system topology. Data flow systems
7-90
External Port
only require a number of processors and point-to-point signals connecting them. This design yields savings in complexity, board space, and system cost. For more information on connecting multiple processors using link ports, see Host Processor Access To Link Buffers on page 9-14. Cluster Multiprocessing Cluster multiprocessing works for applications where flexibility is required. This flexibility is needed when a system must be able to support a variety of different tasks, some of which may be running concurrently. The cluster multiprocessing configuration is shown in Figure 7-29. Also, the processor has an on-chip host interface that lets a cluster be interfaced to a host processor or another cluster.
ADSP-21161 ADSP-21161 ADSP-21161
LINK PORT
LINK PORT
LINK PORT
LINK PORT
LINK PORT
LINK PORT
EXTERNAL PORT
EXTERNAL PORT
EXTERNAL PORT
BULK MEMORY
Figure 7-29. Cluster Multiprocessing Cluster multiprocessing systems include multiple ADSP-21161s connected to a parallel bus that supports interprocessor access of on-chip memory-mapped registers and access to shared global memory. In a typical cluster of processors, up to six processors and a host can arbitrate for
7-91
the bus. The on-chip bus arbitration logic lets these processors share the common bus. The ADSP-21161s features (such as large internal memory, link ports, and external port FIFOs) help eliminate the need for any extra hardware in the cluster multiprocessor configuration. External memory, both local and global, can frequently be eliminated in this type of system. The ADSP-21161 supports fixed and rotating priority schemes. Other supported techniques include bus locking, timed release, DMA prioritization, and core processor access preemption of background DMA transfers. The on-chip arbitration logic lets transitions in bus mastership take up to only one cycle of overhead. Bus requests are generated implicitly when a processor accesses an external address. Because each processor monitors all bus requests and applies the same priority logic to the requests, each can independently determine who is the next bus master. After getting mastership of the bus, a processor can access external memory and the I/O processor registers of all other processors (slaves) in the system. A processor can directly transfer data to another processor or set up a DMA channel to transfer the data. The processors are mapped into a common memory mapto identify the address space of each processor within the unified memory map of the system cluster. Also, each processor has a unique ID. The processors I/O processor registers and external memory are all part of the unified address space. The cluster configuration allows the processors to have a very fast node-to-node data transfer rate. Clusters also allow a simple, efficient software communication model. For example, all of the required setup operations for a DMA transfer can be accomplished by a single processor on one side of the transfer. The other processor is not interrupted until the DMA transfer is complete. The ADSP-21161s internal memory facilitates I/O in multiprocessor systems. The on-chip, dual-ported RAM supports full-speed inter-processor DMA transfers concurrent with dual accesses by the
7-92
External Port
processors processor core. Because no cycles are stolen from the processor core, the processors full performance is maintained during these accesses. Link Port Data Transfers In A Cluster. A bottleneck exists within the cluster because only two processors can communicate over the shared bus during each cycleother processors are held off until the bus is released. Because the processor can also perform point-to-point link port transfers within a cluster, systems can eliminate this bottleneck by setting up data communication through the link ports. Data links between processors can be dynamically set up and initiated over the common bus. Both link ports can operate simultaneously on each processor. A disadvantage of the link ports is that individual transfers occur at a much lower rate than that of the shared parallel bus. Because the link ports 8-bit data path is smaller than the processors native word size, the transfer of each word requires multiple clock cycles. Link ports may also require more software overhead and complexity because they must be set up on both sides of the transfers before they can occur. SIMD Multiprocessing. For certain classes of applications such as radar imaging, a Single-Instruction Multiple-Data (SIMD) array of processors may be the most efficient topology to coordinate a large number of processors in a single system. The SIMD array of Figure 7-29 on page 7-91 consists of multiple processors connected in a two- or three-dimensional mesh. The data link ports provide nearest neighbor communications and through-routing of data. A single master processor provides the instruction stream that the array executes. Data flow in and out the array can be managed through multiple serial port streams.
7-93
The processor accomplishes bus arbitration through the BR1-6, HBR, and HBG signals. BR1-6 arbitrate between multiple processors, and HBR/HBG pass control of the bus from the processor bus master to the host and back. The priority scheme for bus arbitration is determined by the setting of the RPBA pin. Table 7-18 defines the processor pins used in multiprocessing systems. Table 7-18. MultiprocessingPins
Signal BR6-1 Type I/O/S Definition Multiprocessing Bus Requests. Used by multiprocessing to arbitrate for bus mastership. A processor only drives its own BRx line (corresponding to the value of its ID2-0 inputs) and monitors all others. In a multiprocessor system with less than six processors, the unused BRx pins should be tied high; the processors own BRx line must not be tied high or low because it is an output. Multiprocessing ID. Determines which multiprocessing bus request (BR1 - BR6) is used by ADSP-21161 processor. ID = 001 corresponds to BR1, ID = 010 corresponds to BR2, and so on. Use ID = 000 or ID = 001 in single-processor systems. These lines are a system configuration selection that should be hardwired or only changed at reset. Rotating Priority Bus Arbitration Select. When RPBA is high, rotating priority for multiprocessor bus arbitration is selected. When RPBA is low, fixed priority is selected. This signal is a system configuration selection which must be set to the same value on every processor. If the value of RPBA is changed during system operation, it must be changed in the same CLKIN cycle on every processor. Priority Access. The processor slave may assert the PA signal to interrupt background DMA transfers and gain access to the external bus. This signal is asserted when a processor slaves processor core requests the bus or if an external DMA channel requests the bus with the DMACx PRIO control bit set. The PA signal is an active drive output, which may be asserted (low) by one or more slaves. It is deasserted (high) by the master. A protocol is used to avoid driver contention.
ID2-0
RPBA
PA
(a/d) I/O/S
I = Input, S = Synchronous, (o/d) = Open Drain; O = Output, A = Asynchronous, (a/d) = Active Drive
7-94
External Port
The ID2-0pins provide a unique identity for each processor in a multiprocessing system. The first processor should be assigned ID=001, the second should be assigned ID=010, and so on. One of the processors must be assigned ID=001 in order for the bus synchronization scheme to function properly. The processor with ID=001 holds the external bus control lines stable during reset. When the ID2-0 inputs of a processor are equal to 001, 010, 011, 100, 101, or 110, the processor configures itself for a multiprocessor system and maps its I/O processor registers into the multiprocessor memory space. ID=000 configures the processor for a single-processor system. ID=111 is reserved and should not be used. A processor in a multiprocessor system can determine which processor is the current bus master, by reading the CRBM2-0 bits of the SYSTAT register. These bits give the value of the ID2-0 inputs of the current bus master. Conditional instructions can be written that depend upon whether the processor is the current bus master in a multiprocessor system. The assembly language mnemonic for this condition code is BM, and its complement is Not BM (not bus master). The BM condition indicates whether the processor is the current bus master. For more information, see Conditional Sequencing on page 3-53. To use the bus master condition, the condition code select (CSEL) field in the MODE1 register must be zero or the condition is always evaluated as false. Bus Arbitration Protocol The Bus Request (BR1-6) pins are connected between each processor in a multiprocessing system, with the number of BRx lines used equal to the number of processors in the system. Each processor drives the BRx pin that corresponds to its ID2-0 inputs and monitors all others. If less than six processors are used in the system, the unused BRx pins should be tied high.
7-95
When one of the slave processors needs to become bus master, it automatically initiates the bus arbitration process by asserting its BRx line at the beginning of the cycle. Later in the same cycle, the processor samples the value of the other BRx lines. The cycle in which mastership of the bus is passed from one processor to another is called a Bus Transition Cycle (BTC). A bus transition cycle occurs when the current bus masters BRx pin is deasserted and one or more of the slaves BRx pins is asserted. The bus master can retain bus mastership by keeping its BRx pin asserted. Also, the bus master does not always lose bus mastership when it deasserts its BRx lineanother BRx line must be asserted by one or more of the slaves at the same time. In this case, when no other BRx is asserted, the master does not lose any bus cycles. By observing all of the BRx lines, each processor can detect when a bus transition cycle occurs and which processor has become the new bus master. A bus transition cycle is the only time that bus mastership is transferred. After conditions determine that a bus transition cycle is going to occur, every processor in the system evaluates the priority of the BRx lines asserted within that cycle. For a description of bus arbitration priority, see Bus Arbitration Priority (RPBA) on page 7-98. The processor with the highest priority request becomes the bus master on the following cycle, and all of the processors update their internal records to indicate which processor is the current bus master. This information can be read from the current bus master field, CRBM, of the SYSTAT register. Figure 7-29 on page 7-91 shows typical timing for bus arbitration. The actual transfer of bus mastership is accomplished by the current bus master three-stating the external busDATA47-16, ADDR23-0, CLKOUT1, RD, WR, BRST, MS3-0, HBG, DMAG2-1at the end of the bus transition cycle and the new bus master driving these signals at the beginning of the next cycle.
1
For a complete description of CLKOUT functionality, see Table 13-1 on page 13-4.
7-96
External Port
The bus strobes (RD, WR) and MS3-0 are driven high (inactive) before three-stating occurs. ACK must be sampled high by the new master before it starts a new bus operation. For more information, see Figure 7-30. During bus transition cycle delays, execution of external accesses are delayed. When one of the slave processors needs to perform an external read or write, it automatically initiates the bus arbitration process by asserting its BRx line. This read or write is delayed until the processor receives bus mastership. If the read or write was generated by the processors processor core (not the I/O processor), program execution stops on that processor until the instruction is completed. The following steps occur as a slave acquires bus mastership and performs an external read or write over the bus as shown in Figure 7-31 on page 7-100. 1. The slave determines that it is executing an instruction which requires an off-chip access. It asserts its BRx line at the beginning of the cycle. Extra cycles are generated by the core processor (or I/O processor) until the slave acquires bus mastership. 2. To acquire bus mastership, the slave waits for a bus transition cycle in which the current bus master deasserts its BRx line. If the slave has the highest priority request in the bus transition cycle, it becomes the bus master in the next cycle. If not, it continues waiting. 3. At the end of the bus transition cycle the current bus master releases the bus, and the new bus master starts driving. During the CLKIN cycle in which the bus master deasserts its BRx output, it three-states its outputs in case another bus master wins arbitration and enables its drivers in the next CLKIN cycle. If the current bus master retains control of the bus in the next cycle, it enables its bus drivers, even if it has no bus operation to run.
7-97
The processor with ID=00x enables internal keeper latches, or pullup devices, on key signals, including the address and data buses, strobes, and ACK. These devices provide a weak current source or sinkapproximate 20K impedanceto keep these signals from drifting near input receiver thresholds when all drivers are three-stated. When the bus master stops using the bus, its BRx line is deasserted, allowing other processors to arbitrate for mastership if they need it. If no other processors are asserting their BRx line when the master deasserts its BRx, the master retains control of the bus and continues to drive the memory control signals until: 1) it needs to use the bus again, or 2) another processor asserts its BRx line. While a slave waits to be a master for a DMA transfer, it asserts BRx. If that slaves core accesses the DMA address registers, the BRx is deasserted during that access. See I/O Processor Registers Memory Map on page A-51. Bus Arbitration Priority (RPBA) To resolve competing bus requests, there are two available priority schemes: fixed and rotating. The RPBA pin selects the scheme. When RPBA is high, rotating priority bus arbitration is selected, and when RPBA is low, fixed priority is selected. The RPBA pin must be set to the same value on each processor in a multiprocessing system. If the value of RPBA is changed during system operation, it must be changed synchronously to CLKIN and must meet a setup time that lets all processors recognize the change in the same cycle. The priority scheme changes in that (same) cycle.
7-98
External Port
OPTIONAL
HIGHEST PRIORITY REQUESTER BECOMES BUS MASTER
VALID VALID
MS, STROBES DRIVEN INACTIVE BEFORE TRISTATE
VALID VALID
VALID VALID SYNC WRITE SYNC READ BTC DOES NOT ACCESS ACCESS OCCUR IF NO OTHER BRx ASSERTED
MINIMUM 2-CYCLE SYNC READ - SLAVE DEASSERTS ACK IN 2ND CYCLE IF NEEDED
Figure 7-30. Bus Request and Read/Write Timing In the fixed priority scheme, the processor with the lowest ID number among the competing bus requests becomes the bus master. If, for example, the processor with ID=010 and the processor with ID=100 request the bus simultaneously, the processor with ID=010 becomes bus master in the following cycle. Each processor knows the ID of the other processors requesting the bus, because the ID corresponds to the BRx line being used for each processor.
7-99
ADSP-21161 #1 IS BUS BTC MASTER CLKIN BUS REQUESTS: BR1 BR2 ADSP-21161 WITH ID=1:
INTERNAL OPERATION INTERNAL OPERATION
BTC
INTERNAL OPERATION
INTERNAL OPERATION
INTERNAL OPERATION
EXTERNAL ACCESS
UNDRIVEN
PERFORM ACCESS
UNDRIVEN
PERFORM ACCESS
PERFORM ACCESS
PERFORM ACCESS
UNDRIVEN
Figure 7-31. Bus Arbitration Timing The rotating priority scheme gives roughly equal priority to each processor. When rotating priority is selected, the priority of each processor is reassigned after every transfer of bus mastership. Highest priority is rotated from processor to processor as if they were arranged in a circle
7-100
External Port
the processor located next to (one place down from) the current bus master is the one that receives highest priority. Table 7-19 shows an example of how rotating priority changes on a cycle-by-cycle basis. Table 7-19. Rotating Priority Arbitration Example
Cycle Number Hardwired Processor IDs & Priority1 ID1 12 2 3 4 53 M 4 4 5-BR 1-BR ID2 1 5-BR 5-BR M 2 ID3 2-BR M-BR M 1 3 ID4 3 1 1 2 4 ID5 4 2 2 3 5 ID6 5 3 3 4-BR M
1 The following symbols appear in these cells: 1-5 = assigned priority, M = bus mastership (in that cycle), BR = requesting bus mastership with BRx 2 Initial priority assignments 3 Final priority assignments
Bus Mastership Timeout In either the fixed or rotating priority scheme, systems may need to limit how long a bus master can retain the bus. Systems can limit bus mastership by forcing the bus master to deassert its BRx line after a specified number of CLKIN cycles and giving the other processors a chance to acquire bus mastership. To set up a bus master timeout, a program must load the BMAX register (Figure 7-32) with the maximum number of CLKIN cycles (minus 2) that allows the processor to retain bus mastership. This equation is shown below
BMAX
= (maximum # of bus mastership CLKIN cycles) 2 Internal processor clock cycles are a multiple of CLKIN cycles.
7-101
The minimum value for BMAX is 2, which lets the processor retain bus mastership for four CLKIN cycles. Setting BMAX=1 is not allowed. To disable the bus master timeout function, set BMAX=0. Each time a processor acquires bus mastership, its BCNT register is loaded with the value in BMAX. BCNT is then decremented in every CLKIN cycle that the master performs a read or write over the bus and any other (slave) processors are requesting the bus. Any time the bus master deasserts its BRx line, BCNT is reloaded from BMAX. When BCNT decrements to zero, the bus master first completes its off-chip read/write and then deasserts its own BRx (any new off-chip accesses are delayed)this allows transfer of bus mastership. If the ACK signal is holding off an access when BCNT reaches zero, bus mastership is not relinquished until the access can complete. If BCNT reaches zero while a burst transfer is in progress, the bus master completes the burst transfer before deasserting its BRx output. If BCNT reaches zero while bus lock is active, the bus master does not deassert its BRx line until bus lock is removed. If HBR is being serviced, BCNT stops decrementing and continues only after HBR is deasserted. Bus lock is enabled by the BUSLK bit in the MODE2 register. For more information, see Bus Lock and Semaphores on page 7-110.
31 30 29 28 27 26 25 24 23 22 21 0 7 0 0 6 0 0 5 0 20 0 4 0 19 0 3 0 18 0 2 0 17 16 0 1 0 0 0 0
BMAX (0x18)
0 9 0
0 8 0
15 14 0 0
13 12 0 0
11 10 0 0
7-102
External Port
Priority Access The Priority Access signal (PA) lets external bus accesses by a slave processor take priority over ongoing DMA transfers. Normally when external port DMA transfers are in progress, the slave processors cannot use the external bus until the DMA transfer is finished. By asserting its PA pin, the slave processor can acquire the bus without waiting for the DMA operation to complete. The PA signal can also be asserted by a slave with a high-priority DMA access pending on the external bus. If the PA signal is not used in a multiprocessor system, the processor bus master does not give up the bus to another processor until: 1) a cycle in which it does not perform an external bus access or 2) a bus timeout. If a slave processor needs to send a high priority message or perform an important data transfer, it normally must wait until any DMA operation completes. Using the PA signal lets the slave perform its higher priority bus access with less delay. Each of the DMACx registers has a PRIO bit that raises that DMA channel to a higher priority than all other internal DMA channels that do not have the PRIO bit set. Unless configured differently with the EBPR bit in the SYSCON register, this channel still has lower priority (internally) than the core. Programs should be careful to minimize the number of DMA channels enabled to high priority status in the multiprocessor system, because both core and (external) high priority DMA requests from slaves are arbitrated at the same priority level. For example, a slave core cannot arbitrate bus ownership away from a high priority DMA transfer, unless the bus timeout (BMAX function) occurs. When PA is asserted, the current processor bus master deasserts its BRx output, and gives up the bus, provided: 1. Its core does not have an external access pending, and 2. None of its external bus DMA channels have pending high-priority bus requests.
7-103
All processor slaves also deassert their BRx outputs, if each slave meets the same provisions. The current bus master never asserts PA, because it already has control of the bus. If the current master detects a condition that would assert PA while it is bus master, it performs that high priority operation before giving up bus ownership. In the CLKIN cycle after PA has been asserted, only the processor slaves with a pending high priority access have their bus requests asserted. Bus arbitration proceeds as usual with the highest priority device becoming the master when the previous bus master releases its BRx output. The new master samples all BRx inputs after gaining bus mastershipduring the cycle that follows the BTC. If no other bus requests are asserted, the master is the only device driving PA, and the master deasserts and three-states PA in this cycle as shown in Figure 7-33.
1 BR1-5 BR6 2 3 4
All ADSP-21161s that do not have core access pending remove their BRx
PA
7-104
{ {
BTC Slaves cannot assert PA in this cycle Bus Master samples all other BRx negated and negates PA
External Port
If the master samples other BRx inputs as asserted, multiple devices are driving PA, and the new bus master cannot deassert PA. The new bus master three-states its PA driver in this case. All processor slaves recognize the cycle following the BTC. They do not assert PA during this cycle, unless they were already driving their BR and PA outputs in the BTC. This behavior is demonstrated in Figure 7-34.
All ADSP-21161s that do not have core access pending remove their BRx 1 BR1-6 PA 2 3 4
{ {
BTC Bus Master samples other BRx asserted and three-states (only) PA Slaves continue to assert PA in this cycle
7-105
One of the processors in the system must be assigned ID=001 in order for the bus synchronization scheme to function properly. This processor also holds the external bus control lines stable during reset. Bus arbitration synchronization is disabled if the processor is in a single-processor system (ID=000). To synchronize their bus arbitration logic and define the bus master after a system reset, the multiple processors obey the following rules: All processors except the one with ID=001 deassert their BRx line during reset. They keep their BRx deasserted for at least two cycles after reset and until their bus arbitration logic is synchronized1. After reset, a processor considers itself synchronized when it detects a cycle in which only one BRx line is asserted. The processor identifies the bus master by recognizing which BRx is asserted and updates its internal record to indicate the current master. The processor with ID=001 asserts its BRx (BR1) during reset and for at least two cycles after reset. If no other BRx lines are asserted during these cycles, the processor with ID=001 drives the memory control signals to prevent them from glitching. Although it is asserting its BRx and driving the memory control signals during these cycles, this processor does not perform reads or writes over the bus. If the processor with ID=001 is synchronized by the end of the two cycles following reset, it becomes the bus master. If it is not synchronized at this time, it deasserts its BRx (BR1) and waits until it is synchronized. When a processor has synchronized itself, it sets the BSYN bit in the SYSTAT register.
For a complete description of the functionality of the internal reset signal, RSTOUT, see Table 13-1 on page 13-4.
7-106
External Port
If one processor comes out of reset after the others have synchronized and started program execution, that processor may not be able to synchronize immediately (for example, if it detects more than one BRx line asserted). If the un-synchronized processor tries to execute an instruction with an off-chip read or write, it cannot assert its BRx line to request the bus and execution is delayed until it can synchronize and correctly arbitrate for the bus. Synchronization cannot occur while HBG is asserted, because bus arbitration is suspended while the bus is controlled by a host. If HBR is asserted immediately after reset and no bus arbitration has taken place, the processor with ID=001 is considered to be the last bus master. The processor with ID=001 maintains correct logic levels on the RD, WR, MS3-0, and HBG signals during reset. Because the 001 processor can be accidently reset by an erroneous write to the soft reset bit (SRST) of the SYSCON register, it behaves in the following manner during reset: While it is in reset, the processor with ID=001 attempts to gain control of the bus by asserting BR1. While it is in reset, the processor with ID=001 drives the RD, WR, MS3-0, DMAG1, DMAG2, and HBG signals only if it determines that it has control of the bus. For the processor to decide it has control of the bus, two conditions must be true: 1) BR1 was asserted and no other BRx lines were asserted in the previous cycle, and 2) HBG was deasserted in the previous cycle. The processor with ID=001 continues to drive the RD, WR, MS3-0, DMAG1, DMAG2, and HBG signals for two cycles after reset, as long as neither HBG nor any other BRx lines are asserted. At the end of the second cycle it assumes bus mastership (if it is synchronized), and normal bus arbitration begins in the following cycle. If it is not synchronized, it deasserts BR1, stops driving the memory control signals and does not arbitrate for the bus until it becomes synchronized.
7-107
Although the bus synchronization scheme allows individual processors to be reset, the processor with ID=001 may fail to drive the memory control signals if it is in reset while any other processors are asserting their BRx line.If the processor with ID=001 has asserted HBG while it is in reset, it is synchronized when RSTOUT is deasserted1. This lets the host start using the bus while the processors are still in reset. If a host processor attempts to reset the processor bus master (which is driving the HBG output), the host immediately loses control of the bus. During reset2, the ACK line is pulled high internally by the processor bus master with a 20 k equivalent resistor.
For a complete description of the functionality of the internal reset signal, RSTOUT, see Table 13-1 on page 13-4. 2 For a complete description of the functionality of the internal reset signal, RSTOUT, see Table 13-1 on page 13-4.
7-108
External Port
Slave Reads. For more information, see Slave Reads on page 7-57. Shadow Write FIFO. For more information, see Slave Reads on page 7-57. Data Transfers Through the EPBx Buffers. For more information, see Data Transfers Through the EPBx Buffers on page 7-58. Interprocessor Messages & Vector Interrupts. For more information, see Interprocessor Messages and Vector Interrupts on page 7-76. Instruction Transfers Multiprocessor instruction transfers to or from internal memory of processor should use 32-bit transfers for maximum performance. The 48-bit internal transfers use one of the slave EPBx FIFOs and the packing mode function (PMODE) of the DMA channel (32- to 48-bit). Maximum throughput is achieved by transferring packed instructions to or from internal memory, using DMA transfers with 32- to 48-bit packing.
7-110
External Port
Because both external memory and each processors I/O processor registers are accessible by every other processor, semaphores can be located almost anywhere. Read-modify-write operations on semaphores can be performed if all of the processors obey two simple rules: 1. A processor must not write to a semaphore unless it is the bus master. This is especially important if the semaphore is located in the processors own internal memory or I/O processor registers. 2. When attempting a read-modify-write operation on a semaphore, the processor must have bus mastership for the duration of the operation. Both of these rules apply when a processor uses its bus lock feature, which retains its mastership of the bus and prevents the other processors from simultaneously accessing the semaphore. Bus lock is requested by setting the BUSLK bit in the MODE2 register. When this happens, the processor initiates the bus arbitration process by asserting its BRx line. When it becomes bus master, it locks the bus by keeping its BRx line asserted even when it is not performing an external read or write. Host Bus Request (HBR) is also ignored during a bus lock. When the BUSLK bit is cleared, the processor gives up the bus by deasserting its BRx line. While the BUSLK bit is set, the processor can determine if it has acquired bus mastership by executing a conditional instruction with the Bus Master (BM) or Not Bus Master (Not BM) condition codes, for example:
IF NOT BM JUMP(PC,0); /* Wait for bus mastership */
If it has become the bus master, the processor can proceed with the external read or write. If not, it can clear its BUSLK bit and try again later.
7-111
A read-modify-write operation is accomplished with the following steps: 1. Request bus lock by setting the BUSLK bit in MODE2. 2. Wait for bus mastership to be acquired. 3. Wait until Slave Write Pending bit (SWPD) is zero. 4. Read the semaphore, test it, then write to it. Locking the bus prevents other processors from writing to the semaphore while the read-modify-write is occurring. After bus mastership is acquired, check the SWPD bits status in SYSTAT to ensure that a semaphore write by another processor is not pending. If the semaphore is reflective, located in one of the processors I/O processor register, the processor must write to it only when it has bus lock. Multiprocessor Interface Status The SYSTAT register provides status information for host and multiprocessor systems. Figure 7-35 shows the status bits in this register.
7-112
External Port
SYSTAT
0x03
HPS
Host Packing Status 000=packing complete [6th stage of 8-to -48, 4th stage of 8 -to-32, etc.] 001=1st stage pack/unpack 010=2nd stage pack/unpack 011=3rd stage pack/unpack 100=5th stage of 8- to -48 bit packing 101=110=111=reserved
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 0 0 0 0 0 0 0 0 0 0 0 0 0
CRAT
CCLK-to-CLKIN ratio Indicate state of CLKCFG[1:0] pins Undefined at RESET~
SSWPD
Synchronous Slave Write FIFO Data Pending 1=sync slave IOP register write pending 0=no sync slave IOP register write pending
SWPD
Slave Write FIFO Data Pending any data (sync or async) 1=slave write pending to IOP register 0=slave no write pending to IOP register
15 14 13 12 11 10 9 0 0 0 0 0
7 0
1 0
0 0
0 0
Vector Interrupt Pending 1=Vector interrupt pending ID Code Displays state of the ID[2:0] pins
VIPD
HSTM
Host Bus Master 1=host bus master controls ext bus 0=no host bus master
IDC
BSYN
CRBM
Current ADSP- 21161 Bus Master Status of ID of DSP who is Bus Master CRPM=001 when ID=000
7-113
7-114
8 SDRAM INTERFACE
The ADSP-21161 processors synchronous DRAM (SDRAM) interface enables it to transfer data at either the core clock frequency or one-half the core clock frequency. The synchronous approach, coupled with the ability to transfer data at the core clock frequency, supports data transfer at a high throughputup to 400 Mbytes/second for a 32-bit bus width, and 600 Mbytes/second for 48-bit bus width. All inputs are sampled and all outputs are valid on the rising edge of the clock SDCLK. The SDRAMs flexible interface allows you to connect SDRAMs to any one or more of the four external memory banks of the ADSP-21161 processor or to all four banks simultaneously. The ADSP-21161 processors SDRAM controller provides a glueless interface with standard SDRAMs. It supports: SDRAMs of 16 Mbits, 64 Mbits, 128 Mbits, and 256 Mbits with configurations 4-bit, 8-bit, 16-bit and 32-bit wide devices Additional buffers between ADSP-21161 processor and SDRAM Zero wait state, 100 Mwords/second with some access types Up to 254.68 Mwords [3x(64M) + 62.68M] of SDRAM in external memory SDRAM page sizes of 2048, 1024, 512, and 256 words A programmable refresh counter to coordinate between varying clock frequencies and the SDRAMs required refresh rate Buffering for multiple SDRAMs connected in parallel ADSP-21161 SHARC Processor Hardware Reference 8-1
Shared SDRAM devices in a multiprocessing system A separate A10 pin that enables applications to precharge SDRAM before issuing a refresh command Connection to up to four external memory banks (0 to 3) of the ADSP-21161 processor Self-refresh, low-power mode Two power-up options The following are definitions used throughout this chapter: Bank Activate command. Activates the selected bank and latches in a new row address. It must be applied before a read or write command. Burst length. Determines the number of words that the SDRAM inputs or outputs after detecting a write or read command, respectively. The processor supports burst length ONE mode only. During a burst length of one cycle, the ADSP-21161 processor SDRAM controller applies the command every cycle and keeps accessing the data. See also, page size on page 8-3. Burst type. Determines the order in which the SDRAM delivers or stores burst data after detecting a read or write command, respectively. The processor supports sequential accesses only. CAS latency. The delay, in clock cycles, between when the SDRAM detects the read command and when it provides the data at its output pins. The speed grade of the device and the applications clock frequency determine the value of the CAS latency. The application must program the CAS latency value into the SDCTL register after power up.
8-2
SDRAM INTERFACE
CBR Automatic Refresh (CAS before RAS) mode. In this mode, the SDRAM drives its own refresh cycle with no external control input. At cycle end, all SDRAM banks are precharged (idle). DQM Data I/O Mask function. This signal is asserted during a precharge command or when a burst stop command interrupts a burst write. When asserted during a write cycle, this signal interrupts and disables the write operation immediately. SDCTL Register. IOP register that contains programmable SDRAM control and configuration parameters that support different vendors timing and power-up sequence requirements. Mode Register. The SDRAMs configuration register that contains user-defined parameters corresponding to the processor's SDCTL register. After initial power-up and before executing a read or write command, the application must program the MODE register. Page Size. The size, in words, of the SDRAMs page. The processor supports 2048-, 1024-, 512-, and 256-word page sizes. Page size is a programmable option in the SDCTL register. Precharge Command. Precharges an active bank. SDRDIV Programmable Refresh Counter. An IOP register containing a refresh counter value. The clock supplied to the SDRAM can vary between 20 and 100 MHz. This counter enables applications to coordinate CLK rate with the SDRAMs required refresh rate. Self-Refresh. The SDRAMs internal timer initiates automatic refresh cycles periodically, without external control input. This command places the SDRAM device in a low-power mode. Self-refresh is a programmable option in the SDCTL register.
8-3
tRAS. Active Command time. Required delay between issuing an activate command and issuing a precharge command. A vendor-specific value. This option is programmable in the SDCTL register. tRC. Bank Cycle time. The required delay between successive Bank Activate commands to the same bank. This vendor-specific value is defined as: tRC = tRP + tRAS. The processor fixes the value of this parameter, so it is a non-programmable option. tRCD. RAS to CAS delay. The required delay between a ACT command and the start of the first read or write operation. This vendor-specific value is programmable in SDCTL. tRP. Precharge time. Required delay between issuing a precharge command and issuing an activate command. This vendor-specific value is programmable in SDCTL. Figure 8-1 shows the SDRAM controllers interface between the internal SHARC core and the external SDRAM device. (Note that in full instruction with no pack mode, the data bus extends to 48 bits, DATA47:00.) The ADSP-21161 processor normally generates an external memory address, which then asserts the corresponding MSx select, along with RD and WR strobes. These control signals are intercepted by the SDRAM controller. The memory access to SDRAM is based on the mapping of the addresses and memory selects. The configuration is programmed in the SDCTL register. The SDRAM controller can hold off the processor core or I/O processor with an internally connected acknowledge signal (ACK), as determined by refresh, nonsequential access, or page miss latency overhead.
8-4
SDRAM INTERFACE
The SDRAM controller provides a glueless interconnection between the SDRAM control, address, and data pins and the processors internal Harvard Architecture busses. The internal 32-bit address bus is multiplexed by the SDRAM controller to generate the corresponding chip select, row address, column address, and bank select signals to the SDRAM.
ADSP-21161
CCLK OR 1/2 CCLK RD WR RESET ACK SDCLK SDCKE RAS CAS SDWE DQM SDA10 MSx CONTROLLER MSx CLK CKE RAS CAS WE DQM A10 CS (JEDEC) D47:16*
SDRAM
21161 CORE
BUFFER
A14/A13
DQ31:0
BA0/BA1 A12:0
A23:0
MUX
A23:0
A23:0
D47:16
Figure 8-1. SDRAM Controller Interface Figure 8-2 shows a block diagram of the ADSP-21161 processors SDRAM interface to four 8-bit SDRAMs. In this single processor example, the SDRAM interface connects to four 1M x 8 x2 (2M x 8) SDRAM devices to use 2M of 32-bit words. The same address and control bus communicates to all four SDRAM devices. The following connections are made:
8-5
SDCKE
connects to the CKE of the SDRAM devices SDRAM clock connects to the CLK pins
connects to all WE
All CAS, RAS, and DQM signals are connected together between the processor and all of the SDRAM devices Notice that the data bus shows the processors default bus width, DATA[47:16]. For full non-packed instruction execution mode, the data bus can be extended to DATA[47:0] with the use of available disabled link port data pins. The A[10] pin of all SDRAM devices are connected to a separate SDA10 pin on the processor to allow the SDRAM controller to retain control of all SDRAMs for any non-SDRAM accesses during host bus requests.
8-6
SDRAM INTERFACE
SDWE CAS RAS MSx A[14] SDA10 A[9:0] SDCKE SDCLK0 DQM
WE CAS RAS CS
SDRAM #1 2M x 8
WE CAS
DATA[23-16]
SDRAM #2 2M x 8
DQ [7:0]
DQ [7:0]
DATA[31-24]
DATA[47-16]
DATA [47:16] WE CAS RAS CS A11[BS] A[10] A[9:0] CKE CLK DQM SDRAM #3 2M x 8 WE CAS RAS DQ [7:0]
DATA[39-32]
ADSP-21161
SDRAM #4 2M x 8
DATA[47-40]
DQ [7:0]
8-7
+ tRP
8-8
SDRAM INTERFACE
8-9
SDCTL
(0x00B8)
SDTRCD
SDRAM tRCD spec RAS to CAS delay [# of SDCLK cycles: 1 to 7 cycles] Pipelining option with external reg buffer [1=ext SDRAM ctl/addr buffer enable 0=no buffer option]
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
SDEM0
Ext mem Bank0 SDRAM enable
SDEM1
Ext mem Bank1 SDRAM enable
SDBUF
SDEM2
Ext mem Bank2 SDRAM enable
SDCKR
SDCLK-to-CCLK ratio 0=Half CCLK (core clock) freq. (1:2) 1=CCLK Core clock freq. (1:1)
SDEM3
Ext mem Bank 3 SDRAM enable 8 0 7 0 6 0 5 0 4 0 3 0 2 0 1 0 0 0
SDBN
SDRAM # of SDRAM device mem banks 0=2 banks, 1=4 banks 15 14 13 12 11 10 9 0 0 0 0 0 0 0
SDSRF
SDRAM self refresh command enable
SDCL
SDRAM CAS Latency spec 01=1 cycle, 10=2 cycles, 11=3 cycles Disable SDCLK0 &Control Signals 1=Disable SDCLK0, RAS~, CAS~ & SDCLKE 0=Activate SDCLK0, RAS~, CAS~ & SDCLKE SDCLK1 Disable 1=disable SDCLK1, 0=SDCLK active
SDPSS
SDRAM Power- up sequence
DSDCTL
SDPGS
SDRAM Page Size 00=256 words 01=512 words 10=1k words 11=2k words
DSDCK1
SDTRAS
SDRAMtRAS spec Active Command Delay [# of SDCLK cycles: 0 to 15 cycles]
SDPM
SDRAM Power- up mode 0=prechg , 8 CBR refs., mode reg. set 1=prechg, mode reg. set, 8 CBR refs.
SDTRP
SDRAMtRP spec PrechargeDelay [# of SDCLK cycles: 1 to 7 cycles]
8-10
SDRAM INTERFACE
The SDCTL register of the ADSP-21161 processor stores the configuration information of the SDRAM interface. Writing configuration parameters initiates commands to the SDRAM that take effect immediately. Before starting the SDRAM powerup sequence, complete the following steps: 1. Write to the WAIT register to set the waitstates to zero (EBxWS=000) for each bank that has SDRAM mapped to it. 2. Set the SDRDIV register at initial power-up. In the SDRDIV register, a memory-mapped IOP register, configure the value for the SDRAM refresh counter. 3. Write all of the SDRAM configuration parameter values to the SDCTL register. When the SDRAM controller is programmed with the register buffer option enabled, do not perform non-SDRAM write accesses to external memory until the power-up sequence is completed by the SDRAM controller. External memory non-SDRAM writes do not function correctly whenever the SDRAM controller is configured for SDBUF=1 (register buffering) option and the power up sequence has not yet been completed by the SDRAM controller. The MRS command that is applied by the SDRAM controller conflicts with the non-SDRAM write access started by either the core or DMA controller. In the SDCTL register, set the parameter bits as follows: Set the SDRAM clock enables (DSDCTL and DSDCK1). Select the number of banks that the SDRAM contains (SDBN). Select the external memory banks configured for and connected to an SDRAM (SDEMx). Set the SDRAM buffering option (SDBUF).
8-11
Select the CAS latency value (SDCL). Select the SDRAM page size (SDPGS). Select the SDRAM power-up mode (SDPM). Start the SDRAM power-up sequence (SDPSS). Start SDRAM self-refresh mode (SDSRF). Set the Active Command Delay (SDTRAS). Set the precharge delay (SDTRP). Set the RAS-to-CAS delay (SDTRCD). Set the SDCLK to Core Clock Ratio (SDCKR). In systems where several SDRAM devices are connected in parallel, buffering may be required to meet overall system timing requirements. The ADSP-21161 processor supports the pipelining of the address and control signals to enable buffering between ADSP-21161 processor and SDRAM. The pipeline bit (SDBUF) in the SDCTL register enables this mode. When this bit is set, the data for write accesses are delayed by one cycle, allowing the address and controls to be externally latched. In read accesses, data is sampled by ADSP-21161 processor one cycle later. To support the higher clock load requirements, two SDCLK pins are provided to eliminate the need for off-chip clock buffers. An option is provided in the SDCTL register (bits 2 and 3) to allow the SDRAM controller to three-state one or both the SDCLK pins. The SDCKR bit in the control register can be used to set the SDCLK to core clock ratio. The interface can run at full core clock frequency or at half the core clock frequency, depending upon the setting for this bit.
8-12
SDRAM INTERFACE
SDRDIV
0xB9
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
SDRDIV =
f CCLK
SDR AM refresh rate cycle
CL
tRP
Figure 8-4. SDRDIV Register and Calculation is 1x CCLK or 2x CCLK, as determined by the SDCKR bit and SDCTL register.
SDCLK
Where: f CCLK = CLKCFG f CLKOUT CL = CAS latency programmed into the SDCTL register tRP = tRP specification programmed in the SDCTL register
8-13
CLK_CFG
= 2 for 2:1 CCLK-to-CLKOUT clock ratio = 3 for 3:1 CCLK-to-CLKOUT clock ratio = 4 for 4:1 CCLK-to-CLKOUT clock ratio
is defined as the internal core-clock frequency. CLKOUT is 1xCLKIN or 2xCLKIN, depending on whether CLKDBL is tied high or low during RESET. The signals SDCLK0 and SDCLK1 can operate at either 1xCCLK or 1/2 CCLK, as determined by the SDCKR in the SDCTL register.
CCLK
For example, for an IBM SDRAM with: Reference rate = 4096 cycles/64ms
CLKIN
= 25 MHz enabled
CLKDBL
=2 =2
tRP
The equation yields: 50x10 6 SDRDIV = 2x -------------------------------- 2 2 5 = 1554 ( decimal ) = 0x612 1 ------------------4096 64x10 3
8-14
SDRAM INTERFACE
To meet higher clock load requirements, the processor provides two SDRAM clock control pins, SDCLK0 and SDCLK1. These pins eliminate the need for off-chip clock buffers. The DSDCTL and DSDCK1 in the SDCTL register provide control for the SDRAM clock control pins. The DSDCTL bit, if set (=1), enables high impedance for all of the SDRAM control pins (DQM, CAS, RAS, SDWE, and SDCKE) and the SDCLK0 pin. The DSDCTL bit, if cleared (=0), disables all SDRAM control pins. The DSDCK1 bit, if set (=1), enables the SDCLK1 pin and places it into a high impedance state only. The DSDCK1 bit, if cleared (=0), disables SDCLK1. If your system does not use SDRAM, set both DSDCTL and DSDCK1 to 1. If your system uses SDRAM, but the clock load is minimal, set DSDCTL to 0 and DSDCK1 to 1 This setting enables the SDCLK0 pin and all related SDRAM control pins, but disables the second clock pin SDCLK1. If your system uses SDRAM and has a heavy clock load such as a system using registered buffers and eight 4-bit SDRAMs to get 32-bit data, set both DSDCTL and DSDCK1 to 0. This setting enables SDCLK0, SDCLK1, and all SDRAM control pins. In this configuration, SDCLK0 and SDCLK1 can each share half of the clock load.
8-15
bits 16-19, No SDRAM enabled Bank 0 SDRAM Enable Bank 1 SDRAM Enable Bank 2 SDRAM Enable Bank 3 SDRAM Enable
8-16
SDRAM INTERFACE
To meet such timing requirements and enable intermediary buffering, the processor supports pipelining of SDRAM address and control signals. The pipeline bit SDBUF (bit 23) in the SDCTL register enables this mode:
SDBUF = 0 SDBUF = 1
When SDBUF is set (=1), the SDRAM controller delays the data in write accesses by one cycle, enabling the processor to latch the address and controls externally. In read accesses, the SDRAM controller samples data one cycle later. Figure 8-5 shows another single processor example in which the SDRAM interface connects to multiple banks of SDRAM to provide 512 M of SDRAM in a 4-bit I/O configuration. This configuration results in 16 M x 32-bit words. In this example, OxA and OxB output from the registered buffers are the same signal, but are buffered separately. In the registered buffers, a delay of one clock cycle occurs between the input (Ix) and its corresponding output (OxA or OxB).
8-17
D Q
RAS CAS WE
SDRAM #1 4M x 4 x 4
CAS WE [3:0]
RAS SDRAM #5 4M x 4 x 4
D Q
[19:16]
ADSP-21065L
C O N T R O L RAS CAS SDWE DQ M SDCKE MS3 A[13:11] A[9:0] SDA10
Registered Buffers
I0 I1 I2 I3 I4 I5 O0A O1A O2A O3A O4A O5A Aa[13:0]
RAS SDRAM #2 4M x 4 x 4
DATA [3:0]
[7:4]
DATA [3:0]
[23:20]
CS A[13:0]
CS A[13:0]
SDRAM #3 4M x 4 x 4
SDRAM #7 4M x 4 x 4
DATA [3:0]
[11:8]
DATA [3:0]
[27:24]
SDCLK0 SDCLK1
Ab[13:0]
DATA[31-0]
RAS SDRAM #4 4M x 4 x 4
CAS WE [15:12]
RAS SDRAM #8 4M x 4 x 4
DATA [3:0]
[31:28]
CS A[13:0]
8-18
SDRAM INTERFACE
Page length depends on the I/O organization and column addressing of the SDRAMs internal banks. For example, a 16 Mbits SDRAM organized as 2 M x 4 I/O x 2 banks has a page size of 1024 words. The SDPGS bits (bits 12 and 13) in the SDCTL register select the SDRAM page length: 00= 256 words, 01 = 512 words, 10= 1024 words and 11 = 2048 words.
8-19
Initialize the SDRDIV register before the ADSP-21161 processor starts the SDRAM power-up sequence. After power up, make sure that the processor waits one cycle before writing the SDCTL register to issue another SDRAM command. For more details, see the SDRAM device documentation.
When SDSRF is set (=1), the processors SDRAM controller issues a SREF command to the SDRAM device or devices, putting them into self-refresh mode immediately. For details, see Self Refresh Command (SREF) on page 8-39.
8-20
SDRAM INTERFACE
The SDTRAS bits (bits 4, 5, 6, and 7) in the SDCTL register select the tRAS value. For example:
SDTRAS=0001 SDTRAS=0010 SDTRAS=0111 SDTRAS=1111
and SDTRCD settings represent the number of core clock (CCLK) cycles.
8-21
= 2 cycles (SDTRCD=2)
8-22
SDRAM INTERFACE
1 When executing 48-bit packed instructions from 32-, 16-, or 8-bit SDRAM memories: - Add one clock cycle to the throughput value or to the average access rate for 32-bit wide SDRAM - Add three clock cycles to the throughput value or to the average access rate for 16-bit wide SDRAM - Add six clock cycles to the throughput value or to the average access rate for 8-bit wide SDRAM 2 With SDRAM buffering enabled (SBUF=1), replace any instance of (CL) with (CL + 1).
modify register or external address register is greater than a value of 1, then one full page can be written at full throughput, but reads increase the amount of processing time required.
8-23
Whenever a page miss happens, the SDRAM controller executes a PRE command followed by a bank activate command before executing a read/write command. For SDRAM reads, a latency (equal to CAS latency) exists from the start of the read command until data is available from the SDRAM. For the first read in a sequence of reads, the latency always exists. Subsequent reads will not have latency if the address is sequential and uninterrupted. A fresh access to SDRAM always aligns to the CLKIN rising edge. So, interrupted access to SDRAM incur the overhead of additional cycles, depending on the CLK CFG setting. For example, WRT-NOP-WRT-NOP-WRT has a 6-cycle overhead for CLK-CFG-2:1 and SDCKR=1. Every write in the above sequence starts at the rising edge of CLKIN, and two core cycles transpire in every CLKIN. The last WRT completes in the first core cycle of the third CLKIN cycle (which is the ninth core cycle). If the three writes had been consecutive, the third write would be over by the third core cycle of the first CLKIN. As a result, the writes complete six core clock cycles later. Programmable refresh counter provides that can be used to set up a count, depending on the required refresh rate and the clock rate used. The refresh count is specified in the SDRDIV, a memory mapped IOP register. For more information on SDRDIV, see Setting the Refresh Counter Value (SDRDIV) on page 8-13.
Multiprocessing Operation
In a multiprocessing environment, the SDRAM is shared among two or more ADSP-21161 processors. SDRAM input signals (including clock) are always driven by the bus master. The slave processors track the commands that the master processor issues to the SDRAM. This feature or function helps to synchronize the SDRAM refresh counters and to prevent needless refreshing operations. A simplified multiprocessing is shown in Figure 8-6.
8-24
SDRAM INTERFACE
ID=001
ADSP-21161N ID 1
1 0
SDRAM
ADSP-21161N ID 2
1 0
ID=001
BMSTR
SDRAM EP CONTROLLER SDRAM CONTROLLER EP
BMSTR
SDRAM CONTROL
ADDR 23:0
DATA47:16
Figure 8-6. Multiprocessing: Dual Processor System Example When a ADSP-21161 processor receives the bus mastership, it executes a PRE command prior to the first access to SDRAM. This occurs only if the previous master had accessed the SDRAM. In the user application code, the SDCTL and SDRDIV registers of both ADSP-21161 processors must be initialized to the same value. If there is no SDRAM used in the system (as indicated in SDCTL), then the bus transition process is the same as in the ADSP-21160.
Accessing SDRAM
To access SDRAM, the SDRAM controller multiplexes the internal 32-bit non-multiplexed address into a row address, a column address, and a bank select address for the SDRAM device, as shown in Figure 8-7. Lower bits are mapped into the column, next bit/bits are mapped into the bank select, and remaining bits are mapped into the row. This mapping is based on the page size and the number of banks in SDRAM (entered into the SDCTL register).
8-25
27
26
25
Row Addr. SDRAM Bank Select Column Addr.
Figure 8-7. Multiplexed 32-Bit SDRAM Address Based on the values programmed in the SDCTL register for page size and number of SDRAM banks, the SDRAM controller maps bits as follows: the lower ADDR bits into the column address the next bit or bits into the bank select address the remaining higher order bits into the row address The following tables show how the SDRAM controller maps the SDRAM address bits on the processors internal address bus to its external address pins that connect to the SDRAM. The internal and external address bus pins in the tables are defined as follows:
EA IA
= External address pins = Internal address bus For 16 M SDRAMs, A11 is the Bank Select pin. When using a 16 M SDRAM, connect the processor's A14 pin to the SDRAMs A11 pin.
8-26
SDRAM INTERFACE
Address Mapping for SDRAM Table 8-3 through Table 8-7 provide information needed for interfacing to various SDRAMs. Table 8-3. SDRAM Size = 16 Mbit
16 Mbit SDRAM (Page Size x No. of Banks) 256 x 2 512 x 2 1024 x 2 Column Address (Page Access) IA[7:0]=>EA[7:0] IA[8:0]=>EA[8:0] IA[9:0] =>EA[9:0] Bank Select Row Address (Bank Activate) IA[19:9]=>EA[10:0] IA[20:10]=>EA[10:0] IA[21:11]=>EA[10:0]
8-27
Table 8-7. Address Ranges for Various SDRAM Device Densities and Page Size Combinations
SDRAM Device Size 16 Mbit1 1Mx16 2Mx8 4Mx4 2Mx32 4Mx16 8Mx8 16Mx4 Page Size 256 512 1024 512 256 512 1024 Address Range 0 - 0x000F FFFF (1 Mwords) 0 - 0x001F FFFF (2 Mwords) 0 - 0x003F FFFF (4 Mwords) 0 - 0x001F FFFF (2 Mwords) 0 - 0x003F FFFF (4 Mwords) 0 - 0x007F FFFF (8 Mwords) 0 - 0x00FF FFFF (16 Mwords)
64 Mbit
8-28
SDRAM INTERFACE
Table 8-7. Address Ranges for Various SDRAM Device Densities and Page Size Combinations (Contd)
SDRAM Device Size 128 Mbit 2 4Mx32 8Mx16 16Mx8 32Mx4 16Mx16 32Mx8 64Mx4 Page Size 1024 512 1024 2048 512 1024 2048 Address Range 0 - 0x003F FFFF (4 Mwords) 0 - 0x007F FFFF (8 Mwords) 0 - 0x00FF FFFF (16 Mwords) 0 - 0x01FF FFFF (32 Mwords) 0 - 0x00FF FFFF (16 Mwords) 0 - 0x01FF FFFF (32 Mwords) 0 - 0x03FF FFFF (64 Mwords)
256 Mbit
1 2
16M and 64M devices do not have a page size of 2048. 128M and 256M devices do not have a page size of 256.
8-29
and SDA10) when the host assumes control of the system busHBG is asserted. As a result, the single processor (or master processor in a multiprocessor system) can issue REF commands as required.
ADSP-21161
1X OR 1/2X CCLK RD WR RESET ACK SDCLK SDCKE RAS CAS CLK CKE RAS CAS WE DQM A10 CS
CONTROLLER
SDRAM (JEDEC)
MSx
MSx
21161 CORE
D31:0
BUFFER
A14/A13
DQ31:0
BA0/BA1 A12:0
A23:0
MUX
A23:0
REDY
HBG
SBTS
CS
HBR
A23:0
D31:0
HOST
8-30
SDRAM INTERFACE
8-31
REF (refresh). Causes the SDRAM to enter refresh mode and generate all addresses internally SREF (self-refresh). Places the SDRAM in self-refresh mode, in which it controls its refresh operations internally
8-32
SDRAM INTERFACE
MRS
initializes the following SDRAM parameters: Burst length = 1, bits 2-0, hardwired to zero in ADSP-21161 processor Wrap type = sequential, bit 3, hardwired to zero in ADSP-21161 processor Ltmode = latency mode (CAS latency), bits 6-4, programmable in
SDCTL
Bits (14-7) always 0, hardwired in the ADSP-21161 processor While executing mode register set command, the SDRAM controller sets the unused address pins to zero. During the two clock cycles following MRS, ADSP-21161 processor does not issue any other commands. The SDRAM pin state during the MRS command is shown in Table 8-10. Table 8-9. Pin State During MRS Command
Pin MSx CAS RAS SDWE
SDCKE
8-33
Read/Write Command
The SDRAM controller executes a Read/Write command if the next read/write data falls in the present (currently active) page. In general, a Read interrupts a previous Read when the next access is a nonsequential address but a page miss does not occur. When a page miss does occur, the SDRAM controller precharges and activates (PRE and ACT commands) the SDRAM before issuing a Read or Write command. If the internal refresh counter (SDRDIV) asserts a refresh request, any new access is delayed until a refresh command is executed. Read Commands For the Read command, the CAS, MSx and SDA10 are asserted low to enable the SDRAM to latch the column address. The start address is set according to the column address. The delay between Active and Read commands is determined by the tRCD parameter (see SDRAM Timing Specifications on page 8-8). Data is available after the tRCD and CAS latency requirements are met. The SDRAM read timing is shown in Figure 8-9 and the pin state during the Read command is shown in Table 8-11.
8-34
SDRAM INTERFACE
T0 SDCLK
T1
T2
T3
T4
T5
T6
T7
T8
T9
T10
T11
Command Pre NOP Act NOP Read Read Read Read NOP * CAS latency = 1 tCKE1, DQs CAS latency = 2 tCKE2, DQs CAS latency = 3 tCKE3, DQs tRP tRCD NOP NOP NOP
Data
A0
Data
A1
Data
A2
Data
A3
Data
A0
Data
A1
Data
A2
Data
A3
Data
A0
Data
A1
Data
A2
Data
A3
Figure 8-9. Read Timing Diagram Table 8-11. Pin State During a Read Command
Pin MSx CAS RAS SDWE SDCKE SDA10 State Low Low High High High Low
8-35
Write Commands For the write command, CAS, MSx, SDWE, and SDA10 are asserted low to enable the SDRAM to latch the column address. Data is also asserted in the same cycle. The start address is set according to the column address. The write timing is shown in Figure 8-10
T0 2xCLKIN
T1
T2
T3
T4
T5
T6
T7
T8
T9
T10
T11
DQM
Cmd
Pre
NOP
Act
NOP
Write A
Write B
NOP
NOP
Bstop
NOP
NOP
NOP
tRCD
Data
A0
Data
B0
Data
B1
Data
B2
masked data
Figure 8-10. Write Timing Diagram The SDRAM pin state during the Write command is shown in Table 8-12 below: Table 8-12. Pin State During Write Command
Pin MSx CAS RAS SDWE State Low Low High Low
8-36
SDRAM INTERFACE
DMA Transfers In cases where a DMA channel is performing reads from SDRAM, the SDRAM controller issues a read command if at least one location is available in the external port DMA buffer (EPBx) FIFO. Whenever the FIFO is full, a NOP command is issued. In cases where a DMA channel is performing writes to SDRAM, the SDRAM controller issues a write command if at least one word is available in the EPBx buffer. Whenever no data is available to write, an NOP command is issued.
8-37
made to the SDRAM controller based on this refresh divisor value. The controller completes the present burst before servicing the refresh request. The master ADSP-21161 processor always performs the refresh command. Understanding Multiprocessing Operation In a multiprocessing environment, all ADSP-21161 processors share the SDRAM. While the ADSP-21161 processor bus master always drives SDRAM input signals (including the clock), the slave ADSP-21161 processors track the commands the master processor issues to the SDRAM. This tracking helps to synchronize the SDRAM refresh counters and to prevent needless refreshing operations. Whenever a ADSP-21161 processor needs to transfer the bus mastership to other ADSP-21161 processor, it transfers the bus only after meeting tRAS min - 1 number of cycles for the presently active row. If the refresh timer makes a refresh request during this process, the present bus master executes a refresh command (after executing precharge command to SDRAM). The current bus master continues to hold the bus for tRAS min 1 cycles before giving up the bus to the new bus master. If the REF request arrives from the refresh counter during a bus transition cycle, the new bus master immediately issues a REF command. The new bus master becomes aware of this request because the refresh counter is running on all ADSP-21161 processors. The reloading of the refresh counter occurs synchronously on all processors, as the slaves watch the external SDRAM control pins to see when the refresh command is executed by the master. When a processor receives the bus mastership, it executes a PRE command prior to the first access to the SDRAM. The current ADSP-21161 processor bus master retains mastership of the control pins of the SDRAM (RAS, CAS, SDWE, SDCKE, SDCLK, MSx, SDA10) when the host assumes control of the system bus - HBG is asserted. This enables the master ADSP-21161 processor to issue a REF command as required.
8-38
SDRAM INTERFACE
The SDRAM pin state during the REF command is shown in Table 8-13 below: Table 8-13. Pin State During REF Command
Pin MSx CAS RAS SDWE SDCKE State Low Low Low High High
During entry into SREF, make sure that no SDRAM accesses are occurring and the SDRAM has stopped bursting data. The controller automatically asserts a SREF exit cycle if a SDRAM access occurs during the SREF period. After executing a SREF exit command, the controller waits for 2 + tRC cycles to execute a CBR (CAS before RAS) refresh cycle if the refresh counter is expired already. After the CBR refresh command, the SDRAM controller waits for tRC number of cycle before executing a bank activate command. The SDRAM pin state during the SREF command is shown in Table 8-14.
8-39
Programming Example
This section provides a programming example written for the ADSP-21161 processor. The example shown in Listing 8-1 demonstrates how to set up the SDRAM controller to work with the ADSP-21161 processor EZ-KIT Lite. Listing 8-1. SDRAM Controller Setup for 21161 EZ-KIT Lite
/************************************************************** * * * * * * * * * * * * ***************************************************************/ Setup for the SDRAM Controller for 21161 EZ-KIT Lite * Assumes SDRAM part# Micron MT48LC16M16A1-7SE * SDCLK=100MHz tCK=8ns min @ CL=2 tRAS=50ns min tRP=20ns min tRCD=20 ns min tREF=64ms/4K rows ->SDRDIV=(2(30MHz)-CL-tRP-4)64ms/4096=937cycles 3 SDRAMs by 16 bits wide total = 16Mbit x 48 Mapped to MS0 addresses 0x00200000-0x002fffff -> SDCL=1 [CAS Latency] -> SDTRP=2 [precharge delay] -> SDTRCD=2 [CAS-to-RAS delay] * * * * * * * -> SDTRAS=3 [active command delay] *
8-40
SDRAM INTERFACE
init_21161_SDRAM_controller: ustat1=dm(WAIT); bit clr ustat1 0x000FFFFF; dm(WAIT)=ustat1; ustat1=0x1000; dm(SDRDIV)=ustat1; ustat1=dm(SDCTL); // SDCTL = 0x02214231; // SDCLKx = CCLK frequency, no SDRAM buffering option, 2 SDRAM banks // SDRAM mapped to bank 0 only, no self-refresh, page size 256 words // SDRAM powerup mode is prechrg, 8 CRB refs, and then mode reg set cmd // tRCD = 2 cycles, tRP=2 cycles, tRAS=3 cycles, SDCL=1 cycle // SDCLK0, SDCLK1, RAS, CAS and SDCLKE activated bit set ustat1 SDTRCD2|SDCKRx1|SDBN2|SDEM0|SDPSS|SDPGS256|SDTRP2|SDTRAS3|SDCL1; bit clr ustat1 SDBUF|SDEM3|SDEM2|SDEM1|SDSRF|SDPM|DSDCK1|DSDCTL; dm(SDCTL)=ustat1; rts; // Mask in SDRAM settings //Refresh rate // Clear MSx waitstate and mode
8-41
8-42
9 LINK PORTS
The ADSP-21161 processor has two 8-bit wide link ports, which can connect to other processor or peripheral link ports. These bidirectional ports have eight data lines, an acknowledge line, and a clock line. Link ports can operate at frequencies up to the same speed as the processors internal clock, letting each port transfer up to 8 bits of data per internal clock cycle. Link ports also have the following features: Operate independently and simultaneously. Pack data into 32- or 48-bit words; this data can be directly read by the processor or DMA-transferred to or from on-chip memory. Are accessible by the external host processor, using direct reads and writes. Have double-buffered transmit and receive data registers. Include programmable clock and acknowledge controls for link port transfers. Each link port has its own dedicated DMA channel. Provide high-speed, point-to-point data transfers to other processors, allowing differing types of interconnections between multiple DSPs. ADSP-21161 processor link ports are logically (but not electrically) compatible with previous SHARC processor (ADSP-2106x family) link ports. For more information, see Link Data Path and Compatibility Modes on page 9-9.
9-1
Table 9-2 lists the pins associated with each link port. Each link port consists of eight data lines (LxDAT7-0), a link clock line (LxCLK), and a link acknowledge line (LxACK). The LxCLK line allows asynchronous data transfers and the LxACK line provides handshaking. When configured as a transmitter, the port drives both the data and LxCLK lines. When configured as a receiver, the port drives the LxACK line. Figure 9-1 shows link port connections. Table 9-1. Link Port Pins
Link Port Pin(s) LxDAT7-0 LxCLK LxACK Link Port Function Link Port x Data Link Port x Clock Link Port x Acknowledge
TRANSMITTER
LXDAT7-0 EACH LINK PORT LXCLK LXACK 8
RECEIVER
LXDAT7-0 LXCLK LXACK EACH LINK PORT
Figure 9-1. Link Port Pin Connections The link port data pins (L0DAT7-0 and L1DAT7-0)are multiplexed internally with data lines DATA15-0. If link ports are used, you cannot execute full instruction width (48-bit) transfers. To perform 48-bit transfers, you must set the correct bits IPACK[1:0] in the SYSCON register and disable the link ports.
9-2
Link Ports
9-3
LBUF0 LBUF1
Cross-Bar Connection
9-4
Link Ports
9-5
LCTL
0xCC
LRERR1
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
L1CLKD
CCLK Divide Ratio 1 - LBUF1 00=divide by 4, 01=divide by 1 10=divide by 2, 11=divide by 3
LRERR0
Rcv. Pack Error Status for Link Buffer 0 1=incomplete, 0=complete
L1PDRDE
Link Port 1 Pulldown Resister Disable
L1STAT[1:0]
Link Buffer 1 Status (Read- Only) 11=Full, 00=Empty, 10=one word
L1DPWID
Link Buffer 1 Data Path Width 1=8-bits, 0=4-bits
L0STAT[1:0]
Link Buffer 0 Status (Read-Only) 11=Full, 00=Empty, 10=one word
LAB0
Link Port Assignment for LBUF0 0=Link Port 0, 1=Link Port 1
LAB1
Link Port Assignment for LBUF1 0=Link Port 0, 1=Link Port 1
15 14 13 12 0 L1CLKD
CCLK Divide Ratio 0 - LBUF1
11 10 0 0
9 0
8 0
7 0
6 0
5 0
4 0
3 0
2 0
1 0
0 0
L0EN
Link Buffer 0 Enable 1=enable, 0=disable
L1EXT
Link Buffer 1 Extended Word Size 1=48-bit transfers, 0=32-bit transfers
L0DEN
Link Buffer 0 DMA Enable 1=enable DMA 0=disable DMA
L1TRAN
Link Buffer 1 Data Direction 1=Transmit, 0=Receive
L0CHEN
Link Buffer 0 DMA Chaining Enable 1=enable chaining, 0=disable chaining
L1CHEN
Link Buffer 1 DMA Chaining Enable 1=enable chaining, 0=disable chaining
L0TRAN
Link Buffer 0 Data Direction 1=Transmit, 0=Receive
L1DEN
Link Buffer 1 DMA Enable 1=enable DMA, 0=disable DMA
L0EXT
Link Buffer 0 Extended Word Size 1=48 -bit transfers, 0=32 -bit transfers
L1EN
Link Buffer 1 Enable 1=enable DMA, 0=disable DMA
L0CLKD[1:0]
CCLK Divide Ratio- LBUF0 00=divide by 4, 01=divide by 1, 10=divide by 2, 11=divide by 3
L0DPWID
Link Buffer 0 Data Path Width 1=8-bits, 0=4-bits
L0PDRDE
Link Port 0 Pulldown Resister Disable
9-6
Link Ports
9-7
Link Port Clock Divisor. Bits 6-5 and 16-15 (LxCLKD). These bits select the transfer clock divisor for link buffer x (LBUF0 or LBUF1). The transfer clock equals the processor core clock divided by LxCLKD, where L0CLKD[6-5] and L1CLKD[16-15]is: 01=1, 10=2, 11=3, or 00=4. Link Port Pulldown Resistor Disable. Bit 8 and 18 (LxPDRDE).This bit disables (if set, =1) or enables (if cleared, =0) the internal pulldown resistors on the LxCLK, LxACK, and LxDAT7-0 pins of the corresponding unassigned or disabled link port; this bit applies to the port which is not necessarily the port assigned to link buffer x (LBUF0 or LBUF1). For revisions 0.3, 1.0 and 1.1, LxCLK,LxDAT7-0 and LxACK have a 50k internal pulldown resistor. For revisions 1.2 and greater, LxDAT7-0 has a 20k internal pulldown resistor. See Table 13-3 for a description of resistor values of the pins. Systems should not leave link port pins (LxCLK, LxACK, and LxDAT7-0) unconnected without clearing the corresponding LxPDRDE bit or applying an external pulldown. In systems where several DSPs share a link port, only one processor should have this bit cleared. Link Port Data Path Width. Bits 9 and 19 (LxPDPWID). This bit selects the link port data path width (8-bit if set, =1) (4-bit if cleared, =0) for the corresponding link buffer (LBUF0 or LBUF1). Systems using a 4-bit width should connect the lower link port data pins (LxDAT3-0) for data transfers and leave the upper pins (LxDAT7-4) unconnected. In the 4-bit mode, the processor applies pulldowns to the upper pins. Link Port Assignments for LBUF0. Bit 20 (LAB0). This bit assigns link buffer 0 to link port 1 if set (=1) or link port 0 if cleared (=0). Link Port Assignments for LBUF1. Bit 21 (LAB1). This bit assigns link buffer 1 to link port 1 if set (=1) or link port 0 if cleared (=0).
9-8
Link Ports
Link Buffer Status. Bits 23-22 and 25-24 (LxSTAT). These bits identify the status of the corresponding link buffer as follows: 11=full, 00=empty, 10=one word. Receive Packing Error Status. Bit 27 and 26 (LRERRx). This bit indicates if the packed bits in the corresponding link buffer were receive completely (=0), without error, or incompletely (=1). If multiple link ports are bussed together and the link port pulldown resistor is enabled on all the processors, the line is heavily loaded. Ensure only one processor has this functionality. The processors internal clock (CCLK) is the CLKIN frequency multiplied by a clock ratio (CLK_CFG1-0)and the CLKDBL pin (1:1 or 2:1 ratio). For more information, see the clock ratio pin description in Table 13-1 on page 13-4. When link buffers are enabled or disabled, the I/O processor may generate unwanted interrupt service requests if Link Service Request (LSRQ) interrupts are unmasked. To avoid unwanted interrupts, programs should mask the LSRQ interrupts while enabling or disabling link buffers. For more information, see Using Link Port Interrupts on page 9-17.
9-9
at the same speed or faster than the transmitter. Connecting to an ADSP-2106x may require that the ADSP-21161 processor be configured for 1/2 core clock rate operation. For more information, see Using Link Port Handshake Signals on page 9-10.
Link Ports
LCLK STAYS HIGH AT BYTE 0 I F LACK IS SAMPLED LOW ON PREVI OUS LCLK RISING EDGELCLK HIGH INDICATES A STALL
LXCLK
LXACK TRANSMI TTER SAMPLES LACK WILL REASSERT LACK HERE TO DETERMINE AS SOON AS THE LINK WHETHER TO TRANSMIT BUFFER IS "NOT FULL " NEXT WORD BYTE 2 BYTE 3 (32-BIT) OR BYTE 5 (48-BIT) (LSBS) BYTE 0 (MSBS)
LXDAT7-0
BYTE 1
RECEIVER WILL ACCEPT REMAINING B YTES IN THE CURRENT WORD EVEN IF LACK IS DEASSERTED. THE TRANSMITTER WILL NOT SEND TH E FOLLOWI NG WO RD.
Figure 9-4. Link Port Handshake Timing The receive buffer may fill if a higher priority DMA, core I/O processor register access, direct read, direct write or chain loading operation is occurring. LxACK may de-assert when it anticipates the buffer may fill. LxACK is reasserted by the receiver as soon as the internal DMA grant signal has occurred, freeing a buffer location. Data is latched in the receive buffer on the falling edge of LxCLK. The receive operation is purely asynchronous and can occur at any frequency up to the processor clock frequency. When a link port is not enabled, LxDAT7-0, LxCLK and LxACK are three-stated. When a link port is enabled to transmit, the data pins are driven with whatever data is in the output buffer, LxCLK is driven high and LxACK is three-stated. When a port is enabled to receive, the data pins and LxCLK are three-stated and LxACK is driven high.
9-11
To allow a transmitter and a receiver to be enabled (assigned and link buffer enabled) at different times, LxACK, LxCLK, and LxDAT7-0 may be held low with their internal pull-down resistor if LxPDRDE is cleared when the link port is disabled. LxDAT7-0 is kept at the previously driven value by internal keeper latches on the link port data lines if LxPDRDE is cleared when the link port is disabled. If the transmitter is enabled before the receiver, LxACK is low and the transmission is held off. If the receiver is enabled before the transmitter, LxCLK is held low by the pulldown and the receiver is held off. If many link ports are bused together, the systems may need to enable only one of the internal resistors to pull down each bused pin, so the bused lines are not pulled down too strongly or too heavily loaded. Refer to Table 13-1 on page 13-4 for detailed pin descriptions and Table 13-3 on page 13-22 for more information on pull down resistors.
LxACK, LxCLK,
and LxDAT7-0 should not be left unconnected unless external pull-down resistors are used.
9-12
Link Ports
Full/empty status for the link buffer FIFOs is given by the LxSTAT bits of the LCTL register. This status is cleared for a link buffer when its LxEN enable bit is cleared in the LCTL register. During receiving, the external buffer is used to pack the receive link port data (most significant nibble or byte first) and pass it to the internal register before DMA-transferring it to internal memory. This buffer is a two-deep FIFO. If the processors DMA controller does not service it before both locations are filled, the LxACK signal is de-asserted. The link buffer width may be selected to be either 32 or 48 bits. This selection is made individually for each buffer with the LxEXT bits in the LCTLx register. For 40-bit extended precision data or 48-bit instruction transfers, the width must be set to 48 bits.
9-13
To support debugging buffer transfers, the processor has a Buffer Hang Disable (BHD) bit. When set (=1), this bit prevents the processor core from detecting a buffer-related stall condition, permitting debugging of this type of stall condition. For more information, see the BHD discussion on page on page 6-43.
9-14
Link Ports
Depending on the HBW (host bus width) bits in SYSCON, the appropriate 48-bit internal packing mode is selected. Table 6-7 on page 6-36 summarizes the packing mode bit settings for access to link port buffers. Host packing examples are shown below for host direct read/write access to LBUFx link port data buffers. When interfacing to a host processor, the HMSWF bit determines whether the I/O processor packs to most significant 16-bit word first (=1)or least significant 16-bit word first (=0). The packing mode defaults to 48-bit internal packing for host accesses to LBUFx, ignoring PMODE value in DMACx. Table 9-3. Packing Sequence for 16-Bit Bus (MSW First)
Transfer First Second Third Data Bus Pins 31-16 Word 1; bits 47-32 Word 1; bits 31-16 Word 1; bits 15-0
9-15
To write a single 48-bit word or an odd number of 48-bit words to LBUFx, write a dummy access to completely fill the packing buffer, or write the HPFLSH bit in SYSCON to flush the partially filled packing buffer and remove the unused word. The HPFLSH bit clears the HPS bits in SYSTAT as well.
9-16
Link Ports
mation on link port interrupts, see Using Link Port Interrupts on page 9-17. For more information on link port DMA, see Link Port DMA on page 6-81. The link port channels share DMA channels 8 and 9 with the SPI transmit and receive buffers. Do not enable SPI and link port DMA simultaneously. SPI and link port are mutually exclusive when one of the peripherals is enabled. In chained DMA operations, the processor automatically sets up another DMA transfer when the current DMA operation completes. The chain pointer register (CPLB0, and CPLB1) is used to point to the next set of buffer parameters stored in memory. The processors DMA controller automatically downloads these buffer parameters to set up the next DMA sequence. For information on setting up DMA chaining, see Chaining DMA Processes on page 6-25.
mit/receive interrupt latching and masking functions. The IRPTL register controls a single global link port interrupt that latches the LPISUM bit. This bit indicates whether at least one of the two unmasked link port interrupt is latched. Refer to Figure 9-5 and Table A-10 on page A-34 for complete bit description of the LIRPTL register. During reset, if a link port boot is enabled, the mask bit for LBUF0 (bit 16) is set (for example, the interrupt is unmasked). If a SPI boot is enabled, the mask bit for SPI receive (bit 18) is set.
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
LIRPTL
SPITMSKP
LP0MSK
Link Buffer 0 DMA Interrupt Mask
LP1MSK
Link Buffer 1 DMA Interrupt Mask
SPIRMSKP
SPI Receive DMA Interrupt Mask Pointer
SPIRMSK
SPI Receive DMA Interrupt Mask
LP1MSKP
Link Buffer 1 DMA Interrupt Mask Pointer
SPITMSK
SPI Transmit DMA Interrupt Mask
LP0MSKP
Link Buffer 0 DMA Interrupt Mask Pointer 15 14 13 12 11 10 9 0 0 0 0 0 0 0 8 0 7 0 6 0 5 0 4 0 3 2 1 0 0 0
0 0
SPITI
SPI Transmit DMA Interrupt Latch (0x44)
LP0I
Link Buffer 0 DMA Interrupt Latch Interrupt Vector Address Offset- 0x38
SPIRI
SPI Receive DMA Interrupt Latch (0x40)
LP1I
Link Buffer 1 DMA Interrupt Latch (0x3c)
9-18
Link Ports
One way programs can use this interrupt is to send additional control information at the end of a block transfer. Because the receive DMA buffer is empty when the DMA block has completed, the external bus master can send up to two additional words to the slave processors buffer, which has space for the two words. When the slavess DMA completes, there is an interrupt. In the associated interrupt service routing, the buffer can be read in order to use these control words to determine the next course of action.
9-19
through a particular link port. Two processors can communicate without prior knowledge of the transfer direction, link port number, or exactly when the transfer is to occur. The LRSQ register is shown in Figure 9-6 and described in Table A-26 on page A-98.
LSRQ 0xD0
L1RRQ
Link Port 1 Receive Request
31 0
30 29 0 0
28 0
27 0
26 25 24 0 0 0
23 22 0 0
21 0
20 0
19 0
18 0
17 0
16 0
L0TRQ
Link Port 0 Transmit Request
L1TRQ
Link Port 1 Transmit Request
L0RRQ
Link Port 0 Receive Request
15 14 0 L1RM 0
13 0
12 0
11 0
10 0
9 0
8 0
7 0
6 0
5 0
4 0
3 0
2 0
1 0
0 0 L0TM
Link Port 0 Transmit Mask
L1TM
Link Port 1 Transmit Mask
L0RM
Link Port 0 Receive Mask
Figure 9-6. LSRQ Register In Figure 9-6, for transmit request status bits, LxTRQ=1 means LxACK=1, LxTM=1, and LxEN=0; for receive request status bits, LxRRQ=1 means LxCLK=1, LxRM=1, and LxEN= When LxACK or LxCLK is asserted externally, a link service request (LSR) is generated in a disabled (unassigned or assigned with buffer disabled) link port. LSRs are not generated for a link port that is disabled by loopback mode. Each LSR is gated by mask bits before being latched in the LSRQ register. The two possible receive LSRs and the two possible transmit LSRs are gated by mask bits and then ORed together to generate the link service request interrupt. The LSRQ interrupt request may be masked by
9-20
Link Ports
the LSRQI mask bit of the IMASK register. When the mask bit is set, the interrupt is allowed to pass into the interrupt priority encoder. A diagram of this logic appears in Figure 9-7.
LSRQ IMASK, LSRQ I
LXRRQ
LSR STATUS
LXTRQ
LSR MASK
Figure 9-7. Logic for Link Port Interrupts The interrupt routine must read the LSRQ register to determine which link port to service and whether it is a transmit or receive request. LSR interrupts have a latency of two cycles. Note that the link service request interrupt is different from the link receive and transmit interruptthis is also true in IMASK. The 32-bit LSRQ register holds the masked link status of each link port and the corresponding interrupt mask bits. The link service request status of the port is set whenever the port is not enabled and one of LxACK or LxCLK is asserted high. The LSRQ status bits are read-only. Table A-26 on page A-98 shows the individual bits of the LSRQ register. To determine which link port to service, programs can transfer LSRQ to a register Rx (in the register file) and use the leading 0s detect instruction: Rn=LEFTZ Rx. Here, Rn indicates which link port is active in order of priority. If link service requests are in use, they should be masked out when the assigned link buffers are being enabled, disabled, or when the link port is being unassigned in LCTL. Otherwise, spurious service requests may be generated.
9-21
The need for masking is due to a delay before LxCLK or LxACK (if already asserted) signals are pulled (if pulldowns enabled) or driven externally (if pulldowns disabled) below logic threshold. During this delay, these signals are sampled asserted and generate an LSRQ. To avoid the possibility of spurious interrupts, programs should mask the LSRQ interrupt or the appropriate request bit in the LSRQ register and allow a delay before unmasking. Alternatively, programs can mask the LSRQ interrupt and poll the appropriate request status bit until it is cleared and then unmask the interrupt.
9-22
Link Ports
additional word in the link buffer and then reads the LRERR bit. The receiver may then clear the link buffer (LxEN=0) and transmit the appropriate message back to the transmitter on the same, or a different, link port.
9-23
.section/dm seg_dmda;
.var source[N]= 0X11111111, 0X22222222, 0X33333333, 0X44444444, 0X55555555, 0X66666666, 0X77777777, 0X88888888; .var dest[N]; .section/pm lp1i_svc; /*Link Port 1 interrupt vector from ldf file */ jump lpISR1; nop; nop; nop; .section/pm lp0i_svc; /*Link Port 0 interrupt vector from ldf file */ jump lpISR1; nop; nop; nop; /*_____________________Main Routine________________________*/ .section/pm seg_pmco; start: B0=source; L0=@source; B1=dest; L1=@dest; /*Enable Global, Link Port, and Link Port Buffer 1 interrupts*/ bit set imask LPISUMI; bit set lirptl LP1MSK; bit set mode1 IRPTEN | CBUFEN; ustat1=dm(LCTL); /* LCTL REGISTER--LBUF1=TX, LBUF0=RX, 1/2x CCLK RATE, LBUF 0 & 1ENABLED, LBUF 0 & 1 -> PORT 0 */ bit clr ustat1 L0TRAN | LAB0 | LAB1 | L0CLKD0 | L1CLKD0; bit set ustat1 L1TRAN | L1EN | L0EN | L0CLKD1 | L1CLKD1; dm(LCTL)=ustat1; /* Enable circular buffers */ /* Set pointers for source and dest */ /* Main code section from ldf file */
9-24
Link Ports
wait: idle; jump wait; lpISR1: R0=dm(I0,1); dm(LBUF1)=R0; R1=dm(LBUF0); dm(I1,1)=R1; rti; /* Link Port Service Routine */ /* Get data for TX */ /* Write data to LBUF1 */ /* Read data-core will hang here until data is received. */ /* Store incoming data to dest buffer */
.var source[N]= 0X11111111, 0X22222222, 0X33333333, 0X44444444, 0X55555555, 0X66666666, 0X77777777, 0X88888888;
9-25
.var dest[N]; /*________________________Main Routine________________________*/ .section/pm seg_pmco; start: r0=0; DM(LCTL)=r0; B0=source; L0=@source; B1=dest; L1=@dest; ustat1=dm(LCTL); /* LCTL REGISTER-->LBUF1=TX, LBUF0=RX, 2x CLK RATE, LBUF 0 & 1 ENABLED, LBUF 0 & 1 -> PORT 0 */ bit clr ustat1 L0TRAN | L0CLKD0 | L1CLKD0 | LAB0 | LAB1; bit set ustat1 L1TRAN | L1EN | L0EN | L0CLKD1 | L1CLKD1; dm(LCTL)=ustat1; lcntr=N, do transfer until lce; R0=dm(I0,1); dm(LBUF1)=R0; R1=dm(LBUF0); /* Test data to TX */ /* Write data to LBUF1 */ /* Read data-core will hang here until data is received. */ transfer: dm(I1,1)=R1; wait: idle; jump wait; / *Store incoming data to dest buffer */ /* Clear LCTL register */ /* Set up pointers for source and dest */ /* Main code section from ldf
*/
9-26
Link Ports
9-27
ORIGINAL MASTER
DMA TRANSFER COMPLETE LBUF DISABLED LSRQ INTERRUPT ENABLED
ORIGINAL SLAVE
DMA TRANSFER COMPLETE LBUF DISABLED LBUF RX NON-DMA ENABLED
LACK ASSERTION CAUSES LSRQ INTERRUPT LBUF TX NON-DMA ENABLED SEND TRW 4 TIMES TO FILL LBUF FIFOS ON BOTH SIDES CHECK LCTL FOR SLAVE READ OF TRW BEFORE ACCEPTANCE TEST
CHECK LCTL TO SEE IF SLAVE ACCEPTED TOKEN BY EMPTYING FIFOS IN AN ALLOTTED TIME PERIOD
ACCEPT TOKEN BY EMPTYING LBUF FIFOS THROUGH 3 MORE READS WITHIN THE ALLOTTED TIME PERIOD
SETUP LBUF FOR RX NON-DMA TO ACCEPT DMA SIZE SETUP LBUF FOR RX DMA AND DMA COMPLETE IRQ
DISABLE LBUF AND LSRQ INTERRUPT POLL LSRQ STATUS FOR LINK PORT TRANSMIT REQUEST TO BE SURE THAT THE ORIGINAL MASTER IS NOW A SLAVE
LACK ASSERTION ASSURES THAT IT IS SAFE TO BEGIN TRANSMITTING SETUP LBUF FOR TX NON-DMA TO SEND DMA SIZE SETUP LBUF FOR TX DMA AND DMA COMPLETE INTERRUPT
9-28
Link Ports
To use the example, the example code is to be loaded on both the original master and the original slave. The code is ID intelligent for multiprocessor systems: ID1 is the original master (transmitter) and ID2 is original slave (receiver). The master transmits a buffer via DMA through link port 0 using LBUF1 and the slave receives through link port 0 using LBUF0. The slave then requests the token by generating an LSRQ interrupt in the disabled link port of the master (LPORT0). The master responds by sending the token release word and waiting to see if it is accepted. The slave checks to see that it is the token release word and accepts the token by emptying the masters link buffer FIFO within a predetermined amount of time. If the token is accepted the slave becomes the master and transmits a buffer of data to the new slave. If the token is rejected, the master transmits a second buffer. When complete, the original master will finish by setting up LBUF0 to receive without DMA, and the original slave sets up LBUF1 to transmit without DMA. The following is a list of the areas of concern when a program implements a software protocol scheme for token passing: The program must make sure that both link buffers are not enabled to transmit at the same time. In the event that this occurs, data may be transmitted and lost due to the fact that neither link port is driving LxACK. In the example, the LSRQ register status bits are polled to ensure that the master becomes the slave before the slave becomes the master, avoiding the two transmitter conflict. The program must make sure that the link interrupt selection matches the application. If a status detection scheme using the status bits of the LSRQ register is to be used, it is important to note the following: If a link port that is configured to receive is disabled while LxACK is asserted, there is an RC delay before the 50k pulldown resistor1 on LxACK (if enabled) can pull the value below logic
LxACK has a 20k pulldown resistor for revisions 1.2 and higher.
9-29
threshold. If the appropriate request status bit is unmasked in the LSRQ register (in this instance), then an LSR is latched and the LSRQ interrupt may be serviced, even though unintended, if enabled. The program must make sure that synchronization is not disrupted by unrelated influences at critical sections where timing control loops are used to synchronize parallel code execution. Disabling of nested interrupts is one technique to control this.
9-30
Link Ports
The ADSP-21161 processor contains internal series resistance equivalent to 50 on all I/O drivers except the CLKIN and XTAL pins. Therefore, for traces longer than six inches, external series resisters on control, link port data, clock or frame sync pins are not required to dampen reflections from transmission line effects for point-to-point connections.
9-31
ADSP-21161 L0DAT7-0 L0CLK L0ACK 10 I/O DEVICE EXTERNAL PORT LINK PORT 0 LINK PORT 1 EXTERNAL MEMORY
CLK
LINK BUS 0
LINK BUS 1
ADSP-21161
DMA DEVICE
LINK PORT 1
CLK
9-32
Link Ports
EXPANDING CLUSTERS
RING TOPOLOGY
9-33
9-34
10 SERIAL PORTS
The ADSP-21161 processor has four independent, synchronous serial ports (SPORTs) that provide an I/O interface to a wide variety of peripheral devices: SPORT0, SPORT1, SPORT2 and SPORT3. Each serial port has its own set of control registers and data buffers. With a range of clock and frame synchronization options, the SPORTs allow a variety of serial communication protocols and provide a glueless hardware interface to many industry-standard data converters and CODECs. Serial ports can operate at half the full clock rate of the processor, at a maximum data rate of n/2 Mbit/s, where n equals the processor core-clock frequency. Bidirectional (transmit or receive) functions provide greater flexibility for serial communications. Serial port data can be automatically transferred to and from on-chip memory using DMA block transfers. In addition to standard synchronous serial mode, each serial port offers a Time Division Multiplexed (TDM) multichannel mode and I2S mode. Serial ports offer the following features and capabilities: Two bi-directional channels per serial port, configurable as either transmitters or receivers. Each serial port can be configured as two receivers or two transmitters, permitting two unidirectional streams into or out of the same serial port. This bi-directional functionality provides greater flexibility for serial communications. Two SPORTs can be combined to allow full-duplex, dual-stream communications.
10-1
Double-buffers data all serial data pins have programmable receive and transmit functions and thus have one transmit and one receive data buffer register and a bi-directional shift register associated with each serial data bin. Double-buffering provides additional time to service the SPORT. Compression/decompression A-law and -law hardware companding on transmitted and received words. Provides internally-generated serial clock and frame sync signals in a wide range of frequencies, or accepts clock and frame sync input from an external source. Performs interrupt-driven, single-word transfers to and from on-chip memory controlled by the processor core. Executes DMA transfers to and from on-chip memory. Each SPORT can automatically receive or transmit an entire block of data. Permits chaining of DMA operations for multiple data blocks. Three operation modes: standard DSP serial, I2S, and multichannel. In I2S mode, one or both channels on each SPORT can transmit or receive. Each channel either transmits or receives left and right channels. In standard DSP serial and I2S modes, when both A and B channels are used, they transmit or receive data simultaneously, sending or receiving bit 0 on the same edge of the serial clock, bit 1 on the next edge of the serial clock, and so on. In multichannel mode, SPORT0 or SPORT1 can receive A channel data, and SPORT2 or SPORT3 transmits A channel data selectively from up to 128 channels of a time-division-multiplexed serial bitstream. This mode is useful for T1 or H.100/H.110 interfaces. In multichannel mode, SPORT0 and SPORT2 work as a pair, and SPORT1 and SPORT3 work as a pair.
10-2
Serial Ports
Can be configured to transfer data words between 3 and 32 bits in length, either MSB-first or LSB-first. Words must be between 8 and 32 bits in length for I2S mode. 128-channel TDM is supported in multichannel mode operation. Receive comparison and 2-dimensional DMA are not supported in the ADSP-21161 processor.
SPORT0 SPORT2 D0a D0b FS0 SCLK0 D1a D1b FS1 SCLK1 D2a D2b FS2 SCLK2 D3a D3b FS3 SCLK3
SPORT1 SPORT3
10-3
The A and B channel data pins on each SPORT cannot transmit and receive data simultaneously for full-duplex operation. Two SPORTs must be combined to achieve full-duplex operation. The DDIR bit in the SPCTL register controls the same direction for both the A and B channel pins. Therefore, the direction of the A and B channel on a particular SPORT must be the same. Serial communications are synchronized to a clock signal. Every data bit must be accompanied by a clock pulse. Each serial port can generate or receive its own clock signal (SCLKx). Internally-generated serial clock frequencies are configured in the DIVx registers. the A and B channel data pins shift data based on the rate of SCLKx. In addition to the serial clock signal, data may be signaled by a frame synchronization signal. The framing signal can occur at the beginning of an individual word or at the beginning of a block of words. The configuration of frame sync signals depends upon the type of serial device connected to the processor. Each serial port can generate or receive its own frame sync signal (FS) for transmitting or receiving data. Internally-generated frame sync frequencies are configured in the DIVx registers. Both the A and B channel data pins shift data based on the corresponding FSx pin. Figure 10-2 shows a block diagram of a serial port. The SCLKx and FSx signals are internally connected to all four A and B channel data buffers. The setting of the DDIR bit enables the data buffer path, which, once activated, responds by shifting data in response to a frame sync at the rate of SCLKx. Your application program must use the correct serial port data buffers, according to the value of DDIR bit. The DDIR bit enables the transmit data buffers for the transmission of A and B channel data, or it enables the receive data buffers for the reception of A and B channel data. Inactive data buffers are not used. The DDIR bit in the SPCTLx register affects the operation of the transmit data path or the receive data path. The data path includes the data buffers and the shift registers. When DDIR = 0, the primary and secondary RXx data registers and receive shift registers are acti-
10-4
Serial Ports
vated, and the transmit path is disabled. When DDIR = 1, the primary and secondary TXx data register and transmit shift registers are activated, and the receive path is disabled.
DM, PM, I/O Data bus 32
32
32
32
32
Hardware Companding (compression) SPORTs 2 & 3 Only 32 Transmit Shift Register DDIR=1 TX Enable
Hardware Companding (expansion) SPORTs 0 and 1 Only 32 Receive Shift Register DDIR=0 RX Enable
32
32
SCLKx FSx
DDIR CTL
DxA_out
DxA_in
DxB_out
DxB_in
DxA
FSx
SCLKx
DxB
Figure 10-2. Serial Port Block Diagram If the serial data pin is configured as a serial transmitter, the data to be transmitted is written to the TXxA/TXxB buffer. The data is (optionally) compressed in hardware on the primary A channel (SPORT2 and ADSP-21161 SHARC Processor Hardware Reference 10-5
SPORT3 only), then automatically transferred to the transmit shift register. Companding is not supported on the secondary B channels, thus the data is automatically transferred from the TXxB buffer to the shift register. The data in the shift register is then shifted out on the SPORTs Dxy pin, synchronous to the SCLKx clock. If framing signals are used, the FSx signal indicates the start of the serial word transmission. The Dxy pin is always driven (for example, three-stated) if the serial port is enabled (SPEN_A or SPEN_B =1 in the SPCTLx control register), unless it is in multichannel mode and an inactive time slot occurs. When the SPORT is configured as a transmitter (DDIR=1), the TXxA and TXxB registers and the channel transmit shift registers respond to SCLKx and FSx for transmission of data. The receive RXxA and RXxB buffer registers and receive shift registers are inactive and do not respond to SCLKx and FSx signals. Since these registers are inactive, reading from an empty buffer will cause the core to hang indefinitely. Do not read from the inactive RXxA and RXxB registers (since the receive buffer status is always empty) if the SPORTs are configured as transmitters (DDIR bit = '1' in SPCTL), as this will cause a core hang indefinitely. If the serial data pin is configured as a serial receiver (DDIR=0), the receive portion of the SPORT shifts in data from the Dxy pin, synchronous to the SCLKx receive clock. If framing signals are used, the FSx signal indicates the beginning of the serial word being received. When an entire word is shifted in on the primary A channel, the data is (optionally) expanded (SPORT0 and SPORT1 only), then automatically transferred to the RXxA buffer. When an entire word is shifted in on the secondary channel, it is automatically transferred to the RXxB buffer (companding is not supported on the secondary B channels). When the SPORT is configured as a receiver (DDIR=0), the RXxA and RXxB registers, along with the corresponding A and B channel receive shift registers are activated, responding to SCLKx and FSx for reception of data. The transmit TXxA and TXxB buffer registers and transmit A and B shift registers
10-6
Serial Ports
are inactive and do not respond to the SCLKx and FS. Since the TXxA and TXxB registers are inactive, writing to a transmit data buffer will cause the core to hang indefinitely. Do not write to the inactive TXxA and TXxB registers if the SPORTs are configured as receivers (DDIR bit = '0' in SPCTL). If the core keeps writing to the inactive buffer, the transmit buffer status will become full. Since data is never transmitted out of the deactivated transmit data buffers, this results in a core hang indefinitely. The SPORTs are not UARTs and cannot communicate with an RS-232 device or any other asynchronous communications protocol. One way to implement RS-232 compatible communications with the ADSP-21161 processor is to use two of the FLAG pins as asynchronous data receive and transmit signals. For an example, see Chapter 11 Software UART in the Digital Signal Processing Applications Using The ADSP-2100 Family, Volume 2.
SPORT Interrupts
Each serial port has a transmit DMA interrupt and a receive DMA interrupt. For each SPORT, both the A and B channel transmit or receive data buffers share the same interrupt vector. If a given SPORT is configured to transmit data, both the TXxA and TXxB data buffers use the interrupt vector when previous data has been transmitted. If the SPORT is configured to receive data, both the RXxA and RXxB data buffers use the interrupt vector when new data has been received. When serial port DMA is not enabled, interrupts occur based on the SPORT transmit or receive FIFO status. If
10-7
SPORT Reset
on the transmit side the FIFO is empty or on the receive side the FIFO is full, interrupts is generated. The priority of the serial port interrupts is shown in Table 10-1. Table 10-1. Priority of the Serial Port Interrupts
Interrupt Name1 SP0I SP1I SP2I SP3I Interrupt SPORT0 DMA Channels 0 and 1 (Highest Priority) SPORT1 DMA Channels 2 and 3 SPORT2 DMA Channels 4 and 5 SPORT3 DMA Channels 6 and 7 (Lowest Priority)
1 The interrupt names are defined in the def21161.h file supplied with the ADSP-21xxx Development Software.
SPORT interrupts occur on the second system clock (CLKIN) after the last bit of the serial word is latched in or driven out.
SPORT Reset
There are two ways to reset the serial ports: a software reset and a hardware reset. Each method has a different effect on the serial port. A software reset of the SPEN enable bit(s) disables the serial port(s) and aborts any ongoing operations. Status bits are also cleared. The serial ports are ready to start transmitting or receiving data two SCLK cycles after they are enabled in the SPCTLx control register. No serial clocks are lost from this point on. A hardware reset (RESET) disables the whole processor including the serial ports by clearing the SPCTLx control register. Any ongoing operations are aborted.
10-8
Serial Ports
10-9
symbols are contained in the file def21161.h located in the INCLUDE directory of the ADSP-21xxx Development Software. The def21161.h file is shown in the registers appendix section Register and Bit #Defines (def21161.h) on page A-121. All control and status bits in the SPORT registers are active high unless otherwise noted. Since the SPORT registers are memory-mapped, they cannot be written with data directly from memory. Instead, they must be written from (or read into) core registers, usually one of the general-purpose universal registers of the(R15-R0) register file. The SPORT control registers can also be written or read by external devices (for example, another processor or a host processor) to set up a serial port DMA operation. Table 10-2 provides a complete list of the SPORT registers, showing the memory-mapped IOP address and a brief description of each register. Table 10-2. SPORT Registers
Register SPCTL0 TX0A TX0B RX0A RX0B DIV0 CNT0 MR0CS0 MR0CCS0 MR0CS1 IOP Address 0x1C0 0x1C1 0x1C2 0x1C3 0x1C4 0x1C5 0x1C6 0x1C7 0x1C8 0x1C9 Reset 0x0000 0000 None None None None None None None None None Description SPORT0 serial control register SPORT0 transmit data buffer; A channel data SPORT0 transmit data buffer; B channel data SPORT0 receive data buffer; A channel data SPORT0 receive data buffer; B channel data SPORT0 divisor for transmit/receive SCLKx0 and FS0 SPORT0 count register SPORT0 multichannel receive select 0 (Channels 31-0) SPORT0 multichannel receive compand select 0 (Channel 31-0) SPORT0 multichannel receive select 1 (Channels 63-32)
10-10
Serial Ports
10-11
10-12
Serial Ports
10-13
10-14
Serial Ports
(bit 27) is equal to zero for the B channel. To test for the presence of any data in DXA/DXB, test whether DXS_A (bit 31) is equal to one for the A channel, or whether DXS_B (bit 28) is equal to one for the B channel.
DXS_B
There is one global control and status register for each paired SPORT (SPORT0 and SPORT2, SPORT1 and SPORT3) for multichannel operation, SP02MCTL and SP13MCTL, to define the number of channels, provide status of the current channel, enable multichannel operation, and set the multichannel frame delay. Since ADSP-21161 processor supports 128 TDM operations, the number of bits is increased to seven and are stored in a separate register, SP02MCTL or SP13MCTL. The SPxyMCTL register is shown in Figure A-35 on page A-111. The SPCTLx registers control the serial ports operating modes for the I/O processor. Table 10-3 lists all the bits in SPCTLx. Table 10-3. SPCTLx Control Bits Comparison in Three SPORT Modes of Operation
Bit I2S Mode Standard DSP Serial Mode Multichannel Mode Receive Control Bits (SPORT0 and SPORT1) Reserved DTYPE DTYPE SENDN SLEN0 SLEN1 SLEN2 SLEN3 SLEN4 PACK Multichannel Mode Transmit Control Bits (SPORT2 and SPORT3) Reserved DTYPE DTYPE SENDN SLEN0 SLEN1 SLEN2 SLEN3 SLEN4 PACK
0 1 2 3 4 5 6 7 8 9
SPEN_A Reserved Reserved Reserved SLEN0 SLEN1 SLEN2 SLEN3 SLEN4 PACK
SPEN_A DTYPE DTYPE SENDN SLEN0 SLEN1 SLEN2 SLEN3 SLEN4 PACK
10-15
Table 10-3. SPCTLx Control Bits Comparison in Three SPORT Modes of Operation (Contd)
Bit I2S Mode Standard DSP Serial Mode Multichannel Mode Receive Control Bits (SPORT0 and SPORT1) ICLK OPMODE CKRE Reserved IRFS Reserved LRFS Reserved SDEN_A SCHEN_A Reserved Reserved Reserved Reserved Reserved Reserved Reserved Reserved Reserved ROVF_A RXS_A RXS_A Multichannel Mode Transmit Control Bits (SPORT2 and SPORT3) Reserved OPMODE CKRE Reserved Reserved Reserved LTDV Reserved SDEN_A SCHEN_A Reserved Reserved Reserved Reserved Reserved Reserved Reserved Reserved Reserved TUVF_A TXS_A TXS_A
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
MSTR OPMODE Reserved Reserved Reserved DITFS L_FIRST Reserved SDEN_A SCHEN_A SDEN_B SCHEN_B FS_BOTH Reserved SPEN_B DDIR DERR_B DXS_B DXS_B DERR_A DXS_A DXS_A
ICLK OPMODE CKRE FSR IFS DITFS LFS LAFS SDEN_A SCHEN_A SDEN_B SCHEN_B FS_BOTH Reserved SPEN_B DDIR DERR_B DXS_B DXS_B DERR_A DXS_A DXS_A
10-16
Serial Ports
LFS
LAFS
Late FS 0=early FS, 1=late FS
DERR_A
DXA Error Status (sticky) DDIR=1,transmit underflow status DDIR=0, receive overflow status
SDEN_A
SPORT DMA enable A channel 1=enable, 0=disable
DXS_B*
DXB Data Buffer Status 11=full, 10=partially full ,00=empty
SCHEN_A
DMA chaining enable A channel 1=enable, 0=disable
DERR_B*
DXB Error Status (sticky)
SDEN_B
SPORT DMA enable B channel 1=enable, 0=disable
DDIR**
Data Direction Control 1=Active Transmit Buffers TXnB/TXnA 0=Enable Receive Buffers RXnB/RXnA
SCHEN_B
DMA chaining enable B channel 1=enable, 0=disable
SPEN_B
SPORT Enable B 1=enable, 0=disable * Status is read-only ** Do not read/write from/to inactive RXn/TXn buffers 15 14 13 12 11 10 0 0 0 0 0 0 9 0 8 0 7 0 6 0 5 0 4 0 3 0 2 0 1 0 0 0
FS_BOTH
1=issue WS only if data is present in both Tx 0=issue WS if data is present in either Tx
DITFS
Data Independent tx FS (if DDIR=1) 1=data independent, 0= data dependent
SPEN_A
SPORT Enable A (1=enable, 0=disable)
IFS
Internally generated FS 1=internal FS, 0=external FS
DTYPE
Data type 00=right-justify; fill MSB with 0s 01=right-justify; sign extend MSB 10=compand mu-law 11=compand A-law
FSR
FS requirement 1=FS required, 0=FS not required
SENDN
Endian word format 0=MSB first, 1=LSB first
CKRE
Clock edge for data Frame Sync sampling or driving (1=rising edge, 0=falling edge)
SLEN
Serial Word Length-1
OPMODE
SPORT Operation Mode 0=DSP serial mode/multichannel mode 1=I2S mode
PACK
16/32 packing 1=packing, 0=no packing
ICLK
10-17
I 2S Mode
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
L_FIRST
Left or Right I2S channel RX/TX first 1=start left data first 0= start right data first
DERR_A
DXA Error Status (sticky) DDIR=1,transmit underflow status DDIR=0, receive overflow status
SDEN_A
SPORT Transmit DMA enable A ch. 1=enable, 0=disable
DXS_B*
DXB Data Buffer Status 11=full, 10=partially full, 00=empty
SCHEN_A
DERR_B*
DXB Error Status (sticky)
SDEN_B
SPORT transmit DMA enable Bch . 1=enable, 0=disable
D DIR**
Data Direction Control 1=Active Transmit Buffers TXnA/TXnB 0=Enable Receive Buffers RXnA/RXnB
SCHEN_B
DMA Chaining enable B channel 1=enable, 0=disable
SPEN_B
SPORT Enable B 1=enable, 0=disable * Status is read-only ** Do not read/write from/to inactive RXn/TXn buffers 15 14 13 12 11 10 9 0 0 0 0 0 0 0 8 0 7 0 6 0 5 0 4 0 3 2 1 0 0 0
FS_BOTH
1=issue WS only if data is present in both Tx 0= issue WS if data is present in either Tx
0 0
DITFS
Data Independent tx FS (if DDIR=1) 1=data independent, 0=data dependent
SPEN_A
SPORT Enable A (1=enable, 0=disable)
OPMODE
SPORT Operation Mode 0=DSP serial mode/multichannel mode 1=I 2 S mode
SLEN
Serial Word Length- 1
PACK
16/32 packing 1=packing, 0=no packing
MSTR
I2S serial and L/R clock Master 1=internal SCLK and WS, TX/RX is master 0=external SLCK and WS, TX/RX is slave
10-18
Serial Ports
Multichannel Mode
Receive Control Bits
31 30 29 28 27 26 0 0 25 24 23 22 21 20 0 0 0 0 0 0 19 18 17 0 0 0 16 0
LRFS RXS_A*
RXA Data Buffer Status 11=full, 10=partially full, 00=em pty Active Low Multichannel Receive FS0/FS1 0=active high, 1=active low
SDEN_A
SPO RT receive DMA enable A 1=enable, 0=disable
R OVF_A*
RXA Underflow Status (sticky)
SCHEN_A
SPO RT receive DMA chaining enable A 1=enable, 0=disable
*Status is read-only
15 14 13 0 0 0
12 0
11 10 0 0
9 0
8 0
7 0
6 0
5 0
4 0
3 0
2 0
1 0
0 0
IRFS
Internally G enerated Multichannel rx FS 1=internal FS0/FS1, 0=external FS0/FS1
DTYPE
CKR E
Active clock edge for data & frame sync sampling (1=rising edge, 0=falling edge)
Data type 00=right-justify; fill MSB with 0s 01=right-justify; sign extend M SB 10=compand mu-law 11=compand A-law
SENDN
Endian word form at 0=MSB first, 1=LSB first
O PMOD E
SPORT Operation Mode 0=DSP serial mode/multichannel mode 1= I 2 S mode
SLEN
Serial W ord Length -1
ICLK
Internally -generated Receive clock 1=internal clock, 0=external clock
PACK
16/32 packing 1=packing, 0=no packing
Figure 10-5. SPCTL Receive Control Bits in Multichannel Mode for SPORT0 and SPORT1
10-19
Multichannel Mode
Transmit Control Bits
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 0 0 0 0
LTDV TXS_A*
TXA Data Buffer Status 11=full, 10=partially full, 00=empty Active Low MC Transmit Data Valid 0=active high TVD2/TDV3 1=active low TDV2/TDV3
SDEN_A
SPORT transmit DMA enable A 1=enable, 0=disable
TUVF_A*
TXA Underflow Status (sticky)
SCHEN_A
SPORT transmit DMA chaining enable A 1=enable, 0=disable 15 14 13 12 11 10 0 0 0 0 0 0 9 0 8 0 7 0 6 0 5 0 4 0 3 0 2 0 1 0 0 0
Reserved** OPMODE
SPORT Operation Mode 0=DSP serial mode/multichannel mode 1=I 2S mode
Data type x0=right-justify; fill MSB with 0s x1=right-justify; sign extend MSB 0x=compand mu-law 1x=compand A-law
DTYPE
PACK
16/32 packing 1=packing, 0=no packing
SENDN
Endian word format 0=MSB first, 1=LSB first
SLEN
Serial Word Length -1
Figure 10-6. SPCTL Transmit Control Bits in Multichannel Mode for SPORT2 and SPORT3
10-20
Serial Ports
CHNL
Current Channel (read-only) 15 14 13 12 11 10 9 8 0 0 0 0 0 0 0 0 7 0 6 5 4 0 3 2 1 0 0 0
0 0
0 0
SPL
SPORTLoopback SPORT0 & SPORT2 only SPORT1 & SPORT3 only
MCE
Multichannel enable (1=enable, 0=disable)
MFD
Multichannel Frame Delay
NCH
Number of Channels - 1
Figure 10-7. SPxyMCTL Control Bits for Multichannel Mode The following bits control serial port modes and are part of the SPCTLx control registers. Other bits in the SPCTLx registers set up DMA and I/O processor related serial port features. Current Channel Selected. SP02MCTL or SP13MCTL Bits 16-22 (CHNL). These read-only, sticky status bits identify the currently selected transmit channel slot (0 to 127). These bits apply to multichannel mode only. Clock Rising Edge Select. SPCTLx Bit 12 (CKRE).This bit selects whether the serial port uses the rising edge (if set, =1) or falling edge (if cleared, =0) of the clock signal for sampling data and the frame sync. This bit applies to DSP standard serial and multichannel modes only.
10-21
Data Direction Control. SPCTLx Bit 25 (DDIR). This bit controls the data direction of the serial port channel A and B pins. 0 = SPORT is configured to receive on both channels A and B 1 = SPORT is configured to transmit on both channels A and B When configured to receive, the RXxA and RXxB buffers are activated, while the receive shift registers are controlled by SCLKx and FS. The TXxA and TXxB buffers are inactive. When configured to transmit, the TXxA and TXxB buffers are activated, while the transmit shift registers are controlled by SCLKx and FSx. The RXxA and RXxB buffers are inactive. This bit applies to all registers for I2S and DSP standard serial modes. Reading from or writing to inactive buffers will cause a core hang indefinitely until the SPORT is cleared. A hardware reset or host reset will clear the SPORT. Data Independent Transmit Frame Sync Select. SPCTLx Bit 15 (DITFS).This bit selects whether the serial port uses a data-independent transmit frame sync (sync at selected interval, if set to 1) or a data-dependent TFS (sync when data is in the transmit buffer, if cleared to 0) when DDIR=1. When DITFS =0, a transmit FSx signal is generated only when new data is in the SPORT channels transmit data buffer. Applications must also program the DIVx register. When DITFS = 1, a transmit FSx signal is generated, regardless of the validity of the data present in the SPORT channels transmit data buffer. The processor generates the transmit FSx signal at the frequency specified by the value loaded in the DIV register. This bit applies to all SPCTLx registers in I2S and DSP standard serial modes, and SPCTL2 and SPCTL3 register transmit control for multichannel mode. 10-22 ADSP-21161 SHARC Processor Hardware Reference
Serial Ports
DXS Data Buffer Status. SPCTLx Bits 30 and 31(DXS_A) and Bits 27 and 28 (DXS_B). These read-only, sticky bits indicate the status of the serial ports data buffer as follows: 11= buffer full, 00= buffer empty, 10=buffer partially full, 01= reserved. These bits apply to I2S and DSP standard serial modes. When the SPORT is configured as a transmitter, these bits reflect transmit buffer status for the TXxA and TXxB registers. When the SPORT is configured as a receiver, these bits reflect receive buffer status for the RXxA and RXxB registers. Data Buffer Error Status (sticky, read-only). SPCTLx Bit 29 and 26 (DERR).These bits indicate whether the serial transmit operation has underflowed (if set, =1 and DDIR=1) or a receive operation has overflowed (if cleared, =0 and DDIR=0) in the DXA and DXB data buffers. These bits apply to I2S and DSP standard serial modes. When the SPORT is configured as a transmitter, this bit provides transmit underflow status and indicates whether the FSx signal (from internal or external source) occurred while the DXS buffer was empty. The SPORTs transmit data whenever they detect a FSx signal. 0 = No FS signal occurred. 1=
FS
signal occurred.
When the SPORT is configured as a receiver, these bits provide receive overflow status. As a receiver, it indicates when the channel has received new data while the RXS_A buffer is full. New data overwrites existing data. 0 = No new data. 1 = New data.
10-23
Data Type Select. SPCTLx Bits 2-1 (DTYPE).These bits select the companding and MSB data type formatting of serial words loaded into the transmit and receive buffers. The transmit shift register does not zero fill or sign-extend transmit data words. This bit applies to DSP standard serial and multichannel modes only. For standard mode, selection of companding mode and MSB format are exclusive: 00 = Right justify; fill unused MSBs with 0s. 01 = Right justify; sign-extend into unused MSBs. 10 = Compand using _law. (Primary channels only) 11 = Compand using A_law. (Primary channels only) For multichannel mode, selection of companding mode and MSB format are independent: x0 = Right justify; fill unused MSBs with 0s. x1 = Right justify; sign-extend into unused MSBs. 0x = Compand using _law. 1x = Compand using A_law. Frame Sync Both Enable. SPCTLx Bit 22 (FS_BOTH). This bit applies when the SPORTS channels A and B are configured to transmit data. If set (=1), this bit issues word select only when data is present in both transmit buffers, TX0A and TX0B. If cleared (=0), a word select is issued if data is present in either transmit buffers. This bit applies to I2S and DSP standard serial modes only. Internal Transmit Clock Select. SPCTLx Bit 10 (ICLK). This bit selects the internal (if set, =1) or external (if cleared, =0) transmit or receive clock. This bit applies to DSP standard serial and multi-
10-24
Serial Ports
channel modes for SPCTL0 and SPCTL1 registers. In these modes only, set this parameter separately for all four SPORTs, where each SPCTL register contains an ICLK bit. Receive Multichannel Frame Sync Source. SPCTL0 and SPCTL1 Bit 14 (IRFS).This bit selects whether the serial port uses an internal clock generated frame sync (if set, =1) or an external (if cleared, =0) source. This bit applies to multichannel mode only. Internal Frame Sync Select. SPCTLx Bit 14 (IFS).This bit selects whether the serial port uses an internal clock generated frame sync (if set, =1) or an external (if cleared, =0) source. This bit applies to DSP standard serial mode only. Late Transmit Frame Sync Select. SPCTLx Bit 17 (LAFS). This bit selects when to generate the frame sync signal. This bit selects a late frame sync if set (=1) during the first bit of each data word. This bit selects an early frame sync if cleared (=0) during the serial clock cycle immediately preceding the first data bit. This bit applies to DSP standard serial mode only. Left/Right Channel Transmit or Receive First. SPCTLx Bit 16 (L_FIRST).This bit selects the left channel first (if set, =1) or right channel first (if cleared, =0) for transmit or receive. This bit applies to I2S mode only. Low Active Frame Sync Select. SPCTLx Bit 16 (LFS).This bit selects the logic level of the (transmit or receive) frame sync signals. Active high (0) is the default. This bit selects an active low frame sync (if set, =1) or active high frame sync (if cleared, =0). This bit applies to DSP standard serial mode only. Active State Multichannel Receive Frame Sync Select.SPCTL0 and SPCTL1 Bit 16 ( LRFS).This bit selects the logic level of the multichannel received frame sync signals as active low (inverted) if set (=1) or active high if cleared (=0). Active high (0) is the default. This bit applies to multichannel modes only. ADSP-21161 SHARC Processor Hardware Reference 10-25
Active State Transmit Data Valid. SPCTL2 and SPCTL3 Bit 16 (LTDV).This bit selects the logic level of the transmit data valid signals (TDV2, TDV3) pins as active low (inverted) if set (=1) or active high if cleared (=0). These pins are actually FS2 and FS3 reconfigured as outputs during multichannel operation, indicating which timeslots have valid data to transmit. Active high (0) is the default. This bit applies to multichannel mode only.
Multichannel Mode Enable. SP02MCTL and SP13MCTL Bit 0 (MCE). Standard and multichannel modes only. in the registers. One of two configuration bits that enable and disable multichannel mode on both the receive or transmit serial port channels. If MCE is cleared (=0), then multichannel operation is disabled. If MCE is set (=1) and OPMODE is cleared (=0), then multichannel operation is enabled. This bit applies to DSP standard serial and multichannel modes only. Multichannel Frame Delay. SP02MCTL and SP13MCTL Bit 1-4 (MFD).These bits set the interval, in terms of serial clock cycles, between the multichannel frame sync pulse and the first data bit. These bits provide support for different types of T1 interface devices. Valid values range from 0 to 15. Values of 1 to 15 correspond to the number of intervening serial clock cycles. A value of 0 corresponds to no delay. The multichannel frame sync pulse is concurrent with first data bit. This bit applies multichannel mode only. SPORT Transmit or Receive Master Mode. SPCTLx Bit 10 (MSTR). This bit selects the clock and word-select source for transmitting or for receiving. If set (=1), the SPORT uses the internal clock, and the word-select source transmitter or receiver is the master. If cleared (=0), the SPORT transmitter or receiver is a slave. This bit applies to I2S mode only.
10-26
Serial Ports
Number of Multichannel Slots (minus one). SP02MCTL and SP13MCTL Bit 5 -11 (NCH).These bits select the number of channel slots (maximum of 128) to use for multichannel operation. Valid values for actual number of channel slots range from 1 to 128. This bit applies to multichannel mode only. Use the following formula to calculate the value for NCH: NCH = Actual number of channel slots -1. SPORT Operation Mode. SPCTLx Bit 11 (OPMODE). This bit enables if set (=1) or disables if cleared (=0) the I2S mode. When this bit is set, the processor ignores the MCE bit. When this bit is cleared, the MCE bit determines whether the SPORT is in DSP serial mode (MCE=0) or multichannel mode (MCE=1). 16-bit to 32-bit Word Packing Enable. SPCTLx Bit 9 (PACK).This bit enables (if set, =1) or disables (if cleared, =0) 16- to 32-bit word packing. This bit applies to all operation modes. Frame Sync Required Select. SPCTLx Bits 13 (FSR).This bit selects whether the serial port requires (if set, =1) or does not require (if cleared, =0) a transfer frame sync. Only a single frame sync signal is required to initiate communications. The frame sync is ignored after the first bit received. This bit applies to DSP standard serial mode only. Receive Overflow Status (read-only, sticky). SPCTL0 and SPCTL1 Bit 29 (ROVF). These bits indicate when the channel has received new data if set (=1) or not if cleared (=0) while the RXS_A buffer is full. New data overwrites existing data. This bit applies to multichannel mode only. Receive Data Buffer Status Channel A (read-only). SPCTL0 and SPCTL1 Bits 30 and 31 (RXS_A). These bits indicate the status of the channel's receive buffer contents as follows: 00 = buffer empty, 01 = reserved, 10 = buffer partially full, 11 = buffer full. These bits apply to multichannel mode only.
10-27
Serial Port DMA Chaining Enable. SPCTLx Bits 19 and 21 (SCHEN_A and SCHEN_B).These bits enable (if set, =1) or disable (if cleared, =0) serial ports channels A and B DMA chaining. Bit 21 applies to I2S and DSP standard serial modes only for secondary (B) SPORT channels. Serial Port DMA Enable. SPCTLx Bits 18 and 20 (SDEN_A and SDEN_B).These bits enable (if set, =1) or disable (if cleared, =0) the serial ports channel DMA. Bit 20 applies to I2S and DSP standard serial modes only for secondary (B) SPORT channels. Serial Word Endian Select. SPCTLx Bit 3 (SENDN).This bit selects little endian words (LSB first, if set, =1) or big endian words (MSB first, if cleared, =0). This bit applies to DSP standard serial and multichannel modes only. Serial Word Length Select. SPCTLx Bit 4-8 (SLEN).These bits select the word length in bits. Word sizes can be from 3-bit (SLEN=2) to 32-bit (SLEN=31). These bits apply to all operation modes. Use the following formula to calculate the value for SLEN: SLEN = Actual serial word length 1
SLEN
cannot equal 0 or 1.
Serial Port Enable. SPCTLx Bits 0 and 24 (SPEN_A and SPEN_B).This bit enables (if set, =1) or disables (if cleared, =0) the corresponding serial port channel A or B. Clearing this bit aborts any ongoing operation and clears the status bits. The SPORTS are ready to transmit or receive two cycles after enabling. This bit apply to I2S and DSP standard serial modes only.
10-28
Serial Ports
SPORT Loopback Mode. SP02MCTL or SP13MCTL Bit 12 (SPL). This bit enables, if set (=1), or disables, if cleared (=0), the channel loopback mode. Loopback mode enables you to run internal tests and to debug applications. Loopback works only under the following SPORT configurations: SPORT0 (configured as a receiver or transmitter) together with SPORT2 (configured as a transmitter or receiver). SPORT0 can only be paired with SPORT2, and controlled via the SPL bit in the SP02MCTL register. SPORT1 (configured as a receiver or transmitter) together with SPORT3 (configured as a transmitter or receiver). SPORT1 can only be paired with SPORT3, and controlled via the SPL bit in the SP13MCTL register. Either of the two paired SPORTs can be set up to transmit or receive, depending on their DDIR bit configurations. This bit applies to DSP standard serial and I2S modes only. Transmit Underflow Status (sticky, read-only). SPCTL2 and SPCTL3 Bit 29 ( TUVF_A).This bit indicates (if set, =1) whether the multichannel FSx signal (from internal or external source) occurred while the TXS buffer was empty. The SPORTs transmit data whenever they detect an FSx signal. If cleared (=0), No FSx signal occurred. This bit applies to multichannel mode only when the SPORTs are configures as transmitters. Transmit Data Buffer Status (sticky, read-only). SPCTL2 and SPCTL3 Bits 30 and 31(TXS_A). These bits indicate the status of the serial port channels transmit buffer as follows: 11=buffer full, 00=buffer empty, 10=buffer partially full. These bits apply to multichannel mode only.
10-29
Register Writes and Effect Latency SPORT register writes are internally completed at the end of the same CLKIN cycle in which they occur. The newly written value to the SPORT register can be read back on the very next cycle. When a read of one of the SPCTLx control registers is immediately followed by a write to that register, the write may take two cycles to complete. After a write to a SPORT register, control and mode bit changes generally take effect in the second CLKIN cycle after the write is completed. The serial ports are ready to start transmitting or receiving two CLKIN cycles after they are enabled (in the SPCTLx control register). No serial clocks are lost from this point on.
10-30
Serial Ports
The transmit buffers act like a two-location FIFO because they have a data register plus an output shift register as shown in Figure 10-2 on page 10-5. Two 32-bit words may be stored in the transmit queue at any one time. When the transmit register is loaded and any previous word has been transmitted, the register contents are automatically loaded into the output shifter. An interrupt occurs when the output transmit shifter has been loaded, signifying that the transmit buffer is ready to accept the next word (for example, the transmit buffer is not full). This interrupt does not occur when serial port DMA is enabled or when the corresponding mask bit in the IMASK register is cleared. In I2S and DSP Standard serial port modes, the DERR_A and DERR_B overflow/underflow status bits are set when an overflow or underflow occurs. In multichannel mode, the DERR_A bits are redefined due to the fixed-directional functionality of the SPCTLx registers. When the SPCTL0 and SPCTL1 registers are configured for multichannel mode, the receive overflow bit ROVF_A indicates when the A channel has received new data while the RXS_A buffer is full. Similarly, when the SPCTL2 and SPCTL3 registers are configured for multichannel mode, the transmit overflow bit TUVF_A indicates that a new frame sync signal (FS0/FS1) occurred while the TXS_A buffer was empty. The DERR_A (Bit 29) overflow/underflow status bit in the SPCTLx register becomes fixed in multichannel mode only as either the RUVF_A overflow status bit (SPORTs 0 and 1) or TUVF_A underflow status bit (SPORTs 2 and 3). When the SPORT is configured as a transmitter (DDIR =1), a transmit underflow status bit is set in the serial port control register when a transmit frame sync occurs and no new data has been loaded into the transmit buffer. The TUVF_A/DERR_A status bit is sticky and is only cleared by disabling the serial port. When the SPORT is configured as a receiver (DDIR =0), the receive buffers are activated. The receive buffers act like a three-location FIFO because they have two data registers plus an input shift register. Two complete
10-31
32-bit words can be stored in the receive buffer while a third word is being shifted in. The third word overwrites the second if the first word has not been read out (by the processor core or the DMA controller). When this happens, the receive overflow status bit is set in the serial port control register. Almost three complete words can be received without the receive buffer being read before overflow occurs. The overflow status is generated on the last bit of the third word. The ROVF_A/DERR_A status bit is sticky and is cleared only by disabling the serial port. An interrupt is generated when the receive buffer has been loaded with a received word (for example, the receive buffer is not empty). When the corresponding bit in the IMASK register is set, this interrupt is unmasked. If your program causes the core processor to attempt to read from an empty receive buffer or a write to a full transmit buffer, the access is delayed until the buffer is accessed by the external I/O device. This delay is called a core processor hang. If you do not know whether the core processor can access the receive or transmit buffer without a hang, the buffer's status should be read first (in SPCTLx) to determine if the access can be made. To support debugging buffer transfers, the ADSP-21161 processor has a Buffer Hang Disable (BHD) bit. When set (=1), this bit prevents the processor core from detecting a buffer-related stall condition, permitting debugging of this type of stall condition. For more information, see the BHD discussion on on page 6-43. The status bits in SPCTLx are updated during reads and writes from the core processor even when the serial port is disabled. Disable the serial port when writing to the receive buffer or reading from the transmit buffer. When programming the serial port channel (A or B) as a transmitter, only the corresponding TXxA and TXxB buffers become active while the receive buffers RXxA and RXxB remain inactive. Similarly, when the SPORT channel A and B is programmed as receive only the corresponding RXxA and RXxB is activated. Do not attempt to
10-32
Serial Ports
read or write to inactive data buffers. If the ADSP-21161 processor operates on the inactive transmit or receive buffers while the SPORT is enabled, unpredictable results may occur.
Figure 10-8. DIVx Register The bit field CLKDIV specifies how many times the processors internal clock (CCLK) is divided to generate the transmit and receive clocks. The frame sync FS is considered a receive frame sync if the data pins are configured as receivers. Likewise, the frame sync FS is considered a transmit
10-33
frame sync if the data pins are configured as transmitters. The divisor is a 16-bit value, allowing a wide range of serial clock rates. Use the following equation to calculate the serial clock frequency: f CCLK FSCLK = ---------------------------------------2 ( CLKDIV + 1 ) The maximum serial clock frequency is equal to half the processors internal clock (CCLK) frequency, which occurs when CLKDIV is set to zero. Use the following equation to determine the value of CLKDIV to use, given the CCLK frequency and desired serial clock frequency: f CCLK CLKDIV = --------------------- 1 2 ( f SCLK ) The processors internal clock (CCLK) is the clock ratio determined by the CLKDBL pin and the CLK_CFG[1-0] pins. The bit field FSDIV specifies how many transmit or receive clock cycles are counted before generating a FS pulse (when the frame sync is internally generated). In this way, a frame sync can initiate periodic transfers. The counting of serial clock cycles applies to internally or externally generated serial clocks. The formula for the number of cycles between frame sync pulses is: # of serial clocks between frame syncs = FSDIV + 1 Use the following equation to determine the value of FSDIV, given the serial clock frequency and desired frame sync frequency: f SCLK FSDIV = ------------- 1 f SFS
10-34
Serial Ports
The frame sync is continuously active when FSDIV = 0. The value of FSDIV should not be less than the serial word length minus one (the value of the SLEN field in the serial port control register), as this may cause an external device to abort the current operation or cause other unpredictable results. If the serial port is not being used, the FSDIV divisor can be used as a counter for dividing an external clock or for generating a periodic pulse or periodic interrupt. The serial port must be enabled for this mode of operation to work. Exercise caution when operating with externally generated transmit clocks near the frequency of half the processors internal clock. There is a delay between when the clock arrives at the SCLKx pin and when data is outputthis delay may limit the receivers speed of operation. Refer to the data sheet for exact timing specifications. For reliable operation, use full-speed serial clocks only when receiving with an externally generated clock and externally generated frame sync (ICLK = 0, IFS = 0). Externally-generated late transmit frame syncs also experience a delay from when they arrive to when data is output. This can also limit the maximum serial clock speed. Refer to the ADSP-21161N DSP Microcomputer Data Sheet for exact timing specifications.
10-35
Word Length
Serial ports can process word lengths of 3 to 32 bits for serial and multichannel modes and 8 to 32 bits for I2S mode. Word length is configured using the 5-bit SLEN field in the SPCTLx control registers. The value of SLEN is given as follows:
SLEN
Do not set the SLEN value to zero or one. Words smaller than 32 bits are right-justified in the receive and transmit buffers, residing in the least significant bit positions. Although serial ports process word lengths of 3 to 32 bits, transmitting or receiving words smaller than 7 bits at half the full clock rate of the processor may cause incorrect operation when DMA chaining is enabled. Chaining disables the processors internal I/O bus for several cycles while the new TCB parameters are being loaded. Receive data may be lost (for example, overwritten) during this period. Transmitting or receiving words smaller than five bits may cause incorrect operation when all the DMA channels are enabled with no DMA chaining.
Endian Format
Endian format determines whether serial words transmit MSB-first or LSB-first. Endian format is selected by the SENDN bit in the SPCTLx control registers. When SENDN = 0, serial words transmit (or receive) MSB-first. When SENDN = 1, serial words transmit (or receive) LSB-first.
10-36
Serial Ports
10-37
Table 10-4. DTYPE and Data Formatting (DSP Serial Mode) (Contd)
DTYPE 10 11 Data Formatting Compand using -law (primary A channels only) Compand using A-law (primary A channels only)
These formats are applied to serial data words loaded into the receive and transmit buffers. Transmit data words are not zero-filled or sign-extended, because only the significant bits are transmitted. For multichannel operation, the companding selection and MSB-fill selection is independent (Table 10-5). Table 10-5. DTYPE and Data Formatting (Multichannel)
DTYPE x0 x1 0x 1x Data Formatting Right-justify, zero-fill unused MSBs Right-justify, sign-extend into unused MSBs Compand using -law (primary A channels only) Compand using A-law (primary A channels only)
Linear transfers occur if the channel is active and companding is not selected for that channel. Companded transfers occur if the channel is active and companding is selected for that channel. The multichannel compand select registers, MTzCCSx and MRzCCSx, specify the transmit and receive channels that are companded. Transmit or receive sign extension is selected by bit 0 of DTYPE in the SPCTLx register and is common to all transmit or receive channels. If bit 0 of DTYPE is set, sign extension occurs on selected channels that do not have companding selected. If this bit is not set, the word contains zeros in the MSBs.
10-38
Serial Ports
Companding
Companding (compressing/expanding) is the process of logarithmically encoding and decoding data to minimize the number of bits that must be sent. The serial ports support the two most widely used companding algorithms, A-law and -law, performed according to the CCITT G.711 specification. The type of companding can be selected independently for each SPORT. Companding is selected by the DTYPE field of the SPCTLx control register. Companding is supported on the A channel only. SPORTs 2 and 3 primary channels are capable of compression, while SPORTs 0 and 1 primary channels are capable of expansion. When companding is enabled, the data in the RX0A and RX1A buffers is the right-justified, sign-extended expanded value of the eight received LSBs. A write to TX2A and TX3A compresses the 32-bit value to eight LSBs (zero-filled to the width of the transmit word) before it is transmitted. If the 32-bit value is greater than the 13-bit A-law or 14-bit -law maximum, it is automatically compressed to the maximum value. Since the values in the transmit and receive buffers are actually companded in-place, the companding hardware can be used without transmitting (or receiving) any data, for example during testing or debugging. This operation requires one cycle of overhead, as described below. For companding to execute properly, program the SPORT registers prior to loading data values into the SPORT buffers. To compand data in-place, without transmitting: 1. Enable companding in the DTYPE field (bits 21) and enable the DDIR bit (bit 25) of the SPCTLx transmit control register. 2. Write a 32-bit data word to the transmit buffer. The companding is calculated in this cycle.
10-39
3. Wait one cycle. A NOP instruction can be used to do this; if a NOP is not inserted, the core is held off for one cycle anyway. This allows the serial port companding hardware to reload the transmit buffer with the companded value. 4. Read the 8-bit companded value from the transmit buffer. The following is an example for companding data in-place.
R0=0x2000004; Dm(0x1f0)=r0; Nop;nop;nop;nop; R0=0x1234; Dm(0x1f1)=r0; Nop; R0=dm(0x1f1); // Read compressed value (0x8D) from TX3A // Write 0x1234 to TX3A //Set up SPCTL3
To expand data in-place, use the same sequence of operations with the receive buffer instead of the transmit buffer. When expanding data in this way, set the appropriate serial word length (SLEN) in the SPCTLx control register. With companding enabled, interfacing the serial port to a codec requires little additional programming effort. If companding is not selected, two formats are available for received data words of fewer than 32 bits: one that fills unused MSBs with zeros, and another that sign-extends the MSB into the unused bits.
10-40
Serial Ports
The serial clock can be independently generated internally or input from an external source. The ICLK bit of the SPCTLx control registers determines the clock source. When ICLK is set (=1), the clock signal is generated internally by the processor and the SCLKx pins are outputs. The clock frequency is determined by the value of the serial clock divisor (CLKDIV) in the DIVx registers. When ICLK is cleared (=0), the clock signal is accepted as an input on the SCLKx pins, and the serial clock divisors in the DIVx registers are ignored. The externally generated serial clock does not need to be synchronous with the system clock.
10-41
When FSR is cleared (=0), the corresponding frame sync signal is not required. A single frame sync is required to initiate communications but it is ignored after the first bit is transferred. Data words are then transferred continuously in what is referred to as an unframed mode. When DMA is enabled in a mode where frame syncs are not required, DMA requests may be held off by chaining or may not be serviced frequently enough to guarantee continuous unframed data flow. Figure 10-9 illustrates framed serial transfers.
SCLK
FRAMED DATA
B 3
B 2
B 1
B 0
B 3
B 2
B 1
B 0
UNFRAMED DATA B 3 B 2 B 1 B 0 B 3 B 2 B 1 B 0 B 3 B 2 B 1
10-42
Serial Ports
When IFS is set (=1), the corresponding frame sync signal is generated internally by the processor, and the FSx pin is an output. The frequency of the frame sync signal is determined by the value of the frame sync divisor (FSDIV) in the DIVx register. When IFS is cleared (=0), the corresponding frame sync signal is accepted as an input on the FSx pins, and the frame sync divisors in the DIVx registers are ignored. All of the frame sync options are available whether the signal is generated internally or externally.
10-43
For receive/transmit data and frame syncs, setting CKRE to 1 in SPCTLx selects the rising edge of SCLKx. When CKRE is cleared (=0), the processor selects the falling edge. Note that data and frame sync signals change state on the clock edge that is not selected. For example, the transmit and receive functions of any two serial ports connected together should always select the same value for CKRE so internally generated signals are driven on one edge and received signals are sampled on the opposite edge.
10-44
Serial Ports
nally generated frame syncs remain asserted for the entire length of the data word in late framing mode. Externally generated frame syncs are only checked during the first bit. They do not need to be asserted after that time period. Figure 10-10 illustrates the two modes of frame signal timing.
SCLK
DATA
B3
B2
B1
B0
10-45
SPORT Loopback
When DITFS is set (=1), the internally generated (transmit) frame sync is output at its programmed interval regardless of whether new data is available in the transmit buffer. Whatever data is present in the transmit buffer is retransmitted with each assertion of frame sync. Depending on the SPORT operating mode, the TUVF_A or DERR_A/DERR_B transmit underflow status bit is set when this occurs (for example, when old data is retransmitted). The TUVF_A or DERR_A/DERR_B status bit is also set if the transmit buffer does not have new data when an externally generated frame sync occurs. In this mode of operation, the first internally generated frame sync is delayed until data has been loaded into the transmit buffer. If the internally generated frame sync is used, a single write to the transmit data register is required to start the transfer.
SPORT Loopback
When the SPORT loopback bit (SPL) is set in the SP02MCTL or SP13MCTL control register, the serial port is configured in an internal loopback connection as follows: SPORT0 and SPORT2 work as a pair for internal loopback, SPORT1 and SPORT3 work as a pair for internal loopback. The loopback configuration allows the serial ports to be tested internally. When loopback is configured, the DxA, DxB, SCLKx, and FSx signals of the SPORT0 and SPORT1 are internally connected to the DyA, DyB, SCLKy, and FSy signals of SPORT2 and SPORT3 respectively where x = 0 or 1, and y = 2 or 3. In loopback mode, either of the two paired SPORTS can be a transmitter or receiver. One SPORT in the loopback pair must be configured as a transmitter, and the other must be configured as a receiver. For example, SPORT0 can be a transmitter and SPORT2 can be a receiver for internal loopback. Or, SPORT0 can be a receiver and SPORT2 can be the transmitter when setting up internal loopback. The processor ignores external
10-46
Serial Ports
activity on the SCLKx, FSx, A and B channel data pins when the SPORT is configured as the receiver. This prevents contention with the internal loopback data transfer. Only transmit clock and transmit frame sync options may be used in loopback modeprograms must ensure that the serial port is set up correctly in the SPCTLx control registers. Multichannel mode is not allowed. Only standard DSP serial and I2S modes support internal loopback.
10-47
If DDIR bit is set (=1), the SPORT becomes a transmitter and all the other control bits are defined accordingly. Similarly for DDIR =0, the SPORT becomes a receiver. Multichannel mode and companding is not supported for I2S mode.
I 2 S Mode
I2S is a three-wire serial bus standard protocol for transmission of two channel (stereo) Pulse Code Modulation (PCM) digital audio data, in which each sample is sent MSB-first. Many of today's analog and digital audio front-end devices support the I2S protocol including: audio D/A and A/D converters, PC multimedia audio controllers, digital audio transmitters and receivers that support serial digital audio transmission standards such as AES/EBU, SP/DIF, IEC958, CP-340, and CP-1201, digital audio signal processors, dedicated digital filter chips, and sample rate converters. The I2S bus transmits audio data and control signals over separate lines. The data line carries two multiplexed data channels: the left channel and the right channel. In I2S mode, if both channels on a SPORT are set up to transmit, then SPORT transmit channels (TXxA and TXxB) transmit simultaneously, each transmitting left and right I2S channels. If both channels on a SPORT are set up to receive, the SPORT receive channels (RXxA and 2 RXxB) receive simultaneously, each receiving left and right I S channels. Data is transmitted in MSB format. Multichannel operation and companding are not supported in I2S mode. Each SPORT transmit or receive channel has channel enable, DMA enable, and chaining enable bits in its SPCTLx control register. The FSx signal is used as the transmit and/or receive word select signal. DMA-driven or interrupt-driven data transfers can also be selected using bits in the SPCTLx register.
10-48
Serial Ports
Setting Internal Serial Clock and Frame Sync Rates The serial clock rate (CLKDIV value) for internal clocks can be set using a bit field in the CLKDIV register. For details, see Clock and Frame Sync Frequencies (DIV) on page 10-33. I 2 S Control Bits Several bits in the SPCTLx control register enable and configure I2S operation: operation mode (OPMODE), word length (SLEN), I2S channel transfer order (L_FIRST), frame sync (word select) generation (FS_BOTH), master mode enable (MSTR), DMA enable (SDEN), and DMA chaining enable (SCHEN). Setting Word Length (SLEN) SPORTs handle data words containing 8 to 32 bits in I2S Mode. Set the bit length for transmit and receive data words. For details, see Word Length on page 10-36. The transmitter sends the MSB of the next word one clock cycle after the word select (TFS) signal changes. In I2S mode, load the FSDIV register with the same value as SLEN to transmit or receive words continuously. For example, for 8-bit data words (SLEN = 7), set FSDIV = 7. Selecting Transmit Receive Channel Order (L_FIRST) In master and slave modes, it is possible to configure the I2S channel that each SPORT channel transmits or receives first. By default, the SPORT channels transmit and receive on the right I2S channel first. The left and right I2S channels are time-duplexed data channels.
10-49
To select the channel order, set the L_FIRST bit (= 1) to transmit or receive on left channel first, or clear the L_FIRST bit (= 0) to transmit or receive on right channel first. Selecting the Frame Sync Options (FS_BOTH) The processor uses FSx as transmit or receive word select signals, depending on configured direction of the data pins. When the processor generates the transmit word select signal (based on the data in the transmit channels), set FS_BOTH (= 1) to generate the word select signal when both transmit channels contain data. Clear FS_BOTH (= 0) to generate word select signal if either transmit channel contains data. The word select signal changes one clock cycle before the MSB of the data word transmits, enabling the slave transmitter to derive synchronous timing of the serial data and enabling the receiver to store the previous data word and clear its input for the next one. When using both SPORT channels (DxA and DxB) as transmitters (FS_BOTH = 1) and MSTR = 1 and DITFS = 0, the processor generates a frame sync signal only when both transmit buffers contain data because both transmitters share the same CLKDIV and FS. For continuous transmission, both transmit buffers must contain new data. When using both SPORT channels as transmitters and MSTR = 1 and DITFS = 1, the processor generates a frame sync signal at the frequency set by FSDIV = x whether or not the transmit buffers contain new data. In this case, the processor ignores the FS_BOTH bit. The DMA controller or the application is responsible for filling the transmit buffers with data. Enabling SPORT Master Mode (MSTR) The SPORTs transmit and receive channels can be configured for master or slave mode. In master mode, the processor generates the word select and serial clock signals for the transmitter or receiver. In slave mode, an external source generates the word select and serial clock signals for the
10-50
Serial Ports
transmitter or receiver. When MSTR is cleared (= 0), the processor uses an external word select and clock source. The SPORT transmitter or receiver is a slave. When MSTR is set (= 1), the processor uses the processors internal clock for word select and clock source. The SPORT transmitter or receiver is the master. Enabling SPORT DMA (SDEN) DMA can be enabled or disabled independently on any of the SPORTs transmit and receive channels. Set SDEN (= 1) to enable DMA and set channel in DMA-driven data transfer mode. Clear SDEN (= 0) to disable DMA and set the channel in an interrupt-driven data transfer mode.
Interrupt-Driven Data Transfer Mode
In this mode, both the A and B channels share a common interrupt vector, regardless of being configured as a transmitter or receiver. The SPORT generates an interrupt when the transmit buffer has a vacancy or the receive buffer has data. To determine the source of an interrupt, applications must check the TXSx or RXSx data buffer status bits, respectively.
DMA-Driven Data Transfer Mode
Each transmitter and receiver has its own DMA registers. For details, see Serial Port DMA on page 6-95. The same DMA channel drives the left and right I2S channels for the transmitter or the receiver. The software application must de-multiplex the left and right channel data received by the receive buffer, because the left and right data is interleaved in the DMA buffers. Channel A and B on each SPORT share a common interrupt vector. The DMA controller generates an interrupt at the end of DMA transfer only.
10-51
Figure 10-11 shows the relationship between FS (word select), serial clock, and I2S data. Timing for word select is the same as for frame sync. Note that this example uses early frame sync.
SCLK
FS/WS
L 3
L 2
L 1
R 3
R 2
R 1
LEFT CHANNEL
RIGHT CHANNEL
Multichannel Operation
The serial ports offer a multichannel mode of operation which allows the SPORT to communicate in a time-division-multiplexed (TDM) serial system. In multichannel communications, each data word of the serial bit stream occupies a separate channel. Each word belongs to the next consecutive channel. For example, a 24-word block of data contains one word for each of 24 channels. The serial port can automatically select words for particular channels while ignoring the others. Up to 128 channels are available for transmitting or receiving or both. SPORT0 and SPORT1 receive and SPORT2 and SPORT3 transmit data selectively from any of the 128 channels. Data companding and DMA transfers can also be used in multichannel mode on channel A. Channel B is not used in multichannel mode.
10-52
Serial Ports
Although the four SPORTs are programmable for data direction in the standard mode of operation, their programmability is restricted for multichannel operations due to implementation and backward compatibility issues. See the configuration shown in Figure 10-12. The following points summarize these limitations: 1. The primary A channels of SPORT0 and SPORT1 are capable only of expansion, and the primary A channels of SPORT2 and SPORT3 are capable only of compression. 2. In multichannel mode, SPORT0 and SPORT2 work in pairs; SPORT0 is the receive channel, and SPORT2 is the transmit channel. The same is true for SPORT1 and SPORT3. 3. Receive comparison is not supported.
SPORT0 SPORT2 D0a D0b FS0 SCLK0 D1a D1b FS1 SCLK1 D2a D2b TDV2 SCLK2 D3a D3b TDV3 SCLK3
SPORT1 SPORT3
Figure 10-12. SPORT Multichannel Mode Pairings: SPORT0 and SPORT2, SPORT1 and SPORT3 In multichannel mode, the SCLKx2 and SCLKx3 pin is an input and is internally connected to its corresponding SCLKx0 and SCLKx1 pins. It is not necessary to externally connect SCLKx2 to SCLKx0 and SCLKx1 to SCLKx3.
10-53
Figure 10-13 shows example timing for a multichannel transfer with SPORT pairing. The transfer has the following characteristics: Uses the TDM method in which serial data is sent or received on different channels sharing the same serial bus.
FS0 FS2
and FS3 are used as transmit data valid for external logic. These signals are active only during transmit channels. In a SPORT0/SPORT2 multichannel mode pairing, FS2 is the transmit data valid signal. In a SPORT1/SPORT3 multichannel mode pairing, FS3 is the transmit data valid signal.
WORD 0
WORD 1
WORD 2
B3
B2
B1
B0
B3
B2
TDV2
Figure 10-13. Multichannel Operation Frame Syncs in Multichannel Mode All receiving and transmitting devices in a multichannel system must have the same timing reference. The FS0 or FS1 signal is used for this reference, indicating the start of a block (or frame) of multichannel data words.
10-54
Serial Ports
When multichannel mode is enabled on a SPORT0/2 or SPORT1/3 pair, both the transmitter and receiver use FS0/FS1 signal as a frame sync. This is true whether FS0 or FS1 is generated internally or externally. The FS0/FS1 signal synchronizes the channels and restarts each multichannel sequence. FS0/FS1 assertion occurs at the beginning of the channel 0 data word. or FS3 is used as a transmit data valid signal, which is active during transmission of an enabled word. Because the serial ports D2A and D3A pins are three-stated when the time slot is not active, the FS2/FS3 signal specifies whether D2A/D3A is being driven by the ADSP-21161 processor. The processor drives FS2/FS3 in multichannel mode whether or not DITFS is cleared.
FS2
is renamed TDV2 and FS3 is renamed TDV3 in multichannel mode. These pins become outputs. Do not connect FS2 (TDV2) to FS0, and FS3 (TDV3) to FS1, in multichannel mode. Bus contention between the transmit data valid and multichannel frame sync pins will result.
FS2
After the TXxA transmit buffer is loaded, transmission begins and the FS2/FS3 signal is generated. When serial port DMA is used, this may happen several cycles after the multichannel transmission is enabled. If a deterministic start time is required, pre-load the transmit buffer. Multichannel Control Bits in SPCTL The SPCTLx control registers contain several bits that enable and configure multichannel operations. Multichannel mode is enabled by setting the MCE bit in the SP02MCTL or SP13MCTL control register: When MCE is set (= 1), multichannel operation is enabled. When MCE is cleared (= 0), all multichannel operations are disabled. Multichannel operation is activated three cycles after MCE is set. Internally generated frame sync signals activate four cycles after MCE is set.
10-55
Setting the MCE bit enables multichannel operation for both receive and transmit sides of the SPORT0/2 or SPORT1/3 pair. A transmitting SPORT2 or SPORT3 must be in multichannel mode if the receiving SPORT0 or SPORT1 is in multichannel mode. The number of channels used in multichannel operation is selected by the 7-bit NCH field in the SP02MCTL and SP13MCTL multichannel control register. Set NCH to the actual number of channels minus one:
NCH
= Number of Channels 1
The 7-bit CHNL field in the SP02MCTL and SP13MCTL multichannel control registers indicates the channel that is currently selected during multichannel operation. This field is a read-only status indicator. CHNL(6:0) increments modulo NCH(6:0) as each channel is serviced. The 4-bit MFD field in the SP02MCTL and SP13MCTL multichannel control registers specifies a delay between the frame sync pulse and the first data bit in multichannel mode. The value of MFD is the number of serial clock cycles of the delay. Multichannel frame delay allows the processor to work with different types of T1 interface devices. A value of zero for MFD causes the frame sync to be concurrent with the first data bit. The maximum value allowed for MFD is 15. A new frame sync may occur before data from the last frame has been received, because blocks of data occur back to back. Use a multichannel frame delay of at least one pulse when the processor is generating frame syncs for the multichannel system and the serial clock of the system is equal to CLKIN (the processor clock). If MFD is not set to at least one, the master processor in a multiprocessing system does not recognize the first frame sync after multichannel operation is enabled. All succeeding frame syncs are recognized normally.
10-56
Serial Ports
Channel Selection Registers Specific channels can be individually enabled or disabled to select the words that are received and transmitted during multichannel communications. Data words from the enabled channels are received or transmitted, while disabled channel words are ignored. Up to 128 channels are available for transmitting and up to 128 channels for receiving. The multichannel selection registers enable and disable individual channels. The registers for each serial port are as shown in Table 10-7. Table 10-7. Multichannel Selection Registers
Register Names MR0CS(0-3) MR1CS(0-3) MT2CS(0-3) MT3CS(0-3) MR0CCS(0-3) MR1CCS(0-3) MT2CCS(0-3) MT3CCS(0-3) Function Multichannel Receive Select-specifies the active receive channels (4x32-bit registers for 128 channels) Multichannel Transmit Select-specifies the active transmit channels (4x32-bit registers for 128 channels) Multichannel Receive Compand Select-specifies which active receive channels (out of 128 channels) are companded Multichannel Transmit Compand Select-specifies which active transmit channels (out of 128 channels) are companded
Each of the four multichannel enable and compand select registers are 32-bits in length. These registers provide channel selection for 128 (32 x 4 = 128) channel. Setting a bit enables that channel so that the serial port selects its word from the multiple-word block of data (for either receive or transmit). For example, setting bit 0 in MR0CS0 or MT2CS0 selects word 0, setting bit 12 selects word 12, and so on. Setting bit 0 in MR0CS1 or MT2CS1 selects word 32, setting bit 12 selects word 44, and so on.
10-57
Setting a particular bit to 1 in the MT2CS(0-3) or MT3CS(0-3) register causes SPORT2 or SPORT3 to transmit the word in that channels position of the data stream. Clearing the bit in the MT2CS(0-3) or MT3CS(0-2) register causes SPORT2s D2A or SPORT3s D3A data transmit pin to three-state during the time slot of that channel. Setting a particular bit to 1 in the MR0CS(0-3) or MR1CS(0-3) register causes the serial port to receive the word in that channels position of the data stream; the received word is loaded into the receive buffer. Clearing the bit in the MR0CS(0-3)/MR1CS(0-3) register causes the serial port to ignore the data. Companding may be selected on a per-channel basis. Setting a bit to 1 in any of the multichannel registers specifies that the data be companded for that channel. A-law or -law companding can be selected using the DTYPE bits in the SPCTLx control registers. SPORT0 and SPORT1 expand selected incoming time slot data, while SPORT2 and SPORT3 compress selected outgoing time slot data.
Serial Ports
Data-direction programmability is supported in standard DSP serial mode and I2S mode. The value of the DDIR bit in SPCTL (0=RX, 1=TX) in SPCTLx determines whether the receive or transmit register for the SPORT becomes active. The SPORT DMA channels are assigned higher priority than all other DMA channels (for example, link ports and the external port) because of their relatively low service rate and their inability to hold off incoming data. Having higher priority causes the SPORT DMA transfers to be performed first when multiple DMA requests occur in the same cycle.
10-59
Although the DMA transfers are performed with 32-bit words, serial ports can handle word sizes from 3 to 32 bits (8 to 32-bits for I2S mode). If serial words are 16 bits or smaller, they can be packed into 32-bit words for each DMA transfer; this is configured by the PACK bit of the SPCTLx control registers. When serial port data packing is enabled (PACK=1), the transmit and receive interrupts are generated for the 32-bit packed words, not for each 16-bit word. The following sections present an overview of serial port DMA operations; additional details are covered in the DMA chapter of this manual. For information on SPORT DMA Channel Setup, see Setting Up Serial Port DMA on page 6-100. For information on SPORT DMA Parameter Registers, see Serial Port DMA on page 6-95. For information on SPORT DMA Chaining, see Chaining DMA Processes on page 6-25. Setting Up DMA on SPORT Channels Each SPORT DMA channel has an enable bit (SDEN) in its SPCTLx control register. When DMA is disabled for a particular channel, the SPORT generates an interrupt every time it receives a data word or whenever there is a vacancy in the transmit buffer. For more information, see Single-Word Transfers on page 10-65. Each channel also has a DMA chaining enable bit (SCHEN) in its SPCTLx control register. To set up a serial port DMA channel, write a set of memory buffer parameters to the SPORT DMA parameter registers shown in Table 10-9. Load the II, IM, and C registers with a starting address for the buffer, an address modifier, and a word count, respectively. These registers can be written from the core processor or from an external processor.
10-60
Serial Ports
Y = A or B, and x = 0 - 3
Once serial port DMA is enabled, the processors DMA controller automatically transfers received data words in the receive buffer to the buffer in internal memory. Likewise, when the serial port is ready to transmit data, the DMA controller automatically transfers a word from internal memory to the transmit buffer. The controller continues these transfers until the entire data buffer is received or transmitted. When the count register of an active DMA channel reaches zero (0), the SPORT generates the corresponding interrupt.
the address for the next DMA transfer. The modify value in the IM register is a signed integer, which provides capability for both incrementing and decrementing the buffer pointer. Each DMA channel has a count register CxA/CxB, which must be initialized with a word count that specifies the number of words to transfer. The count register decrements after each DMA transfer on the channel. When the word count reaches zero, the SPORT generates the interrupt for the channel and automatically disables the DMA channel. Each SPORT DMA channel also has a chain pointer register (CPxA/CPxB) and a general-purpose register (GPxA/GPxB). The CPx register functions in chained DMA operations. The general-purpose registers can be used for any purpose. For more information on SPORT DMA chaining, see Serial Port DMA on page 6-95. Table 10-10. SPORT DMA Parameter Registers Addresses
Register II0A IM0A C0A CP0A GP0A Reserved 0x65- 0x67 II0B IM0B C0B CP0B GP0B 0x80 0x81 0x82 0x83 0x84 1 1 1 1 1 RX0B or TX0B RX0B or TX0B RX0B or TX0B RX0B or TX0B RX0B or TX0B Address 0x60 0x61 0x62 0x63 0x64 DMA Channel 0 0 0 0 0 SPORT Data Buffer RX0A or TX0A RX0A or TX0A RX0A or TX0A RX0A or TX0A RX0A or TX0A
10-62
Serial Ports
Reserved 0x6D - 0x6F II1B IM1B C1B CP1B GP1B 0x88 0x89 0x8A 0x8B 0x8C 3 3 3 3 3 RX1B or TX1B RX1B or TX1B RX1B or TX1B RX1B or TX1B RX1B or TX1B
Reserved 0x8D - 0x8F II2A IM2A C2A CP2A GP2A 0x70 0x71 0x72 0x73 0x74 4 4 4 4 4 RX2A or TX2A RX2A or TX2A RX2A or TX2A RX2A or TX2A RX2A or TX2A
Reserved 0x75 - 0x77 II2B IM2B C2B CP2B GP2B 0x90 0x91 0x92 0x93 0x94 5 5 5 5 5 RX2B or TX2B RX2B or TX2B RX2B or TX2B RX2B or TX2B RX2B or TX2B
10-63
Reserved 0x7D - 0x7F II3B IM3B C3B CP3B GP3B 0x98 0x99 0x9A 0x9B 0x9C 7 7 7 7 7 RX3B or TX3B RX3B or TX3B RX3B or TX3B RX3B or TX3B RX3B or TX3B
When programming the serial port channel (A or B) as a transmitter only the corresponding TXxA and TXxB become active, while the receive buffers (RXxA and RXxB) remain inactive. Similarly, when the SPORT channel A and B is programmed as receive, only the corresponding RX0A and RX0B is activated. When performing core-driven transfers, write to the buffer designated by the DDIR bit setting in the SPCTL register. For DMA-driven transfers, the serial port logic performs the data transfer from internal memory to/from the appropriate buffer depending on DDIR bit setting. If the inactive SPORT data buffers are read or written to by core while the port is already being enabled, the core will hang. For example, if a SPORT is programmed to be a transmitter, while at the same time the core reads from the receive buffer of the same SPORT, the core hangs just as it would if it were reading an empty buffer that is currently active. This locks up the core permanently until the SPORT is reset.
10-64
Serial Ports
Therefore, set the direction bit, the serial port enable bit, and DMA enable bits before initiating any operations on the SPORT data buffers. If the processor operates on the inactive transmit or receive buffers while the SPORT is enabled, it can cause unpredictable results. SPORT DMA Chaining In chained DMA operations, the processors DMA controller automatically sets up another DMA transfer when the contents of the current buffer have been transmitted (or received). The chain pointer register (CPx) functions as a pointer to the next set of buffer parameters stored in memory. The DMA controller automatically downloads these buffer parameters to set up the next DMA sequence. For more information on SPORT DMA chaining, see Serial Port DMA on page 6-95. DMA chaining occurs independently for the transmit and receive channels of each serial port. Each SPORT DMA channel has a chaining enable bit (SCHEN) that when set (= 1) enables DMA chaining and when cleared (= 0) disables DMA chaining. Writing all zeros to the address field of the chain pointer register (CPx) also disables chaining.
Single-Word Transfers
Individual data words may also be transmitted and received by the serial ports, with interrupts occurring as each 32-bit word is transmitted or received. When a serial port is enabled and DMA is disabled, the SPORT DMA interrupts are generated whenever a complete 32-bit word has been received in the receive buffer, or whenever the transmit buffer is not full. Single-word interrupts can be used to implement interrupt-driven I/O on the serial ports. When the processor cores program reads a word from a serial ports receive buffer or writes a word to its transmit buffer, check the buffer's full/empty status to avoid hanging the processor core. (This can also happen to an external device, for example a host processor, when it is reading
10-65
or writing a serial port buffer.) The full/empty status can be read in the DXS bits of the SPCTLx. Reading from an empty receive buffer or writing to a full transmit buffer causes the processor (or external device) to hang, waiting for the status to change. To support debugging buffer transfers, the processor has a Buffer Hang Disable (BHD) bit. When set (= 1), this bit prevents the processor core from detecting a buffer-related stall condition, permitting debugging of this type of stall condition. For more information, see the BHD discussion on page 6-43. Multiple interrupts can occur if both SPORTs transmit or receive data in the same cycle. Any interrupt can be masked in the IMASK register; if the interrupt is later enabled in IMASK, the corresponding interrupt latch bit in IRPTL must be cleared in case the interrupt has occurred in the meantime. When serial port data packing is enabled (PACK = 1 in the SPCTLx control registers), the transmit and receive interrupts are generated for 32-bit packed words, not for each 16-bit word.
10-66
Serial Ports
10-67
L0=@source; B1=dest; L1=@dest; ustat3=dm(SYSCON); bit clr ustat3 BHD; dm(SYSCON)=ustat3; bit set mode1 CBUFEN; r0 = 0x00001000; /*Set the SPL bit in the SPxxMCTL register to enable loopback*/ dm(SP02MCTL)=r0; r0 = 0x0; dm(DIV0) = r0; r0 = 0x000021f1; /* Set bits SPEN_A, SLEN = 32, FSR--enable the A channel, set the word length to 32 bits, and require frame synch. */ dm(SPCTL0)=r0; r0=0x00270004; /*TCLKDIV=[FCCLK(96Mhz)/2xFSCLKx((19.2Mhz)]-1=0x0004 */ /*TFSDIV=[FSCLKx(9.6Mhz)/TFS(.24Mhz)]-1=0x0027 */ dm(DIV2)=r0; r0=0x20065f1; /* Set bits SPEN_A, SLEN=32, ICLK, IFS, FSR, DDIR--enable the A channel, set the word length to 32 bits, generate internal framesynch and clock, require frame synch, and set for transmit. */ dm(SPCTL2)=r0; lcntr = N, do send until LCE; r1=dm(i0,1); dm(TX2A)=r1; r0=dm(RX0A); /*Test data to be transmitted*/ /*Send data to buffer*/ /*Read data from buffer*/ /*Externally generated clock and framesync*/ /*Enable Circular Buffers*/ /*Disable Core Buffer Hang*/
10-68
Serial Ports
/*Store data*/
10-69
B0=source; L0=@source; B1=dest; L1=@dest; ustat3=dm(SYSCON); bit clr ustat3 BHD; dm(SYSCON)=ustat3; bit set imask SP0I |SP2I; bit set mode1 CBUFEN | IRPTEN; /*Unmask SPORT0&2 Interrupts*/ /*Enable Circ. buffs & Global inters*/ r0 = 0x00001000; /* Set the SPL bit in the SPxxMCTL register to enable loopback */ dm(SP02MCTL)=r0; r0 = 0x0; dm(DIV0) = r0; r0 = 0x000021f1; /* Set bits SPEN_A, SLEN-32, FSR--enable the A channel, set the word length to 32 bits, and require frame synch. */ dm(SPCTL0)=r0; r0=0x00270004; /* TCLKDIV=[FCCLK(96Mhz)/2xFSCLKx((19.2Mhz)] -1=0x0004 */ /* TFSDIV=[FSCLKx(9.6Mhz)/TFS(.24Mhz)]-1=0x0027 */ dm(DIV2)=r0; r0=0x20065f1; /* Set bits SPEN_A, SLEN=32, ICLK, IFS, FSR, DDIR--enable the A channel, set the word length to 32 bits, generate internal framesynch and clock require frame synch, and set for transmit. */ dm(SPCTL2)=r0; wait: idle; /* Externally generated clock and framesync */ /*Disable Core Buffer Hang*/
10-70
Serial Ports
jump wait; IRQ: r1=dm(i0,1); dm(TX2A)=r1; r0=dm(RX0A); dm(i1,1)=r0; rti; /*Interrupt Service Routine*/ /*Test data to be transmitted*/ /*Send data to buffer*/ /*Read data from buffer*/ /*Store data*/
10-71
10-72
The ADSP-21161 processor is equipped with a synchronous serial peripheral interface port that is compatible with the industry-standard Serial Peripheral Interface (SPI). The SPI port supports communication with a variety of different peripheral devices including CODECs, data converters, sample rate converters, SP/DIF or AES/EBU digital audio transmitters and receivers, LCDs, shift registers, microcontrollers, and FPGA devices with SPI emulation. The processors SPI port provides the following features and capabilities: A simple four wire interface consisting of two data pins, a device select pin, and a clock pin Full-duplex operation that allow the ADSP-21161 processor to transmit and receive data simultaneously on the same port Special data formats to accommodate little and big endian data, different word lengths, and packing modes Master and slave modes as well as multi-master mode in which the ADSP-21161processor can be connected to up to four other SPI devices Open drain outputs to avoid data contention and to support multi-master scenarios Programmable baud rates, clock polarities, and phases Slave booting from a master SPI device
11-1
Functional Description
Functional Description
The SPI interface has two shift registers: the transmit shift register (TXSR) and the receive shift register (RXSR). TXSR serially transmits data and RXSR receives data synchronously with the SPI clock signal (SPICLK). Figure 11-1 provides a block diagram of the ADSP-21161 processor SPI interface. The data is shifted into or out of the shift registers on two separate pins: the Master In Slave Out (MISO) pin and the Master Out Slave In (MOSI) pin.
MOSI MISO SPICLK SPIDS FLAGx
RX Shift Register
TX Shift Register
32
Figure 11-1. SPI Block Diagram During data transfers one SPI device acts as the SPI master by controlling the data flow. It does this by generating the SPICLK and asserting the SPI device select signal (SPIDS). The SPI master receives data using the MISO pin and transmits using the MOSI pin. The other SPI device acts as the SPI
11-2
slave by receiving new data from the master into its receive shift register using the MOSI pin. It transmits requested data out the transmit shift register using the MISO pin. The SPI has two 2-deep FIFOs: the transmit data buffer (SPITX) and the receive data buffer (SPIRX). Data to be transmitted is written to SPITX and then automatically transferred into the transmit shift register. Once a full data word has been received in the receive shift register, the data is automatically transferred into SPIRX from which the data can be read from. Programmable FLAGx pins provide slave selection. These pins are connected to the SPIDS of the slave devices. In a multi-master or multi-device ADSP-21161 processor environment in which multiple ADSP-21161 processors are connected via their SPI ports, all MOSI pins are connected together, all MISO pins are connected together, and the SPICLK pins are connected together as well. The FLAGx pins are connected to each of the slave SPI devices in the system via the SPIDS pins.
SPICLK
The Serial Peripheral Interface Clock (SPICLK) signal is driven by the master device and controls the data transfer rate. It is an output signal if the device is configured as a master and an input signal if the device is configured as a slave. The master transmits data at a variety of baud rates. The SPICLK signal cycles once for each bit transmitted. The SPICLK signal is a gated clock that is active during data transfers, only for the length of the transferred word. SPICLK is configured with the BAUDR bits in the SPCTL register. The SPICLK clock rate (baud rate) can go as high as the rate given by the expression: f-core clock/8. The number of active
11-3
TXSR
Figure 11-2. Master-Slave Interconnections clock edges is equal to the number of bits driven on the data lines. Slave devices ignore the serial clock if the slave select input SPIDS is driven inactive. The SPICLK signal shifts out and shifts in the data driven on the MISO and MOSI lines. The data is shifted out on one clock edge and sampled on the opposite clock edge. To define the transfer format, clock polarity and clock phase relative to data can be programmed into the SPICTL control register.
SPIDS
The Serial Peripheral Interface Slave Device Select (SPIDS) signal is an active low signal used to enable an SPI port of an ADSP-21161 processor that is configured as a slave device. This input-only pin behaves like a chip select and is provided by the master device for the slave devices. For a master device, this signal can act as an error signal input in a multi-master environment. In multi-master mode, SPIDS can be asserted (driven low) to a master device to signal that another device is trying to be the master
11-4
device. In this case, the ADSP-21161 processors SPIDS signal is used as an input error signal from the slave device. If this signal is asserted low when the device is in master mode, it is considered a multi-master error. For a single-master, multiple-slave configuration in which FLAG0-3 are used as slave selects, SPIDS must be tied high to VDD. For ADSP-21161 processor to ADSP-21161 processor SPI interaction, any of the master processors FLAG0-3 pins can be used to drive the SPIDS signal on the SPI slave device.
FLAG
The Flag (FLAGx) pins are general-purpose bidirectional I/O data pins. Each FLAG pin can be programmed as an input or output. For SPI, FLAG3-0 pins are used to select slaves in a system that has multiple SPI devices. When FLAGS are used for SPI to select a slave using and the PSSE and FLS bits are enabled, SPI has higher priority than the core for use of the pins. If PSSE is set (= 1), all of the four flags become slave selects. If a particular GPIO is programmed as output, and the PSSE feature on that flag pin is enabled at the same time, the FLAG register bit is not reflected on the flag pin. However, if the pin is programmed as input, the status of the pin is reflected in the FLAG register. The SPI state machine drives this pin for the slave SPI device and the status is updated in the FLAG register. When using this pin to drive SPIDS while some other device is using it as GPIO, for example, the other device should not drive any data on this pin. For related flag discussions, see the following sections: Automatic Slave Selection on page 11-26 Core-Based Flag Pins on page 13-34
11-5
MOSI
The Master Out Slave In (MOSI) pin is one of the bidirectional I/O data pins. If the ADSP-21161 processor is configured as a master, the MOSI pin is a data transmit pin used to transmit output data. If the ADSP-21161 processor is configured as a slave, the MOSI pin is a data receive pin used to receive input data. In a system that has multiple SPI devices, data shifts out from the MOSI output pin of the master and into the MOSI input(s) of the slave(s).
MISO
The Master In Slave Out (MISO) pin is one of the bidirectional I/O data pins. If the ADSP-21161 processor is configured as a master, the MISO pin is a data receive pin used to receive input data. If the ADSP-21161 processor is configured as a slave, the MISO pin is a data transmit pin used to transmit output data. In a system that has multiple SPI devices, the data shifts out from the MISO output pin of the slave and into the MISO input pin of the master. Only one slave may transmit data at any given time. The user application code must ensure that when multiple devices are selected to transmit data from the master, only one slave will respond with data to be transmitted back to the master during the active transfer. The DMISO bit in the SPICTL register can be programmed to accomplish this. Figure 11-3 illustrates an example of an ADSP-21161 processor SPI interface where the processor is the SPI master. When using the SPI interface, the processor can be directed to alter the conversion resources, mute, modify the volume, and power down the AD1855 stereo DAC.
11-6
ADSP-21161
Master Device
AD1855
Stereo 96 kHz DAC SPICLK FLAG0 MOSI CCLK CLATCH DATA
Figure 11-3. ADSP-21161 as SPI Master Another SPI configuration example, shown in Figure 11-4, illustrates how the ADSP-21161 processor can be used as the SPI slave device. The 8-bit host microcontroller is the SPI master. The processor can be booted via its SPI interface to download user application code and data prior to runtime.
ADSP-21161
Slave SPI Device
11-7
SPI Interrupts
SPI Interrupts
The SPI port has two interrupts: a transmit interrupt and a receive interrupt. If DMA is enabled, a maskable interrupt occurs when the DMA block transfer has completed. If DMA is disabled, the core processor may read the SPIRX register from or write to the SPITX data buffer. To enable an interrupt, program the SPIRX interrupt enable (SPRINT) or the SPITX interrupt enable (SPTINT) in the SPICTL register. The SPIRX and SPITX buffers are memory mapped IOP registers. A maskable interrupt is generated when the receive buffer is not empty or the transmit buffer is not full. In order for the SPI hardware to work properly, interrupts must always be enabled in the SPICTL register. If interrupts are not wanted or needed, they can be masked at a higher level in the LIRPTL register or the IMASK registers. The transmit interrupt vector location (0x44) is used for both core driven transmit interrupts and DMA driven transmit interrupts. The receive interrupt vector location (0x40) is used for both core driven receive interrupts and DMA driven receive interrupts. In order to use SPI interrupts, unmask the IRPTEN bit in the MODE1 register, unmask the LPISUMI bit in the IRPTL register, and unmask the SPIRMSK bit or SPITMSK bit in the LIRPTL register. See Interrupt Latch Register (IRPTL) on page A-27 for IRPTL register bit descriptions. See Link Port Interrupt Register (LIRPTL) on page A-34 for LIRPTL register bit descriptions.
11-8
11-9
For revisions 0.3, 1.0 and 1.1 silicon, the SPI transmit and receive FIFOs cannot be cleared by disabling the SPI port via SPICTL. In order to clear the SPI receive FIFO, the application program must execute up to two dummy core reads from the SPIRX register. The number of reads needed depends on the number of words in the FIFO as shown in the FIFO buffer status. To clear out the SPITX FIFOs, clear all the FLS bits and then poll the SPITX buffer status in the SPISTAT register. Note that when the FLS bits are not set, there are no slave devices selected. However, the data will still be driven on the appropriate data pin. This FIFO clear operation may be important if you need to reprogram the SPI port to communicate to a new slave, or to change from a master to a slave SPI device. The default value of the SPICTL register at reset is 0x00000000. The value of the SPICTL register at slave boot is 0x0A001F81
11-10
SPICTL
0xB4
GM
Fetch/Discard Incoming RXB data when RXB full 0=Discard incoming data 1=Overwrite with new data
31 30 29 28 27 26 25 24 23 22 21 20 0 0 0 0 0 0 0 0 0 0 0 0
19 18 17 16 0 0 0 0
FLS1
FLAG1 Slave Device Select 1=Enable, 0=Disable
SENDLW
Send Zero/Repeat Byte When TXB Empty 0=Send zero, 1=Repeat last data
FLS2
FLAG2 Slave Device Select 1=Enable, 0=Disable
SGN
Sign Extend Data 0=no sign extend, 1=sign extend
FLS3
FLAG3 Slave Device Select 1=Enable, 0=Disable
PACKEN
8-bit Packing Enable 0=no packing, 1=8 to 32-bit packing
NSMLS
Non-Seamless operation 0=no delay, 1=delay before next word starts
RDMAEN
Receive DMA Enable 1=Enable, 0=Disable
OPD
Open Drain Output Enable for Data Pins 0=Normal, 1=Open Drain
DCPH0
Deselect SPIDS in CPHASE =0 (master mode only, NSMLS bit=1) 0=No SPI device select 1=Deselects slaves between successive transfers
DMISO
Disable MISO Pin (Broadcast) 0=MISO Enabled, 1=MISO Disabled
15 14 13 12 11 10 1 0 0 0 0 1
9 1
8 1
7 0
6 1
5 0
4 0
3 0
2 0
1 0
0 0 SPIEN
SPI System Enable 1=enable, 0=disable
FLS0
FLAG0 Slave Device Select 1=Enable, 0=Disable
PSSE
Programmable Slave Select Enable 0=Disable, 1=Enable
SPRINT
SPI RX Buffer Interrupt Enable 1=enable SPI IRQ on RXB empty, 0=disable
TDMAEN
Transmit DMA Enable 1=Enable, 0=Disable
SPTINT
SPI TX Buffer Interrupt Enable 1=enable SPI IRQ on TXB not full, 0=disable
BAUDR
Baud Rate CCLK / (2**(2 + BR))
MS
Master/Slave Mode Bit 0=SPI slave device, 1=SPI Master Device
WL
Word Length 00=8 bits, 01=16 bits, 11=32 bits, 10=RESERVED
CP
Clock polarity 0=SPICLK active high, low in idle state 1=SPICLK active low, high in idle state
DF
Data Format 0=LSB sent / received first 1=MSB sent / received first
CPHASE
Clock phase 0=SPICLK toggles at middle of 1st data bit 1=SPICLK toggles at beginning of 1st data bit
11-11
SPTINT
3 4
MS CP
CPHASE
7-8 9-12
WL BAUDR
11-12
15-18
FLS
19
NSMLS
20
DCPH0
26
OPD
11-13
29 30 31
SGN SENDLW GM
Baud Rate Example The BAUDR bits of the SPICTL register set the baud rate using the following formula:
coreclock ---------------------------------f SPICLK = 2 2 + BAUDR
If the core clock is 100 MHz and the BAUDR bits are 0xD (13), the SPICLK frequency is determined as follows: 100MHz f = --------------------------= 3052 Hz SPICLK
2 2 + 13
11-14
Seamless Operation The SPI port can transmit words seamlessly without delay by clearing (=0) the NSMLS bit in the SPICTL register. When seamless operation is disabled (NSMLS=1), there is a delay between word transfers from the SPI master. During this delay, the state machine disables and enables the slaves for DCPH0 = 1. The delay between words is 2.5 SPICLK cycles. Some slower slaves need time between data transfers to receive data and move new data for transmitting into the shift register. Set the NSMLS bit in the master device in order to create enough delay for the slave to perform data transfers.
11-15
3 1 30 2 9 2 8 2 7 2 6 2 5 2 4
23 2 2 2 1 20 0 0 0 0
1 9 18 17 1 6 0 0 0 0
SP IS TA T 0xB 5
1 5 1 4 13 1 2 0 0 0 0
1 1 10 0 0
9 0
8 0
7 0
6 0
5 0
4 0
3 0
2 0
1 0
0 0
RXS
SP IR X D a ta B u ffer Statu s (R ead -on ly ) 00=S PIR X em p ty 01=S PIR X p artially fu ll 11=S PIR X full 10=R es erve d
S P IF
SPI T ran sm it T ran s fer C o m p lete 1=tran sfer co m ple te , 0= ac tiv e tran sfer
MME
M u ltim aster E rro r 0= n o erro r, 1 = SP ID S~ as serted b y s lave
RBSY
R ec ep tio n E rro r (O verflow ) 1=n ew d ata receiv ed with fu ll R XB FIFO S PI en ters id le m o d e if m aster d evice
TXE
Tran sm iss ion E rror (U nd erflow ) 1= n o n ew d ata in T X F IF O , S PI en ters id le m o d e if m aster d evice
TXS
SP IT X D a ta B u ffer Statu s (read o n ly) 00= S PIT X em p ty 01= T X B p artially full 11= S PIT X full 10= R eserv ed
11-16
11-17
11-18
11-19
11-20
master or slave timing diagrams since the SPICLK, MISO, and MOSI pins are directly connected between the master and the slave. The MISO signal is the output from the slave (slave transmission), and the MOSI signal is the output from the master (master transmission). The SPICLK signal is generated by the master, and the SPIDS signal is the slave device select input to the slave from the master. The diagram represents an 8-bit transfer (WL=00) with MSB first (DF=1). Any combination of the WL and DF bits of the SPICTL register is allowed. For example, a 32-bit transfer with LSB first is also a possible configuration.
MSB
LSB
* *
MSB
LSB
*UNDEFINED
Figure 11-7. SPI Transfer Protocol for CPHASE = 0 The clock polarity and the clock phase must be identical for the master device and the slave device involved in the communication link. The transfer format from the master may be changed between transfers to adjust to various requirements of a slave device.
11-22
MSB
LSB
MSB
LSB
* = UNDEF INED
Figure 11-8. SPI Transfer Protocol for CPHASE = 1 Enable DCPH0 (bit 20) (=1) to make the slave select line, SPIDS, inactive (HIGH) between each serial transfer. This is controlled automatically by hardware logic. This feature is available in both CPHASE=0 and CPHASE=1. The standard SPI peripherals use this mode only in CPHASE=0. Clearing the DCPH0 bit (=0) keeps SPIDS active low throughout the entire data transfer for both CPHASE=0 and CPHASE=1. Table 11-4. SPICLK Driving and Latching Edges for SPI Data Transfers
Phase 0 0 1 1 Polarity 0 1 0 1 Driving Edge of SPICLK Falling Edge Rising Edge Rising Edge Falling Edge Latching Edge of SPICLK Rising Edge Falling Edge Falling Edge Rising Edge
11-23
11-24
11-25
Interrupt and DMA Driven Transfers For interrupt driven transfers, the SPTINT or SPRINT bit should be set in the SPICTL. The interrupt routine in the user software is expected to perform the data transfer. For DMA driven transfers, the RDMAEN or TDMAEN bits must be set in the SPICTL. The DMA controller does the data transfer automatically. An interrupt is generated at the end of the DMA transfer. For more information on interrupts, see section SPI Interrupts on page 11-8. Interrupts or DMA requests are automatically generated when the transmit buffer is partially empty or when the receive buffer is partially full. In the event that the SPITX and SPIRX interrupts are not serviced, or a higher priority DMA occurs, resulting in the transmit buffer becoming empty or the receive buffer becoming full, the SPI device will stall the SPI clock until all the data is read from the receive buffer or a piece of data is written to the transmit buffer. Core Driven Transfers For core driven SPI transfers, SPTINT and SPRINT are enabled in the SPI, and the corresponding interrupt masks SPIRMSK and SPITMSK are disabled in the LIRPTL register. The user software has to read from or write to SPIRX and SPITX in the transmit buffer becoming empty or the receive buffer becoming full, the SPI device will stall the SPI clock until all the data is read from the receive buffer or a piece of data is written to the transmit buffer. Automatic Slave Selection Multiple slaves are automatically controlled (selected and deselected) during the SPI transfer by enabling the PSSE bit in the SPICTL register. This bit locks all the four flag pins (FLAG0,FLAG1,FLAG2 and FLAG3) as SPI slave selects. By writing to the FLS bits (bits 15-18) in the SPICTL register, the corresponding FLAG bits are programmed as outputs for slave selection.
11-26
To enable the different slaves, connect the slave SPIDS pins to the programmable flag pins FLAG0-3 of the master ADSP-21161. Since these flags are NOT open drain, slave select pins (FLAGS) cannot be shorted together in multimaster environment. To control slave selects, an external glue logic is required in a multi-master environment. Enable the SPI port by setting the SPIEN bit in the SPICTL. The masters flag pins are asserted low and the SPIDS signals of the slaves are asserted. Upon completion of the transfer, the FLAG pins are de-asserted, and slave selection is subsequently disabled. During data transfers, if the SPI clock is stalled, the slaves are automatically deselected by de-asserting the flags in the master. Once data transmission becomes possible, the slaves are automatically selected again by asserting the flags in the master. When DCPH0 is set, the slaves are automatically deselected and selected again by de-asserting and asserting the flags in the master. This is done automatically in the SPI. There is a one cycle latency for a flag output to change after writing to the SPICTL register (when PSSE is set and the flag is enabled). To use the PSSE feature, systems can have five SPI devices with ADSP-21161 as the master. The PSSE is programmed for slave selection of the other four devices. The ADSP-21161 processor can broadcast to all the four slaves at once or can write to individual slaves by appropriately programming the FLS bits. User Controlled Slave Selection The user can also control the slaves without enabling the PSSE bit in the SPI. The user can set or clear the I/O flags directly by writing a 1 or 0 into the FLAG register. The user can also emulate DCPH0 operation by setting or clearing the values in the FLAG register at the appropriate time.
11-27
When using this mode, the following sequence should be followed to ensure proper data transfer according to the SPI protocol. 1. Enable the SPI by writing into the SPICTL register. 2. Assert the required slave select by writing a zero into the appropriate bit in the FLAG register. 3. Load SPITX with the required data by enabling DMAs, interrupts, or by performing core writes to SPITX.
11-28
5. For interrupt driven core transfers to or from SPITX or SPIRX, enable bits SPTINT and SPRINT in the SPICTL register. An SPI interrupt occurs when SPITX is partially empty or when the receive buffer SPIRX is partially full. 6. For duplex DMA transfers, enable RDMAEN and TDAMEN in SPICTL need. DMA requests are generated when SPITX is partially empty or when SPIRX is partially full. The DMA controller then transfers data between internal memory and the SPI data buffers. Interrupts and DMA requests are automatically generated when the transmit buffer is partially empty or when the receive buffer is partially full. In case of DMA driven or core driven transfers, if the transmit buffer becomes empty or the receive buffer becomes full, the SPI device continues to operate based on the conditions of the SENDLW and GM bits. If the SENDLW bit is cleared (=0) and the transmit buffer is empty, the device repeatedly transmits 0s out on the MISO pin. If the SENDLW is set (=1) and the transmit buffer is empty, the device continues to transmit the last word written to SPITX that was transmitted. Retransmission of the data in SPITX occurs after the transmit buffer becomes empty. If the GM bit is set (=1) and the receive buffer is full, the device continues to receive new data from the MOSI pin, overwriting the previous (older) data in the SPIRX buffer. If the GM bit is reset (=0) and the receive buffer is full, the incoming data from the shift register is discarded and the SPIRX register is not updated. The register ignores the new data and retains the old information.
11-29
11-30
When the device is an SPI master, the working of the bit depends on the mode of data transfer. For DMA or interrupt driven data transfer, the SPICLK will stall as soon as both the SPITX and the TXSR become empty. There is NO transmission error in this case. For core driven data transfers, the error bit is set as soon as both the SPITX and the TXSR become empty. The SPI continues to transmit the next data as specified by the SENDLW bit in the SPICTL register.
11-31
11-32
Do not perform a normal core write of SPITX during DMA operation. A normal core read of SPITX can be done at any time and does not interfere with, or initiate, SPI transfers. Do not perform a normal core read of SPIRX during DMA operation. A normal core write of SPIRX can be done at any time and does not interfere with, or initiate, SPI transfers. Interrupts are generated based on DMA events that are configured in the SPICTL register.
11-33
SPI Booting
Do not perform a normal core write of SPITX during DMA operation. A normal core read of SPITX can be done at any time and does not interfere with, or initiate, SPI transfers. Do not perform a normal core read of SPIRX during DMA operation. A normal core write of SPIRX can be done at any time and does not interfere with, or initiate, SPI transfers.
SPI Booting
The ADSP-21161 processor allows a host SPI device to boot the processor on power-up RESET de-assertion. To enable the SPI booting mode, the EBOOT and BMS pins must be tied low, and the LBOOT pin must be tied high. When the processor comes out of reset, it starts the SPI boot process. The SPI is configured as a slave upon power-up. Therefore, after reset, the SPI waits for SPIDS and SPICLK from the SPI host to download the boot program. The default value of the SPICTL register when the processor is configured for SPI boot is 0x0A00 1F81. The SPI port is enabled as a slave to receive 32-bit words in LSB-first format. DMA is enabled to facilitate loading the boot kernel. The DMISO bit is also enabled to avoid contention in the MISO pin in systems where multiple slave devices are to be booted simultaneously. DMA channel 8 is used when downloading the boot kernel information to the processor. At reset, the DMA parameter registers for DMA channel 8 are initialized to their required values. Table 11-5 lists the initial values for these registers. The ADSP-21161 SPI booting mode supports boots from 8-, 16-, or 32-bit host SPI devices. In SPI boot mode, the data word size in the shift register defaults to 32 bits. Therefore, for 8- or 16-bit hosts, data words are packed into the shift register to generate 32-bit words, which can be shifted into internal program memory. 11-34 ADSP-21161 SHARC Processor Hardware Reference
The host initiates the booting operation by activating SPICLK and asserting the SPIDS signal to the active low state. The 256-word, boot-strapped instruction loader kernel is loaded 32 bits at a time, via the 32-bit SPI receive shift register (RXSR). To properly upload 256 instructions (48-bit words), the SPI DMA initially loads a DMA count of 0x180 (384) 32-bit words which is equivalent to 0x100 (256) 48-bit words. The relationship between the 32-bit words received into the SPIRX register and the instructions that need to be placed in internal memory is described in the Figure 11-9. After the first 256 words are loaded the interrupt associated with the SPI receive is activated. The processor jumps to the location for SPIRI_svc (0x40040) and executes the code located there. Typically, the first instruction at the SPI receive interrupt vector (SPIRI) is an RTI instruction in which case the processor jumps to location 0x40005 where normal program execution continues. Because most applications require more than 256 words of instructions and initialization data, a loader and a 256 word loader kernel are supplied with the tools. Use these tools to create code that automatically loads the rest of the application code and then overwrites itself with application code and data. For more information on the loader, see the development tools documentation. The boot loader kernel supplied with the tools loads a combination of instructions with DMA into scratch locations and then writes the instructions to internal memory using the core via the PX register. The 256-word, boot-strapped instruction loader kernel is loaded 32-bits at a time, via the
11-35
SPI Booting
32-bit SPI receive shift register, using a normal-word addressing scheme with two-column memory addresses. Figure 11-9 shows how SPI data is packed in internal memory.
32-Bit receive SHIFT REGISTER
#384:DM[0x4017F] #384:DM[0x4017E] 0x400FF MSW LSW MSW LSW PM48 [0x400FF] [x400FE]
S P I R X
DMA (channel 8)
DMA #6: DM[40005] DMA #5: DM[40004] MSW LSW MSW LSW PM48 [0x40003] [x40002] DMA #4: DM[40003] DMA #3: DM[40002] MSW LSW MSW LSW PM48 [0x40002] PM48 [0x40001]
DMA #2: DM[40001] DMA #1: DM[40000] MSW LSW MSW LSW UW PM48 [0x40000] LW [x40001] 0x40000 MOSI
1
16
Figure 11-9. SPI Data Packing The SPI Control Register (SPICTL) is configured to 0x0A00 1F81 upon reset during SPI boot. SPI transfers occur with the following default bit settings:
SPIEN MS DF WL
= 1, SPI enabled
= 0, slave device = 0, LSB first = 11, 32-bit SPI receive shift register word length = 1, MISO disabled = 1, Receive DMA enabled
DMISO
RDMAEN
11-36
The SPIRX DMA channel 8 parameter registers are configured to DMA in 0x180 32-bit words into internal memory normal-word address space starting at 0x40000. Once the 32-bit DMA completes, the data is then accessed as 3-column 48-bit instruction accesses, for example, the processor executes a 256 (0x100) word loader kernel upon completion of the 32-bit, 0x180 word DMA. For 16-bit SPI hosts, two words are shifted into the 32-bit receive shift register ( RXSR) before a DMA transfer to internal memory occurs. For 8-bit SPI hosts, four words are shifted into the 32-bit receive shift register before a DMA transfer to internal memory occurs. By default, the booting SPI expects to receive words into SPIRX seemlessly. This means that bits are received continuously without breaks. For different SPI host sizes, the processor expects to receive instructions and data packed in an LSW format. Figure 11-10 shows a pair of instructions packed for SPI booting using a 32-, 16-, and an 8-bit host.
Words 32-bit host 16-bit host 8-bit host
33445566 5566 3344 CCDD1122 1122 CCDD 7788AABB AABB BB AA 7788
66
55
44
33
22
11
DD
CC
88
77
Figure 11-10. Instruction Packing for 32-, 16-, or 8-Bit SPI Host Booting The following sections examine how data is packed into internal memory during SPI booting for SPI host word widths of 32-, 16-, or 8-bits.
11-37
SPI Booting
0x40000
32-bit SPI WORD N
32
S P I R X
32
DMA Channel 8
32
MOSI
11-38
The following is an example of 48-bit instructions to be executed at PM addresses 0x40000 and 0x40001: [0x40000] 1122 33445566 [0x40001] 7788 AABBCCDD The 32-bit SPI host would need to pack (prearrange data) as follows: SPI word 1 = 0x33445566 SPI word 2 = 0xCCDD1122 SPI word 3 = 0x7788AABB
WORD N
16- bit
32
DMA Channel 8
32
WORD N + 1
MOSI
16- bit
11-39
SPI Booting
The following is an example of 48-bit instructions to be executed at PM addresses 0x40000 and 0x40001: [0x40000] 1122 33445566 [0x40001] 7788 AABBCCDD The 16-bit SPI host would need to pack (prearrange data) as follows: SPI word 1 = 0x5566 SPI word 2 = 0x3344 SPI word 3 = 0x1122 SPI word 4 = 0xCCDD SPI word 5 = 0xAABB SPI word 6 = 0x7788 The initial boot of the 256-word loader kernel requires a 16-bit host to transmit 768 16-bit words. One 32-bit word is created from two packed 16-bit words. The SPI DMA count value of 0x180 is equivalent to 384 words. Therefore, the total number of 16-bit words loaded is 768.
11-40
8-bit WORD N
32
32
DMA Channel 8
32
Figure 11-13. 8-Bit SPI Host Packing The following is an example of 48-bit instructions to be executed at PM addresses 0x40000 and 0x40001: [0x40000] 1122 33445566 [0x40001] 7788 AABBCCDD
MOSI
11-41
SPI Booting
The 8-bit SPI host would need to pack (prearrange data) as follows: SPI word 1 = 0x66 SPI word 2 = 0x55 SPI word 3 = 0x44 SPI word 4 = 0x33 SPI word 5 = 0x22 SPI word 6 = 0x11 SPI word 7= 0xDD SPI word 8 = 0xCC SPI word 9 = 0xBB SPI word 10 = 0xAA SPI word 11 = 0x88 SPI word 12 = 0x77 The initial boot of the 256-word loader kernel requires an 8-bit host to transmit 1536 8-bit words. The SPI DMA count value of 0x180 is equal to 384 words. Since one 32-bit word is created from four packed 8-bit words, the total number of 8-bit words transmitted is 1536. Multiprocessor SPI Port Booting In systems where multiple ADSP-21161 processors are connected and configured for SPI booting, the master ADSP-21161 (or any SPI master device) can boot up to four processors configured as SPI slaves. The processor uses four programmable flags, FLAG0-3, as dedicated SPI device-select signals for the SPI slave devices. The FLS bits in the SPICTL register correspond to these flags. Figure 11-14 shows a single ADSP-21161 processor master with four slaves. The master processor selects each slave device using a dedicated FLAG pin. The master device communicates with one slave device at any given time, or it broadcasts data to multiple slaves by setting more than one FLS bit in SPICTL.
11-42
Slave Device
MISO SPICLK MOSI SPIDS MISO SPICLK
Slave Device
MOSI SPIDS MISO SPICLK
Slave Device
MOSI SPIDS MISO
Slave Device
SPICLK MOSI SPIDS
VDD
SPICLK MOSI
SPIDS FLAG3
ASP-21161
SINGLE MASTER DEVICE
FLAG0
Figure 11-14. Single Master, Multiple Slaves Configuration All ADSP-21161 Processors The master ADSP-21161 processor can boot multiple slaves in the following ways: The ADSP-21161 processor transmits to all four SPI devices at the same time in a broadcast mode. Broadcast the 256-word loader kernel and identical application code simultaneously to all slaves. If the master is a ADSP-21161 processor, enable the FLSx bit in the SPICTL register, and disable the MISO pins. Otherwise, the master asserts the SPIDS pins of all the slaves to transmit the data. This feature can be enabled by setting the DMISO bit in the four slave processors. This DMISO feature may be available in some microcontrollers. Therefore, it is possible to use the DMISO feature with any SPI devices that include this functionality.
11-43
Load the bootstrap kernel and processor instructions and data one-at-a time for each processor. In this case, enable only one FLSx bit at a time in the SPICTL register to drive the flag pin connected to a slaves device select. The master device will assert the SPIDS pin of the slave to load the data. This ensures that each processor boots one after the other. It is also possible to use a combination of broadcast and individual processor booting to boot a multiprocessor system. SPI hosts can broadcast boot application code that will reside on several slaves and then complete the booting process by booting the individual slaves with slave specific application code. In this situation, the host SPI device asserts the SPIDS pins of all slaves during the broadcast portion of the boot. The host then asserts the SPIDS pins of specific slaves. If the ADSP-21161 processor is the master as is shown in Figure 11-14, the master enables the FLSx bit in the SPICTL register for the slave currently booting. Figure 11-14 shows one ADSP-21161 processor as a master and four ADSP-21161 processor (or other SPI-compatible devices) as slaves:
11-44
// vector code for reset vector from ldf file .section/pm Chip_Reset: seg_rth; idle; jump start; nop; nop;
// vector code for receive interrupt vector from ldf file .section/pm spiri_svc; nop; nop; jump receive; rti; .section/dmseg_dmda; .var spi_tx_buf[size] =0x11111111,0x22222222, 0x33333333, 0x44444444, 0x55555555,0x66666666, 0x77777777, 0x88888888, 0x99999999, 0xaaaaaaaa; .var spi_rx_buf[size]; .section/pm seg_pmco;
11-45
start: //Set pointers for source and dest, I0=B0 automatically b0=spi_tx_buf; l0=@spi_tx_buf; m0=1; b1=spi_rx_buf; l1=@spi_rx_buf; m1=1; // set circular buffer enable and allow global interrupts bit set MODE1 CBUFEN | IRPTEN; bit set LIRPTL SPIRMSK ; bit set IMASK LPISUMI; r0=0x00000000; dm(SPICTL)=r0; // prime SPITX register r0=dm(i0,m0); dm(SPITX)=r0; ustat1=dm(SPICTL); bit set ustat1 | BAUDR5 | SGN | GM; /* Enable spi port, spitx and spirx interrupts, master device spiclk toggles at beginning of first data transfer bit, MSB first format, 32 bit word length, baud rate sign extend and get more new data even if receive buffer is full */ dm(SPICTL) = ustat1; // start transfer by configuring SPICTL // set up options for the SPI port // enable SPI RX interrupts // unmask spi interrupts // initially clear SPI control register // 32-bit SPI datawords // 32-bit SPI datawords
11-46
//write value to internal memory buffer //get new value to transmit from internal // transmit buffer
#include <def21161.h> #define size 10 // vector code for reset vector from ldf file .section/pm Chip_Reset: seg_rth; idle; jump start; nop; nop;
11-47
0x33333333, 0x44444444, 0x55555555, 0x66666666, 0x77777777, 0x88888888, 0x99999999, 0xaaaaaaaa; .var spi_rx_buf[size]; .SECTION/PM seg_pmco; start: bit set MODE1 IRPTEN | CBUFEN;// set circular buffer enable and allow global interrupts b0=spi_tx_buf; // 32-bit SPI datawords l0=@spi_tx_buf; m0=1; b1=spi_rx_buf; // 32-bit SPI datawords l1=@spi_rx_buf; m1=1; r0=0x00000000;// initially clear SPI control register dm(SPICTL)=r0; ustat1=dm(SPICTL); bit set ustat1 SPIEN | MS | DF | WL32 | BAUDR5 | SGN | GM;
/* The SPI transmit buffer must be fed with the first two data words before enabling SPI if SPRINT/SPTINT will not be enabled for interrupt usage */ r0=dm(i0,m0); dm(SPITX)=r0; //write to TX buffer
11-48
dm(SPICTL) = ustat1;
lcntr = 0x8, do looping until lce; r0=dm(i0,m0); dm(SPITX)=r0; // test receive buffer status to determine when it is ok to read from SPIRX test:ustat1=dm(SPISTAT); bit tst USTAT1 RXS0; if Not TF jump test; r0=dm(SPIRX); dm(i1,m1)=r0; looping: nop; r0=dm(SPIRX); dm(i1,m1)=r0; r0=dm(SPIRX); dm(i1,m1)=r0; wait: jump wait; idle; //read from RX buffer //read from RX buffer //read from RX buffer //write to TX buffer
11-49
11-50
A boundary scan allows a system designer to test interconnections on a printed circuit board with minimal test-specific hardware. The scan is made possible by the ability to control and monitor each input and output pin on each chip through a set of serially scannable latches. Each input and output is connected to a latch, and the latches are connected as a long shift register so that data can be read from or written to them through a serial test access port (TAP). The ADSP-21161 processor contains a test access port compatible with the industry-standard IEEE 1149.1 (JTAG) specification. Only the IEEE 1149.1 features specific to the ADSP-21161 processor are described here. For more information, see the IEEE 1149.1 specification and other the documents listed in References on page 12-29. The boundary scan allows a variety of functions to be performed on each input and output signal of the ADSP-21161 processor. Each input has a latch that monitors the value of the incoming signal and can also drive data into the chip in place of the incoming value. Similarly, each output has a latch that monitors the outgoing signal and can also drive the output in place of the outgoing value. For bidirectional pins, the combination of input and output functions is available. Every latch associated with a pin is part of a single serial shift register path. Each latch is a master/slave type latch with the controlling clock provided externally. This clock (TCK) is asynchronous to the ADSP-21161 processor system clock (CLKIN).
12-1
The ADSP-21161 processor emulation features halt the processor at a pre-defined point to examine the state of the processor, execute arbitrary code, restore the original state, and continue execution. The ADSP-21161 processor emulation features are a superset of the ADSP-21160 emulation features. All emulation features supported by previous SHARCs are supported on the ADSP-21161 processor, except the ICSA output signal and function. The set of features on which JTAG ICE designs rely are supported in an identical fashion on ADSP-21161 processor. The ADSP-21161 processor can be used with the ADSP-2106x SHARC JTAG ICE hardware. There are several changes/extensions to the base functionality of the ADSP-2106x emulation capability, which require changes in the JTAG ICE software for ADSP-21161 processor support. These extensions include: 1. The emulation breakpoint address start/end registers have moved from UREG space to IOP register space. This change did not effect the TSTEMU block directly, only the address decodes to gain access to it. 2. has been added to the IR decode space. This shift register provides access to the full 64-bit wide PX register of ADSP-21161 processor.
EMU64PX
3. A memory test shift register has been added to the IR decode space. This feature is for Analog Devices internal use ONLY. Several on-chip facilities are directly accessed through the JTAG interface. These facilities are listed in Table 12-2 on page 12-4. Other emulation facilities are only indirectly accessible. To indirectly access the facilities that do not appear in Table 12-2 on page 12-4, scan the instruction which moves data of interest to/from the PX register, scan the PX data (if the instruction is a PX read), let the core execute the instruction, then scan the PX register out (if the instruction was a PX write). 12-2 ADSP-21161 SHARC Processor Hardware Reference
The breakpoint start/end registers are mapped into the IOP register space of the ADSP-21161 processor. For specific addresses, see Register and Bit #Defines (def21161.h) on page A-121. The EMUN, EMUCLK, and EMUCLK2 registers occupy the same UREG address space as on the ADSP-2106x. These facilities are read-only by the ADSP-21161 processor core in normal operation.
12-3
Instruction Register
A Boundary Scan Description language (BSDL) file for the ADSP-21161 processor is available on Analog Devices website. Set your browser to:
http://www.analog.com/techsupt/documents/bsdl
Refer to the IEEE 1149.1 JTAG specification for detailed information on the JTAG interface. The many sections of this appendix assume a working knowledge of the JTAG specification.
Instruction Register
The instruction register allows an instruction to be shifted into the processor. This instruction selects the test to be performed and/or the test data register to be accessed. The instruction register is 5 bits long with no parity bit. A value of 10000 binary is loaded (LSB nearest TDO) into the instruction register whenever the TAP reset state is entered. Table 12-2 lists the binary code for each instruction. Bit 0 is nearest TDO and bit 4 is nearest TDI. No data registers are placed into test modes by any of the public instructions. The instructions affect the ADSP-21161 processor as defined in the 1149.1 specification. The optional instructions RUNBIST, IDCODE and USERCODE are not supported by the ADSP-21161 processor. Table 12-2. JTAG Instruction Register Codes
43210 11111 00000 10000 01000 11000 00100 Register Bypass Boundary Boundary EMUPMD Boundary EMUCTL Instruction BYPASS EXTEST SAMPLE EMULATION INTEST EMULATION 48-bit scan length Comment Type Public Public Public Private Public Private
12-4
The entry under Register is the serial scan path, either Boundary or Bypass in this case, enabled by the instruction. Figure 12-1 shows these register paths. The 1-bit Bypass register is fully defined in the 1149.1 specification. For more information on the Boundary register, see Boundary Register on page 12-17. No special values need be written into any register prior to selection of any instruction. As Table 12-2 shows, certain instructions are reserved for emulator use. For more information, see Table 12-7.
12-5
Instruction Register
480
3 2
INSTRUCTION REGISTER
12-6
into the CAPTURE state and EMUPX is selected, EMUPX is updated with the most significant 48-bits of PX. The EMUPX register is used to transfer data between the emulator and the target system. The EMUPX register is provided for backwards compatibility with the SHARC ICE hardware and is 64 bits wide. To provide compatibility, only the most significant 48 bits of PX are mapped to EMUPX. 48-bit instructions, and 40-bit extended precision data, are always aligned to the most significant bit. When transferring 32-bit data to/from PX register, PX2 must be specified as the source/destination to ensure that the 32-bits is aligned to the most significant bit.
12-7
Instruction Register
EIRQENA
BKSTOP
3 4
SS SYSRST
ENBRKOUT
12-8
EPSTOP
NEGPA1
9 10 11 12
12-9
Instruction Register
18 19 20 21 22-23
24-25 26-27
DA1MODE DA2MODE
12-10
33 34
35 36
TMODE BHO
37
MTST
38, 39
12-11
Instruction Register
TURE state. The emulator reads EMUSTAT to determine the state of the ADSP-21161 processor. None of the bits in this register can be written by the emulator. All bits are active high. Table 12-4 lists the EMUSTAT registers bits. Table 12-4. Emulation Status (EMUSTAT) Register Definition
Bit # 0 1 2 3 4 5-7 Name EMUSPACE EMUREADY INIDLE COMHALT EPHALT Function (If bit=1...) Indicates that the next instruction is to be fetched from the emulator. Indicates that the ADSP-21161 processor has finished executing the previous emulator instruction. Indicates that the ADSP-21161 processor was in IDLE prior to the latest emulator interrupt. Indicates a core access to a SPORT or a LINK is hung because of an external device. Indicates a core access to a DMA buffer is hung because of the external port. Reserved
12-12
12-13
Instruction Register
The ADSP-21161 processor contains nine sets of emulation breakpoint registers. Each set consists of a start and end register which describe an address range, with the start register setting the lower end of the address range. Each breakpoint set monitors a particular address bus. When a valid address is in the address range, than a breakpoint signal is generated. The address range includes the start and end addresses. The nine breakpoint sets are grouped into five types: instruction (IA), DM data (DA), PM data (PA), IO data (IO), and EP data (EP). The individual breakpoint signals in each type are ORed together to create five composite breakpoint signals. These composite signals can be optionally ANDed or ORed together to create the effective breakpoint event signal used to generate an emulator interrupt. The ANDBKP bit in the EMUCTL register selects the function used. Each breakpoint type has an enable bit in the EMUCTL register. When set, these bits add the specified breakpoint type into the generation of the effective breakpoint signal. If cleared, the specified breakpoint type is not used in the generation of the effective breakpoint signal. This allows the user to trigger the effective breakpoint from a subset of the breakpoint types. To provide further flexibility, each individual breakpoint can be programmed to trigger if the address is in range AND one of these conditions is met: READ access, WRITE access, ANY access, or NO access. The control bits for this feature are also located in EMUCTL. For more information, see PA1MODES bit description in Table 12-3 on page 12-8. The address ranges of the emulation breakpoint registers are negated by setting the appropriate renege negation bits in the EMUCTL register. For more information, see NEGPA1 bit description Table 12-3 on page 12-8. Each breakpoint can be disabled by setting the start address larger than the end address.
12-14
Four of the breakpoints monitor the instruction address. Two monitor the data memory address. One monitors the program memory data address, one monitors the I/O address bus and one monitors the EP address bus. The instruction address breakpoints monitor the address of the instruction being executed, not the address of the instruction being fetched. If the current execution is aborted, the breakpoint signal does not occur even if the address is in range. Data address breakpoints (DA and PA only) are also ignored during aborted instructions. The nine breakpoint sets appear in Table 12-6. Table 12-6. PSx, DMx, IOx, and EPx (Breakpoint) Registers
Register PSA1S PSA1E PSA2S PSA2E PSA3S PSA3E PSA4S PSA4E DMA1S DMA1E DMA2S DMA2E PMDAS PMDAE IOAS IOAE Function Instruction Address Start #1 Instruction Address End #1 Instruction Address Start #2 Instruction Address End #2 Instruction Address Start #3 Instruction Address End #3 Instruction Address Start #4 Instruction Address End #4 Data Address Start #1 Data Address End #1 Data Address Start #2 Data Address End #2 Program Data Address Start Program Data Address End I/O Address Start I/O Address End Group1 IA IA IA IA IA IA IA IA DA DA DA DA PA PA IO IO
12-15
Instruction Register
Table 12-6. PSx, DMx, IOx, and EPx (Breakpoint) Registers (Contd)
Register EPAS EPAE 1 Function External Port Address Start External Port Address End Group1 EP EP
Group IA=24-bit addresses, Groups DA, PA, and EP=32-bit addresses, Group IO=19-bit addresses.
EMUN Register
The EMUN (Nth event counter) register is located in the I/O Processor register set. It is not user accessible and can be written only when the ADSP-21161 processor is in emulation space. The EMUN register is read-only from normal-space and can be written only when the ADSP-21161 processor is in emulation space. The Nth event counter allows an emulation breakpoint to occur on the Nth occurrence of the breakpoint event. This is accomplished by writing the desired Nth value to the EMUN register in UREG space. This register can be read from normal space, but it can be written only in emulation space. The counter decrements on each occurrence of the breakpoint event, asserting the interrupt when the counter is equal to zero and the hardware breakpoint event occurs.
12-16
the amount of time spent executing a particular section of code. The EMUCLK2 register extends the time EMUCLK can count by incrementing each time the EMUCLK value rolls over to zero. The combined emulation clock counter can count accurately for thousands of hours.
EMUIDLE Instruction
The EMUIDLE instruction places the ADSP-21161 processor in the idle state and triggers an emulator interrupt. This operation lets you use the EMUIDLE instruction to be used as a software breakpoint. When EMUIDLE is executed, the emulation clock counter immediately halts.
Boundary Register
The Boundary register is 481 bits long. This section defines the latch type and function of each position in the scan path. The positions are numbered with 0 being the first bit output (closest to TDO) and 480 being the last (closest to TDI). The following are some notes about boundary registers: Scan position 0 (NC_0) is the end is closest to TDO (scan in first) Scan position 480 (SPARE); this end is closest to TDI (scan in last) Output Enables: 1 = Drive the associated signals during the EXTEST and INTEST instructions 0 = Three-state the associated signals during the EXTEST and INTEST instructions
12-17
Boundary Register
12-18
12-19
Boundary Register
12-20
12-21
Boundary Register
12-22
12-23
Boundary Register
12-24
12-25
Boundary Register
12-26
12-27
Private Instructions
Table 12-2 on page 12-4 lists the private instructions that are reserved for emulation and memory test. The ADSP-21161 processor JTAG ICE emulator uses the TAP and boundary scan as a way to access the processor in the target system. The JTAG ICE emulator requires a target board connector for access to the TAP. For more information, see Designing For JTAG Emulation on page 13-49.
12-28
References
IEEE Standard 1149.1-1990. Standard Test Access Port and Boundary-Scan Architecture. To order a copy, contact IEEE at 1-800-678-IEEE. Maunder, C.M. and R. Tulloss. Test Access Ports and Boundary Scan Architectures. IEEE Computer Society Press, 1991. Parker, Kenneth. The Boundary Scan Handbook. Kluwer Academic Press, 1992. Bleeker, Harry, P. van den Eijnden, and F. de Jong. Boundary-Scan TestA Practical Approach. Kluwer Academic Press, 1993. Hewlett-Packard Co. HP Boundary-Scan Tutorial and BSDL Reference Guide. (HP part# E1017-90001.) 1992.
12-29
References
12-30
13 SYSTEM DESIGN
The ADSP-21161 processor supports many system design options. The options implemented in a system are influenced by cost, performance, and system requirements. This chapter provides the following system design information: Pin Descriptions on page 13-2 Dual-Voltage Power-up Sequencing on page 13-41 Designing For JTAG Emulation on page 13-49 Conditioning Input Signals on page 13-60 Designing For High Frequency Operation on page 13-62 Booting Single and Multiple Processors on page 13-71 Other chapters also discuss system design issues. Some other locations for system design information include: Setting External Port Modes on page 7-3 Setting Link Port Modes on page 9-5 SPORT Operation Modes on page 10-47 SPI Operation Modes on page 11-24 By following the guidelines described in this chapter, you can design the JTAG emulation interface for an Analog Devices target board. Development and testing of your application code and hardware can begin without debugging the debug port. ADSP-21161 SHARC Processor Hardware Reference 13-1
Pin Descriptions
Pin Descriptions
This section describes the pins of the ADSP-21161 processor and shows how these signals can be used in a ADSP-21161 processor system. All I/O pins except CLKIN and XTAL have an internal 50k resister that is enabled during reset. Figure 13-1 illustrates how the pins are used in a single-processor system. Figure 7-29 on page 7-91 shows a system diagram illustrating pin connections in an multiprocessor cluster.
13-2
System Design
CONTROL
CLOCK 2
DATA
CS ADDR BOOT EPROM (OPTIONAL) DATA ADDR DATA MEMORY AND OE PERIPHERALS WE (OPTIONAL) ACK CS RAS CAS DQM WE CLK CKE A10 CS ADDR DATA SDRAM (OPTIONAL)
ADSP-21161
3 12
BRST
FLAG11-0 ADDR23-0 TIMEXP DATA47-16 RPBA ID2-0 RD LXCLK WR LXACK ACK LXDAT7-0 MS3-0 SCLK0 FS0 D0A D0B SCLK1 FS1 D1A D1B SCLK2 FS2 D2A D2B SCLK3 FS3 D3A D3B SPICLK SPDS MOSI MISO RAS CAS DQM SDWE SDCLK1-0 SDCKE SDA10
ADDRESS
13-3
Pin Descriptions
ADSP-21161 processor pin definitions are listed in Table 13-1. The following symbols appear in the Type column of Table 13-1: A G I O P S (a/d) (o/d) T Asynchronous Ground Input Output Power Supply Synchronous Active Drive Open Drain Three-State (when SBTS is asserted or the processor is bus slave)
ADDR23-0
I/O/T
13-4
System Design
AGND BR6-1
G I/O/S
BMS
I/O/T
13-5
Pin Descriptions
CAS
I/O/T
CLKIN
CLK_CFG1-0
13-6
System Design
13-7
Pin Descriptions
When the COPT bit is set, CLKOUT is driven by the master device. CLKOUT is three-stated during the bus transition cycle by the device giving up its bus master status. The new bus master then drives CLKOUT. During host accesses, the bus master that granted the bus to the host drives CLKOUT. A keeper latch on the processors CLKOUT pin maintains the output at the level it was last driven. This latch is only enabled on processors with ID2-0=00x. If CLKDBL enabled, CLKOUT = 2xCLKIN period If CLKDBL disabled, CLKOUT = 1xCLKIN period Note: CLKOUT is only controlled by the CLKDBL pin and operates at either 1xCLKIN or 2xCLKIN. For more information, see ADSP-21161 CLKOUT and CCLK Clock Generation Operation on page 13-27. CS I/A Chip Select. Asserted by host processor to select the ADSP-21161 processor.
13-8
System Design
DMAR2
I/A
DMAG1
O/T
DMAG2
O/T
DQM
O/T
13-9
Pin Descriptions
DxB
I/O
EBOOT
EMU
O (O/D)
FLAG11-0
I/O/A
FSx
I/O
GND HBR
G I/A
13-10
System Design
LxDAT7-0 [DAT15-0]
I/O [I/O/T]
13-11
Pin Descriptions
LBOOT
MOSI
I/O
MISO
I/O
13-12
System Design
RAS
I/O/T
RD
I/O/T
REDY
O (O/D)
RESET
I/A
RPBA
I/S
13-13
Pin Descriptions
Type O
Function Reset Out. When RSTOUT is asserted, this pin is used to indicate to the external logic that the core blocks are in reset. It is deasserted 4096 cycles after RESET is deasserted allowing the PLL to stabilize and lock. For systems requiring a secondary reset for other devices needing to be simultaneously brought out of reset with the processor core reset, system designers can connect this pin to the reset pin of the other devices. This prevents other devices from driving data before the processor begins the booting process.
SDWE
I/O/T
SDRAM Write Enable. In conjunction with CAS, RAS, MSx, SDWE, SDCLKx, and sometimes SDA10, defines the operation for the SDRAM to perform. SDRAM Clock Output 0. Clock for SDRAM devices. SDRAM Clock Output 1. Additional clock for SDRAM devices. For systems with multiple SDRAM devices, handles the increased clock load requirements, eliminating need of off-chip clock buffers. Either SDCLK1 or both SDCLKx pins can be three-stated. SDRAM Clock Enable. Enables and disables the CLK signal. For details, see the data sheet supplied with your SDRAM device. SDRAM A10 Pin. Enables applications to refresh an SDRAM in parallel with a non-SDRAM accesses or host accesses. Suspend Bus Three-State. External devices can assert SBTS (low) to place the external bus address, data, selects, and strobes in a high impedance state for the following cycle. If the ADSP-21161processor attempts to access external memory while SBTS is asserted, the processor halts and the memory access is not completed until SBTS is de-asserted. SBTS should only be used to recover from host processor/ADSP-21161 processor deadlock. Transmit/Receive Serial Clock (Serial Ports 0, 1, 2, 3). Each SCLK pin has a 50k internal pull-up resistor. This signal can be either internally or externally generated.
SDCLK0 SDCLK1
I/O/S/T O/S/T
SDCKE
I/O/T
SDA10 SBTS
O/T I/S
SCLKx
I/O
13-14
System Design
SPIDS
O I I/S I/S O
13-15
Pin Descriptions
VDDINT VDDEXT WR
P P I/O/T
XTAL
Inputs identified as synchronous (S) must meet timing requirements with respect to CLKIN (or with respect to TCK for TMS, TDI). Inputs identified as asynchronous (A) can be asserted asynchronously to CLKIN (or to TCK for TRST). Unused inputs should be tied or pulled to VDDEXT or GND, except for ADDR23-0, DATA47-16, FLAG11-0, and inputs that have internal pull-up or pull-down resistors (PA, ACK, BRST, CLKOUT, RD, WR, DMARx, DMAGx, DxA, DxB, SCLKx, LxDAT7-0, MISO, MOSI, SPICLK, LxCLK, LxACK, TMS, TRST and TDI) these pins can be left floating. Some of these pins have a logic-level hold circuit (only enabled on the ADSP-21161 processor with ID2-0=00x) that prevents input from floating internally. See the pin list in Table 13-1.
13-16
System Design
The TRST input of the JTAG interface must be asserted (pulsed low) or held low after power-up for proper operation of the ADSP-21161 processor. Do not leave this pin unconnected. Additional Notes: In single-processor systems, the processor owns the external bus during reset and does not perform bus arbitration to gain control of the bus. Operation of the RD and WR signals changes when CS is asserted by a host processor. For more information, see Asynchronous Transfers on page 7-48. Except during a Host Transition Cycle (HTC), the RD and WR strobes should not be deasserted (low-to-high transition) while ACK or REDY are deasserted (low)the processor hangs if this happens. In multiprocessor systems, the ACK signal is an input to the ADSP-21161 processor bus master and does not float when it is not being driven. It is not necessary to use an external pullup resistor on the ACK line during booting or at any other time. The ACK pin is pulled high internally with a 20k equivalent resistor and is activated under the following three conditions: 1. When the processor is in reset (regardless of the hardwired ID pin configuration) 2. After reset, in a single processor system (ID2-0 =000) 3. After reset, in a multiprocessor system, the processor having ID2-0 =001 Figure 13-2 shows how different data word sizes are transferred over the external port.
13-17
Pin Descriptions
DATA 47-16
47 40 39 32 31 24 23 16 15
DATA 15-0
8 7 0
PROM BOOT 8-bit Packed DMA Data 8-bit Packed Instruction Execution 16-bit Packed DMA Data 16-bit Packed Instruction Execution Float or Fixed, D31-D0, 32-bit Packed 32-bit Packed Instruction 48-bit Instruction Fetch (No Packing) Extra Data Lines DATA[15-0] Are Only Accessible If Link Ports Are Disabled. Enable These Additional Data Lines By setting IPACK[1:0] = 01 In SYSCON.
13-18
System Design
To ensure recognition of an asynchronous input, it must be asserted for at least one full processor cycle plus setup and hold time, except for RESET, which must be asserted for at least four processor cycles. The minimum time prior to recognition (the setup and hold time) is specified in the ADSP-21161N SHARC DSP Microcomputer Data Sheet.
13-19
Pin Descriptions
13-20
System Design
For ID =0 or 1, driven only by processor bus master, otherwise three-stated Bus master independent JTAG interface Serial ports, SPI and link port
13-21
Pin Descriptions
13-22
System Design
13-23
Pin Descriptions
Clock Derivation
The ADSP-21161 processor employs a phase-locked loop on-chip, to provide clocks that switch at higher frequencies than the system clock (CLKIN). The PLL-based clocking methodology employed on the processor influences the clock frequencies and behavior for the serial, link, SDRAM, SPI, and external ports; in addition to the processor core and internal memory. In each case, the processor PLL provides a de-skewed clock to the port logic and I/O pins.
13-24
System Design
For the external port, this clock is fedback to the PLL, such that the external port clock always switches at the 1x or 2x frequency CLKIN frequency depending on if CLKDBL is enabled. The PLL provides internal clocks that switch at multiples of the CLKIN frequency for the internal memory, processor core, link and serial ports, and to modify the external port timing as required (for example, read/write strobes in asynchronous modes). The ratio of processor core clock frequency and CLKIN/external port clock frequency is determined by the CLK_CFG1-0 pins and CLKDBL pin (as shown in Table 13-8 on page 13-29), during reset. The core clock ratio cannot be altered dynamically. The ADSP-21161 processor must be reset to alter the clock ratio. The PLL provides a clock that switches at the processor core frequency to the serial and link ports. Each of the serial and link ports can be programmed to operate at clock frequencies derived from this clock. The four serial ports transmit and receive clocks are divided down from the processor core clock frequency by setting the DIVx registers appropriately. In addition to the PLL ratios generated by the CLK_CFG1-0 pins, an additional CLKDBL pin can be used for additional clock ratio options. The (1x/2x CLKIN) rate set by the CLKDBL pin determines the rate of the PLL input clock and the rate at which the synchronous external port operates. With the combination of CLK_CFG[1:0] and CLKDBL, ratios of 2:1, 3:1, 4:1, 6:1, and 8:1 between the core and CLKIN are supported. Timing Specifications The ADSP-21161 processors internal clock (a multiple of CLKIN) provides the clock signal for timing internal memory, processor core, link ports, serial ports, SPI, SDRAM, and external port (as required for read/write strobes in asynchronous access mode). During reset, program the ratio between the ADSP-21161 processors internal clock frequency and external (CLKIN) clock frequency with the CLK_CFG1-0 and CLKDBL pins. Even
13-25
Pin Descriptions
though the internal clock is the clock source for the external port, it behaves as described in the Clock Rate Ratio chart (CLKDBL pin description). To determine switching frequencies for the serial and link ports, divide down the internal clock, using the programmable divider control of each port (DIVx for the serial ports and LxCLKD1-0 for the link ports). For the SPI port, the BAUDR bit in the SPICTL register controls the SPICLK baud rate based on the core clock frequency. Each of the two link port clock frequencies are determined by programming the LxCLKDx parameters in the LCTL registers. For more information, see Link Port Buffer Control Register (LCTL) on page A-92. Note the following definitions of various clock periods that are a function of CLKIN and the appropriate ratio control. Figure 13-3 allows Core-to-CLKIN ratios of 2:1, 3:1, 4:1, 6:1, and 8:1 with external oscillator or crystal.
External Port Host, MMS, SRAM SBSRAM PLL External Port SDRAM x1, x1/2
PLLICLK
CLKIN
Crystal or Clock Oscillator
CCLK
2:1, 3:1, 4:1 Core Clock
XTAL
CLKDBL
CLKOUT
CLK_CFG[1:0]
SDCLK[1:0]
13-26
System Design
Table 13-4 and Table 13-5 provide various definitions of clock inputs, outputs and uses in an ADSP-21161 processor system. Table 13-4. ADSP-21161 CLKOUT and CCLK Clock Generation Operation
Timing Requirements CLKIN CLKOUT PLLICLK CCLK = = = = Calculation 1/tCKIN 1/tTCK 1/tPLLIN 1/tCCLK = = = = Description Input Clock Local Clock Out PLL Input Clock Core Clock
If CLKDBL is enabled (tied low at reset), then CLKOUT = PLLICLK = 2xCLKIN. Otherwise, CLKOUT = PLLICLK = CLKIN. CCLK = Core Clock = PLLICLK x PLL Multiply Ratio (determined by CLK_CFG pins). Table 13-5. Clock Relationships
Timing Requirements tCK tPLLICK tCCLK tLCLK tSCLK tSDK tSPICLK = = = = = = = Description1 CLKOUT Clock Period PLL Input Clock (Processor) Core Clock Period Link Port Clock Period = (tCCLK) * LR Serial Port Clock Period = (tCCLK) * SR SDRAM Clock Period = (tCCLK) * SDCKR SPI Clock Period = (tCCLK) * SPIR
13-27
Pin Descriptions
where: LR = link port-to-core clock ratio (1, 2, 3, or 1:4, determined by LxCLKD) SR = serial port-to-core clock ratio (wide range, determined by CLKDIV) SDCKR = SDRAM-to-Core Clock Ratio (1:1 or 1:2, determined by SDCTL register) SPIR = SPI-to-Core Clock Ratio (wide range, determined by SPICTL register) LCLK = Link Port Clock SCLK = Serial Port Clock SDK = SDRAM Clock SPICLK = SPI Clock
Table 13-6 describes clock ratio requirements. Table 13-7 shows an example clock derivation. Table 13-6. Clock Ratios
Timing Requirements cRTO lRTO sRTO = = = Description Core:CLKOUT ratio, (2, 3, or 4:1, determined by CLK_CFG) lport:core clock ratio (1:1, 1:2, 1:3, or 1:4, determined by LxCLKD) Sport:core clock ratio (wide range determined by xCLKDIV)
13-28
System Design
mum time period during reset before the RESET signal can be deasserted. For information on minimum clock setup, see the ADSP-21161N DSP Microcomputer Data Sheet. Table 13-8 describes the internal clock to CLKIN frequency ratios supported by the ADSP-21161 processor. Table 13-8. Clock Rate Ratios
CLKDBL 1 1 1 0 0 0 CLK_CFG1 0 0 1 0 0 1 CLK_CFG0 0 1 0 0 1 0 Core Clock Ratio 2:1 3:1 4:1 4:1 6:1 8:1 CLKOUT Ratio 1x 1x 1x 2x 2x 2x
When using an external crystal, the maximum crystal frequency cannot exceed 25 MHz. The internal clock generator when used in conjunction with the XTAL pin and an external crystal is designed to support up to a maximum of 25 MHz external crystal frequency. For all other external clock sources, the maximum CLKIN frequency is 50 MHz. Table 13-9 demonstrates the internal core clock switching frequency, across a range of CLKIN frequencies. The minimum operational range for any given frequency is constrained by the operating range of the phase
13-29
Pin Descriptions
lock loop. Note that the goal in selecting a particular clock ratio for the application is to provide the highest internal frequency, given a CLKIN frequency. If an external master clock is used, it should not be driving the CLKIN pin when the processor is not powered. The clock must be driven immediately after powerup; otherwise, internal gates stay in an undefined (hot) state and can draw excess current. After powerup, there should be sufficient time for the oscillator to start up, reach full amplitude and deliver a stable CLKIN signal to the processor before the reset is released. This may take 100 ms depending on the choice of crystal, operating frequency, loop gain and capacitor ratios. For details on timing, refer to the ADSP-21161N DSP Microcomputer Data Sheet. After the external RESET signal is deasserted, the PLL starts operating. The rest of the chip is held in reset for 4096 CLKIN cycles after RESET is deasserted by an internal (or core) reset (RSTOUT1) signal. This sequence allows the PLL to lock and stabilize. Table 13-9. Selecting Core to CLKIN Ratio
Typical Crystal and Clock Oscillators Inputs 12.5 Clock Ratios 2:1 3:1 4:1 6:1 16.67 25 33.3 40 50
Core CLK (MHz) 25 37.5 50 75 33.3 50 66.6 100 50 75 100 N/A 66.6 100 N/A N/A 80 N/A N/A N/A 100 N/A N/A N/A
13-30
System Design
Reset Generators
It is important that an ADSP-21161 processor (or programmable device) have a reliable active RESET that is released once the power supplies and internal clock circuits have stabilized. The RESET signal should not only offer a suitable delay, but it should also have a clean monotonic edge. Analog Devices has a range of microprocessor supervisory ICs with different features. Features include one or more of the following: Powerup reset Optional manual reset input Power low monitor Back-up battery switching Part number series for Analog Devices supervisory circuits are as follows: ADM69x ADM70x ADM80x ADM1232 ADM181x ADM869x
13-31
Pin Descriptions
A simple powerup reset circuit is shown in Figure 13-4, using the ADM809-RART reset generator. The ADM809 provides an active low RESET signal whenever the supply voltage is below 2.63V. At powerup, a 240ms active reset delay is generated to give the power supplies and oscillators time to stabilize.
+1.8V DDINT +3.3V DDEXT
10F
CC
V DDEXT
V DDINT
ADM809-RART
RESET RESET
ADSP-21161
a
S
GND
GND
Figure 13-4. Simple Reset Generator Another part, the ADM706TAR, provides power on RESET and optional manual RESET. It allows designers to create a more complete supervisory circuit that monitors the supply voltage. Monitoring the supply voltage allows the system to initiate an orderly shutdown in the event of power failure. The ADM706TAR also allows designers to create a watchdog timer that monitors for software failure. This part is available in an eight lead SOIC package. Figure 13-5 shows a typical application circuit using the ADM706TAR.
13-32
System Design
2 Vt=+1.25V 4 1 6 RESET
7 5 8 3
ADSP-21161
a
S
ADM706TAR
GND
13-33
Pin Descriptions
Flag Inputs When a flag pin is programmed as an input, its value is stored in a bit in the FLAGS register. The bit is updated in each cycle with the input value from the pin. Flag inputs can be asynchronous to the processor clock, so there is a one-cycle delay before a change on the pin appears in FLAGS (if the rising edge of the input misses the setup requirement for that cycle). For more information, see Flag Value Register (FLAGS) on page A-37. An flag bit is read-only if the flag is configured as an input. Otherwise, the bit is readable and writable. The flag bit states are conditions that code can specify in conditional instructions. Flag Outputs When a flag is configured as an output, the value on the pin follows the value of the corresponding bit in the FLAGS register. A program can set or clear the flag bit to provide a signal to another processor or peripheral. The FLAG outputs transition on rising edge of CLKIN. Because the processor core operates at least twice the frequency of CLKIN, the programmer must hold the FLAG state stable for at least one full CLKIN period, to insure that
13-34
System Design
the output changes state. Figure 13-6 describes the relationship between instruction execution and a Flag pin, when the processor core to bus clock ratio is set to 2:1. Figure 13-6 also describes the flag in/out process. Note that at least two instructions execute each CLKIN cycle.
BIT SET MODE2 FLG0; BIT CLR FLAGS FLG0; BIT SET FLAGS FLG0; NOP; BIT CLR FLAGS FLG0; BIT CLR MODE2 FLG0; NOP; NOP; /* 1st cycle: set FLAG0 to output in Mode2 */ /* clear FLAG0 */ /* 1st cycle: set FLAG0 output high */ /* 2nd cycle: FLAG register updated here */ /* A NOP indicates a NOP or another instruction not related to FLAG. */ /* 2nd cycle: clear FLAG0 output */ /* earliest assertion of FLAG0 output, depends on CLKOUT phase */ /* 3rd cycle: set FLAG0 back to input */ /* 3rd cycle: */ /*4th cycle: earliest deassertion of FLAG0 output */
3RD CLKOUT CYCLE: 4TH CLKOUT CY CLE: 5TH CLKOUT CY CLE:
C LKOUT
OUTPUT ENABLED
FLA G HIGH
FLAG LOW
FLAGX
OUTPU T VALID
Pin Descriptions
The ADSP-21161 processor has an additional eight IOP based general-purpose programmable input/output flag pins - FLAG[11:4]. As outputs, these pins can signal peripheral devices; as inputs, these pins can provide the test for conditional branching. These pins correspond to the FLAG11-4 pins listed in the datasheet of the device. All FLAG pins are configured as inputs on reset. When configuring IOFLAG register flag pins as outputs, do not set FLGx bits 0 to 7 in the same instruction cycle that the flag is configured as an output (setting the FLGxO bits 8 to 15 in the IOFLAG register). If your application requires that the flags be set after they are configured as outputs, two writes to the IOFLAG register are needed: one to configure the flag pin as an output, and another to set the flag pin high. The functionality of the FLAG11-4 pins is similar to that of the FLAG3-0 except for both the status and control information are included in one register, IOFLAG. The control and status bits for the FLAG3-0 are in the MODE2 register and FLAGS register, respectively. Bits 0-7 of IOFLAG reflect the status of the FLAG pins while bits 15-8 control the direction (input or output) of these flags. A value of 0 programs the flag as an input and a value of 1 programs it as an output. Although you cannot execute bit wise operations such as BIT TST, BIT CLR, on these flags directly in memory, you can execute these operations by first writing to a system register such as the USTAT1 - USTAT4. Figure 13-7 shows the IOFLAG register.
13-36
System Design
31 30 29 28 27 26 25 24 23 22 21 20
19 18 17 16 0 0 0 0
IOFLAG
0x1B
15 14 0 0
13 12 0 0
11 10 0 0
9 0
8 0
7 0
6 0
5 0
4 0
3 0
2 0
1 0
0 0
FLG11O
0=FLAG11 Input 1=FLAG11 Output
FLG4
FLAG4 Value (Low=0, High=1)
FLG10O
0=FLAG10 Input 1=FLAG10 Output
FLG5
FLAG5 Value
FLG9O
0=FLAG9 Input 1=FLAG9 Output
FLG6
FLAG6 Value
FLG8O
0=FLAG8 Input 1=FLAG8 Output
FLG7
FLAG7 Value
FLG7O
0=FLAG7 Input 1=FLAG7 Output
FLG8
FLAG8 Value
FLG6O
0=FLAG6 Input 1=FLAG6 Output
FLG9
FLAG9 Value
FLG50
0=FLAG5 Input 1=FLAG5 Output
FLG10
FLAG10 Value
FLG40
0=FLAG4 Input 1=FLAG4 Output
FLG11
FLAG11 Value
Figure 13-7. IOFLAG Register Example #1: Configuring FLGx as Output Flags The following example shows how to configure the flags as output flags, set the flag pins high and write the bits to the IOFLAG register:
ustat2 = 0x00000000; bit set ustat2 FLG9O|FLG8O|FLG7O|FLG6O|FLG5O|FLG4O; dm(IOFLAG) = ustat2;
13-37
Pin Descriptions
After writing to the register, the flags can be toggled with the bit command:
bit tgl ustat2 FLG9|FLG8|FLG7|FLG6|FLG5|FLG4; dm(IOFLAG) = ustat2;
tgl
Example #2: Configuring FLGx as Input Flags The following example shows how to configure the flags as input flags, clear the flag pins, and write the modified flag settings to the IOFLAG register:
ustat2 = 0x00000000; bit clr ustat2 FLG9O|FLG8O|FLG7O|FLG6O|FLG5O|FLG4O; dm(IOFLAG) = ustat2;
System Design
before that change is received external to the processor based on the rising edge of CLKOUT. The same cycle effect applies to the 3:1 and 4:1 clock ratios. For the 3:1 clock ratio, the processor requires up to three CCLK cycles before the change is received external to the processor based on the rising edge of CLKOUT. For the 4:1 clock ratio, the processor requires up to four CCLK cycles. Since a core stall does not occur when writing to or reading from FLAG pins synchronized to the slower ADSP-21161 processor system clock, NOP instructions are required. In this case, write extra NOPs to ensure overruns do not occur in the higher clock rates. The ADSP-21161 processor samples FLAG inputs at the CLKIN frequency except when CLKDBL is enabled. When CLKDBL is enabled, the processor samples FLAG inputs at the CLKOUT frequency. FLAG outputs must be held stable for at least one full CLKIN cycle. Figure 13-9 shows the delay in setting (or toggling the flag pins) for clock modes 2:1, 3:1, and 4:1.
DM (IOFLAG) =USTAT
CLKOUT 2:1
CLKOUT 3:1
CLKOUT 4:1
Figure 13-8. Delay in Setting Flag Pins for Clock Modes 2:1, 3:1 and 4:1
13-39
Pin Descriptions
Example #3: Programming 2:1 Clock Ratio The following example shows how to program an IOFLAG output with a 2:1 CCLK to CLKOUT ratio:
LCNTR = 100, DO flag_toggle UNTIL LCE; bit tgl ustat1 FLG4O; flag_toggle: dm(IOFLAG) = ustat1;
Since a CLKOUT transition occurs every two CCLK instruction cycles, no additional NOP instructions are required. Example #4: Programming 3:1 Clock Ratio The following example shows how to set an IOFLAG output with 3:1 CCLK to CLKOUT ratio:
LCNTR = 100, DO flag_toggle UNTIL LCE; bit tgl ustat1 FLG4O; dm(IOFLAG) = ustat1; flag_toggle:nop;
Since a CLKOUT transition occurs every three CCLK instruction cycles, one NOP instruction is required to prevent the flag output overrun. Example #5: Programming 4:1 Clock Ratio The following example shows how to set an IOFLAG output with 4:1 CCLK to CLKOUT ratio:
LCNTR = 100, DO flag_toggle UNTIL LCE; bit tgl ustat1 FLG4O; dm(IOFLAG) = ustat1; nop; flag_toggle:nop;
13-40
System Design
13-41
The ADSP-21161 I/O pads have a network of internal diodes to protect the processor from damage by electrostatic discharge. These protection diodes connect the 1.8 V core and 3.3 V I/O supplies internally. Figure 13-9 shows how a network of protection diodes isolates the internal supplies and provides ESD protection for the I/O pins. During the power-up sequence of the processor, differences in the ramp up rates and activation time between the two supplies can cause current to flow in the I/O ESD protection circuitry. When applying power separately to the VDDEXT or VDDINT pins, take precautions to prevent or limit the maximum current and duration conducted through the isolation diodes if the un powered pins are at ground potential. Since the ESD protection diodes connect the 1.8 V core and 3.3 V I/O supplies internally, these diodes can be damaged at any time the 1.8 V core supply voltage is present without the presence of the 3.3 V I/O supply.
VDDEXT ( -3.3 V)
VDDINT (1 .8V )
ADSP-21161
I/O PIN
INPUT
13-42
System Design
The ESD protection diodes connect the 1.8 V core and 3.3 V I/O supplies internally. Improper supply sequencing can cause damage to the ESD protection circuitry. If the 1.8 V supply is active for prolonged periods of time before the 3.3 V I/O supply is activated, there is a significant amount of loading on the I/O pins. Damage occurs because the I/O is powered from the 1.8 V supply through the ESD diodes. To prevent this damage to the ESD diode protection circuitry, Analog Devices recommends including a bootstrap Schottky diode. The bootstrap Schottky diode connected between the 1.8 V and 3.3 V power supplies protects the ADSP-21161 from partially powering the 3.3 V supply. Including a Schottky diode shortens the delay between the supply ramps and thus prevent damage to the ESD diode protection circuitry. With this technique, of the 1.8 V rail rises ahead of the 3.3 V rail, the Schottky diode pulls the 3.3 V rail along with the 1.8 V rail. For many power supply system designers, it may be easier to design the PLL clock gate workaround instead of shortening the VDDINT ramp time. Moving between revisions does not require any hardware modifications to gate the clock. As long as the tCLKVDD startup requirement is met then a reliable start-up reset of the PLL for revision 1.0/1.1 is assured. This requirement guarantees that the CLKIN source is present within 200 ms after the supplies are ramped. See the ADSP-21161N DSP Microcomputer Data Sheet for timing specifications. Holding off CLKIN up to a maximum of 200 ms is allowed. Figure 13-10 shows a basic block diagram of the Schottky diode connected between the core and I/O voltage regulators and the processor. The anode of the diode must be connected to the 1.8 V supply. The diode must have a forward biased voltage of 0.6 V or less and must have a current rating sufficient to supply the 3.3 V rail of the system. The use of a Schottky diode is the recommended method suggested by Analog Devices.
13-43
DC input source
VDDEXT
ADSP -21161
VDDINT
Figure 13-10. Dual 1.8 V/3.3 V Supplies With a Schottky Diode For recommendations on managing power-up sequencing for the core I/O dual voltage supply, refer to the Powerup Sequencing specifications in the ADSP-21161N SHARC DSP Microcomputer Data Sheet.
13-44
System Design
VDDINT
POR
Output POR pulse is generated when VDDINT is between 0 and 1.2V
PLL_RESET
PLL
CLKIN ENA_CLK
Internal Core
RESET
Processor Reset
is generated as an active high pulse from the point at which begins to ramp up from 0 V. It is deactivated when VDDINT reaches
1.2 V. For revisions 1.0 and 1.1, VDDINT must ramp from 0 V to 1.8 V within 2 ms for the POR circuit to properly generate a PLL reset pulse Figure 13-12 shows three PLL reset-related input signals: the top one is VDDINT, the bottom two are derived from VDDINT and are related to the POR circuit. The POR input tracks VDDINT up to 1.2 V before it drops down. This is used to generate the PLL reset pulse. As the input is rising to 1.2 V, the output of the POR generates the reset pulse for the PLL. After the POR input voltage reaches 1.2 V, the POR voltage drops off, which then deactivates the reset pulse connected to the PLL. The duration of the POR circuit being driven active low is from 0 V to 1.2 V. If the system is powering down VDDINT and coming back up again, there are a few requirements that must be met to properly generate a PLL reset pulse on the subsequent powerup. First, the POR circuit requires
13-45
that VDDINT voltage level is below 0.5 V. Secondly, re-ramp from 0.5 V to 1.2 V must occur within 1ms to guarantee another generated PLL_RESET pulse. PLL CLKIN Enable Circuit The 9-bit counter counts a certain number of CLKIN cycles before it allows the PLL to begin to lock to the incoming CLKIN frequency. This counter was added to allow the CLKIN source to amplify and oscillate to a stable fundamental frequency before the PLL begins to try to lock to the incoming frequency.
1 ms 3 ms
1.2 Volts VDDINT POR Input PLL_RESET (POR Output) 0.5 Volts
Figure 13-12. PLL Reset Because oscillator or crystal startup times can range from 5 to 10 ms, the internal 512 cycle counter in some startup cases does not allow the CLKIN oscillator source to run at its locked oscillator fundamental frequency before the PLL clock input is enabled. Some oscillators might have a slow frequency ramp up time for 10 ms.
13-46
System Design
The revision 1.0 and 1.1 PLL can fail to lock or fail to continue to run if the CLKIN frequency goes below 15 MHz for more than 20 s or when using CLKDBL, the minimum CLKIN frequency cannot be less than 7.5 MHz. There are two ways in which the PLL can be reset for revisions 1.0 and 1.1: Ensure that the VDDINT ramp rate time is met (< 2 ms) with a stable CLKIN frequency applied when the POR circuit is enabled. When using an external clock oscillator powered by the VDDEXT supply, bring up VDDEXT for a recommended 25 ms before enabling VDDINT. This allows the external CLKIN source to come up and stabilize before the VDDINT power supply is brought up. The VDDINT POR circuit then activates and generates a PLL pulse. Hold off or gate the CLKIN source until the VDDINT/VDDEXT supplies are known to be stable. This negates the VDDINT ramp rate requirement if VDDINT is exceeding 2 ms. Holding off CLKIN low or high until the supplies are stable also resets the internal PLL circuitry and allows the PLL to start reliably. Once, the processor is up and running, if you stop the CLKIN source, the PLL can lock up and not restart when CLKIN is reapplied. If there is a brown-out situation in your system, the watchdog circuit power-downs to at least 0.5 V and power-up of the VDDINT supply within 1.0ms (to restart the POR circuit).
13-47
RESET
PLL_RESET ENA_CLK
CLKIN
PLL
RSTOUT
Figure 13-13. Power On Reset Circuit Revisions 1.2 The PLL must lock to the CLKIN frequency (around 100 s). Because the PLL resets on the rising edge of RESET, the PLL needs time to lock to CLKIN before the core can execute or begin the boot process. A delayed core reset has been added via the delay circuit. There is a 12-bit counter that counts up to 4096 CLKIN cycles after RESET is transitioned from low to high. The delay circuit is activated at the same time the PLL is reset. A secondary RSTOUT pin (B15 which previously was a NC) has been added to allow system designers the option to have the ADSP-21161 processor reset another device after the core is reset. Note that as in previous silicon revisions, the CLKOUT is active during a reset. During reset the processor is in PLL BYPASS mode. CLKOUT frequency during reset depends upon CLKDBL pin. During reset if CLKDBL is HIGH then CLKOUT frequency = 1/4 of CLKIN frequency and if CLKDBL = LOW then CLKOUT frequency = 1/2 of CLKIN frequency.
13-48
System Design
The advantage of the delayed core reset is that the PLL can be reset any number of times without having to power-down the system. If there is a brown-out situation, the watchdog circuit only has to control the RESET pin to restart the PLL.
emulators JTAG signals, which are routed to one or more ADSP-21161 processor devices, or a combination of ADSP-21161 processor devices and other JTAG devices on the chain.
13-50
System Design
Figure 13-14. Emulator Interface for Analog Devices JTAG Processors When the emulator is not connected to this header, jumpers should be placed across BTMS, BTCK, BTRST, and BTDI as shown in Figure 13-15. This holds the JTAG signals in the correct state to allow the ADSP-21161 processor to run freely. All the jumpers should be removed when connecting the emulator to the JTAG header. For a list of the state of each standard JTAG signal refer to Table 13-11. Use the following legend: O=Output, I=Input, and NU=Not Used. The ADSP-21161 processor CLKIN signal is the clock signal line (typically 30 MHz or greater) that connects an oscillator to all processors in multiple processor systems requiring synchronization. In order for synchronous operations to work correctly the CLKIN signal on all the processors must be the same signal and the skew between them must be minimal (use clock drivers, or other means).
13-51
Figure 13-15. JTAG Target Board Connector With No Local Boundary Scan Table 13-10. State of Standard JTAG Signals
Signal TMS TCK TRST TDI TDO EMU CLKIN Description Test Mode Select Test Clock (10 MHz) Test Reset Test Data In Test Data Out Emulation Pin Processor Clock Input Emulator O O O O I I NU ADSP-21161 I I I I O O (Open Drain) I
Note that the CLKIN signal is not used by the emulator and can cause noise problems if connected to the JTAG header. Legacy documents show it connected to pin 4 of the JTAG header. Pin 4 should be tied to ground on the 14-pin JTAG header (do not connect the JTAG header pin to the pro-
13-52
System Design
cessors CLKIN signal). If you have already connected it to the JTAG header pin, and are experiencing noise from this signal, simply clip this pin on the 14-pin JTAG header. The final connections between a single processor target and the emulation header (within 6 inches) are shown in Figure 13-16. A 4.7 K pull-up resistor has been added on TCK, TDI and TMS for increased noise resistance.
Figure 13-16. Single Connection to the JTAG Header If a design uses more than one processor (or other JTAG device in the scan chain), or if the JTAG header is more than 6 inches from the processor, use a buffered connection scheme as shown in Figure 13-17 on
13-53
Layout Requirements
page 13-55 (no local boundary scan mode shown). To keep signal skew to a minimum, be sure the buffers are all in the same physical package (typical chips have 6, 8, or 16 drivers). Using a buffer that includes a series of resistors such as the 74ABT2244 family can reduce ringing on the JTAG signal lines. For low voltage applications (3.3 V, 2.5 V, and 1.8 V I/O), the 74ALVT, and 74AVC logic families is useful. Also, note the position of the pull-up resistor on EMU. This is required since the EMU line is an open drain signal. If more than one processor (or JTAG device) is on the target (in the scan chain), you must buffer the JTAG header. This keeps the signals clean and avoids noise problems that occur with longer signal traces (ultimately resulting in reliable emulator operation). Although the theoretical number of devices that can be supported (by the software) in one JTAG scan chain is large (50 devices or more) it is not recommended that you use more than eight physical devices in one scan chain. A physical device could however contain many JTAG devices such as inside a multi-chip module. The recommendation of not more than eight physical devices is mostly due to the transmission line effects that appear in long signal traces, and based on some field-collected empirical data. The best approach for large numbers of physical devices is to break the chain into several smaller independent chains, each with their own JTAG header and buffer. If this is not possible, at least add some jumpers that can reduce the number of devices in one chain for debug purposes, and pay special attention in the layout stage for transmission line effects.
Layout Requirements
All JTAG signals (TCK, TMS, TDI, TDO, EMU, TRST) should be treated as critical route signals. Specify a controlled impedance requirement for each route (value depends on your circuit board, typically 50-75 ). Keeping crosstalk and inductance to a minimum on these lines by using a good ground plane and by routing away from other high noise signals such as
13-54
System Design
Figure 13-17. Multiple Connection to JTAG Header clock lines is also important. Keep these routes as short and clean as possible, and keep the bused signals (TMS, TCK, TRST, EMU) as close to the same length as possible. The JTAG TAP relies on the state of the TMS line and the TCK clock signal. If these signals have glitches (due to ground bounce, crosstalk, etc.) unreliable emulator operation results. When experiencing emulator problems, look at these signals using a high-speed digital oscilloscope. These lines must be clean, and may require special termination schemes. If you are buffering the JTAG header (most applications do) you must provide signal termination appropriate for your target board (series, parallel, R/C, etc.).
13-55
Pod Specifications
This section contains design details on various emulator pod designs by the Analog Devices Tools product line. The emulator pod is the device that connects directly to the target board 14-pin JTAG header. See also Engineer-to-Engineer Notes EE-68.
13-56
System Design
13-57
Pod Specifications
Figure 13-20. 3.3V JTAG Pod Driver Logic Parallel terminate the TMS, TCK, TRST, and TDI lines locally on your target board, if needed, since they are driven by the pod with sufficient current drive (32 mA). In order to use the terminators on the TDO line (CLKIN is not used), you MUST have a buffer on your target board JTAG header.
13-58
System Design
The ADSP-21161 processor is not capable of driving the parallel terminator load directly with TDO. Assuming the proper buffers are included, use the optional parallel terminators by placing a jumper on J2.
13-59
You can terminate the TMS, TCK, TRST, and TDI lines locally on your target board, if needed, as long as the terminators current use does not exceed the driver's maximum current supply (8 mA). In order to use the terminator on the TDO line, include a buffer on your target board JTAG header. The ADSP-21161 processor is not capable of driving a parallel terminator load (typically 50-75 ) directly with TDO. Assuming you have the proper buffers, you may use the optional parallel terminator by adding the appropriate resistors and placing a jumper on J2.
13-60
System Design
Filtering is implemented only on the link port data and clock inputs. This is possible because the link ports are self-synchronized. The clock and data are sent together. It is not the absolute delay but rather the relative delay between clock and data that determines performance margin. By filtering both LxCLK and LxDAT7-0 with identical circuits, response to LxCLK glitches and reflections are reduced but relative delay is unaffected. The filter has the effect of ignoring a full strength pulse (a glitch) narrower than approximately 2 ns. Glitches that are not full strength can be somewhat wider. The link ports do not use glitch rejection circuits because they can be used with longer, series-terminated transmission lines where the reflections do not occur near the signal transitions.
13-61
3. For systems not needing CLKOUT as a clock source, CLKOUT may be used to identify the current bus master. This requires that the outputs not be tied together. If and when this debug feature is not needed, the CLKOUT output can be disabled by setting the COD bit in the SYSCON register. The bus master can be identified by checking the BMSTR pin.
13-62
System Design
FREQUENCY 1
CLOCK
ADSP-21160
a
S
NO CONNECT
NO CONNECT
Figure 13-22. Reducing Clock Jitter and Ring Never share a clock buffer IC with a signal of a different clock frequency. This introduces excessive jitter.
Clock Distribution
There must be low clock skew between processors in a multiprocessor cluster when communicating synchronously on the external bus. The clock must be routed in a controlled-impedance transmission line that can be properly terminated at either the end of the line or the source.
13-63
Figure 13-23 illustrates end-of-line termination for the clock. End-of-line termination is not usually recommended unless the distance between the processors is extremely small, because devices that are at a different wire distance from each other receive a skewed clock. This is due to the propagation delay of a PCB transmission line, which is typically 5 to 6 inches/ns.
CLOCK
+5 V
50 TRANSMISSION LINE
180
1.4V
70
ADSP-21160
a
S
ADSP-21160
a
S
ADSP-21160
a
S
Figure 13-23. End-Of-Line Termination for the Clock Caution Figure 13-24 illustrates source termination for the clock. Source termination allows delays in each path to be identical. Each device must be at the end of the transmission line because only there does the signal have a single transition. The traces must be routed so that the delay through each is matched to the others. Line impedance higher than 50 may be used, but clock signal traces should be in the PCB layer closest to the ground plane to keep delays stable and crosstalk low. More than one device may be at the end of the line, but the wire length between them must be short and the impedance (capacitance) of these must be kept high. The matched
13-64
System Design
OCT AL INVE R T E R
ACT Q240 (NAT IONAL S E MICONDUCT OR ) OR IDT 49F CT 805/A OR CY7C992 40 50 TRANSMISSION LINE
ADS P -21160
a
S
CLOCK
40
50
TRANSMISSION LINE
ADS P -21160
a
S
40
50
TRANSMISSION LINE
ADS P -21160
a
S
BUFFER DRIVE IMPEDANCE = 10 A S E P AR AT E B UF F E R AND T R ANS MIS S ION L INE IS NE E DE D F OR E ACH GR OUP OF PR OCE S S OR S T HAT AR E F UR T HE R T HAN 4 INCHE S F R OM E ACH OT HE R .
Figure 13-24. Use Source Termination to Distribute the Clock inverters must be in the same IC and must be specified for a low skew (< 1 ns) with respect to each other. This skew should be as small as possible because it subtracts from the margin on most specifications.
Point-to-Point Connections
Unlike previous SHARC processors, the ADSP-21161 processor contains internal series resistance equivalent to 50 on all drivers except the CLKIN and XTAL pins. Therefore, for traces longer than six inches, external series resisters on control, data, clock or frame sync pins are not required to
13-65
dampen reflections from transmission line effects for point-to-point connections. However, for more complex networks such as a star configuration, a series termination is still recommended. Figure 13-26 shows an internal resistance in the driver of 10 . The additional 40 series resister at the driver pad results in a total resistance of 50 . For more specific guidance on related issues, see the reference source in Recommended Reading on page 13-71 for suggestions on transmission line termination. Also, see the ADSP-21161N DSP Microcomputer Data Sheet for output drivers rise and fall time data.
ADS P -21160
a
S
DRIVER IMPEDANCE = 17V ON LINK PORT TRANSMITTER
33
33
OF F
OP E N CIRCUIT
ADS P -21160
a
S
OPEN CIRCUIT
Figure 13-25. Source Termination for Point-to-Point Connectors For link port operation at the full internal clock rate it is important to maintain low skew between the data (LxDAT7-0) and clock (LxCLK). For full speed operation with a 100 MHz internal clock, a skew of less than 0.5 ns is required. Although the ADSP-21161 processors serial ports may be operated at a slow rate, the output drivers still have fast edge rates.
13-66
System Design
Signal Integrity
The capacitive loading on high-speed signals should be reduced as much as possible. Loading of buses can be reduced by using a buffer for devices that operate with wait states, for example DRAMs. This reduces the capacitance on signals tied to the zero-wait-state devices, allowing these signals to switch faster and reducing noise-producing current spikes. Signal run length (inductance) should also be minimized to reduce ringing. Extra care should be taken with certain signals such as the read and write strobes (RD, WR) and acknowledge (ACK). In a multiprocessor cluster, each processor can drive the read or write strobes. In this case, some damping resistance should be put in the signal path if the line length is greater than 6 inches (Figure 13-26). This is at the expense of additional signal delay. The time budget for these signals should be carefully analyzed.
13-67
ADS P-21160
a
S
ADS P -21160
a
S
10
10
10
10
ADS P-21160
a
S
ADS P -21160
a
S
13-68
System Design
Position the processors on both sides of the board to reduce area and distances if possible. Design for lower transmission line impedances to reduce crosstalk and to allow better control of impedance and delay. Use of 3.3 V peripheral components and power supplies to help reduce transmission line problems, because the receiver switching voltage of 1.5 V is close to the middle of the voltage swing. In addition, ground bounce and noise coupling is less. Experiment with the board and isolate crosstalk and noise issues from reflection issues. This can be done by driving a signal wire from a pulse generator and studying the reflections while other components and signals are passive.
13-69
ADSP-21160
a
S
CASE 1: BYPASS CAPACITORS ON NON-COMPONENT (BOTTOM) SIDE OF BOARD, BENEATH DSP PACKAGE
CASE 2: BYPASS CAPACITORS ON COMPONENT (TOP) SIDE OF BOARD, AROUND DSP PACKAGE
Oscilloscope Probes
When making high-speed measurements, be sure to use a bayonet type or similarly short (< 0.5 inch) ground clip, attached to the tip of the oscilloscope probe. The probe should be a low-capacitance active probe with 3 pF or less of loading. The use of a standard ground clip with 4 inches of ground lead causes ringing to be seen on the displayed trace and
13-70
System Design
makes the signal appear to have excessive overshoot and undershoot. A 1 GHz or better sampling oscilloscope is needed to see the signals accurately.
Recommended Reading
The text High-Speed Digital Design: A Handbook of Black Magic is recommended for further reading. This book is a technical reference that covers the problems encountered in state-of-the-art, high-frequency digital circuit design, and is an excellent source of information and practical ideas. Topics covered in the book include: High-Speed Properties of Logic Gates Measurement Techniques Transmission Lines Ground Planes and Layer Stacking Terminations and Vias Power Systems Ribbon Cables and Connectors Clock Distribution and Clock Oscillators High-Speed Digital Design: A Handbook of Black Magic, Johnson and Graham, Prentice Hall, Inc., ISBN 0-13-395724-1.
cute instructions from external memory without booting, a No boot mode may also be configured. For information on the setup and DMA processes for booting a single processor, see Bootloading Through The External Port on page 6-70 and Bootloading Through The Link Port on page 6-88 and Bootloading Through the SPI Port on page 6-113. Multiprocessor systems can be booted from a host processor, from external EPROM, through a link port, SPI port, or from external memory. Table 13-11. Booting Modes
Booting Mode EPROM Connect BMS to EPROM chip select Host processor Serial boot via SPI Link port No booting Processor executes from external memory Reserved EBOOT 1 LBOOT 0 BMS Output
0 0 0 0
0 1 1 0
x (Input)
13-72
System Design
13-73
processors can boot either identical code or different code from the EPROM. If the processors load differing code, a jump table (based on processor ID) can be used to select the code for each processor.
EBOOT LBOOT
ADDR DATA RD
ADSP-21161 (S1)
CS
EPROM
HERE, MULTIPLE SHARCS BOOT FROM THE SAME EPROM. FOR THIS CONFIGURATION, THE LOADER ROUTINE USES A JUMP TABLE. THIS TABLE INDICATES THE ADDRESS OF THE IMAGE THAT LOADS INTO EACH PROCESSOR. THE PROCESSORS CAN LOAD THE SAME IMAGE OR INDIVIDUAL IMAGES.
ADSP-21161 (S2)
BMS
CONTROL DATA
EBOOT LBOOT
ADSP-21161 (S6)
BMS
Figure 13-28. Alternating Booting From an EPROM Sequential Booting The EBOOT pin of the ADSP-21161 processor with IDx=1 must be set high for EPROM booting. All other processors should be configured for host booting (EBOOT=0, LBOOT=0, and BMS=1), which leaves them in the idle state at startup and allows the processor with IDx=1 to become bus master
13-74
System Design
and boot itself. Only the BMS pin of processor #1 is connected to the chip select of the EPROM. When processor #1 has finished booting, it can boot the remaining processors by writing to their external port DMA buffer 0 (EPB0) via multiprocessor memory space. An example system that uses this sequential technique appears in Figure 13-29.
13-75
EBOOT LBOOT
ADDR DATA RD
CS EPROM
13-76
DATA
System Design
data delay of one. Throughput is the maximum rate at which the operation is performed. Data delay and throughput are the same whether the access is from a host processor or from another ADSP-21161 processor.
Execution Stalls
The following events can cause an execution stall for the ADSP-21161 processor: 1 cycle on a program memory data access with instruction cache miss 2 cycles on non-delayed branches 2 cycles on normal interrupts 5 cycles on vector interrupts 1-2 cycles on short loops with small iterations n cycles on an IDLE instruction
DAG Stalls
1 cycle hold on register conflict
Memory Stalls
1 cycle on PM and DM bus access to the same block of internal memory n cycles if conflicting accesses to external memory n cycles if access to external memory (until I/O buffers are cleared out)
13-77
n cycles if external access and ADSP-21161 processor does not control the external bus n cycles until external access is complete (for example, waitstates, idle cycles, and so on.)
DMA Stalls
1 cycle if an access to a DMA parameter register conflicts with the DMA address generation (for example, writing to the register while a register update is taking place) or reading while a DMA register conflicts with DMA chaining 1 cycle if an access to a DMA parameter register or the DMASTAT register conflicts with DMA address generation. For example, one cycle stall occurs when writing to a DMA register while a register update is taking place. Similarly, a one cycle stall occurs when reading from a DMA register while DMA chaining is taking place. n cycles if writing (or reading) to a DMA buffer when the buffer is full (or empty)
13-78
System Design
Core processor access to external memory Synchronous access to slaves IOP registers Read (Transfer out) Write (Transfer in) Slave mode DMA Read (Transfer out) Write (Transfer in) Master mode DMA Transfer out Transfer in Handshake mode DMA3 Read/Write (Transfer in/out) External-Handshake mode DMA4 Read/Write (Transfer in/out)
0 2 3 3
1 The delay is between data in the IOP register and at the external port. For example, an IOP register is written in the second cycle after a write completes at the external port. 2 These transfer rates are limited by the speed of the read of the DMA FIFO buffer. When bursting is enabled, the first read requires three cycles. The maximum burst read throughput is 3-2-2-2. 3 The delay is between DMA and DMARx. 4 The delay is between DMARx and the external transfer.
Interrupts (IRQ2-0) Multiprocessor bus requests (BR1-6) Host bus request SYSCON effect latency
13-79
Host packing status update in SYSTAT register DMA packing status update in DMACx register DMA chain initialization Vector interrupt Serial ports1 Link ports 1x CCLK speed 1/2x CCLK speed 1/3x CCLK speed 1/4x CCLK speed 1
1
ADSP-21161 processor to ADSP-21161 processor transfers using 32-bit words. Link port throughput is decreased and cycle time increased when the link port clock divisor bits are set in the LCTL register.
The link port control register LCTL and the serial port control register SPCTLx share the same internal bus for reads and writes. Therefore, when a read of one of these registers followed by a write occurs, the write will require two processor cycles to complete.
13-80
A REGISTERS
The ADSP-21161 processor has general-purpose and dedicated registers in each of its functional blocks. The register reference information for each functional block includes bit definitions, initialization values, and memory-mapped addresses (for I/O processor registers). Information on each type of register is available at the following locations: Control and Status System Registers on page A-2 Processing Element Registers on page A-23 Program Sequencer Registers on page A-25 Data Address Generator Registers on page A-46 I/O Processor Registers on page A-47 When writing programs, it is often necessary to set, clear, or test bits in the processors registers. While these bit operations can all be done by referring to the bits location within a register or (for some operations) the registers address with a hexadecimal number, it is much easier to use symbols that correspond to the bits or registers name. For convenience and consistency, Analog Devices provides a header file that provides these bit and registers definitions. For more information, see the Register and Bit #Defines (def21161.h) on page A-121. Many registers have reserved bits. When writing to a register, programs may only clear (write zero to) the registers reserved bits.
A-1
MODE 1 register initialization value is 0x0000 0000 for revisions less than 1.0. For revisions greater than or equal to 1.0, the initialization value is 0x0100 0000 because circular buffering (CBUFEN) is enabled. MODE2_SHDW bits 31-25 are the processor ID and silicon revision number, so the initialization value varies with the processors ID2-0 pins input and the silicon revision.
A-2
Registers
BR0
SRCU
SRD1H
SRD1L
SRD2H
MODE 1 register initialization value is 0x0000 0000 for revisions less than 1.0. For revisions greater than or equal to 1.0, the initialization value is 0x0100 0000 because circular buffering (CBUFEN) is enabled.
A-3
SRRFH
9-8 10
11
NESTM
14
SSE
A-4
Registers
16
RND32
18-17
CSEL
20-19 21
A-5
31-25
A-6
Registers
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
MODE1
CBUFEN
0=Disable Circular 1=Enables Circular
RND32
0=Round Floating-Point Data to 40 bits 1=Round Floating-Point Data to 32 bits
BDCST1
0= Disable I1 Broadcast 1= Enable I1 Broadcast
CSEL
Condition Code Select 00=Bus Master Condition
BDCST9
0= Disable I9 Broadcast 1= Enable I9 Broadcast 15 14 13 12 11 10 9 0 0 0 0 0 0 0 8 0 7 0 6 0 5 0 4 0 3 0 2 0
PEYEN
0= Disable PEy- SISD mode 1= Enable PEy- SIMD mode 1 0 0 0
TRUNC
0=Floating-Point Round-to-Nearest 1=Floating-Point Truncation SSE 0=Disable Short Word Sign Extension 1=Enable Short Word Sign Extension
BR8 BR0
SRCU ALUSAT
0=Disable ALU Saturation 1=Enable ALU Saturation 0=Enable MR Primary 1=Enable MR Alternative 0=Enable DAG1 7-4 Primary 1=Enable DAG1 7-4 Alternate 0=Enable DAG1 3-0 Primary 1=Enable DAG1 3-0 Alternate
SRD1H SRD1L
IRPTEN
0=Disable Interrupts 1=Enable Interrupts
NESTM
0=Disable Interrupt Nesting 1=Enable Interrupt Nesting
SRD2H
0=Enable DAG2 15-12 Primary 1=Enable DAG2 15-12 Alternate 0=Enable DAG2 11-8 Primary 1=Enable DAG2 11-8 Alternate 0=Enable R15-R8 Primary 1=Enable R15-R8 Alternate
SRRFL
0=Enable R7-R0 Primary 1=Enable R7-R0 Secondary
SRD2L
SRRFH
A-7
2. When an IRQ2-0 timer expires or a VIRPT interrupt occurs. Example Before the PUSH STS instruction, MODE1 is set to 0x01202811. This MODE1 value corresponds to the following settings being enabled: Bit Reversing for I8 Secondary Registers for DAG2 (high) Interrupt Nesting, ALU Saturation Processor Element Y (SIMD) Circular Buffering The MMASK register (Figure A-2) is set to 0x0020 2001 indicating that you want to disable ALU Saturation, SIMD, and bit reversing for I8 after pushing the status stack. The value in MODE1 after PUSH STS is 0x0100 0810. The other settings that were previously in MODE1 remain the same. The only bits that are affected are those that are set both in MMASK and in MODE1. These bits are cleared after the status stack is pushed.
A-8
Registers
Note also that the reset value of MMASK is 0x0020 0000. If you do not make any changes to the MMASK register, the default setting automatically disables SIMD when servicing any of the hardware interrupts mentioned above, or during any push of the status stack.
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
MMASK
CBUFEN
0=Disable Linear 1=Enables Circular
RND32
0=Round Floating-Point Data to 40 bits 1=Round Floating-Point Data to 32 bits
BDCST1
0=Disable I1 Broadcast 1=Enable I1 Broadcast
CSEL
Condition Code Select 00=Bus Master Condition
BDCST9
0=Disable I9 Broadcast 1=Enable I9 Broadcast 15 14 13 12 11 10 0 0 0 0 0 0 9 0 8 0 7 0 6 0 5 0 4 0 3 0 2 0 1 0
PEYEN
0=Disable PEy-SISD mode 1=Enable PEy-SISD mode 0 0
0=Floating-point Rount-to-Nearest 1=Floating-Point Truncation SSE 0=Disable Short Word Sign Extension 1=Enable Short Word Sign Extension
TRUNC
BR8
0=Disable I8 Bit-Reversing (DAG 2) 1=Enable I8 Bit-Reversing (DAG 2)
BR0
0=Disable I0 Bit-Reversing (DAG 1) 1=Enable I0 Bit-Reversing (DAG 1)
SRCU
0=Enable MR Primary 1=Enable MR Alternative
ALUSTAT
0=Disable ALU Saturation 1=Enable ALU Saturation
SRD1H SRD1L
IRPTEN
0=Disable Interrupts 1=Enable Interrupts
NESTM
0=Disable Interrupt Nesting 1=Enable Interrupt Nesting
SRD2H
0=Enable DAG2 15-12 Primary 1=Enable DAG2 15-12 Alternate
SRRFL
0=Disable R7-R0 Primary 1=Enable R7-R0 Alternate
SRD2L
0=Enable DAG2 11-8 Primary 1=Enable DAG2 11-8 Alternate
SRRFH
0=Enable R15-R8 Primary 1=Enable R15-R8 Alternate
A-9
14-7 15 16
A-10
Registers
20
IIRAE
A-11
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
MODE2
U64MAE
0 FLG1O
0=FLAG1 Input 1=FLAG1 Output
FLG2O
0=FLAG2 Input 1=FLAG2 Output
IIRAE
0=Disable detection of Illegal IOP access 1=Enable detection of Illegal IOP access
FLG3O
0=FLAG3 Input 1=FLAG3 Output
CAFRZ
0=Cache Updates 1=Cache Freeze (No Updates)
15 14 13 12 11 10 9 0 0 0 0 0 0 0
8 0
7 0
6 0
5 0
4 0
1 0
0 0
0 0
FLG0O
0=FLAG0 Input 1=FLAG0 Output
IRQ0E
0=IRQ O Level-Sensitive 1=IRQ O Edge-Sensitive
BUSLK
0=No External Bus Lock 1=External Bus Lock
IRQ1E
0=IRQ 1 Level-Sensitive 1=IRQ 1 Edge-Sensitive
TIMEN
0=Disable Timer 1=Enable Timer
IRQ2E
0=IRQ 2 Level-Sensitive 1=IRQ 2 Edge-Sensitive
CADIS
0=Enable Cache 1=Disable Cache
A-12
Registers
If a program loads the ASTATx register manually, there is a one cycle effect latency before the new value in ASTATx can be used in a conditional instruction. Table A-4. Arithmetic Status Registers (ASTATx/y) Bit Definitions
Bit(s) 0 Name AZ Definition ALU Zero/Floating-Point Underflow. Indicates whether the last ALU operations result was zero (if set, =1) or non-zero (if cleared, =0). The ALU updates AZ for all fixed-point and floating-point ALU operations. AZ can also indicate a floating-point underflow. During an ALU underflow (indicated by a set (=1) AUS bit in the STKYx/y register), the processor sets AZ if the floating-point result is smaller than can be represented in the output format. 1 AV ALU Overflow. Indicates whether the last ALU operations result overflowed (if set, =1) or did not overflow (if cleared, =0). The ALU updates AV for all fixed-point and floating-point ALU operations. For fixed-point results, the processor sets AV and the AOS bit in the STKYx/y register when the XOR of the two most significant bits is a 1. For floating-point results, the processor sets AV and the AVS bit in the STKYx/y register when the rounded result overflows (unbiased exponent > 127). 2 AN ALU Negative. Indicates whether the last ALU operations result was negative (if set, =1) or positive (if cleared, =0). The ALU updates AN for all fixed-point and floating-point ALU operations.
A-13
Multiplier Negative. Indicates whether the last multiplier operations result was negative (if set, =1) or positive (if cleared, =0). The multiplier updates MN for all fixed-point and floating-point multiplier operations.
A-14
Registers
If the multiplier operation directs a fixed-point result to an MR register, the processor places the overflowed portion of the result in MR1 and MR2 for an integer result or places it in MR2 only for a fractional result.
A-15
If the multiplier operation directs a fixed-point, fractional result to an MR register, the processor places the underflowed portion of the result in MR0. 9 MI Multiplier Floating-Point Invalid Operation. Indicates whether the last multiplier operations input was invalid (if set, =1) or valid (if cleared, =0). The multiplier updates MI for floating-point multiplier operations. The processor sets MI and the MIS bit in the STKYx/y register if the ALU operation: 10 AF Receives a NAN input operand Receives an Infinity and Zero as input operands
ALU Floating-Point Operation. Indicates whether the last ALU operation was floating-point (if set, =1) or fixed-point (if cleared, =0). The ALU updates AF for all fixed-point and floating-point ALU operations.
A-16
Registers
12
SZ
17-14 18
A-17
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
ASTATx/y
CCAC
BTF
Bit Test Flag for System Registers
15 14 13 12 11 10 9 0 0 0 0 0 0 0
8 0
7 0
6 0
5 0
4 0
3 0
2 0
1 0
0 0
SS
Shifter Input Sign
AZ
SZ
Shifter Zero
AV
ALU Overflow
SV
Shifter Overflow
AN AC AS
ALU Negative ALU Fixed-Point Carry ALU X Input Sign (for ABS and MANT)
AF
ALU Floating-Point Operation
MI
Multiplier Floating-Point Invalid Operation
AI
ALU Floating-Point Invalid Operation
MU
Multiplier Floating-Point Underflow
MN
MV
Multiplier Overflow
Multiplier Negative
A-18
Registers
only indicates status for PEy operations. Table A-5, Figure A-5 and Figure A-6 lists bits for both STKYx and STKYy, noting with an the bits that apply only to STKYx. bits do not clear themselves after the condition they flag is no longer true. They remain sticky until cleared by the program.
STKY
The ADSP-21161 processor sets a STKY bit in response to a condition. For example, the processor sets the AUS bit in the STKY register when an ALU underflow set AZ in the ASTAT register. The processor clears AZ if the next ALU operation does not cause an underflow. The bit AUS remains set until a program clears the STKY bit. Interrupt service routines should clear their interrupts corresponding STKY bit so the processor can detect a re-occurrence of the condition. For example, an interrupt service routine for the floating-point underflow exception interrupt (FLTUI) would clear the AUS bit in the STKY register near the beginning of the routine. Table A-5. Sticky Status Registers (STKYx/y) Bit Definitions
Bit(s) 0 1 2 4-3 5 6 7 AIS MOS MVS Name AUS AVS AOS Definition At right: shows bits in both STKYx/y shows bits in STKYx only
ALU Floating-Point Underflow. A sticky indicator for the ALU AS bit. For more information, see AZ on page A-13. ALU Floating-Point Overflow. A sticky indicator for the ALU AV bit. For more information, see AV on page A-13. ALU Fixed-Point Overflow. A sticky indicator for the ALU AV bit. For more information, see AV on page A-13. Reserved ALU Floating-Point Invalid Operation. A sticky indicator for the ALU AI bit. For more information, see AI on page A-14. Multiplier Fixed-Point Overflow. A sticky indicator for the multiplier MV bit. For more information, see MV on page A-15. Multiplier Floating-Point Overflow. A sticky indicator for the multiplier MV bit. For more information, see MV on page A-15.
A-19
Multiplier Floating-Point Underflow. A sticky indicator for the multiplier MU bit. For more information, see MU on page A-16. Multiplier Floating-Point Invalid Operation. A sticky indicator for the multiplier MI bit. For more information, see MI on page A-16. Reserved
MIS
16-10 17 CB7S
DAG1 Circular Buffer 7 Overflow. Indicates whether a circular buffer being addressed with DAG1 register I7 has overflowed (if set, =1) or has not overflowed (if cleared, =0). A circular buffer overflow occurs when DAG circular buffering operation increments the I register past the end of buffer. DAG2 Circular Buffer 15 Overflow. Indicates whether a circular buffer being addressed with DAG2 register I15 has overflowed (if set, =1) or has not overflowed (if cleared, =0). A circular buffer overflow occurs when DAG circular buffering operation increments the I register past the end of buffer. Illegal IOP Register Access. Indicates if set (=1) whether a core, host, or multiprocessor access to I/O processor registers has occurred or has not occurred (if 0). Unaligned 64-Bit Memory Access. Indicates if set (=1) whether a Normal word access with the LW mnemonic addressing an uneven memory address has occurred or has not occurred (if 0). PC Stack Full. Indicates whether the PC stack is full (if 1) or not full (if 0)Not a sticky bit, cleared by a Pop. PC Stack Empty. Indicates whether the PC stack is empty (if 1) or not empty (if 0)Not sticky, cleared by a Push. Status Stack Overflow. Indicates whether the status stack is overflowed (if 1) or not overflowed (if 0)A sticky bit. Status Stack Empty. Indicates whether the status stack is empty (if 1) or not empty (if 0)Not sticky, cleared by a Push.
18
CB15S
19
IIRA
20
U64MA
21 22 23 24
A-20
Registers
Loop Stack Overflow. Indicates whether the loop counter stack and loop stack are overflowed (if 1) or not overflowed (if 0)A sticky bit. Loop Stack Empty. Indicates whether the loop counter stack and loop stack are empty (if 1) or not empty (if 0)Not sticky, cleared by a Push. Reserved
26
LSEM
31-27
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
STKYx
LSEM
Loop Stack Empty (read - only)
CB7S
DAG1 Circular Buffer 7 Overflow
CB15S
DAG2 Circular Buffer 15 Overflow
LSOV
Loop Stack Overflow (read- only)
IIRA
Illegal IOP Register Access 1= illegal access occured, 0= no illegal access
SSEM
Status Stack Empty (read - only)
SSOV
Status Stack Overflow (read-only)
U64MA
Unaligned 64-bit Memory Access 1=unaligned access has occured, 0=no access occured
PCEM
PC Stack Empty (read - only) Not sticky, cleared by PUSH
PCFL
PC Stack Full (read -only) Not sticky, cleared by POP
15 14 13 12 11 10 0 0 0 0 0 0
9 0
8 0
7 0
6 0
5 0
4 0
3 0
2 0
1 0
0 0
MIS
Multiplier Floating -Point Invalid Operation
AUS AVS
MUS
Mulitplier Floating -Point Underflow
AOS AIS
MVS
Multiplier Floating -Point Overflow
MOS
Multiplier Fixed -Point Overflow
A-21
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
STKYy
15 14 13 12 11 10 0 0 0 0 0 0
9 0
8 0
7 0
6 0
5 0
4 0
3 0
2 0
1 0
0 0
MIS
Multiplier Floating - Point Invalid Operation
AUS AVS
MUS
Mulitplier Floating -Point Underflow
AOS AIS
ALU Fixed - Point Overflow Multiplier Floating - Point Overflow Multiplier Fixed - Point Overflow
MVS
MOS
USTAT1
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
USTAT2
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
USTAT3
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
USTAT4
A-22
Registers
A-23
Binary Point
8 bits ZEROS
A-24
Registers
PX2
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
PX1
A-25
FLAGS bits 0-3 are equal to the values of the FLAG0-3 input pins after reset; the flag pins are configured as inputs after reset.
A-26
Registers
RSTI
IICDI
SOVFI
A-27
Since the timer expired event (TCOUNT decrements to zero) generates two interrupts, TMZHI and TMZLI, programs should unmask the timer interrupt with the desired priority and leave the other one masked. 5 VIRPTI Multiprocessor Vector Interrupt. Indicates whether a VIRPTI is latched and is pending (if set, =1) or no VIRPTI is pending (if cleared, =0). A VIRPTI occurs when one of the DSPs in a multiprocessor system writes an address (the vector) to the processors VIRPT register. IRQ2 Hardware Interrupt. Indicates whether an IRQ2I is latched and is pending (if set, =1) or no IRQ2I is pending (if cleared, =0). An IRQ2I occurs when an external device asserts the IRQ2 pin. IRQ1 Hardware Interrupt. Indicates whether an IRQ1I is latched and is pending (if set, =1) or no IRQ1I is pending (if cleared, =0). An IRQ1I occurs when an external device asserts the IRQ1 pin. IRQ0 Hardware Interrupt. Indicates whether an IRQ0I is latched and is pending (if set, =1) or no IRQ0I is pending (if cleared, =0). An IRQ0I occurs when an external device asserts the IRQ0 pin. Reserved
IRQ2I
IRQ1I
IRQ0I
A-28
Registers
11
SP1I
12
SP2I
13
SP3I
14
LPISUMI
A-29
External Port Buffer 1 DMA Interrupt. Indicates whether an EP1I is latched and is pending (if set, =1) or no EP1I is pending (if cleared, =0). For more information, see EP0I on page A-30. External Port Buffer 2 DMA Interrupt. Indicates whether an EP2I is latched and is pending (if set, =1) or no EP2I is pending (if cleared, =0). For more information, see EP0I on page A-30. External Port Buffer 3 DMA Interrupt. Indicates whether an EP3I is latched and is pending (if set, =1) or no EP3I is pending (if cleared, =0). For more information, see EP0I on page A-30. Link Port Service Request Interrupt. Indicates whether an LSRQI is latched and is pending (if set, =1) or no LSRQI is pending (if cleared, =0). An LSRQI occurs when an external source accesses an unassigned link port or accesses an assigned link port that has its link buffer disabled. DAG1 Circular Buffer 7 Overflow Interrupt. Indicates whether a CB7I is latched and is pending (if set, =1) or no CB7I interrupt is pending (if cleared, =0). For more information, see CB7S on page A-20. DAG2 Circular Buffer 15 Overflow Interrupt. Indicates whether a CB15I is latched and is pending (if set, =1) or no CB15I is pending (if cleared, =0). For more information, see CB15S on page A-20. Timer Expired (Low Priority) Interrupt. Indicates whether a TMZLI is latched and is pending (if set, =1) or no TMZLI is pending (if cleared, =0). For more information, see TMZHI on page A-28.
17
EP2I
18
EP3I
19
LSRQI
20
CB7I
21
CB15I
22
TMZLI
A-30
Registers
24
FLTOI
25
FLTUI
26
FLTII
27
SFT0I
28
SFT1I
29
SFT2I
30
SFT3I
31
A-31
registers. The bits in IMASK unmask (enable if set, =1) or mask (disable if cleared, =0) the interrupts that are latched in the IRPTL register. Except for RESET, all interrupts are maskable.
IRPTL
When IMASK masks an interrupt, the masking disables the processors response to the interrupt. The IRPTL register still latches an interrupt even when masked, and the processor responds to that latched interrupt if it is later unmasked. Table A-10 and Figure A-10 provide bit definitions for the IMASK register.
A-32
Registers
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
EP1I EP2I
EP3I
LSRQI
Link Port Service Request (0x60)
CB7I
FLTII
CB15I
FLTUI
TMZLI FIXI
15 14 13 12 11 10 9 0 0 0 0 0 0 0 8 0 7 0 6 0 5 0 4 0 3 0 2 0 1 0 0 0
FLTOI
EP0I
Ext.Port Buffer 0 DMA(0x50)
Emulator Interrupt (int vector address 0x00) Reset ( int vector address 0x04) Illegal Input Condition Detected (0x08)
LPISUMI
Link or SPI Buffer DMA Summary
SP3I
SPORT3 DMA (0x34)
SP2I
SPORT2 DMA (0x30)
SOVFI
SP1I
SPORT1 DMA (0x2C)
TMZHI
SP0I
SPORT0 DMA (0x28) IRQ0 Asserted (0x20)
Multiprocessor Vector Interrupt (0x14) IRQ2 Asserted (0x18) IRQ1 Asserted (0x1C)
IRQ0I
A-33
Note that LP0 is set irrespective of whether the link port is enabled in core mode or DMA mode. 1 LP1I Link Port Buffer 1 DMA Interrupt. Indicates whether an LP1 interrupt is latched and is pending (if set, =1) or no LP1 interrupt is pending (if cleared, =0). SPI Receive DMA Interrupt Latch. Indicates whether an SPIRI is latched and is pending (if set, =1) or no SPIRI is pending (if cleared, =0). SPI Transmit DMA Interrupt Latch. Indicates whether an SPITI is latched and is pending (if set, =1) or no SPITI is pending (if cleared, =0).
SPIRI
SPITI
A-34
Registers
Table A-10. Link Port Interrupt Latch, Mask, and Mask Pointer Register (LIRPTL) Bit Definitions (Contd)
Bit 15-4 16 LP0MSK Name Definition Reserved Link Buffer 0 DMA Interrupt Mask. This bit unmasks the LP0 interrupt (if set, =1) or masks the LP0 interrupt (if cleared, =0). For more information on how interrupt masking works, see Interrupt Latch Register (IRPTL) on page A-27. Link Buffer 1 DMA Interrupt Mask. This bit unmasks the LP1 interrupt (if set, =1) or masks the LP1 interrupt (if cleared, =0). For more information on how interrupt masking works, see Interrupt Latch Register (IRPTL) on page A-27. SPI Receive DMA Interrupt Mask. This bit unmasks the SPIR interrupt (if set, =1) or masks the SPIR interrupt (if cleared, =0). For more information on how interrupt masking works, see Interrupt Latch Register (IRPTL) on page A-27. SPI Transmit DMA Interrupt Mask. This bit unmasks the SPIT interrupt (if set, =1) or masks the SPIT interrupt (if cleared, =0). For more information on how interrupt masking works, see Interrupt Latch Register (IRPTL) on page A-27. Reserved LP0MSKP Link Buffer 0 DMA Interrupt Mask Pointer. When the ADSP-21161 processor is servicing another interrupt, indicates whether the LP0 interrupt is unmasked (if set, =1) or the LP0 interrupt is masked (if cleared, =0). For more information on how interrupt mask pointers works, see Interrupt Mask Pointer Register (IMASKP) on page A-32. Link Buffer 1 DMA Interrupt Mask Pointer. When the processor is servicing another interrupt, this bit indicates whether the LP1 interrupt is unmasked (if set, =1) or the LP1 interrupt is masked (if cleared, =0). For more information on how interrupt mask pointers works, see Interrupt Mask Pointer Register (IMASKP) on page A-32.
17
LP1MSK
18
SPIRMSK
19
SPITMSK
23-20 24
25
LP1MSKP
A-35
Table A-10. Link Port Interrupt Latch, Mask, and Mask Pointer Register (LIRPTL) Bit Definitions (Contd)
Bit 26 Name SPIRMSKP Definition SPI Receive DMA Interrupt Mask Pointer. When the processor is servicing another interrupt, this bit indicates whether the SPIR interrupt is unmasked (if set, =1) or the SPIR interrupt is masked (if cleared, =0). For more information on how interrupt mask pointers works, see Interrupt Mask Pointer Register (IMASKP) on page A-32. SPI Transmit DMA Interrupt Mask Pointer. When the processor is servicing another interrupt, this bit indicates whether the SPIT interrupt is unmasked (if set, =1) or the SPIT interrupt is masked (if cleared, =0). For more information on how interrupt mask pointers works, see Interrupt Mask Pointer Register (IMASKP) on page A-32. Reserved
27
SPITMSKP
31-28
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
LIRPTL
SPITMSKP
LP0MSK
Link Buffer 0 DMA Interrupt Mask
LP1MSK
Link Buffer 1 DMA Interrupt Mask
SPIRMSKP
SPI Receive DMA Interrupt Mask Pointer
SPIRMSK
SPI Receive DMA Interrupt Mask
LP1MSKP
Link Buffer 1 DMA Interrupt Mask Pointer
SPITMSK
SPI Transmit DMA Interrupt Mask
LP0MSKP
Link Buffer 0 DMA Interrupt Mask Pointer 15 14 13 12 11 10 9 0 0 0 0 0 0 0 8 0 7 0 6 0 5 0 4 0 3 2 1 0 0 0
0 0
SPITI
SPI Transmit DMA Interrupt Latch (0x44)
LP0I
Link Buffer 0 DMA Interrupt Latch Interrupt Vector Address Offset- 0x38
SPIRI
SPI Receive DMA Interrupt Latch (0x40)
LP1I
Link Buffer 1 DMA Interrupt Latch (0x3c)
A-36
Registers
A-37
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
FLAGS
15 14 13 12 11 10 9 0 0 0 0 0 0 0
8 0
7 0
6 0
5 0
4 0
1 0
0 0
0 0
FLG2
0=FLG2 pin cleared 1=FGL2 pin set 0=FLG3 pin cleared 1=FGL3 pin set
FLG0
0=FLG0 pin cleared 1=FGL0 pin set 0=FLG1 pin cleared 1=FGL1 pin set
FLG3
FLG1
A-38
Registers
on these flags. However, it is possible to execute these operations indirectly by writing to system registers such as USTAT1, USTAT2, USTAT3 or USTAT4. Table A-12. IOFLAG Register (IOFLAG) Bit Definitions
Bit 0 1 2 3 4 5 6 7 8 Name FLG4 FLG5 FLG6 FLG7 FLG8 FLG9 FLG10 FLG11 FLG4O Definition FLAG4 Value. Indicates the state of the FLAG4 pin, whether the pin is high (if set, =1) or low (if cleared, =0). FLAG5 Value. Indicates the state of the FLAG5 pin, whether the pin is high (if set, =1) or low (if cleared, =0). FLAG6 Value. Indicates the state of the FLAG6 pin, whether the pin is high (if set, =1) or low (if cleared, =0). FLAG7 Value. Indicates the state of the FLAG7 pin, whether the pin is high (if set, =1) or low (if cleared, =0). FLAG8 Value. Indicates the state of the FLAG8 pin, whether the pin is high (if set, =1) or low (if cleared, =0). FLAG9 Value. Indicates the state of the FLAG9 pin, whether the pin is high (if set, =1) or low (if cleared, =0). FLAG10 Value. Indicates the state of the FLAG10 pin, whether the pin is high (if set, =1) or low (if cleared, =0). FLAG11 Value. Indicates the state of the FLAG11 pin, whether the pin is high (if set, =1) or low (if cleared, =0). FLAG4 Output Select. This bit selects the I/O direction for the FLAG4 pin; the flag is programmed as an output (if set, =1) or input (if cleared, =0). FLAG5 Output Select. This bit selects the I/O direction for the FLAG5 pin; the flag is programmed as an output (if set, =1) or input (if cleared, =0). FLAG6 Output Select. This bit selects the I/O direction for the FLAG6 pin; the flag is programmed as an output (if set, =1) or input (if cleared, =0). FLAG7 Output Select. This bit selects the I/O direction for the FLAG7 pin; the flag is programmed as an output (if set, =1) or input (if cleared, =0).
FLG5O
10
FLG6O
11
FLG7O
A-39
13
FLG9O
14
FLG10O
15
FLG11O
31-16
A-40
Registers
31 30 29 28 27 26 25 24 23 22 21 20
19 18 17 16 0 0 0 0
IOFLAG
0x1B
15 14 0 0
13 12 0 0
11 10 0 0
9 0
8 0
7 0
6 0
5 0
4 0
3 0
2 0
1 0
0 0
FLG11O
0=FLAG11 Input 1=FLAG11 Output
FLG4
FLAG4 Value (Low=0, High=1)
FLG10O
0=FLAG10 Input 1=FLAG10 Output
FLG5
FLAG5 Value
FLG9O
0=FLAG9 Input 1=FLAG9 Output
FLG6
FLAG6 Value
FLG8O
0=FLAG8 Input 1=FLAG8 Output
FLG7
FLAG7 Value
FLG7O
0=FLAG7 Input 1=FLAG7 Output
FLG8
FLAG8 Value
FLG6O
0=FLAG6 Input 1=FLAG6 Output
FLG9
FLAG9 Value
FLG50
0=FLAG5 Input 1=FLAG5 Output
FLG10
FLAG10 Value
FLG40
0=FLAG4 Input 1=FLAG4 Output
FLG11
FLAG11 Value
A-41
As shown in Figure A-14, the address buses can handle 32-bit addresses, but the program sequencer only generates 24-bit addresses over the PM bus. Since the sequencer generates 24-bit addresses, sequencing is limited to the low 64 Mwords of the processors 254 Mword memory map. PM and DM Address Buses and DAGs Can Handle 32-Bit Addresses Program Sequencer Handles 24-Bit Addresses
V Field
E Field
M Field S Field
Bits 19-17, System (Internal) Memory Bits 20, Multiprocessor Memory Bits 23-21, External Memory Bits 27-24, External Banked Memory Four fields in the address identify the type of memory being addressed.
A-42
Registers
Table A-13 describes the three fields that appear in Figure A-14. The content of the External (E), Multiprocessor (M), and System (S) fields in the address route the data or instruction access to the memory space. Table A-13. PM and DM Address Bus E, M, and S Fields
Bit Field E Description External Address Values in this field have the following meaning: all zeros non-zero M The address is in the IOP registers of another ADSP-21161 processor (M and S activated) The address is in external memory; with the E bits active, remaining bits [20-0] are a valid address
Multiprocessor Values in this field have the following meaning: non-zero 1 0 ID of another ADSP-21161 processor Write to IOP register of an ADSP-21161 processor. This field is only set for accesses between ADSP-21161 processors. Address in the processors own internal memory
System Values in this field have the following meaning: 000 001 01x 1xx Address of an IOP register Address in Long Word Addressing space Address in Normal Word Addressing space Address in Short Word Addressing space
Virtual Values in this field have the following meaning: 00 Depends on E, S1-0, and M bits; address corresponds to locals internal or external (bank 0) memory or to remote processors IOP space. External memory bank 1, local processor External memory bank 2, local processor External memory bank 3, local processor
01 10 11
A-43
A-44
Registers
A-45
A-46
Registers
A-47
0000 through 0x0000 01FF of the memory map. The I/O registers control the following operations: External port DMA, Link port DMA, Serial port DMA and SPI port DMA. I/O processor registers have a one cycle effect latency (changes take effect on the second cycle after the change). Since the I/O processors registers are part of the processors memory map, buses access these registers as locations in memory. While these registers act as memory-mapped locations, they are separate from the processors internal memory and have different bus access. One bus can access one I/O processor register from one I/O processor register group at a time. Table A-16 lists the I/O processor register groups. When there is contention among the buses for access to registers in the same I/O processor register group, the processor arbitrates register access as follows: External Port (EP) bus accesses (highest priority) Data Memory (DM) bus accesses Program Memory (PM) bus accesses I/O processor (I/O) bus (lowest priority) accesses DMA parameter register or DMASTAT register conflicts There is a one cycle DMA stall if an access to a DMA parameter register or the DMASTAT register conflicts with DMA address generation. For example, one cycle stall occurs when writing to a DMA register while a register update is taking place. Similarly, a one cycle stall occurs when reading from a DMA register while DMA chaining is taking place. The bus with highest priority gets access to the I/O processor register group, and the other buses are held off from accessing that I/O processor register group until that access been completed.
A-48
Registers
There is one exception to this access contention rule. The I/O bus and EP bus can simultaneously access the EP (External Port) group of registers, allowing DMA transfers to internal memory at full speed. Table A-16. I/O Processor Register Groups
Register Group System Control (SC) Registers DMA Address (DA) Registers I/O Processor Registers In This Group SYSCON, VIRPT, WAIT, SYSTAT, MSGR0, MSGR1, MSGR2, MSGR3, MSGR4, MSGR5, MSGR6, MSGR7, BMAX, BCNT, PC_SHDW, IOFLAG, MODE2_SHDW, DMASTAT II0A, II0B, IM0A, IM0B, C0A, C0B, CP0A, CP0B, GP0A, GP0B, II1A, II1B, IM1A, IM1B, C1A, C1B, CP1A, CP1B, GP1A, GP1B, II2A, II2B, IM2A, IM2B, C2A, C2B, CP2A, CP2B, GP2A, GP2B, II3A, II3B, IM3A, IM3B, C3A, C3B, CP3A, CP3B, GP3A, GP3B, IILB0 (IISRX), IMLB0 (IMSRX), CLB0 (CSRX), GPLB0 (GPSRX), IILB1(IISTX), IMLB1 (IMSTX), CLB1, GPLB0 (GPSTX), IIEP0, IMEP0, CEP0, CPEP0, GPEP0, EIEP0, EMEP0, ECEP0, IIEP1, IMEP1, CEP1, CPEP1, GPEP1, EIEP1, EMEP1, ECEP1, IIEP2, IMEP2, CEP2, CPEP2, GPEP2, EIEP2, EMEP2, ECEP2, IIEP3, IMEP3, CEP3, CPEP3, GPEP3, EIEP3, EMEP3, ECEP3, EI13, EM13, EC13 EPB0, EPB1, EPB2, EPB3, DMAC10,DMAC11,DMAC12,DMAC13
A-49
Since the I/O processor registers are memory-mapped, the ADSP-21161 processors architecture does not allow programs to directly transfer data between these registers and other memory locations, except as part of a DMA operation. To read or write I/O processor registers, programs must use the processor core registers. The following example code shows a value being transferred from memory to the USTAT1 register, then the value is transferred to the I/O processor WAIT registers.
USTAT2= 0x108421; /* 1st instr. to be executed after reset */ DM(WAIT)=USTAT2; /* Set external memory waitstates to 0 */
A-50
Registers
The register names for I/O processor registers are not part of the processors assembly syntax. To ease access to these registers, programs should use the #include command to incorporate a file containing the registers symbolic names and addresses. An example #include file appears in the Register and Bit #Defines (def21161.h) on page A-121. Table A-17. I/O Processor Registers Memory Map
Register Address 0x000 0x001 0x002 0x003 0x004 0x006 0x008 0x009 0x00A 0x00B 0x00C 0x00D 0x00E 0x00F 0x010 0x011 0x014 Register Name SYSCON VIRPT WAIT SYSTAT EPB0 EPB1 MSGR0 MSGR1 MSGR2 MSGR3 MSGR4 MSGR5 MSGR6 MSGR7 PC_SHDW MODE2_ SHDW EPB2 Initialization After Reset 0x0001 0020 0x0004 0014 0x01ce 739c 0x000n 0nn0 ni ni ni ni ni ni ni ni ni ni ni 0xnn00 0000 ni Register Group Reference SC SC SC SC EP EP SC SC SC SC SC SC SC SC SC SC EP page A-60 page A-63 page A-65 page A-69 page A-76 page A-76 page A-77 page A-77 page A-77 page A-77 page A-77 page A-77 page A-77 page A-77 page A-77 page A-78 page A-76
Notes: An ni in the Initialization column indicates that the register is Not Initialized. For information on Register Groups, see Table A-16 on page A-49.
A-51
Register Group Reference EP SC SC SC EP EP EP EP DA DA DA DA DA SC DA DA DA DA DA DA DA DA page A-76 page A-79 page A-79 page A-38 page A-80 page A-80 page A-80 page A-80 page A-90 -
ni1
Notes: An ni in the Initialization column indicates that the register is Not Initialized. For information on Register Groups, see Table A-16 on page A-49.
A-52
Registers
ni1 ni ni ni ni ni ni ni ni ni ni ni ni ni ni ni ni ni ni
1
Notes: An ni in the Initialization column indicates that the register is Not Initialized. For information on Register Groups, see Table A-16 on page A-49.
A-53
Notes: An ni in the Initialization column indicates that the register is Not Initialized. For information on Register Groups, see Table A-16 on page A-49.
A-54
Registers
Notes: An ni in the Initialization column indicates that the register is Not Initialized. For information on Register Groups, see Table A-16 on page A-49.
A-55
Notes: An ni in the Initialization column indicates that the register is Not Initialized. For information on Register Groups, see Table A-16 on page A-49.
A-56
Registers
Notes: An ni in the Initialization column indicates that the register is Not Initialized. For information on Register Groups, see Table A-16 on page A-49.
A-57
Notes: An ni in the Initialization column indicates that the register is Not Initialized. For information on Register Groups, see Table A-16 on page A-49.
A-58
Registers
Notes: An ni in the Initialization column indicates that the register is Not Initialized. For information on Register Groups, see Table A-16 on page A-49. 1 Initialization depends on the booting mode.
A-59
BSO
IIVT
3 5-4 6 7
8 9
A-60
Registers
11
ADREDY
15-12 16
18-17
EBPR
19
DCPR
20
LDCPR
21
PRROT
22
COD
A-61
29-24 31-30
A-62
Registers
SYSCON (0x0000)
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
IPACK
External Packed Instruction Execution Mode 00=32 to 48 packed instruction execution 01=Full 48-bit instruction execution /No-Packing Mode 10=16 to 48 packed instruction execution 11=8 to 48 packed instruction execution
BHD
Buffer Hang Disable 0=buffer hang enabled (DEFAULT), 1=disabled buffer hang
EBPR
External Bus Priority 00=even priority between core processor and IOP bus 01=core processor priority, 10=I/O processor priority
COPT
CLKOUT Option 0=CLKOUT controlled by COD bit 1=CLKOUT driven by MMS master
DCPR
DMA rotating access priority DMA channels 10-13, 1= rotating, 0=sequential
COD
Clock Out Disable 0=CLKOUT Enabled, 1=CLKOUT Disabled
LDCPR
DMA rotating access priority DMA channels 8 & 9 1=rotating, 0=sequential
PRROT
Link Port/External Port Rotating Priority 1=rotating priority, 0=fixed priority between DMA chs 8/9 & 10/11/12/13 15 14 13 12 11 10 9 1 1 0 1 0 0 0
8 0
7 0
6 0
5 1
4 0
3 0
2 0
1 0
0 0
ADREDY
Active Drive REDY 0=open drain (o/d), 1=active drive (a/d)
SRST
Soft Reset
IMDW1
Internal Memory Block 1 Data Width 0=32-bit data, 1=40-bit data Internal Memory Block 0 Data Width 0=32-bit data, 1=40-bit data
BSO IIVT
Boot Select Override Internal Interrupt Vector Table (no boot mode)
IMDW0
HBW
HMSWF
A-63
commands in multiple-processor systems. This interrupt occurs when an external processor (a host or another processor) writes an address to the VIRPT register, inserting a new vector address for VIRPT. Table A-19. Vector Interrupt Address Register (VIRPT) Bit Definitions
Bit(s) 23-0 Name VIRPTA Definition Vector Interrupt Address. These bits contain the multiprocessor interrupts vector (address). When an external processor loads an address into this register, the processor pushes the status stack and starts executing the routine at the vector address. Vector Interrupt (optional) Data. These bits contain optional data that the external processor may pass to the interrupt service routine.
31-24
VIRPTD
VIRPT
(0x01)
31 30 29 28 27 26 25 24 23 22 21 20 0 0 0 0 0 0 0 0 0 0 0 0
19 18 17 16 0 1 0 0
VIRPTD
Vector Interrupt (optional) Data (contains optional data from extern processor for ISR)
15 14 13 12 0 0 0 0
11 10 0 0
9 0
8 0
7 0
6 0
5 0
4 1
3 0
2 1
1 0
0 0
VIRPTA
Vector Interrupt Address (contains interrupt vector address loaded by extern processor)
A-64
Registers
A-65
01
10
11 4-2 EB0WS
External Bank 0 Waitstates. These bit fields select the waitstates for external memory Bank 0 as follows: EBxWS 000 001 010 011 100 101 110 111 # of Waitstates 0 1 2 3 4 5 6 7 Hold Time Cycle? no no yes yes yes yes yes yes
Note that Hold Cycles applies to asynchronous mode only. 6-5 9-7 11-10 EB1AM EB1WS EB2AM External Bank 1 Access Mode. (see EB0AM definition) External Bank 1 Waitstates. (see EB0WS definition) External Bank 2 Access Mode. (see EB0AM definition)
A-66
Registers
Table A-20. External Memory Setup Register (WAIT) Bit Definitions (Contd)
Bit(s) 14-12 16-15 19-17 21-20 24-22 29-25 30 HIDMA Name EB2WS EB3AM EB3WS RBAM RBWS Definition External Bank 2 Waitstates. (see EB0WS definition) External Bank 3 Access Mode. (see EB0AM definition) External Bank 3 Waitstates. (see EB0WS definition) ROM Boot Access Mode. (see EB0AM definition) ROM Boot Waitstates. (see EB0WS definition) Reserved Handshake and Idle for DMA Enable. This bit enables (if set, =1) or disables (if cleared, =0) adding an idle cycle after every memory access for DMAs with handshaking (DMAR-DMAG). The added cycle reduces bus contention by accommodating devices with a slow three-state time. Also, the added cycle accommodates long write recovery time by de-asserting DMAG longer. 31 Reserved
A-67
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
WAIT (0x0002)
1 1
HIDMA
Handshake and Idle for DMA enable 0 =no idle cycle 1=adds an idle cycle after every handshake DMA DMAG asserted longer reduces bus contention for slower devices
EB3AM
External Bank 3 Access Mode
EB3WS
External Bank 3 waitstates
RBAM
ROM Boot Access Mode
RBWS
ROM Boot Waitstates
15 14 13 12 11 10 9 0 1 1 1 0 0 1
8 1
7 1
6 0
5 0
4 1
3 1
2 1
1 0
0 0
EB2WS
External Bank 2 waitstates
EB0AM
External Bank 0 Access Mode 00=Async, uses both internal waitstate& ext ACK 01=Sync (RD~ and WR~ change on CLKOUTsedge) min 2 cycle reads, 1 cycle writes (EP0WS=001) 10=Sync (RD~ and WR~ change on CLKOUTsedge) min 2 cycle reads, 2 cycles writes (EP0WS=001) 11= reserved External Bank 0 Waitstates 000= 0 waitstates , no hold time cycle 001=1 waitstate, no hold time cycle, minimum for sync 010=2 waitstates, hold time cycle 011=3 waitstates, hold time cycle 100=4 waitstates, hold time cycle 101=5 waitstates, hold time cycle 110=6 waitstates, hold time cycle 111=7 waitstates, hold time cycle (hold time cycles for Async Mode only)
EB2AM
External Bank 2 Access Mode
EB1WS
External Bank 1 waitstates
EB1AM
External Bank 1 Access Mode
EB0WS
A-68
Registers
BSYN
3-2 6-4
7 10-8 12-11 13
A-69
19 20
21
SWPD
A-70
Registers
SYSTAT
0x03
HPS
Host Packing Status 000=packing complete [6th stage of 8-to -48, 4th stage of 8 -to-32, etc.] 001=1st stage pack/unpack 010=2nd stage pack/unpack 011=3rd stage pack/unpack 100=5th stage of 8- to -48 bit packing 101=110=111=reserved
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 0 0 0 0 0 0 0 0 0 0 0 0 0
CRAT
CCLK-to-CLKIN ratio Indicate state of CLKCFG[1:0] pins Undefined at RESET~
SSWPD
Synchronous Slave Write FIFO Data Pending 1=sync slave IOP register write pending 0=no sync slave IOP register write pending
SWPD
Slave Write FIFO Data Pending any data (sync or async) 1=slave write pending to IOP register 0=slave no write pending to IOP register
15 14 13 12 11 10 9 0 0 0 0 0
7 0
1 0
0 0
0 0
Vector Interrupt Pending 1=Vector interrupt pending ID Code Displays state of the ID[2:0] pins
VIPD
HSTM
Host Bus Master 1=host bus master controls ext bus 0=no host bus master
IDC
BSYN
CRBM
Current ADSP- 21161 Bus Master Status of ID of DSP who is Bus Master CRPM=001 when ID=000
A-71
SDRDIV 0xB9
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
SDRDIV=
CL
RP
A-72
Registers
DSDCTL
3 7-4
DSDCK1 SDTRAS
10-8
SDTRP
A-73
171
SDEM1
181
SDEM2
191
SDEM3
A-74
Registers
21
SDCKR
22 23
26-24
SDTRCD
31-27 1
The CS pin of a SDRAM chip should be connected to MSx pin of the ADSP-21161 processor for the corresponding memory bank in which you want to map the SDRAM device. All four memory banks can have SDRAM simultaneously.
A-75
SDCTL
(0x00B8)
SDTRCD
SDRAM tRCD spec RAS to CAS delay [# of SDCLK cycles: 1 to 7 cycles] Pipelining option with external reg buffer [1=ext SDRAM ctl/addr buffer enable 0=no buffer option]
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
SDEM0
Ext mem Bank0 SDRAM enable
SDEM1
Ext mem Bank1 SDRAM enable
SDBUF
SDEM2
Ext mem Bank2 SDRAM enable
SDCKR
SDCLK-to-CCLK ratio 0=Half CCLK (core clock) freq. (1:2) 1=CCLK Core clock freq. (1:1)
SDEM3
Ext mem Bank 3 SDRAM enable 8 0 7 0 6 0 5 0 4 0 3 0 2 0 1 0 0 0
SDBN
SDRAM # of SDRAM device mem banks 0=2 banks, 1=4 banks 15 14 13 12 11 10 9 0 0 0 0 0 0 0
SDSRF
SDRAM self refresh command enable
SDCL
SDRAM CAS Latency spec 01=1 cycle, 10=2 cycles, 11=3 cycles Disable SDCLK0 &Control Signals 1=Disable SDCLK0, RAS~, CAS~ & SDCLKE 0=Activate SDCLK0, RAS~, CAS~ & SDCLKE SDCLK1 Disable 1=disable SDCLK1, 0=SDCLK active
SDPSS
SDRAM Power- up sequence
DSDCTL
SDPGS
SDRAM Page Size 00=256 words 01=512 words 10=1k words 11=2k words
DSDCK1
SDTRAS
SDRAMtRAS spec Active Command Delay [# of SDCLK cycles: 0 to 15 cycles]
SDPM
SDRAM Power- up mode 0=prechg , 8 CBR refs., mode reg. set 1=prechg, mode reg. set, 8 CBR refs.
SDTRP
SDRAMtRP spec PrechargeDelay [# of SDCLK cycles: 1 to 7 cycles]
Registers
Normally, a DMA process automatically accesses the buffer register for memory transfer. Programs can also access these buffers as registers. However, programs must use the PX register to access the full width of the buffer. A PX register move can access the entire 64 bits of an external port buffer using the full width PX.
PC_SHDW
(0x10)
0 9 0
0 8 0
15 14 0 0
13 12 0 0
11 10 0 0
A-77
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16
MODE2_SHDW 0x11
PID4- 3
Processor Identification (Read Only)
PID 2-0
Processor Identification (Read Only)
Silicon Revision
(Read Only) Revision 0.3=00 Revision 1.0/1.1=01 Revision 1.2=10
15 14 13 12 11 10 9 0 0 0 0 0 0 0
8 0
7 0
6 0
5 0
4 0
1 0
0 0
0 0
A-78
Registers
BMAX (0x18)
15 14 13 12 0 0 0 0
11 10 0 0
Figure A-23. BMAX Register For more information describing how BMAX and BCNT work, see Bus Mastership Timeout on page 7-101.
A-79
31 30
29 28 0 13 0 0 12 0
27 0
26 0
25 0 9 0
24 0 8 0
23 22 0 7 0 0 6 0
21 20 0 5 0 0 4 0
19 18 0 3 0 0 2 0
17 0 1 0
16 0 0 0
BCNT
(0x19)
15 14 0 0
11 10 0 0
BCNT=# CCLK cycles r e m a i n i n g for DSP to retain bus m astership (decrem ents every cycle)
CHEN
A-80
Registers
Table A-24. External Port DMA Control Registers (DMACx) Bit Definitions (Contd)
Bit(s) 2 Name TRAN Definition External Port Transmit/Receive Select. This bit selects the transfer direction (transmit if set, =1) (receive if cleared, =0) for the corresponding external port FIFO buffer (EPBx). Reserved DTYPE External Port Data Type Select. This bit selects the transfer data type (40/48=bit, 3-column if set, =1) (32/64-bit, 4-column if cleared, =0) for the corresponding external port FIFO buffer (EPBx). Programs must not change a buffers DTYPE setting while the buffer is enabled. The buffers DTYPE setting overrides the internal memory blocks setting IMDWx for Normal word width. Whether buffer is set for 48- or 64- bit words, programs must index (IIx) the corresponding DMA channel with a Normal word address; always an even address 64-bit. 8-6 PMODE External Port Packing Mode. These bits select the packing mode for the corresponding external port FIFO buffer (EPBx) as follows: 001=16 external to 32/64 internal packing, 010=16 external to 48 internal packing, 011=32 external to 48 internal packing, 101=8 external to 48 internal packing, 100=32 external to 32/64 internal packing (No pack), 110=8 external to 32/64 internal packing, 000 =111=reserved. Programs must not change a buffers PMODE setting while the buffer is enabled. For host processor accesses through the external port, the buffers PMODE setting must match the Host Bus Width (HBW) setting in the SYSCON registers. 9 MSWF Most Significant 16-bit Word First During Packing. When the buffers PMODE is 001 or 010, this bit selects the packing order of 16-bit words (most significant first set, =1) (least significant first cleared, =0) for the corresponding external port FIFO buffer (EPBx). Programs must not change a buffers MSWF setting while the buffer is enabled.
4-3 5
A-81
Table A-24. External Port DMA Control Registers (DMACx) Bit Definitions (Contd)
Bit(s) 10 Name MASTER Definition Master Mode Enable. This bit enables (if set, =1) or disables (if cleared, =0) master mode for the corresponding external port FIFO buffer (EPBx). Programs must not change a buffers MASTER setting while the buffer is enabled. The MASTER, HSHAKE, and EXTERN bits work together to select the external port buffers mode. 11 HSHAKE Handshake Mode Enable. This bit enables (if set, =1) or disables (if cleared, =0) handshake mode for the corresponding external port FIFO buffer (EPBx). Programs must not change a buffers HSHAKE setting while the buffer is enabled. The MASTER, HSHAKE, and EXTERN bits work together to select the external port buffers mode. 12 INTIO Single-Word Interrupt Enable. This bit enables (if set, =1) or disables (if cleared, =0) single-word, non-DMA, interrupt-driven transfers for the corresponding external port FIFO buffer (EPBx). To avoid spurious interrupts, programs must not change a buffers INTIO setting while the buffer is enabled. External Handshake Mode Enable. This bit enables (if set, =1) or disables (if cleared, =0) external handshake mode for the corresponding external port FIFO buffer (EPBx). Programs must not change a buffers EXTERN setting while the buffer is enabled. The MASTER, HSHAKE, and EXTERN bits work together to select the external port buffers mode.
13
EXTERN
A-82
Registers
Table A-24. External Port DMA Control Registers (DMACx) Bit Definitions (Contd)
Bit(s) 14 Name FLSH Definition Flush DMA Buffers & Status. This bit flushes (when set, =1) settings for the corresponding external port FIFO buffer (EPBx). Flushing these settings does the following: Clears (=0) the FS and PS status bits Clears (=0) the FIFO buffer and DMA request counter Clears (=0) any partially packed words
When a program sets (=1) FLSH, the processor flushes the settings and clears (=0) FLSH. There is a two-cycle effect latency in completing the flush operation. Programs must not set a buffers FLSH during the same write that enables the buffer. Also, programs must not set a buffers FLSH bit while the DMA channel is active. Programs should determine the channels active status by reading the corresponding bit in the DMASTAT register. 15 PRIO External Port Bus Priority. This bit selects the external bus access priority level (high if set, =1) (low if cleared, =0) for the corresponding external port FIFO buffer (EPBx). Programs must not change a buffers PRIO setting while the buffer is enabled. When PRIO is set, the processor asserts the PA pin as part of external bus arbitration for DMA accesses using this buffer. The PRIO bit does not effect internal DMA priority arbitration.
A-83
Table A-24. External Port DMA Control Registers (DMACx) Bit Definitions (Contd)
Bit(s) 17-16 Name FS Definition External Port FIFO Buffer Status. These bits indicate the corresponding FIFO buffers status as 00=buffer empty, 01=buffer-not-full, 10=buffer-not-empty, 11=buffer full. For transmit (TRAN=1), buffer-not-full means that the buffer has space for one Normal word, and buffer-not-empty means that the buffer has space for two-or-more Normal words. For receive (TRAN=0), buffer-not-full means that the buffer contains one Normal word, and buffer-not-empty means that the buffer contains two-or-more Normal words. Any type of full status (01, 10, or 11) in receive mode indicates that new (unread) data is in the buffer. These bits are read-only. The processor clears these bits when DEN is cleared (changes from 1 to 0). 18 INT32 Internal Memory 32-bit Transfers Select. This bit selects the external bus access width (32-bit transfers only if set, =1) (64-bit transfers when possible if cleared, =0) for the corresponding external port FIFO buffer (EPBx). Programs must not change a buffers INT32 setting while the buffer is enabled. Note that the buffers DTYPE and internal memory blocks IMDWx setting (either can select 40/48-bit transfers) overrides a 32-bit transfers only (INT32 =1) setting. 20-19 MAXBL Maximum Burst Length Select. These bits select the maximum burst transfer length for the corresponding external port FIFO buffer (EPBx) as follows: 00=burst disabled, 01=burst limit of 4, 10=11=reserved. Processors may perform burst accesses to external memory banks only when the bank is configured for synchronous access (EBxAM field in WAIT register). For burst writes, the memory banks EBxAM must be configured for the one-wait state write, synchronous access mode.
A-84
Registers
Table A-24. External Port DMA Control Registers (DMACx) Bit Definitions (Contd)
Bit(s) 23-21 Name PS Definition External Port Packing Status. These bits indicate the corresponding FIFO buffers packing status as 000=pack complete, 001=1st stage pack/unpack, 010=2nd stage multi-stage pack/unpack, 011= 3rd stage, 100=5th stage of 8 to 48-bit packing, 101=110=111=reserved. These bits are read-only. The processor clears these bits when DEN is cleared (changes from 1 to 0). 31-24 Reserved
A-85
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
PS
Ext Port EPBx FIFO Buffer Packing Status (read-only) 000=packing complete 001=1st stage pack/unpack 010=2 nd stage pack/unpack 011=3rd stage 100=5 th stage of 8 to 48 -bit packing 101=110=111=reserved
FS
Ext. Port FIFO Buffer Status (read-only) 00=buffer empty 01=buffer-not- full 10=buffer-not - empty 11=buffer full
INT32
Internal Memory 32 -bit Transfers Select 1=32-bit transfers/EPBx access width 0=64-bit transfers/EPBx access width
MAXBL
Maximum Burst Length Select 00=burst disabled 01=burst limit of 4 10=11=reserved 15 14 13 12 11 10 9 0 0 0 0 0 0 0
8 0
7 0
6 0
5 0
4 0
1 0
0 0
0 0
External Port Bus Priority Access 1=DSP asserts PA~ for external bus access 0=PA~ not asserted Flush EPBx FIFO Buffers & Status 1=flush EPBx External Handshake Mode Enable 1=enable, external devices to external memory 0=disable
PRIO
DEN
Ext. Port DMA Enable 1=enable, 0=disable
FLSH
CHEN
Ext. Port DMA Chaining Enable 1=enable, 0=disable
EXTERN
TRAN
Ext. Port EPBx Transmit/Rcv. Select 1=transmit data from intern memory 0=receive data from ext memory
INTIO
Single Word Interrupts for EPBx FIFO Buffers 1=enable single - wd non -DMA interrupt-driven xfers 0=disabled, FIFO fully enabled
DTYPE
EPBx FIFO Buffer Data Type Select 1=40/48 - bit, 3-column data 0=32/64 - bit, 4- column data
HSHAKE
EPBx DMA Handshake Mode Enable 1=enable, 0=disable
PMODE
Ext Port EPBx FIFO Packing Mode 000, 111= reserved 001=16 ext- to- 32/64 int 010=16 ext-to-48 int 011=32 ext- to -48 int 100=no pack (32 ext -to - 32/64 int) 101=8 ext-to -48 int 110=8 ext - to- 32/64int
MASTER
EPBx DMA Master Mode Enable 1=enable, 0=disable
MSWF
Most Significant Word First During Packing 1=enable, MSW first 0=disable, LSW first
A-86
Registers
A-87
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9
IIx
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
IMx
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
Cx
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
CPx
Program -Controlled Interrupt Bit If this bit is set, the I/O processor will generate a DMA interrupt on completion of a chained DMA
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
PCI Bit
GPx
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
EIEPx
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
EMEPx
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
ECEPx
(Reserved bi ts must always be set to zero when programming DMA parameter registers)
Registers
A-89
The value of EMEPx should be such that after being modified with EMEPx, the value of EIEPx does not fall outside the valid memory range. For more information, see I/O Processor on page 6-1. Only External Port DMA channels have EMEPx registers, because these channels exclusively address processor external memory.
A-90
Registers
Note that there is a single cycle of read latency between a change in a DMA channels status and the update of its DMASTAT bit(s).
DMASTAT
0x37
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 0 0
DMA13CHST
Channel 13(EPB3) Chaining Status
DMA0CHST
Channel 0 (RX0A/TX0A) Chaining Status
DMA12CHST
Channel 12 (EPB2) Chaining Status Channel 11 (EPB1) Chaining Status Channel 10 (EPB0) Chaining Status
DMA2CHST
Channel 2 (RX1A/TX1A) Chaining Status
DMA4CHST DMA6CHST
DMA8CHST
Channel 8 (LBUF0) Chaining Status
DMA5CHST
Channel 5 (RX2B/TX2B) Chaining Status
DMA9CHST DMA1CHST
DMA3CHST
Channel 3 (RX1B/TX1B) Chaining Status 15 14 13 12 11 10
0 0
DMA13ST
Channel 13 (EPB3) Status
DMA0ST
Channel 0 (RX0A/TX0A) Status Channel 2 (RX1A/TX1A) Status
DMA12ST
Channel 12 (EPB2) Status
DMA2ST DMA4ST
DMA11ST
Channel 11 (EPB1) Status
DMA10ST
Channel 10 (EPB0) Status
DMA6ST
Channel 6 (RX3A/TX3A) Status
DMA7ST
Channel 7 (RX3B/TX3B) Status
DMA8ST
Channel 8 (LBUF0/SPIRX) Status
DMA5ST
Channel 5 (RX2B/TX2B) Status
DMA9ST
Channel 9 (LBUF1/SPITX) Status
DMA3ST
Channel 3 (RX1B/TX1B) Status
DMA1ST
Channel 1 (RX0B/TX0B) Status
* Channel Active Status: 1=Active [ transferring data or waiting to transfer current block, and not transferring TCB ] 0= Inactive [DMA transter complete, or in TCB chain loading] ** Channel Chaining Status: 1=Chaining is Enabled and currently transferring TCB, or is Pending to transfer TCB, 0 = Chaining Disabled Status does not change on the master ADSP-21161 processor during external port DMA until the external portion is completed (for example, the EPBx buffers are emptied). If in chain insertion mode (DEN=0, CHEN=1), then channel chaining status will never go to a 1. Therefore, test channel status to see if it is ready so that your program can rewrite the chain pointer (CPx) register.
A-91
LBUF0 (0xc0)
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9
63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32
LBUF1 (0xc2)
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9
A-92
Registers
mask Link Service Requests (LSRQ) before modifying the LCTL register. For more information, see Link Port Service Request & Mask Register (LSRQ) on page A-98. Table A-25. Link Port Buffer Control Registers (LCTL) Bit Definitions
Bit(s) 0 Name L0EN Definition Link Buffer Enable. This bit enables (if set, =1) or disables (if cleared, =0) link buffer 0 (LBUF0). When the processor disables the buffer (L0EN transitions from high to low), the processor clears the corresponding L0STAT and L0RERR bits. Link Buffer DMA Enable. This bit enables (if set, =1) or disables (if cleared, =0) DMA transfers link buffer 0 (LBUF0). Link Buffer DMA Chaining Enable. This bit enables (if set, =1) or disables (if cleared, =0) DMA chaining link buffer 0 (LBUF0) Link Buffer Transfer Direction. This bit selects the transfer direction (transmit if set, =1) (receive if cleared, =0) for link buffer 0 (LBUF0). Link Buffer Extended Word Size. This bit selects the transfer extended word size (48-bit if set, =1) (32-bit if cleared, =0) for link buffer 0(LBUF0). Programs must not change a buffers L0EXT setting while the buffer is enabled. The buffers L0EXT setting overrides the internal memory blocks setting IMDWx for Normal word width. Whether buffer is set for 48- or 32- bit words, programs must index (IIx) the corresponding DMA channel with a Normal word address. 6-5 L0CLKD Link Port Clock Divisor. These bits select the transfer clock divisor for link buffer 0 (LBUF0). The transfer clock equals the processor core clock divided by L0CLKD, where L0CLKD[6-5] is: 01=1, 10=2, 11=3, or 00=4. Reserved
1 2 3
L0EXT
A-93
Table A-25. Link Port Buffer Control Registers (LCTL) Bit Definitions (Contd)
Bit(s) 8 Name L0PDRDE Definition Link Port Pulldown Resistor Disable. This bit disables (if set, =1) or enables (if cleared, =0) the internal pulldown resistors on the L0CLK, L0ACK, and L0DAT7-0 pins of the corresponding unassigned or disabled link port for silicon revisions 0.3, 1.0 and 1.1 and L0CLK and L0ACK for silicon revisions 1.2 and higher. this bit applies to the port which is not necessarily the port assigned to link buffer 0 (LBUF0). For revisions 0.3, 1.0 and 1.1 systems should not leave link port pins (L0CLK, L0ACK, and L0DAT7-0) unconnected without clearing the corresponding L0PDRDE bit or applying an external pulldown. For silicon revisions 1.2 or higher, this applies to L0CLK and L0ACK pins only. In systems where several processors share a link port, only one processor should have this bit cleared. For complete pin descriptions, see Table 13-1 on page 13-4. 9 L0DPWID Link Port Data Path Width. This bit selects the link port data path width (8-bit if set, =1) (4-bit if cleared, =0) for link buffer 0 (LBUF0). Systems using a 4-bit width should connect the lower link port data pins (L0DAT3-0) for data transfers and leave the upper pins (L0DAT7-4) unconnected. In the 4-bit mode, the processor applies pulldowns to the upper pins. 10 L1EN Link Buffer Enable. This bit enables (if set, =1) or disables (if cleared, =0) link buffer 1 (LBUF1). When the processor disables the buffer (L1EN transitions from high to low), the processor clears the corresponding L1STAT and L1RERR bits. Link Buffer DMA Enable. This bit enables (if set, =1) or disables (if cleared, =0) DMA transfers link buffer 1(LBUF1). Link Buffer DMA Chaining Enable. This bit enables (if set, =1) or disables (if cleared, =0) DMA chaining link buffer 1(LBUF1). Link Buffer Transfer Direction. This bit selects the transfer direction (transmit if set, =1) (receive if cleared, =0) link buffer 1(LBUF1).
11 12 13
A-94
Registers
Table A-25. Link Port Buffer Control Registers (LCTL) Bit Definitions (Contd)
Bit(s) 14 Name L1EXT Definition Link Buffer Extended Word Size. This bit selects the transfer extended word size (48-bit if set, =1) (32-bit if cleared, =0) for link buffer 1(LBUF1). Programs must not change a buffers L1EXT setting while the buffer is enabled. The buffers L1EXT setting overrides the internal memory blocks setting IMDWx for Normal word width. Whether buffer is set for 48- or 32- bit words, programs must index (IIx) the corresponding DMA channel with a Normal word address. 16-15 L1CLKD Link Port Clock Divisor. These bits select the transfer clock divisor for link buffer 1(LBUF1). The transfer clock equals the processor core clock divided by L1CLKD, where L1CLKD[16-15] is: 01=1, 10=2, 11=3, or 00=4. Reserved L1PDRDE Link Port Pulldown Resistor Disable. This bit disables (if set, =1) or enables (if cleared, =0) the internal pulldown resistors on the L1CLK, L1ACK, and L1DAT7-0 pins of the corresponding unassigned or disabled link port for silicon revisions 0.3, 1.0 and 1.1 and L1CLK and L1ACK for silicon revisions 1.2 and higher. This bit applies to the port, which is not necessarily the port assigned to link buffer 1 (LBUF1). For revisions 0.3, 1.0 and 1.1 systems should not leave link port pins (L1CLK, L1ACK, and L1DAT7-0) unconnected without clearing the corresponding L1PDRDE bit or applying an external pulldown. For silicon revisions 1.2 or higher, this applies to L1CLK and L1ACK pins only. In systems where several DSPs share a link port, only one processor should have this bit cleared. For complete pin descriptions, see Table 13-1 on page 13-4.
17 18
A-95
Table A-25. Link Port Buffer Control Registers (LCTL) Bit Definitions (Contd)
Bit(s) 19 Name L1DPWID Definition Link Port Data Path Width. This bit selects the link port data path width (8-bit if set, =1) (4-bit if cleared, =0) for link buffer 1 (LBUF1). Systems using a 4-bit width should connect the lower link port data pins (L1DAT3-0) for data transfers and leave the upper pins (L1DAT7-4) unconnected. In the 4-bit mode, the processor applies pulldowns to the upper pins. 20 21 23-22 25-24 26 LAB0 LAB1 L0STAT L1STAT LRERR0 Link Port Assignments for LBUF0. This bit assigns link buffer 0 to link port 1 if set (=1) or link port 0 if cleared (=0). Link Port Assignments for LBUF1. This bit assigns link buffer 1 to link port 1 if set (=1) or link port 0 if cleared (=0). Link Buffer 0 Status. These bits identify the status of link buffer 0 as follows: 11=full, 00=empty, 10=one word. Link Buffer 1 Status. These bits identify the status of link buffer 1 as follows: 11=full, 00=empty, 10=one word. Receive Packing Error Status for Link Buffer 0. Indicates if the packed bits in link buffer 0 were receive completely (=0), without error, or incompletely (=1). Receive Packing Error Status for Link Buffer 1. Indicates if the packed bits in link buffer 1 were received completely (=0), without error, or incompletely (=1). Reserved
27
LRERR1
31-28
A-96
Registers
LCTL
0xCC
LRERR1
31 30 29 28 27 26 25 24 23 22 21 20 0 0 0 0 0 0 0 0 0 0 1 0
19 18 17 16 0 0 0 0
L1CLKD
CCLK Divide Ratio 1 - LBUF1 00=divide by 4, 01=divide by 1 10=divide by 2, 11=divide by 3
LRERR0
Rcv. Pack Error Status for Link Buffer 0 1=incomplete, 0=complete
L1PDRDE
Link Port 1 Pulldown Resister Disable
L1STAT[1:0]
Link Buffer 1 Status (Read- Only) 11=Full, 00=Empty, 10=one w ord
L1DPW ID
Link Buffer 1 Data Path Width 1=8-bits, 0=4-bits
L0STAT[1:0]
Link Buffer 0 Status (Read-Only) 11=Full, 00=Empty, 10=one w ord
LAB0
Link Port Assignment for LBUF0 0=Link Port 0, 1=Link Port 1
LAB1
Link Port Assignment for LBUF1 0=Link Port 0, 1=Link Port 1
15 14 0 L1CLKD
CCLK Divide Ratio 0 - LBUF1
13 0
12 0
11 10 0 0
9 0
8 0
7 0
6 0
5 0
4 0
3 0
2 0
1 0
0 0
L0EN
Link Buffer 0 Enable 1=enable, 0=disable
L1EXT
Link Buffer 1 Extended Word Size 1=48-bit transfers, 0=32-bit transfers
L0DEN
Link Buffer 0 DMA Enable 1=enable DMA 0=disable DMA
L1TRAN
Link Buffer 1 Data Direction 1=Transmit, 0=Receive
L0CHEN
Link Buffer 0 DMA Chaining Enable 1=enable chaining, 0=disable chaining
L1CHEN
Link Buffer 1 DMA Chaining Enable 1=enable chaining, 0=disable chaining
L0TRAN
Link Buffer 0 Data Direction 1=Transmit, 0=Receive
L1DEN
Link Buffer 1 DMA Enable 1=enable DMA, 0=disable DMA
L0EXT
Link Buffer 0 Extended Word Size 1=48 -bit transfers, 0=32 -bit transfers
L1EN
Link Buffer 1 Enable 1=enable DMA, 0=disable DMA
L0CLKD[1:0]
CCLK Divide Ratio - LBUF0 00=divide by 4, 01=divide by 1, 10=divide by 2, 11=divide by 3
L0DPW ID
Link Buffer 0 Data Path W idth 1=8-bits, 0=4-bits
L0PDRDE
Link Port 0 Pulldow n Resister Disable
A-97
A-98
Registers
Table A-26. Link Port Service Request Register (LSRQ) Bit Definitions (Contd)
Bit(s) 21 Name L0RRQ Definition Link Port 0 Receive Request Status (Read-Only). If set (=1), indicates that link port 0 is disabled, but L0CLK is set (indicating an external receive request). Link Port 1 Transmit Request Status (Read-Only). If set (=1), indicates that link port 1 is disabled, but L1ACK is set (indicating an external transmit request). Link Port 1 Receive Request Status (Read-Only). If set (=1), indicates that link port 1 is disabled, but L1CLK is set (indicating an external receive request). Reserved
22
L1TRQ
23
L1RRQ
31-24
LSRQ 0xD0
L1RRQ
31 30 29 28 27 0 0 0 0 0
26 25 24 23 22 21 20 19 18 17 0 0 0 0 0 0 0 0 0 0
16 0
L0TRQ
Link Port 0 Transmit Request
L1TRQ
Link Port 1 Transmit Request
L0RRQ
Link Port 0 Receive Request
15 14 13 12 0 L1RM
Link Port 1 Receive Mask
11 10 0 0
9 0
8 0
7 0
6 0
5 0
4 0
3 0
2 0
1 0
0 0 L0TM
Link Port 0 Transmit Mask
L1TM
Link Port 1 Transmit Mask
L0RM
Link Port 0 Receive Mask
A-99
A-100
Registers
Serial Word Endian Select. This bit selects little endian words (LSB first, if set, =1) or big endian words (MSB first, if cleared, =0). Serial Word Length Select. These bits select the word length in bits. Word sizes can be from 3-bit (SLEN=2) to 32-bit (SLEN=31). 16-bit to 32-Bit Word Packing Enable. This bit enables (if set, =1) or disables (if cleared, =0) 16- to 32-bit word packing. Internal Transmit Clock Select. This bit selects the internal transmit clock (if set, =1) or external transmit clock (if cleared, =0). This bit applies to processor serial and multichannel modes for SPCTL0 and SPCTL1 registers. In I2S mode, this bit selects the word source and internal transmit clock (if set, =1) or external transmit clock (if cleared, =0) Sport Operation Mode. This bit selects the I2S mode if set (=1) or processor Serial mode/Multichannel mode if cleared (=0). Clock Rising Edge Select. This bit selects whether the serial port uses the rising edge (if set, =1) or falling edge (if cleared, =0) of the clock signal for sampling data and the frame sync. This bit is reserved when the SPORT is in I2S mode.
8-4
SLEN
9 10
PACK ICLK
A-101
Table A-27. Serial Port Control Registers (SPCTLx) Bit Definitions (Contd)
Bit(s) 13 Name FSR Definition Frame Sync Required Select. This bit selects whether the serial port requires (if set, =1) or does not require (if cleared, =0) a transfer frame sync. This bit is reserved when the SPORT is in I2S mode and multichannel mode. 14 IFS (IRFS) Internally Frame Sync Select. This bit selects whether the serial port uses an internal generated FS (if set, =1) or uses an external FS (if cleared, =0). This bit is reserved when the SPORT is in I2S mode and multichannel transmit mode. 15 DITFS Data Independent Transmit Frame Sync Select. This bit selects whether the serial port uses a data-independent transmit FS (sync at selected interval, if set, =1) or uses a data-dependent TFS (sync when data in TX, if cleared, =0) when DDIR=1. This bit is reserved when the SPORT is in multichannel mode. 16 17 LFS (LRFS, LTDV) LAFS Low Active Frame Sync Select. This bit selects an active low FS (if set, =1) or active high FS (if cleared, =0). Late Transmit Frame Sync Select. This bit selects a late FS (FS during first bit, if set, =1) or an early FS (FS before first bit, if cleared, =0). This bit is reserved when the SPORT is in I2S mode and multichannel mode. 18 19 SDEN_A SCHEN_A Serial Port DMA Enable A. This bit enables (if set, =1) or disables (if cleared, =0) the serial ports A channel DMA. Serial Port DMA Chaining Enable A. This bit enables (if set, =1) or disables (if cleared, =0) serial ports channel A DMA chaining.
A-102
Registers
Table A-27. Serial Port Control Registers (SPCTLx) Bit Definitions (Contd)
Bit(s) 20 Name SDEN_B Definition Serial Port DMA Enable B. This bit enables (if set, =1) or disables (if cleared, =0) the serial ports channel B DMA. This bit is reserved when the SPORT is in multichannel mode. 21 SCHEN_B Serial Port DMA Chaining Enable B. This bit enables (if set, =1) or disables (if cleared, =0) serial ports channel B DMA chaining. This bit is reserved when the SPORT is in multichannel mode. 22 FS_BOTH FS Both Enable. This bit issues WS if data is present in both transmit buffers if set (=1). If cleared (=0), WS is issued if data is present in either transmit buffers. This bit is reserved when the SPORT is in multichannel mode. 23 24 SPEN_B Reserved Serial Port Enable B. This bit enables (if set, =1) or disables (if cleared, =0) the corresponding serial port B channel. This bit is reserved when the SPORT is in multichannel mode. 25 DDIR Data Direction Control. This bit activates transmit buffers TXnA or TXnB if set (=1) or enables receive buffers RXnA or RXnB if cleared (=0). This bit is reserved when the SPORT is in multichannel mode. 26 DERR_B DXB Error Status (Sticky, Read-Only). This bit indicates whether the serial transmit operation has underflowed (if set, =1 and DDIR=1) or a receive operation has overflowed (if cleared, =0 and DDIR=0) in the DXB data buffer. This bit is reserved when the SPORT is in multichannel mode.
A-103
Table A-27. Serial Port Control Registers (SPCTLx) Bit Definitions (Contd)
Bit(s) 28-27 Name DXS_B Definition DXB Data Buffer Status (Read-Only). These bits indicate the status of the serial ports DXB data buffer as follows: 11=full, 00=empty, 10=partially full. This bit is reserved when the SPORT is in multichannel mode. 29 DERR_A (ROVF_A, TUVF_A) DXS_A (RXS_A, TXS_A) DXA Error Status (Sticky, Read-Only). This bit indicates whether the serial transmit operation has underflowed (if set, =1 and DDIR=1) or a receive operation has overflowed (if cleared, =0 and DDIR=0) in the DXA data buffer. DXA Data Buffer Status (Read-Only). These bits indicate the status of the serial ports DXA data buffer as follows: 11=full, 00=empty, 10=partially full.
31-30
A-104
Registers
D S P Serial M ode
31 30 29 28 27 26 25 24 23 22 21 20 0 0 0 0 0 0 0 0 0 0 0 0 19 18 17 16 0 0 0 0
LFS
L AFS
Late F S 0=early F S , 1=late FS
D ER R _ A
D X A E rror S tatus (sticky) D D IR =1,transm it underflow status D D IR =0, receive overflow status
S D EN _ A
S P O R T D M A enable A channel 1=enable, 0=disable
D X S _B *
D X B D ata B uffer S tatus 11=full, 10=partially full ,00=em pty
S C H EN _ A
D M A chaining enable A channel 1=enable, 0=disable
D E R R _B *
D X B E rror S tatus (sticky)
S D EN _ B
S P O R T D M A enable B channel 1=enable, 0=disable
D D IR **
D ata D irection C ontrol 1=A ctive Transm it B uffers T X nB /T XnA 0=E nable R eceive B uffers R X nB /R X nA
S C H EN _ B
D M A chaining enable B channel 1=enable, 0=disable
S P E N _B
S P O R T E nable B 1=enable, 0=disable * S tatus is R ead-only ** D o not read/write from /to inactive R Xn/TX n buffers 15 14 13 12 0 0 0 0 11 10 0 0 9 0 8 0 7 0 6 0 5 0 4 0 3 0 2 0 1 0 0 0
FS _B O TH
1=issue W S only if data is present in both Tx 0=issue W S if data is present in either T x
D IT FS
D ata Independent tx F S (if D D IR =1) 1=data independent, 0= data dependent
S P E N _A
S P O R T E nable A (1=enable, 0=disable)
IF S
Internally generated FS 1=internal FS , 0=external FS
D T YP E
D ata type 00=right-justify; fill M S B w ith 0s 01=right-justify; sign extend M S B 10=com pand m u-law 11=com pand A -law
FS R
FS requirem ent 1=F S required, 0=FS not required
SENDN
E ndian w ord form at 0=M S B first, 1=LS B first
CKRE
C lock edge for data Fram e S ync sam pling or driving (1=rising edge, 0=falling edge)
S LE N
S erial W ord Length-1
OPMODE
S P O R T O peration M ode 0=D S P serial m ode/m ultichannel m ode 1=I 2 S m ode
P AC K
16/32 packing 1=packing, 0=no packing
IC LK
A-105
I 2S Mode
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
L_FIRST
Left or Right I2S channel RX/TX first 1=start left data first 0= start right data first
DERR_A
DXA Error Status (sticky) DDIR=1,transmit underflow status DDIR=0, receive overflow status
SDEN_A
SPORT Transmit DMA enable A ch. 1=enable, 0=disable
DXS_B*
DXB Data Buffer Status 11=full, 10=partially full, 00=empty
SCHEN_A
DERR_B*
DXB Error Status (sticky)
SDEN_B
SPORT transmit DMA enable Bch . 1=enable, 0=disable
D DIR**
Data Direction Control 1=Active Transmit Buffers TXnA/TXnB 0=Enable Receive Buffers RXnA/RXnB
SCHEN_B
DMA Chaining enable B channel 1=enable, 0=disable
SPEN_B
SPORT Enable B 1=enable, 0=disable * Status is Read-only ** Do not read/write from/to inactive RXn/TXn buffers 15 14 13 12 11 10 9 0 0 0 0 0 0 0 8 0 7 0 6 0 5 0 4 0 3 2 1 0 0 0
FS_BOTH
1=issue WS only if data is present in both Tx 0=issue WS if data is present in either Tx
0 0
DITFS
Data Independent tx FS (if DDIR=1) 1=data independent, 0=data dependent
SPEN_A
SPORT Enable A (1=enable, 0=disable)
OPMODE
SPORT Operation Mode 0=DSP serial mode/multichannel mode 1=I2S mode
SLEN
Serial Word Length- 1
PACK
16/32 packing 1=packing, 0=no packing
MSTR
I2S serial and L/R clock Master 1=internal SCLK and WS, TX/RX is master 0=external SLCK and WS, TX/RX is slave
A-106
Registers
Multichannel Mode
Receive Control Bits
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 0 0 0 0
LRFS RXS_A*
RXA Data Buffer Status 11=full, 10=partially full, 00=empty Active Low Multichannel Receive FS0/FS1 0=active high, 1=active low
SDEN_A
SPORT receive DMA enable A 1=enable, 0=disable
ROVF_A*
RXA Underflow Status (sticky)
SCHEN_A
SPORT receive DMA chaining enable A 1=enable, 0=disable
*Status is Read-only
15 14 13 12 11 10 0 0 0 0 0 0
9 0
8 0
7 0
6 0
5 0
4 0
3 0
2 0
1 0
0 0
IRFS
Internally Generated Multichannel rx FS 1=internal FS0/FS1, 0=external FS0/FS1
DTYPE
CKRE
Active clock edge for data & frame sync sampling (1=rising edge, 0=falling edge)
Data type 00=right-justify; fill MSB with 0s 01=right-justify; sign extend MSB 10=compand mu-law 11=compand A-law
SENDN
Endian word format 0=MSB first, 1=LSB first
OPMODE
SPORT Operation Mode 0=DSP serial mode/multichannel mode 1=I2S mode
SLEN
Serial Word Length -1
ICLK
Internally -generated Receive clock 1=internal clock, 0=external clock
PACK
16/32 packing 1=packing, 0=no packing
A-107
Multichannel Mode
Transmit Control Bits
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 0 0 0 0
LTDV TXS_A*
TXA Data Buffer Status 11=full, 10=partially full, 00=empty Active Low MC Transmit Data Valid 0=active high TVD2/TDV3 1=active low TDV2/TDV3
SDEN_A
SPORT transmit DMA enable A 1=enable, 0=disable
TUVF_A*
TXA Underflow Status (sticky)
SCHEN_A
SPORT transmit DMA chaining enable A 1=enable, 0=disable 15 14 13 12 11 10 0 0 0 0 0 0 9 0 8 0 7 0 6 0 5 0 4 0 3 0 2 0 1 0 0 0
Reserved** OPMODE
SPORT Operation Mode 0=DSP serial mode/multichannel mode 1=I 2S mode
DTYPE
Data type x0=right-justify; fill MSB with 0s x1=right-justify; sign extend MSB 0x=compand mu-law 1x=compand A-law
PACK
16/32 packing 1=packing, 0=no packing
SENDN
Endian word format 0=MSB first, 1=LSB first
SLEN
Serial Word Length -1
A-108
Registers
SPORT Multichannel Control Registers (SPxyMCTL) These registers addresses are SP02MCTL0x1DF, SP13MCTL0x1FF. The SP02MCTL register is the multichannel control register for SPORTs 0 and 2. The SP13MCTL register is the multichannel control register for SPORTs 1 and 3. The reset value for these registers is undefined. These registers are described in Table A-28 and Figure A-35. Table A-28. SPORT Multichannel Control Register Bit Definitions
Bit(s) 0 Name MCE Definition Multichannel Mode Enable. Standard and multichannel modes only. Bit 0 in the SP02MCTL and SP13MCTL registers.One of two configuration bits that enable and disable multichannel mode on both the receive or transmit serial port channels. See also, OPMODE. 0 = Disable multichannel operation. 1 = Enable multichannel operation if OPMODE=0. Multichannel Frame Delay. These bits set the interval, in number of serial clock cycles, between the multichannel frame sync pulse and the first data bit. These bits provide support for different types of T1 interface devices. Valid values range are from 0 to 15 with bits SP02MCTL[4:1] or SP13MCTL[4:1]. Values of 1 to15 correspond to the number of intervening serial clock cycles. A value of 0 corresponds to no delay. The multichannel frame sync pulse is concurrent with first data bit. 11-5 NCH Number of Multichannel Slots (minus one).These bits select the number of channel slots (maximum of 128) to use for multichannel operation.Valid values for actual number of channel slots range from 1 to 128. Use this formula to calculate the value for NCH: NCH = Actual number of channel slots -1.
4-1
MFD
A-109
31-23
A-110
Registers
CHNL
Current Channel (read-only) 15 14 13 12 11 10 9 0 0 0 0 0 0 0 8 0 7 0 6 5 4 0 3 2 1 0 0 0
0 0
0 0
SPL
SPORTLoopback SPORT0 & SPORT2 only SPORT1 & SPORT3 only
MCE
Multichannel enable (1=enable, 0=disable)
MFD
Multichannel Frame Delay
NCH
Number of Channels - 1
Figure A-35. SP02MCTL and SP13MCTL Registers SPORT Transmit Buffer Registers (TXx) The TXx registers addresses are: TX0A0x1C1, Tx0B0x1C2, Tx1A0x1E1, Tx1B0x1E2, Tx2A0x1D1, Tx2B0x1D2, Tx3A0x1F1, Tx3B0x1F2. The reset value for these registers is undefined. The 32-bit TXx registers hold the output data for serial port transmit operations. For more information on how transmit buffers work, see Transmit and Receive Data Buffers on page 10-30. SPORT Receive Buffer Registers (RXx) The RXx registers addresses are: Rx0A0x1C3, Rx0B0x1C4, Rx1A0x1E3, Rx1B0x1E4, Rx2A0x1D3, Rx2B0x1D4, Rx3A0x1F3, Rx3B0x1F4. The reset value for these registers is undefined. The 32-bit RXx registers hold the input data from serial port receive operations. For more information on how receive buffers work, see Transmit and Receive Data Buffers on page 10-30.
A-111
SPORT Divisor Registers (DIVx) The DIVx registers addresses are: DIV00x1C5, DIV10x1E5, DIV20x1D5, DIV30x1F5 (shown in Figure A-36). The reset value for these registers is undefined. These registers contain two fields: Bits 15-0 are CLKDIV. These bits select the Serial Clock Divisor for internally generated SCLK as follows: f CCLK CLKDIV = ------------------- 1 2 ( f SCLK ) Bits 31-16 are FSDIV. These bits select the Frame Sync Divisor for internally generated TFS as follows: f SCLK FSDIV = ----------- 1 f SFS
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
A-112
Registers
SPORT Count Registers (CNTx) The CNTx registers addresses are: CNT00x1C6, CNT10x1E6, CNT20x1D6, CNT30x1F6. The reset value for these registers is undefined. The CNTx registers provides status information for the internal clock and frame sync. SPORT Transmit Select Registers (MT2CSx and MT3CSx) The MT2CSx and MT3CSx registers addresses are: MT2CS00x1D7, MT2CS10x1D9, MT2CS20x1DB, MT2CS30x1DD, MT3CS00x1F7, MT3CS10x1F9, MT3CS20x1FB, MT3CS30x1FD. The reset value for these registers is undefined. Each bit, 31-0, set (=1) in one of four MTxCSx registers correspond to an active transmit channel, 127-0, on a multichannel mode serial port. When the MT2CSx and MT3CSx registers activate a channel, the serial port transmits the word in that channels position of the data stream. When a channels bit in the MTCSx register is cleared (=0), the serial ports DT (data transmit) pin three-states during the channels transmit time slot. SPORT Transmit Compand Registers (MT2CCSx and MT3CCSx) The MT2CCSx and MT3CCSx registers addresses are: MT2CCS00x1D8,
MT2CCS10x1DA, MT2CCS20x1DC, MT2CCS30x1DE, MT3CCS00x1F8, MT3CCS10x1FA, MT3CCS20x1FC, MT3CCS30x1FE.
these registers is undefined. Each bit, 31-0, set (=1) in one of four MTxCCSx registers correspond to an companded transmit channel, 127-0, on a multichannel mode serial port. When the MTCCSx register activates companding for a channel, the serial port applies the companding from the serial ports DTYPE selection to the transmitted word in that channels position of the data stream. When a channels bit in the MTCCSx register is cleared (=0), the serial port does not compand the output during the channels receive time slot.
A-113
SPORT Receive Select Registers The MRCSx registers addresses are: MR0CS00x1C7, MR0CS10x1C9, MR0CS20x1CB, MR0CS30x1CD, MR1CS00x1E7, MR1CS10x1E9, MR1CS20x1EB, MR1CS30x1ED. The reset value for these registers is undefined. Each bit, 31-0, set (=1) in one of the four MRCSx registers corresponds to an active receive channel, 127-0, on a multichannel mode serial port. When the MRCSx register activates a channel, the serial port receives the word in that channels position of the data stream and loads the word into the RXx buffer. When a channels bit in the MRCSx register is cleared (=0), the serial port ignores any input during the channels receive time slot. SPORT Receive Compand Registers These registers addresses are: MR0CCS00x1C8, MR0CCS10x1CA, MR0CCS20x1CC, MR0CCS30x1CE, MR1CCS00x1E8, MR1CCS10x1EA, MR1CCS20x1EC, MR1CCS30x1EE. The reset value for these registers is undefined. Each bit, 31-0, set (=1) in the MR0CCSx and MR1CCSx registers correspond to an companded receive channel, 127-0, on a multichannel mode serial port. When one of the four MR0CCSx and MR1CCSx registers activate companding for a channel, the serial port applies the companding from the serial ports DTYPE selection to the received word in that channels position of the data stream. When a channels bit in the MR0CCSx and MR1CCSx registers are cleared (=0), the serial port does not compand the input during the channels receive time slot.
A-114
Registers
TXE
4-3
TXS
A-115
7-6
RXS
3 1 30 2 9 2 8 2 7 2 6 2 5 2 4
23 2 2 2 1 20 0 0 0 0
1 9 18 17 1 6 0 0 0 0
SP IS TA T 0xB 5
1 5 1 4 13 1 2 0 0 0 0
1 1 10 0 0
9 0
8 0
7 0
6 0
5 0
4 0
3 0
2 0
1 0
0 0
RXS
SP IR X D a ta B u ffer Statu s (R ead -on ly ) 00=S PIR X em p ty 01=S PIR X p artially fu ll 11=S PIR X full 10=R es erve d
S P IF
SPI T ran sm it T ran s fer C o m p lete 1=tran sfer co m ple te , 0= ac tiv e tran sfer
MME
M u ltim aster E rro r 0= n o erro r, 1 = SP ID S~ as serted b y s lave
RBSY
R ec ep tio n E rro r (O verflow ) 1=n ew d ata receiv ed with fu ll R XB FIFO S PI en ters id le m o d e if m aster d evice
TXE
Tran sm iss ion E rror (U nd erflow ) 1= n o n ew d ata in T X F IF O , S PI en ters id le m o d e if m aster d evice
TXS
SP IT X D a ta B u ffer Statu s (read o n ly) 00= S PIT X em p ty 01= T X B p artially full 11= S PIT X full 10= R eserv ed
A-116
Registers
SPTINT
3 4
MS CP
CPHASE
7-8
WL
A-117
Bits 14 to 24 are controlled during master mode. 14 PSSE Programmable Slave Select Enable. This bit is used to program the controlled automatic generation of slave device select signals during SPI transfers. This bit enables (if set, =1) or disables (if cleared, =0) the programmable slave select mode. The slave selection is subsequently made using the FLS bit. Flag Select. These bits select which flag pins are asserted when multiple slaves are used (0=Disable, 1=Enable) as follows: Bit 15= FLAG0 Bit 16= FLAG1 Bit 17= FLAG2 Bit 18= FLAG3 Note: Only Flag[0] to Flag[3] can be used this way. 19 NSMLS Non-Seamless Operation. This bit, if set (=1), indicates that after each word transfer there is a delay before the next word transfer starts. When cleared (=0), indicates no delay before the next word starts, a seamless operation. Deselect SPIDS in CPHASE = 0. This bit deselects when high (=1) the slaves between successive word transfers in CPhase 0. The slave is selected in master mode using PSSE functionality. This bit has no effect in slave mode for the SPI port. This functionality is valid only when NSMLS =1 and CPHASE =0. This bit is cleared (=0) when not in use.
15-18
FLS
20
DCPH0
A-118
Registers
26
OPD
27
RDMAEN
A-119
SPICTL
0xB4
GM
Fetch/Discard Incoming RXB data when RXB full 0=Discard incoming data 1=Overwrite with new data
31 30 29 28 27 26 25 24 23 22 21 20 0 0 0 0 0 0 0 0 0 0 0 0
19 18 17 16 0 0 0 0
FLS1
FLAG1 Slave Device Select 1=Enable, 0=Disable
SENDLW
Send Zero/Repeat Byte When TXB Empty 0=Send zero, 1=Repeat last data
FLS2
FLAG2 Slave Device Select 1=Enable, 0=Disable
SGN
Sign Extend Data 0=no sign extend, 1=sign extend
FLS3
FLAG3 Slave Device Select 1=Enable, 0=Disable
PACKEN
8-bit Packing Enable 0=no packing, 1=8 to 32-bit packing
NSMLS
Non-Seamless operation 0=no delay, 1=delay before next word starts
RDMAEN
Receive DMA Enable 1=Enable, 0=Disable
OPD
Open Drain Output Enable for Data Pins 0=Normal, 1=Open Drain
DCPH0
Deselect SPIDS in CPHASE =0 (master mode only, NSMLS bit=1) 0=No SPI device select 1=Deselects slaves between successive transfers
DMISO
Disable MISO Pin (Broadcast) 0=MISO Enabled, 1=MISO Disabled
15 14 13 12 11 10 1 0 0 0 0 1
9 1
8 1
7 0
6 1
5 0
4 0
3 0
2 0
1 0
0 0 SPIEN
SPI System Enable 1=enable, 0=disable
FLS0
FLAG0 Slave Device Select 1=Enable, 0=Disable
PSSE
Programmable Slave Select Enable 0=Disable, 1=Enable
SPRINT
SPI RX Buffer Interrupt Enable 1=enable SPI IRQ on RXB empty, 0=disable
TDMAEN
Transmit DMA Enable 1=Enable, 0=Disable
SPTINT
SPI TX Buffer Interrupt Enable 1=enable SPI IRQ on TXB not full, 0=disable
BAUDR
Baud Rate CCLK / (2**(2 + BR))
MS
Master/Slave Mode Bit 0=SPI slave device, 1=SPI Master Device
WL
Word Length 00=8 bits, 01=16 bits, 11=32 bits, 10=RESERVED
CP
Clock polarity 0=SPICLK active high, low in idle state 1=SPICLK active low, high in idle state
DF
Data Format 0=LSB sent / received first 1=MSB sent / received first
CPHASE
Clock phase 0=SPICLK toggles at middle of 1st data bit 1=SPICLK toggles at beginning of 1st data bit
A-120
Registers
automatically loaded into the internal memory. For core or interrupt driven transfer, you can also use the RXS status bits in the SPISTAT register to determine if the receive buffer is full. Reading from an empty SPIRX buffer causes a core hang if the buffer hang disable bit is cleared in the SYSCON register.
A-121
Here are some example uses: bit set mode1 BR0|IRPTEN|ALUSAT; ustat1=BSO|HPM01|HMSWF; DM(SYSCON)=ustat1; ----------------------------------------------------------------------------- */ #ifndef __DEF21161_H_ #define __DEF21161_H_ /*----------------------------------------------------------------------------*/ /* System Register bit definitions */ /*----------------------------------------------------------------------------*/ /* MODE1 and MMASK registers */ #define BR8 0x00000001 /* Bit 0: Bit-reverse for I8 */ #define BR0 0x00000002 /* Bit 1: Bit-reverse for I0 (uses DMS0- only ) */ #define SRCU 0x00000004 /* Bit 2: Alt. register select for comp. units */ #define SRD1H 0x00000008 /* Bit 3: DAG1 alt. register select (7-4) */ #define SRD1L 0x00000010 /* Bit 4: DAG1 alt. register select (3-0) */ #define SRD2H 0x00000020 /* Bit 5: DAG2 alt. register select (15-12) */ #define SRD2L 0x00000040 /* Bit 6: DAG2 alt. register select (11-8) */ #define SRRFH 0x00000080 /* Bit 7: Register file alt. select for R(15-8) */ #define SRRFL 0x00000400 /* Bit 10: Register file alt. select for R(7-0) */ #define NESTM 0x00000800 /* Bit 11: Interrupt nesting enable */ #define IRPTEN 0x00001000 /* Bit 12: Global interrupt enable */ #define ALUSAT 0x00002000 /* Bit 13: Enable ALU fixed-pt. saturation */ #define SSE 0x00004000 /* Bit 14: Enable short word sign extension */ #define TRUNCATE 0x00008000 /* Bit 15: 1=fltg-pt. truncation 0=Rnd to nearest */ #define RND32 0x00010000 /* Bit 16: 1=32-bit fltg-pt.rounding 0=40-bit rnd */ #define CSEL 0x00060000 /* Bit 17-18: CSelect: Bus Mastership */ #define PEYEN 0x00200000 /* Bit 21: Processing Element Y enable */ #define SIMD 0x00200000 /* Bit 21: Enable SIMD Mode */ #define BDCST9 0x00400000 /* Bit 22: Load Broadcast for I9 */ #define BDCST1 0x00800000 /* Bit 23: Load Broadcast for I1 * #define CBUFEN 0x01000000 /* Bit 23: Circular Buffer Enable */ /* MODE2 register */ #define IRQ0E 0x00000001 /* Bit 0: IRQ0- 1=edge sens. 0=level sens. */ #define IRQ1E 0x00000002 /* Bit 1: IRQ1- 1=edge sens. 0=level sens. */ #define IRQ2E 0x00000004 /* Bit 2: IRQ2- 1=edge sens. 0=level sens. */ #define CADIS 0x00000010 /* Bit 4: Cache disable */ #define TIMEN 0x00000020 /* Bit 5: Timer enable */ #define BUSLK 0x00000040 /* Bit 6: External bus lock */ #define FLG0O 0x00008000 /* Bit 15: FLAG0 1=output 0=input */ #define FLG1O 0x00010000 /* Bit 16: FLAG1 1=output 0=input */ #define FLG2O 0x00020000 /* Bit 17: FLAG2 1=output 0=input */ #define FLG3O 0x00040000 /* Bit 18: FLAG3 1=output 0=input */ #define CAFRZ 0x00080000 /* Bit 19: Cache freeze */ #define IIRAE 0x00100000 /* Bit 20: Illegal IOP Register Access Enable */ #define U64MAE 0x00200000 /* Bit 21: Unaligned 64-bit Memory Access Enable */ /* bits 31-30, 27-25 are Processor ID[4:0], read only, value: 0b01001 bits 29-28 are silicon revision[1:0], read only, value: 0 These bits (only) are routed to Mode2 Shadow register (IOP register 0x11)*/ /* FLAGS register */ #define FLG0 0x00000001 #define FLG1 0x00000002 #define FLG2 0x00000004 #define FLG3 0x00000008
/* /* /* /*
0: 1: 2: 3:
*/ */ */ */
/* ASTATx and ASTATy registers */ #ifdef SUPPORT_DEPRECATED_USAGE /* Several of these (AV, AC, MV, SV, SZ) are assembler-reserved keywords, so this style is now deprecated. If these are defined, the assemblerreserved keywords are still available in lowercase, e.g., IF sz JUMP LABEL1.*/
A-122
Registers
# # # # # # # # # # # # # # # # # # # # # # #
define define define define define define define define define define define define define define define define define define define define define define define
0x00000001 0x00000002 0x00000004 0x00000008 0x00000010 0x00000020 0x00000040 0x00000080 0x00000100 0x00000200 0x00000400 0x00000800 0x00001000 0x00002000 0x00040000 0x01000000 0x02000000 0x04000000 0x08000000 0x10000000 0x20000000 0x40000000 0x80000000
/* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /*
Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit
0: 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 18: 24: 25: 26: 27: 28: 29: 30: 31:
ALU result zero or fltg-pt. underflow ALU overflow ALU result negative ALU fixed-pt. carry ALU X input sign (ABS and MANT ops) ALU fltg-pt. invalid operation Multiplier result negative Multiplier overflow Multiplier fltg-pt. underflow Multiplier fltg-pt. invalid operation ALU fltg-pt. operation Shifter overflow Shifter result zero Shifter input sign Bit test flag for system registers Compare Accumulation Bit 0 Compare Accumulation Bit 1 Compare Accumulation Bit 2 Compare Accumulation Bit 3 Compare Accumulation Bit 4 Compare Accumulation Bit 5 Compare Accumulation Bit 6 Compare Accumulation Bit 7
*/ */ */ */ */ */ */ */ */ */ */ */ */ */ */ */ */ */ */ */ */ */ */
#endif #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define ASTAT_AZ ASTAT_AV ASTAT_AN ASTAT_AC ASTAT_AS ASTAT_AI ASTAT_MN ASTAT_MV ASTAT_MU ASTAT_MI ASTAT_AF ASTAT_SV ASTAT_SZ ASTAT_SS ASTAT_BTF ASTAT_CACC0 ASTAT_CACC1 ASTAT_CACC2 ASTAT_CACC3 ASTAT_CACC4 ASTAT_CACC5 ASTAT_CACC6 ASTAT_CACC7 0x00000001 0x00000002 0x00000004 0x00000008 0x00000010 0x00000020 0x00000040 0x00000080 0x00000100 0x00000200 0x00000400 0x00000800 0x00001000 0x00002000 0x00040000 0x01000000 0x02000000 0x04000000 0x08000000 0x10000000 0x20000000 0x40000000 0x80000000 /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit Bit 0: 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 18: 24: 25: 26: 27: 28: 29: 30: 31: ALU result zero or fltg-pt. u'flow*/ ALU overflow */ ALU result negative */ ALU fixed-pt. carry */ ALU X input sign(ABS and MANT ops)*/ ALU fltg-pt. invalid operation */ Multiplier result negative */ Multiplier overflow */ Multiplier fltg-pt. underflow */ Multiplier fltg-pt. invalid op. */ ALU fltg-pt. operation */ Shifter overflow */ Shifter result zero */ Shifter input sign */ Bit test flag for system registers*/ Compare Accumulation Bit 0 */ Compare Accumulation Bit 1 */ Compare Accumulation Bit 2 */ Compare Accumulation Bit 3 */ Compare Accumulation Bit 4 */ Compare Accumulation Bit 5 */ Compare Accumulation Bit 6 */ Compare Accumulation Bit 7 */
/* STKYx and STKYy registers */ /* bits 0 to 9 in both STKYx and STKYY, bits 17 to 26 in STKYx only */ #define AUS 0x00000001 /* Bit 0: ALU fltg-pt. underflow #define AVS 0x00000002 /* Bit 1: ALU fltg-pt. overflow #define AOS 0x00000004 /* Bit 2: ALU fixed-pt. overflow #define AIS 0x00000020 /* Bit 5: ALU fltg-pt. invalid operation #define MOS 0x00000040 /* Bit 6: Multiplier fixed-pt. overflow #define MVS 0x00000080 /* Bit 7: Multiplier fltg-pt. overflow #define MUS 0x00000100 /* Bit 8: Multiplier fltg-pt. underflow #define MIS 0x00000200 /* Bit 9: Multiplier fltg-pt. invalid operation /* STKYx register *ONLY* */ #define CB7S 0x00020000 /* Bit 17: DAG1 circular buffer 7 overflow #define CB15S 0x00040000 /* Bit 18: DAG2 circular buffer 15 overflow #define IIRA 0x00080000 /* Bit 19: Illegal IOP Register Access #define U64MA 0x00100000 /* Bit 20: Unaligned 64-bit Memory Access #define PCFL 0x00200000 /* Bit 21: PC stack full */ #define PCEM 0x00400000 /* Bit 22: PC stack empty */ #define SSOV 0x00800000 /* Bit 23: Status stack overflow (MODE1 and ASTAT)
*/ */ */ */ */ */ */ */ */ */ */ */
*/
A-123
0x01000000 /* Bit 24: Status stack empty */ 0x02000000 /* Bit 25: Loop stack overflow */ 0x04000000 /* Bit 26: Loop stack empty */
/* IRPTL and IMASK and IMASKP registers */ #define EMUI 0x00000001 /* Bit 0: Offset: 00: Emulator Interrupt */ #define RSTI 0x00000002 /* Bit 1: Offset: 04: Reset */ #define IICDI 0x00000004 /* Bit 2: Offset: 08: Illegal Input Condition Detected */ #define SOVFI 0x00000008 /* Bit 3: Offset: 0c: Stack overflow */ #define TMZHI 0x00000010 /* Bit 4: Offset: 10: Timer = 0 (high priority) */ #define VIRPTI 0x00000020 /* Bit 5: Offset: 14: Vector interrupt */ #define IRQ2I 0x00000040 /* Bit 6: Offset: 18: IRQ2- asserted */ #define IRQ1I 0x00000080 /* Bit 7: Offset: 1c: IRQ1- asserted */ #define IRQ0I 0x00000100 /* Bit 8: Offset: 20: IRQ0- asserted */ #define SP0I0x00000400 /* Bit 10: Offset: 28: SPORT0 DMA channel */ #define SP1I0x00000800 /* Bit 11: Offset: 2c: SPORT1 DMA channel */ #define SP2I0x00001000 /* Bit 12: Offset: 30: SPORT2 DMA channel */ #define SP3I0x00002000 /* Bit 13: Offset: 34: SPORT3 DMA channel */ #define LPISUMI0x00004000 /* Bit 14: Offset: na: LPort Interrupt Summary */ #define EP0I0x00008000 /* Bit 15: Offset: 50: External port channel 0 DMA */ #define EP1I0x00010000 /* Bit 16: Offset: 54: External port channel 1 DMA */ #define EP2I0x00020000 /* Bit 17: Offset: 58: External port channel 2 DMA */ #define EP3I0x00040000 /* Bit 18: Offset: 5c: External port channel 3 DMA */ #define LSRQI 0x00080000 /* Bit 19: Offset: 60: Link service request */ #define CB7I 0x00100000 /* Bit 20: Offset: 64: Circ. buffer 7 overflow */ #define CB15I 0x00200000 /* Bit 21: Offset: 68: Circ. buffer 15 overflow */ #define TMZLI 0x00400000 /* Bit 22: Offset: 6c: Timer = 0 (low priority) */ #define FIXI 0x00800000 /* Bit 23: Offset: 70: Fixed-pt. overflow */ #define FLTOI 0x01000000 /* Bit 24: Offset: 74: fltg-pt. overflow */ #define FLTUI 0x02000000 /* Bit 25: Offset: 78: fltg-pt. underflow */ #define FLTII 0x04000000 /* Bit 26: Offset: 7c: fltg-pt. invalid */ #define SFT0I 0x08000000 /* Bit 27: Offset: 80: user software int 0 */ #define SFT1I 0x10000000 /* Bit 28: Offset: 84: user software int 1 */ #define SFT2I 0x20000000 /* Bit 39: Offset: 88: user software int 2 */ #define SFT3I 0x40000000 /* Bit 30: Offset: 8c: user software int 3 */ /* LIRPTL register */ #define LP0I 0x00000001 /* Bit 0: Offset: 38: Link port channel 0 DMA */ #define LP1I 0x00000002 /* Bit 1: Offset: 3C: Link port channel 1 DMA */ #define SPIRI 0x00000004 /* Bit 2: Offset: 40: SPI Receive DMA */ #define SPITI 0x00000008 /* Bit 3: Offset: 44: SPI Transmit DMA */ #define LP0MSK 0x00010000 /* Bit 16: Link port channel 0 Interrupt Mask */ #define LP1MSK 0x00020000 /* Bit 17: Link port channel 1 Interrupt Mask */ #define SPIRMSK 0x00040000 /* Bit 18: SPI Receive Interrupt Mask */ #define SPITMSK 0x00080000 /* Bit 19: SPI Transmit Interrupt Mask */ #define LP0MSKP 0x01000000 /* Bit 24: Link port channel 0 Interrupt Mask Pointer */ #define LP1MSKP 0x02000000 /* Bit 25: Link port channel 1 Interrupt Mask Pointer */ #define SPIRMSKP 0x04000000 /* Bit 26: SPI Receive Interrupt Mask Pointer */ #define SPITMSKP 0x08000000 /* Bit 27: SPI Transmit Interrupt Mask Pointer */ /* LSRQ #define #define #define #define #define #define #define #define register */ L0TM 0x00000010 L0RM 0x00000020 L1TM 0x00000040 L1RM 0x00000080 L0TRQ 0x00100000 L1TRQ 0x00200000 L0RRQ 0x00400000 L1RRQ 0x00800000
/* Link Port 0 Transmit Mask /* Link Port 0 Receive Mask /* Link Port 1 Transmit Mask /* Link Port 1 Receive Mask /* Link Port 0 Transmit Request /* Link Port 1 Receive Request /* Link Port 0 Transmit Request /* Link Port 1 Receive Request
*/ */ */ */ */ */ */ */
/*------------------------------------------------------------------------------*/ /* */ /* I/O Processor Register Address Memory Map */ /* */ /*------------------------------------------------------------------------------*/ #define SYSCON 0x00 /* System configuration register */ #define VIRPT 0x01 /* Vector interrupt register */ #define WAIT 0x02 /* External Port Wait register - renamed to EPCON */
A-124
Registers
#define EPCON 0x02 /* External Port configuration register */ #define SYSTAT 0x03 /* System status register */ /* the upper 32-bits of the 64-bit epbxs are only accessible as 64-bit reference*/ #define EPB0 0x04 /* External port DMA buffer 0 */ #define EPB1 0x06 /* External port DMA buffer 1 */ #define MSGR0 0x08 /* Message register 0 */ #define MSGR1 0x09 /* Message register 1 */ #define MSGR2 0x0a /* Message register 2 */ #define MSGR3 0x0b /* Message register 3 */ #define MSGR4 0x0c /* Message register 4 */ #define MSGR5 0x0d /* Message register 5 */ #define MSGR6 0x0e /* Message register 6 */ #define MSGR7 0x0f /* Message register 7 */ /* IOP shadow registers of the core control regs #define PC_SHDW 0x10 /* PC IOP shadow register (PC[23-0]) #define MODE2_SHDW 0x11 /* Mode2 IOP shadow register (MODE2[31-25]) #define EPB2 0x14 /* External port DMA buffer 2 #define EPB3 0x16 /* External port DMA buffer 3 #define BMAX 0x18 /* Bus time-out maximum */ #define BCNT 0x19 /* Bus time-out counter */ #define DMAC10 0x1c /* EP DMA10 control register */ #define DMAC11 0x1d /* EP DMA11 control register */ #define DMAC12 0x1e /* EP DMA12 control register */ #define DMAC13 0x1f /* EP DMA13 control register */ #define DMASTAT 0x37 /* DMA channel status register */ /* SPI Registers #define SPICTL #define SPISTAT #define SPIRX #define SPITX */ */ */ */ */
IOP Register Addresses*/ 0xb4 /* Serial peripheral-compatible interface control register */ 0xb5 /* Serial periipheral-compatible interface status register */ 0xb7 /* SPI receive data buffer */ 0xb6 /* SPI transmit data buffer */
/* IOFLAG Register Address */ #define IOFLAG 0x1b /* Address of programmable I/O flags 4-11 */ /* IOP registers for SDRAM controller. */ #define SDCTL 0xb8 /* SDRAM control reg. #define SDRDIV 0xb9 /* Refresh counter div reg. /* Link #define #define #define #define
*/ */
Port Registers */ LBUF0 0xc0 /* Link buffer 0 */ LBUF1 0xc2 /* Link buffer 1 */ LCTL 0xcc /* Link buffer control */ LSRQ 0xd0 /* Link service request and mask registers
*/
/* SPORT0 */ #define SPCTL0 0x1c0 #define TX0A 0x1c1 #define TX0B 0x1c2 #define RX0A 0x1c3 #define RX0B 0x1c4 #define DIV0 0x1c5 #define CNT0 0x1c6 /* SPORT2 */ #define SPCTL2 0x1d0 #define TX2A 0x1d1 #define TX2B 0x1d2 #define RX2A 0x1d3 #define RX2B 0x1d4 #define DIV2 0x1d5 #define CNT2 0x1d6 /* SPORT1 */ #define SPCTL1 0x1e0 #define TX1A 0x1e1 #define TX1B 0x1e2 #define RX1A 0x1e3
/* /* /* /* /* /* /*
serial port control register */ serial port control register */ transmit secondary B channel data buffer */ receive primary A channel data buffer */ receive secondary B channel data buffer */ divisor for transmit/receive SLCK0 and FS0 */ count register */
/* /* /* /* /* /* /*
serial port control register */ serial port control register */ transmit secondary B channel data buffer */ receive primary A channel data buffer */ receive secondary B channel data buffer */ divisor for transmit/receive SLCK2 and FS2 */ count register */
/* /* /* /*
serial port control register */ serial port control register */ transmit secondary B channel data buffer */ receive primary A channel data buffer */
A-125
/* SPORT1 receive secondary B channel data buffer */ /* SPORT1 divisor for transmit/receive SLCK1 and FS1 */ /* SPORT1 count register */
/* SPORT3 */ #define SPCTL3 0x1f0 /* SPORT3 serial port control register */ #define TX3A 0x1f1 /* SPORT3 serial port control register */ #define TX3B 0x1f2 /* SPORT3 transmit secondary B channel data buffer */ #define RX3A 0x1f3 /* SPORT3 receive primary A channel data buffer */ #define RX3B 0x1f4 /* SPORT3 receive secondary B channel data buffer */ #define DIV3 0x1f5 /* SPORT3 divisor for transmit/receive SLCK3 and FS3 */ #define CNT3 0x1f6 /* SPORT3 count register */ /* SPORT0 - MCM Receive (Works in pair with SPORT2) */ #define MR0CS0 0x1c7 /* SPORT0 multichannel rx select, channels 31 - 0 */ #define MR0CCS0 0x1c8 /* SPORT0 multichannel rx compand select, channels 31 - 0 */ #define MR0CS1 0x1c9 /* SPORT0 multichannel rx select, channels 63 - 32 */ #define MR0CCS1 0x1ca /* SPORT0 multichannel rx compand select, channels 63 - 32 */ #define MR0CS2 0x1cb /* SPORT0 multichannel rx select, channels 95 - 64 */ #define MR0CCS2 0x1cc /* SPORT0 multichannel rx compand select, channels 95 - 64 */ #define MR0CS3 0x1cd /* SPORT0 multichannel rx select, channels 127 - 96 */ #define MR0CCS3 0x1ce /* SPORT0 multichannel rx compand select, channels 127 - 96 */ /* SPORT2 - MCM Transmit (Works in pair with SPORT0) #define MT2CS0 0x1d7 /* SPORT2 multichannel tx #define MT2CCS0 0x1d8 /* SPORT2 multichannel tx #define MT2CS1 0x1d9 /* SPORT2 multichannel tx #define MT2CCS1 0x1da /* SPORT2 multichannel tx #define MT2CS2 0x1db /* SPORT2 multichannel tx #define MT2CCS2 0x1dc /* SPORT2 multichannel tx #define MT2CS3 0x1dd /* SPORT2 multichannel tx #define MT2CCS3 0x1de /* SPORT2 multichannel tx #define SP02MCTL 0x1df */ select, compand select, compand select, compand select, compand
channels 31 - 0 */ select, channels 31 - 0 */ channels 63 - 32 */ select, channels 63 - 32 */ channels 95 - 64 */ select, channels 95 - 64 */ channels 127 - 96 */ select, channels 127 - 96 */
/* SPORT1 - MCM Receive (Works in pair with SPORT3) */ #define MR1CS0 0x1e7 /* SPORT1 multichannel rx select, channels 31 - 0 */ #define MR1CCS0 0x1e8 /* SPORT1 multichannel rx compand select, channels 31 - 0 */ #define MR1CS1 0x1e9 /* SPORT1 multichannel rx select, channels 63 - 32 */ #define MR1CCS1 0x1ea /* SPORT1 multichannel rx compand select, channels 63 - 32 */ #define MR1CS2 0x1eb /* SPORT1 multichannel rx select, channels 95 - 64 */ #define MR1CCS2 0x1ec /* SPORT1 multichannel rx compand select, channels 95 - 64 */ #define MR1CS3 0x1ed /* SPORT1 multichannel rx select, channels 127 - 96 */ #define MR1CCS3 0x1ee /* SPORT1 multichannel rx compand select, channels 127 - 96 */ /* SPORT3 - MCM Transmit (Works in pair with SPORT1) #define MT3CS0 0x1f7 /* SPORT3 multichannel tx #define MT3CCS0 0x1f8 /* SPORT3 multichannel tx #define MT3CS1 0x1f9 /* SPORT3 multichannel tx #define MT3CCS1 0x1fa /* SPORT3 multichannel tx #define MT3CS2 0x1fb /* SPORT3 multichannel tx #define MT3CCS2 0x1fc /* SPORT3 multichannel tx #define MT3CS3 0x1fd /* SPORT3 multichannel tx #define MT3CCS3 0x1fe /* SPORT3 multichannel tx #define SP13MCTL */ select, compand select, compand select, compand select, compand
channels 31 - 0 */ select, channels 31 - 0 */ channels 63 - 32 */ select, channels 63 - 32 */ channels 95 - 64 */ select, channels 95 - 64 */ channels 127 - 96 */ select, channels 127 - 96 */
/*------ DMA Parameter Register Assignments - New Naming Conventions -------*/ /* DMA Channel 0 - Serial Port 0, A channel data */ #define II0A 0x60 /* Internal DMA0 memory address */ #define IM0A 0x61 /* Internal DMA0 memory access modifier */ #define C0A 0x62 /* Contains number of DMA0 transfers remaining */ #define CP0A 0x63 /* Points to next DMA0 parameters */ #define GP0A 0x64 /* DMA0 General purpose */ /* DMA Channel 1 - Serial Port 0, B channel data */ #define II0B 0x80 /* Internal DMA1 memory address
*/
A-126
Registers
/* /* /* /*
Internal DMA1 memory access modifier */ Contains number of DMA1 transfers remaining */ Points to next DMA1 parameters */ DMA1 General purpose */
/* DMA Channel 2 - Serial Port 1, A channel data */ #define II1A 0x68 /* Internal DMA2 memory address */ #define IM1A 0x69 /* Internal DMA2 memory access modifier */ #define C1A 0x6a /* Contains number of DMA2 transfers remaining */ #define CP1A 0x6b /* Points to next DMA2 parameters */ #define GP1A 0x6c /* DMA2 General purpose */ /* DMA Channel 3 - Serial Port 1, B channel data */ #define II1B 0x88 /* Internal DMA3 memory address */ #define IM1B 0x89 /* Internal DMA3 memory access modifier */ #define C1B 0x8a /* Contains number of DMA3 transfers remaining */ #define CP1B 0x8b /* Points to next DMA3 parameters */ #define GP1B 0x8c /* DMA3 General purpose */ /* DMA Channel 4 - Serial Port 2, A channel data */ #define II2A 0x70 /* Internal DMA4 memory address */ #define IM2A 0x71 /* Internal DMA4 memory access modifier */ #define C2A 0x72 /* Contains number of DMA4 transfers remaining */ #define CP2A 0x73 /* Points to next DMA4 parameters */ #define GP2A 0x74 /* DMA4 General purpose */ /* DMA Channel 5 - Serial Port 2, B channel data */ #define II2B 0x90 /* Internal DMA5 memory address */ #define IM2B 0x91 /* Internal DMA5 memory access modifier */ #define C2B 0x92 /* Contains number of DMA5 transfers remaining */ #define CP2B 0x93 /* Points to next DMA5 parameters */ #define GP2B 0x94 /* DMA5 General purpose */ /* DMA Channel 6 - Serial Port 3, A channel data */ #define II3A 0x78 /* Internal DMA6 memory address */ #define IM3A 0x79 /* Internal DMA6 memory access modifier */ #define C3A 0x7a /* Contains number of DMA6 transfers remaining */ #define CP3A 0x7b /* Points to next DMA6 parameters */ #define GP3A 0x7c /* DMA6 General purpose */ /* DMA Channel 7 - Serial Port 3, B channel data */ #define II3B 0x98 /* Internal DMA7 memory address */ #define IM3B 0x99 /* Internal DMA7 memory access modifier #define C3B 0x9a /* Contains number of DMA7 transfers remaining #define CP3B 0x9b /* Points to next DMA7 parameters */ #define GP3B 0x9c /* DMA7 General purpose */ /* DMA Channel 8 - Link #define IILB0 0x30 #define IMLB0 0x31 #define CLB0 0x32 #define CPLB0 0x33 #define GPLB0 0x34 /* DMA Channel 8 - SPI #define IISRX 0x30 #define IMSRX 0x31 #define CSRX 0x32 #define GPSRX 0x34 /* DMA Channel 9 - Link #define IILB1 0x38 #define IMLB1 0x39 #define CLB1 0x3a #define CPLB1 0x3b #define GPLB1 0x3c
*/ */
Buffer 0 (or SPI Receive) */ /* Internal DMA8 memory address */ /* Internal DMA8 memory access modifier */ /* Contains number of DMA8 transfers remaining */ /* Points to next DMA8 parameters */ /* DMA8 General purpose */ Receive (or Link Buffer 0) - No DMA Chain Pointer reg */ /* Internal DMA8 memory address */ /* Internal DMA8 memory access modifier */ /* Contains number of DMA8 transfers remaining */ /* DMA8 General purpose */ Buffer 1 (or SPI Transmit) */ /* Internal DMA9 memory address */ /* Internal DMA9 memory access modifier */ /* Contains number of DMA9 transfers remaining */ /* Points to next DMA9 parameters */ /* DMA9 General purpose */
/* DMA Channel 9 - SPI Transmit (or Link Buffer 1) - No DMA Chain Pointer reg */ #define IISTX 0x38 /* Internal DMA9 memory address */
A-127
/* Internal DMA9 memory access modifier */ /* Contains number of DMA9 transfers remainnig */ /* DMA9 General purpose */
/* DMA Channel 10 - External Port FIFO Buffer 0 */ #define IIEP0 0x40 /* Internal DMA10 memory address */ #define IMEP0 0x41 /* Internal DMA10 memory access modifier */ #define CEP0 0x42 /* Contains number of DMA10 transfers remaining */ #define CPEP0 0x43 /* Points to next DMA10 parameters */ #define GPEP0 0x44 /* DMA10 General purpose */ #define EIEP0 0x45 /* External DMA10 address */ #define EMEP0 0x46 /* External DMA10 address modifier */ #define ECEP0 0x47 /* External DMA10 counter */ /* DMA Channel 11 - External Port FIFO Buffer 1 */ #define IIEP1 0x48 /* Internal DMA11 memory address */ #define IMEP1 0x49 /* Internal DMA11 memory access modifier */ #define CEP1 0x4a /* Contains number of DMA11 transfers remaining */ #define CPEP1 0x4b /* Points to next DMA11 parameters */ #define GPEP1 0x4c /* DMA11 General purpose */ #define EIEP1 0x4d /* External DMA11 address */ #define EMEP1 0x4e /* External DMA11 address modifier */ #define ECEP1 0x4f /* External DMA counter */ /* DMA Channel 12 - External Port FIFO Buffer 2 */ #define IIEP2 0x50 /* Internal DMA12 memory address */ #define IMEP2 0x51 /* Internal DMA12 memory access modifier */ #define CEP2 0x52 /* Contains number of DMA12 transfers remaining */ #define CPEP2 0x53 /* Points to next DMA12 parameters */ #define GPEP2 0x54 /* DMA12 General purpose */ #define EIEP2 0x55 /* External DMA12 address */ #define EMEP2 0x56 /* External DMA12 address modifier */ #define ECEP2 0x57 /* External DMA12 counter */ /* DMA Channel 13 - External Port FIFO Buffer 3 */ #define IIEP3 0x58 /* Internal DMA13 memory address */ #define IMEP3 0x59 /* Internal DMA13 memory access modifier */ #define CEP3 0x5a /* Contains number of DMA13 transfers remaining */ #define CPEP3 0x5b /* Points to next DMA13 parameters */ #define GPEP3 0x5c /* DMA13 General purpose */ #define EIEP3 0x5d /* External DMA13 address */ #define EMEP3 0x5e /* External DMA13 address modifier */ #define ECEP3 0x5f /* External DMA13 counter */
/*---- DMA Parameter Register Assignments - Old Legacy ADSP-21160 Naming Conventions ---- */ /* NOTE: For backwards compatibility, we can retain the old DMA parameter register names used in the ADSP-21160. However, the naming conventions used for DMA channels of the ADSP-21160 do not necessarily correspond to the actual DMA channel priority assigment for the ADSP-21160 Ex) DMA Channel 4 IOP addresses on the ADSP-21160 are now DMA channel 8 on the ADSP-21161 DMA Channel 5 IOP addresses on the ADSP-21160 are now DMA channel 9 on the ADSP-21161 To clear any confusion, we recommend using the new IOP naming conventions for the DMA parameter registers as defined above */ #define #define #define #define #define #define #define #define #define #define II0 IM0 C0 CP0 GP0 II1 IM1 C1 CP1 GP1 0x60 0x61 0x62 0x63 0x64 0x68 0x69 0x6a 0x6b 0x6c /* /* /* /* /* /* /* /* /* /* Internal DMA0 memory address */ Internal DMA0 memory access modifier Contains number of DMA0 transfers remaining Points to next DMA0 parameters */ DMA0 General purpose */ Internal DMA1 memory address */ Internal DMA1 memory access modifier Contains number of DMA1 transfers remaining Points to next DMA1 parameters */ DMA1 General purpose */
*/ */
*/ */
A-128
Registers
#define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define
II2 IM2 C2 CP2 GP2 II3 IM3 C3 CP3 GP3 II6 IM6 C6 CP6 GP6 II7 IM7 C7 CP7 GP7 II8 IM8 C8 CP8 GP8 II9 IM9 C9 CP9 GP9 II4 IM4 C4 CP4 GP4 II5 IM5 C5 CP5 GP5 II10 IM10 C10 CP10 GP10 EI10 EM10 EC10 II11 IM11 C11 CP11 GP11 EI11 EM11 EC11
0x70 0x71 0x72 0x73 0x74 0x78 0x79 0x7a 0x7b 0x7c 0x80 0x81 0x82 0x83 0x84 0x88 0x89 0x8a 0x8b 0x8c 0x90 0x91 0x92 0x93 0x94 0x98 0x99 0x9a 0x9b 0x9c 0x30 0x31 0x32 0x33 0x34 0x38 0x39 0x3a 0x3b 0x3c 0x40 0x41 0x42 0x43 0x44 0x45 0x46 0x47 0x48 0x49 0x4a 0x4b 0x4c 0x4d 0x4e 0x4f 0x50 0x51 0x52
/* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /*
Internal DMA2 memory address */ Internal DMA2 memory access modifier Contains number of DMA2 transfers remaining Points to next DMA2 parameters */ DMA2 General purpose */ Internal DMA3 memory address */ Internal DMA3 memory access modifier Contains number of DMA3 transfers remaining Points to next DMA3 parameters */ DMA3 General purpose */ Internal DMA6 memory address */ Internal DMA6 memory access modifier Contains number of DMA6 transfers remaining Points to next DMA6 parameters */ DMA6 General purpose */ Internal DMA7 memory address */ Internal DMA7 memory access modifier Contains number of DMA7 transfers remaining Points to next DMA7 parameters */ DMA7 General purpose */ Internal DMA8 memory address */ Internal DMA8 memory access modifier Contains number of DMA8 transfers remaining Points to next DMA8 parameters */ DMA8 General Purpose */ Internal DMA9 memory address */ Internal DMA9 memory access modifier Contains number of DMA9 transfers remaining Points to next DMA9 parameters */ DMA9 General purpose */ Internal DMA4 memory address */ Internal DMA4 memory access modifier Contains number of DMA4 transfers remaining Points to next DMA4 parameters */ DMA4 General purpose */ Internal DMA5 memory address */ Internal DMA5 memory access modifier Contains number of DMA5 transfers remaining Points to next DMA5 parameters */ DMA5 General purpose */ Internal DMA10 memory address */ Internal DMA10 memory access modifier Contains number of DMA10 transfers remaining Points to next DMA10 parameters */ DMA10 General purpose */ External DMA10 address */ External DMA10 address modifier */ External DMA10 counter */ Internal DMA11 memory address */ Internal DMA11 memory access modifier Contains number of DMA11 transfers remaining Points to next DMA11 parameters */ DMA11 General purpose */ External DMA11 address */ External DMA11 address modifier */ External DMA counter */
*/ */
*/ */
*/ */
*/ */
*/ */
*/ */
*/ */
*/ */
*/ */
*/ */
/* Internal DMA12 memory address */ /* Internal DMA12 memory access modifier /* Contains number of DMA12 transfers remaining
*/ */
A-129
#define #define #define #define #define #define #define #define #define #define #define #define #define
CP12 GP12 EI12 EM12 EC12 II13 IM13 C13 CP13 GP13 EI13 EM13 EC13
0x53 0x54 0x55 0x56 0x57 0x58 0x59 0x5a 0x5b 0x5c 0x5d 0x5e 0x5f
/* /* /* /* /* /* /* /* /* /* /* /* /*
Points to next DMA12 parameters DMA12 General purpose External DMA12 address External DMA12 address modifier External DMA12 counter
*/ */ */ */ */
Internal DMA13 memory address */ Internal DMA13 memory access modifier Contains number of DMA13 transfers remaining Points to next DMA13 parameters */ DMA13 General purpose */ External DMA13 address */ External DMA13 address modifier */ External DMA13 counter */
*/ */
/* Emulation/Breakpoint Registers (remapped from UREG space) */ /* NOTES: - These registers are ONLY accessible by the core - It is *highly* recommended that these facilities be accessed only through the ADI emulator routines */ /* Core Emulation HWBD Registers */ #define PSA1S 0xa0 /* Instruction address start #1 #define PSA1E 0xa1 /* Instruction address end #1 #define PSA2S 0xa2 /* Instruction address start #2 #define PSA2E 0xa3 /* Instruction address end #2 #define PSA3S 0xa4 /* Instruction address start #3 #define PSA3E 0xa5 /* Instruction address end #3 #define PSA4S 0xa6 /* Instruction address start #4 #define PSA4E 0xa7 /* Instruction address end #4 #define PMDAS 0xa8 /* Program Data address start #define PMDAE 0xa9 /* Program Data address end #define DMA1S 0xaa /* Data address start #1 #define DMA1E 0xab /* Data address end #1 #define DMA2S 0xac /* Data address start #2 #define DMA2E 0xad /* Data address end #2 #define EMUN 0xae /* hwbp hit-count register /* IOP Emulation HWBP Bounds Registers */ #define IOAS 0xb0 /* IOA Upper Bounds #define IOAE 0xb1 /* IOA Lower Bounds #define EPAS 0xb2 /* EPA Upper Bounds #define EPAE 0xb3 /* EPA Lower Bounds
*/ */ */ */ */ */ */ */ */ */ */ */ */ */ */
*/ */ */ */
/*----------------------------------------------------------------------------*/ /* */ /* IOP Control/Status Register Bit Definitions */ /* */ /*----------------------------------------------------------------------------*/ /* SYSCON Register */ #define SRST 0x00000001 #define BSO 0x00000002 #define IIVT 0x00000004 #define IWT 0x00000008 #define HBW32 0x00000000 #define HBW16 0x00000010 #define HBW8 0x00000020 #define HMSWF 0x00000080 #define HPFLSH 0x00000100 #define IMDW0X 0x00000200 #define IMDW1X 0x00000400 #define ADREDY 0x00000800 #define BHD 0x00010000 #define EBPR00 0x00000000 #define EBPR01 0x00020000 #define EBPR10 0x00040000
/* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /*
Soft Reset */ Boot Select Override */ Internal Interrupt Vector Table */ Instruction word transfer (0 = data, 1 = inst) */ Host bus width: 32 */ Host bus width: 16 */ Host bus width: 8 */ Host packing order (0 = LSW first, 1 = MSW) */ Host pack flush */ Internal memory block 0, extended data (40 bit) */ Internal memory block 1, extended data (40 bit) */ Active Drive Ready */ Buffer Hand Disable */ External bus priority: Even */ External bus priority: Core has priority */ External bus priority: IO has priority */
A-130
Registers
/* /* /* /* /* /*
Select rotating access priority on DMA10 - DMA13*/ Select rotating access priority on DMA8 - DMA9 */ Select rotating prio between LPort and EPort */ Clock Out Disable */ External instruction execution packing mode bit 0 */ External instruction execution packing mode bit 1 */
/* SYSTAT Register */ #define HSTM 0x00000001 #define BSYN 0x00000002 #define CRBM 0x00000070 #define IDC 0x00000700 #define VIPD 0x00002000 #define CRAT 0x00070000 #define SSWPD 0x00100000 #define SWPD 0x00200000 #define HPS 0x01c00000
/* /* /* /* /* /* /* /* /*
Host is the Bus Master */ Bus arbitration logic is synchronized */ Current ADSP211xx Bus Master */ ADSP211xx ID Code */ Vector interrupt pending (1 = pending) */ CLK_CFG(3-0), Core:CLKIN clock ratio */ Sync slave write pending... SSWPD bit added for 21161 */ Any (sync + Async) slave write pending */ Host pack status... HPS modified for 21161 */
/* MODE2_SHDW Register - IOP register adrees 0x11 */ /* bits 31-30, 27-25 are Processor ID[4:0], read only, value: 01010 bits 29-28 are silicon revision[1:0], read only, value: 01 These former MODE2 register bitfields (only) are now routed to the MODE2 Shadow register (IOP register 0x11). Bits 25-31 now reserved in MODE2. */ #define PID20 0x0E000000 /* PID[2:0] Processor Identification (read-only)*/ #define SIREV 0x30000000 /* Silicon Revision (read-only) */ #define PID43 0xC0000000 /* PID[4:3] Processor Identification (read-only) */
/* WAIT Register */ /* generic WAIT bitfields */ #define EB0AM 0x00000003 /* External Bank 0 Access Mode */ #define EB0WS 0x0000001C /* External Bank 0 Waitstate Configuration */ #define EB1AM 0x00000060 /* External Bank 1 Access Mode */ #define EB1WS 0x00000380 /* External Bank 1 Waitstate Configuration */ #define EB2AM 0x00000C00 /* External Bank 1 Access Mode */ #define EB2WS 0x00007000 /* External Bank 2 Waitstate Configuration */ #define EB3AM 0x00018000 /* External Bank 1 Access Mode */ #define EB3WS 0x000E0000 /* External Bank 3 Waitstate Configuration */ #define RBAM 0x00300000 /* ROM Boot Access Mode */ #define RBWS 0x01C00000 /* ROM Boot Waitstate Configuration */ #define HIDMA 0x80000000 /* Single idle cycle for DMA handshake */ /* specific wait access mode settings */ #define EB0A0 0x00000000 /* Ext Bank 0 Async, internal AND external ACK */ #define EB0S1 0x00000001 /* Ext Bank 0 Sync, 2-cycle reads, 1-cycle writes */ #define EB0S2 0x00000002 /* Ext Bank 0 Sync, 2-cycle reads, 2-cycle writes */ #define EB1A0 0x00000000 /* Ext Bank 1 Async, internal AND external ACK */ #define EB1S1 0x00000020 /* Ext Bank 1 Sync, 2-cycle reads, 1-cycle writes */ #define EB1S2 0x00000040 /* Ext Bank 1 Sync, 2-cycle reads, 2-cycle writes */ #define EB2A0 0x00000000 /* Ext Bank 2 Async, internal AND external ACK */ #define EB2S1 0x00000400 /* Ext Bank 2 Sync, 2-cycle reads, 1-cycle writes */ #define EB2S2 0x00000800 /* Ext Bank 2 Sync, 2-cycle reads, 2-cycle writes */ #define EB3A0 0x00000000 /* Ext Bank 3 Async, internal AND external ACK */ #define EB3S1 0x00008000 /* Ext Bank 3 Sync, 2-cycle reads, 1-cycle writes */ #define EB3S2 0x00010000 /* Ext Bank 3 Sync, 2-cycle reads, 2-cycle writes */ #define RBWA0 0x00000000 /* ROM boot: Async, internal AND external ACK */ #define RBWS1 0x00100000 /* ROM boot: Sync, 2-cycle reads, 1-cycle writes */ #define RBWS2 0x00200000 /* ROM boot: Sync, 2-cycle reads, 2-cycle writes */ /* individual waitstate combinations */ #define EB0WS0 0x00000000 /* External Bank 0: 0 waitstates, no hold cycle */ #define EB0WS1 0x00000004 /* External Bank 0: 1 waitstates, no hold cycle */ #define EB0WS2 0x00000008 /* External Bank 0: 2 waitstates, hold cycle */ #define EB0WS3 0x0000000C /* External Bank 0: 3 waitstates, hold cycle */ #define EB0WS4 0x00000010 /* External Bank 0: 4 waitstates, hold cycle */ #define EB0WS5 0x00000014 /* External Bank 0: 5 waitstates, hold cycle */ #define EB0WS6 0x00000018 /* External Bank 0: 6 waitstates, hold cycle */ #define EB0WS7 0x0000001C /* External Bank 0: 7 waitstates, hold cycle */ #define EB1WS0 0x00000000 /* External Bank 1: 0 waitstates, no hold cycle */
A-131
#define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define
EB1WS1 EB1WS2 EB1WS3 EB1WS4 EB1WS5 EB1WS6 EB1WS7 EB2WS0 EB2WS1 EB2WS2 EB2WS3 EB2WS4 EB2WS5 EB2WS6 EB2WS7 EB3WS0 EB3WS1 EB3WS2 EB3WS3 EB3WS4 EB3WS5 EB3WS6 EB3WS7 RBWST0 RBWST1 RBWST2 RBWST3 RBWST4 RBWST5 RBWST6 RBWST7
0x00000080 0x00000100 0x00000180 0x00000200 0x00000280 0x00000300 0x00000380 0x00000000 0x00001000 0x00002000 0x00003000 0x00004000 0x00005000 0x00006000 0x00007000 0x00000000 0x00020000 0x00040000 0x00060000 0x00080000 0x000A0000 0x000C0000 0x000E0000 0x00000000 0x00400000 0x00800000 0x00C00000 0x01000000 0x01400000 0x01800000 0x01C00000
/* External Bank 1: 1 waitstates, no hold cycle /* External Bank 1: 2 waitstates, hold cycle /* External Bank 1: 3 waitstates, hold cycle /* External Bank 1: 4 waitstates, hold cycle /* External Bank 1: 5 waitstates, hold cycle /* External Bank 1: 6 waitstates, hold cycle /* External Bank 1: 7 waitstates, hold cycle /* External Bank 2: 0 waitstates, no hold cycle /* External Bank 2: 1 waitstates, no hold cycle /* External Bank 2: 2 waitstates, hold cycle /* External Bank 2: 3 waitstates, hold cycle /* External Bank 2: 4 waitstates, hold cycle /* External Bank 2: 5 waitstates, hold cycle /* External Bank 2: 6 waitstates, hold cycle /* External Bank 2: 7 waitstates, hold cycle /* External Bank 3: 0 waitstates, no hold cycle /* External Bank 3: 1 waitstates, no hold cycle /* External Bank 3: 2 waitstates, hold cycle /* External Bank 3: 3 waitstates, hold cycle /* External Bank 3: 4 waitstates, hold cycle /* External Bank 3: 5 waitstates, hold cycle /* External Bank 3: 6 waitstates, hold cycle /* External Bank 3: 7 waitstates, hold cycle /* ROM boot wait state 0, no hold cycle */ /* ROM boot wait state 1, no hold cycle */ /* ROM boot wait state 2, hold cycle */ /* ROM boot wait state 3, hold cycle */ /* ROM boot wait state 4, hold cycle */ /* ROM boot wait state 5, hold cycle */ /* ROM boot wait state 6, hold cycle */ /* ROM boot wait state 7, hold cycle */
*/ */ */ */ */ */ */ */ */ */ */ */ */ */ */ */ */ */ */ */ */ */ */
/* DMAC10, DMAC11, DMAC12, DMAC13 Register Bitfield Definitions */ #define DEN 0x00000001/* External Port DMA Enable */ #define CHEN 0x00000002/* External Port DMA Chaining Enable */ #define TRAN 0x00000004/* External Port EPBx Transmit/Receive Select */ #define DTYPE 0x00000020/* EPBx FIFO Buffer Data Type Select */ #define PMODE1 0x00000040/* EPBx FIFO Pack Modes.16-bit ext to 32/64-bit int packing */ #define PMODE2 0x00000080/* 16-bit external to 48-bit internal packing */ #define PMODE3 0x000000C0/* 32-bit external to 48-bit internal packing */ #define PMODE4 0x00000100/* No Pack Mode-32-bit external to 32/64-bit internal packing */ #define PMODE5 0x00000140/* 8-bit external to 48-bit internal packing */ #define PMODE6 0x00000180/* 8-bit external to 32/64-bit internal packing */ #define MSWF 0x00000200/* Most Significant Word First During Packing */ #define MASTER 0x00000400/* EPBx DMA Master Mode Enable */ #define HSHAKE 0x00000800/* EPBx DMA Handshake Mode Enable */ #define INTIO 0x00001000/* Single Word Interrupts for EPBx FIFO Buffers */ #define EXTERN 0x00002000/* External Handshake Mode Enable */ #define FLSH 0x00004000/* Flush EPBx FIFO Buffers and Status */ #define PRIO 0x00008000/* External Port Bus Priority Access */ #define FS 0x00030000/* External Port FIFO Buffer Status (read-only) */ #define INT32 0x00040000/* Internal Memory 32-bit Transfer Select */ #define MAXBL0 0x00000000/* Maximum Burst Length Select Disabled */ #define MAXBL1 0x00080000/* Bit 19 set; Maximum Burst Length Limit of 4 Enabled*/ #define PS 0x00E00000/* Ext. Port EPBx FIFO Buffer Pack Status (read-only)*/ /* DMASTAT Register (read-only) */ #define DMA0ST 0x00000001 /* DMA channel 0 #define DMA2ST 0x00000002 /* DMA channel 2 #define DMA4ST 0x00000004 /* DMA channel 4 #define DMA6ST 0x00000008 /* DMA channel 6 #define DMA8ST 0x00000010 /* DMA channel 8 #define DMA9ST 0x00000020 /* DMA channel 9 #define DMA1ST 0x00000040 /* DMA channel 1 #define DMA3ST 0x00000080 /* DMA channel 3 #define DMA5ST 0x00000100 /* DMA channel 5 #define DMA7ST 0x00000200 /* DMA channel 7 #define DMA10ST 0x00000400/* DMA channel 10
(RX0A/TX0A) Active Status (RX1A/TX1A) Active Status (RX2A/TX2A) Active Status (RX3A/TX3A) Active Status (LBUF0) Active Status */ (LBUF1) Active Status */ (RX0B/TX0B) Active Status (RX1B/TX1B) Active Status (RX2B/TX2B) Active Status (RX3B/TX3B) Active Status (EPB0) Active Status */
*/ */ */ */
*/ */ */ */
A-132
Registers
#define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define
DMA11ST 0x00000800 /* DMA channel 11 (EPB1) Active Status */ DMA12ST 0x00001000 /* DMA channel 12 (EPB2) Active Status */ DMA13ST 0x00002000 /* DMA channel 13 (EPB3) Active Status */ DMA0CHST 0x00010000/* DMA channel 0 (RX0A/TX0A) Chaining Status DMA2CHST 0x00020000/* DMA channel 2 (RX1A/TX1A) Chaining Status DMA4CHST 0x00040000/* DMA channel 4 (RX2A/TX2A) Chaining Status DMA6CHST 0x00080000/* DMA channel 6 (RX3A/TX3A) Chaining Status DMA8CHST 0x00100000/* DMA channel 8 (LBUF0) Chaining Status */ DMA9CHST 0x00200000/* DMA channel 9 (LBUF1) Chaining Status */ DMA1CHST 0x00400000/* DMA channel 1 (RX0B/TX0B) Chaining Status DMA3CHST 0x00800000/* DMA channel 3 (RX1B/TX1B) Chaining Status DMA5CHST 0x01000000/* DMA channel 5 (RX2B/TX2B) Chaining Status DMA7CHST 0x02000000/* DMA channel 7 (RX3B/TX3B) Chaining Status DMA10CHST 0x04000000/* DMA channel 10 (EPB0) Chaining Status */ DMA11CHST 0x08000000/* DMA channel 11 (EPB1) Chaining Status */ DMA12CHST 0x10000000/* DMA channel 12 (EPB2) Chaining Status */ DMA13CHST 0x20000000/* DMA channel 13 (EPB3) Chaining Status */
*/ */ */ */
*/ */ */ */
/* SDCTL - SDRAM Control Register bit definitions */ #define SDCL1 0x00000001/* SDCL[1:0] - CAS Latency field */ #define SDCL2 0x00000002/* (delay between RD cmd and data at o/p pins) */ #define SDCL3 0x00000003/* configurable between 1 and 3 SDCLK cycles */ #define DSDCTL #define DSDCK1 #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define 0x00000004 /* disable SDCLK0, /RAS, /CAS & SDCKE pins 0x00000008/* disable SDCLK1 pin */ SDTRAS0 SDTRAS1 SDTRAS2 SDTRAS3 SDTRAS4 SDTRAS5 SDTRAS6 SDTRAS7 SDTRAS8 SDTRAS9 SDTRAS10 SDTRAS11 SDTRAS12 SDTRAS13 SDTRAS14 SDTRAS15 SDTRP0 SDTRP1 SDTRP2 SDTRP3 SDTRP4 SDTRP5 SDTRP6 SDTRP7 SDPM SDPGS256 SDPGS512 SDPGS1024 SDPGS2048 SDPSS SDSRF SDEM0 SDEM1 SDEM2 SDEM3 SDBN2 SDBN4 SDCKRx1 SDCKR_DIV2 SDBUF */
0x00000000/* SDTRAS[3:0] - tRAS spec (active command delay)*/ 0x00000010 /* (required delay between a Bank Activate */ 0x00000020/* command to a Precharge command) */ 0x00000030 /* configurable between 0 to 15 SDCLK cycles */ 0x00000040 0x00000050 0x00000060 0x00000070 0x00000080 0x00000090 0x000000a0 0x000000b0 0x000000c0 0x000000d0 0x000000e0 0x000000f0 0x00000000/* SDTRP[2:0] - tRP spec (precharge delay) */ 0x00000100/* (required delay between a precharge command */ 0x00000200/* to a Bank Activate command) */ 0x00000300 /* configurable between 1 to 7 cycles */ 0x00000400 0x00000500 0x00000600 0x00000700 SDRAM power-up mode bit */ SDRAM Page Size - 256 words */ SDRAM Page Size - 512 words */ SDRAM Page Size - 1024 words */ SDRAM Page Size - 2048 words */ SDRAM power-up sequence start command */ Self refresh command */ Memory Bank 0 SDRAM Enable */ Memory Bank 1 SDRAM Enable */ Memory Bank 2 SDRAM Enable */ Memory Bank 3 SDRAM Enable */ SDRAM contains 2 memory banks */ SDRAM contains 4 memory banks */ 1:1 (full) SDCLK-to-CCLK (core-clock) ratio */ 1:2 (one-half) SDCLK-to-CCLK ratio */ Pipeline (reg. buf) option */
0x00000800/* 0x00000000/* 0x00001000/* 0x00002000/* 0x00003000/* 0x00004000/* 0x00008000/* 0x00010000/* 0x00020000/* 0x00040000/* 0x00080000/* 0x00000000/* 0x00100000/* 0x00200000/* 0x00000000/* 0x00800000/*
A-133
0x00000000/* SDTRCD[2:0] - tRCD spec. (RAS-to-CAS delay)*/ 0x01000000/* (required delay between a Bank Activate cmd */ 0x02000000/* and the start of the first RD or WR) */ 0x03000000/* configurable between 1 to 7 SDCLK cycles*/ 0x04000000 0x05000000 0x06000000 0x07000000
/* IOFLAG - programmable I/O status macro definitions */ #define FLG4 0x00000001 /* FLAG4 value (Low = '0', High = '1') */ #define FLG5 0x00000002 /* FLAG5 value (Low = '0', High = '1') */ #define FLG6 0x00000004 /* FLAG6 value (Low = '0', High = '1') */ #define FLG7 0x00000008 /* FLAG7 value (Low = '0', High = '1') */ #define FLG8 0x00000010 /* FLAG8 value (Low = '0', High = '1') */ #define FLG9 0x00000020 /* FLAG9 value (Low = '0', High = '1') */ #define FLG10 0x00000040 /* FLAG10 value (Low = '0', High = '1') */ #define FLG11 0x00000080 /* FLAG11 value (Low = '0', High = '1') */ /* IOFLAG - programmable I/O control macro definitions */ #define FLG4O 0x00000100 /* FLAG4 control ('0' = flag input, '1' = flag output) */ #define FLG5O 0x00000200 /* FLAG5 control ('0' = flag input, '1' = flag output) */ #define FLG6O 0x00000400 /* FLAG6 control ('0' = flag input, '1' = flag output) */ #define FLG7O 0x00000800 /* FLAG7 control ('0' = flag input, '1' = flag output) */ #define FLG8O 0x00001000 /* FLAG8 control ('0' = flag input, '1' = flag output) */ #define FLG9O 0x00002000 /* FLAG9 control ('0' = flag input, '1' = flag output) */ #define FLG10O 0x00004000 /* FLAG10 control ('0' = flag input, '1' = flag output) */ #define FLG11O 0x00008000 /* FLAG11 control ('0' = flag input, '1' = flag output) */
/*SPICTL register */ #define SPIEN 0x00000001 #define SPRINT 0x00000002 #define SPTINT 0x00000004 #define MS 0x00000008 #define CP 0x00000010 #define CPHASE 0x00000020 #define DF 0x00000040 #define WL8 0x00000000 #define WL16 0x00000080 #define WL32 0x00000180 #define BAUDR1 0x00000200 #define BAUDR2 0x00000400 #define BAUDR3 0x00000600 #define BAUDR4 0x00000800 #define BAUDR5 0x00000A00 #define BAUDR6 0x00000C00 #define BAUDR7 0x00000E00 #define BAUDR8 0x00001000 #define BAUDR9 0x00001200 #define BAUDR100x00001400 #define BAUDR110x00001600 #define BAUDR120x00001800 #define BAUDR130x00001A00 #define BAUDR140x00001C00 #define BAUDR150x00001E00 #define TDMAEN 0x00002000 #define PSSE 0x00004000 #define FLS0 0x00008000 #define FLS1 0x00010000 #define FLS2 0x00020000 #define FLS3 0x00040000 #define SMLS 0x00080000 #define NSMLS 0x00080000 #define DCPH0 0x00100000 #define DMISO 0x02000000 #define OPD 0x04000000 #define RDMAEN 0x08000000
/* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /*
SPI system enable */ SPIRX buffer interrupt enable */ SPITX buffer interrupt enable */ Master/Slave Mode bit */ SPICLK Polarity */ SPICLK Phase */ Data Format */ SPI Word Length = 8 */ SPI Word Length = 16 */ SPI Word Length = 32 */ BAUDRATE = CCLK / 2**(2 + 1) = CCLK/8 */ BAUDRATE = CCLK / 2**(2 + 2) = CCLK/16 */ BAUDRATE = CCLK / 2**(2 + 3) = CCLK/32 */ BAUDRATE = CCLK / 2**(2 + 4) = CCLK/64 */ BAUDRATE = CCLK / 2**(2 + 5) = CCLK/128 */ BAUDRATE = CCLK / 2**(2 + 6) = CCLK/512 */ BAUDRATE = CCLK / 2**(2 + 7) = CCLK/1024 */ BAUDRATE = CCLK / 2**(2 + 8) = CCLK/2048 */ BAUDRATE = CCLK / 2**(2 + 9) = CCLK/4096 */ BAUDRATE = CCLK / 2**(2 + 10) = CCLK/8192 */ BAUDRATE = CCLK / 2**(2 + 11) = CCLK/16384 */ BAUDRATE = CCLK / 2**(2 + 12) = CCLK/32768 */ BAUDRATE = CCLK / 2**(2 + 13) = CCLK/65536 */ BAUDRATE = CCLK / 2**(2 + 14) = CCLK/131072 */ BAUDRATE = CCLK / 2**(2 + 15) = CCLK/262144 */ SPITX transmit buffer DMA enable, DMA channel 9 */ Programmable slave device select */ FLAG0 slave device select enable */ FLAG1 slave device select enable */ FLAG2 slave device select enable */ FLAG3 slave device select enable */ Seamless operation Seamless operation */ Select or deselect SPIDS~ between transfers */ Disable MISO Pin for Broadcast Mode */ Open drain output enable for data pins */ SPIRX recevie buffer DMA enable, DMA channel 8 */
A-134
Registers
/* /* /* /* /*
8-to-16 Bit Packing Enable */ Sign-extend SPIRX/SPITX data */ Send zero or repeat previous data when SPITX empty */ Send zero or repeat previous data when SPITX empty */ Retrieve or discard incoming data when SPIRX full */
/* SPISTAT register */ #define SPIF 0x00000001 #define SRS 0x00000001 #define MME 0x00000002 #define TXE 0x00000004 #define TXS0 0x00000008 #define TXS1 0x00000010 #define RBSY 0x00000020 #define RXS0 0x00000040 #define RXS1 0x00000080
/* /* /* /* /* /* /* /* /*
SPI transmit or receive transfer complete (in pre 1.2 Si)*/ SPI shift register status (in 1.2 Si and above)*/ Multimaster error */ SPITX transmission error (underflow) */ TXS[0] - SPITX data buffer status */ TXS[1] - SPITX data buffer status */ SPIRX reception error (overflow) */ RXS[0] - SPIRX data buffer status */ RXS[1] - SPIRX data buffer status */
/* LCTL #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define #define
register - 0xcc */ L0EN 0x00000001 L0DEN 0x00000002 L0CHEN 0x00000004 L0TRAN 0x00000008 L0EXT 0x00000010 L0CLKD00x00000020 L0CLKD10x00000040 L0PDRDE0x00000100 L0DPWID0x00000200 L1EN 0x00000400 L1DEN 0x00000800 L1CHEN 0x00001000 L1TRAN 0x00002000 L1EXT 0x00004000 L1CLKD00x00008000 L1CLKD10x00010000 L1PDRDE0x00040000 L1DPWID0x00080000 A0LB 0x00100000 A1LB 0x00200000 LAB0 0x00100000 LAB1 0x00200000 L0STAT00x00400000 L0STAT10x00800000 L1STAT00x01000000 L1STAT10x02000000 LRERR0 0x04000000 LRERR1 0x08000000
/* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* /*
Link buffer 0 enable */ Link buffer 0 DMA enable */ Link buffer 0 DMA chaining enable */ Link buffer 0 data direction */ Link buffer 0 extended word size */ L0CLKD[0] Link buffer 0 CCLK divide ratio */ L0CLKD[1] Link buffer 0 CCLK divide ratio */ Link Port 0 pulldown resister disable */ Link buffer 0 data path width */ Link buffer 1 enable */ Link buffer 1 DMA enable */ Link buffer 1 DMA chaining enable */ Link buffer 1 data direction */ Link buffer 1 extended word size */ L1CLKD[0] Link buffer 1 CCLK divide ratio */ L1CLKD[1] Link buffer 1 CCLK divide ratio */ Link Port 1 pulldown resister disable */ Link buffer 1 data path width */ Link Port Assignment for LBUF0 - 2106x/21160 compatibility */ Link Port Assignment for LBUF1 - 2106x/21160 compatibility */ Link Port Assignment for LBUF0 -new naming conventions */ Link Port Assignment for LBUF1 - new naming conventions */ L0STAT[0] - link buffer 0 status (read-only) */ L0STAT[1] - link buffer 0 status (read-only) */ L1STAT[0] - link buffer 1 status (read-only) */ L1STAT[1] - link buffer 1 status (read-only) */ Link Buffer 0 receive pack error status */ Link Buffer 1 receive pack error status */
/* SP02MCTL & #define MCE #define MFD0 bit */ #define MFD1 #define MFD2 #define MFD3 #define MFD4 #define MFD5 #define MFD6 #define MFD7 #define MFD8 #define MFD9 #define MFD10 #define MFD11 #define MFD12 #define MFD13 #define MFD14 #define NCH
SP13MCTL registers */ 0x00000001 /* Multichannel Mode Enable */ 0x00000000 /* no frame delay, multichannel FS pulse in same SCLK cycle as first data 0x00000002 0x00000004 0x00000006 0x00000008 0x0000000A 0x0000000C 0x0000000E 0x00000010 0x00000012 0x00000014 0x00000016 0x00000018 0x0000001A 0x0000001C 0x00000FE0 /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* multichannel mode 1 cycle frame sync delay */ multichannel mode 2 cycle frame sync delay */ multichannel mode 3 cycle frame sync delay */ multichannel mode 4 cycle frame sync delay */ multichannel mode 5 cycle frame sync delay */ multichannel mode 6 cycle frame sync delay */ multichannel mode 7 cycle frame sync delay */ multichannel mode 8 cycle frame sync delay */ multichannel mode 9 cycle frame sync delay */ multichannel mode 10 cycle frame sync delay */ multichannel mode 11 cycle frame sync delay */ multichannel mode 12 cycle frame sync delay */ multichannel mode 13 cycle frame sync delay */ multichannel mode 14 cycle frame sync delay */ Number of MCM channels - 1 */
A-135
0x00001000 0x007F0000
/* SPORT 0&2 or SPORT 1&3 Internal Loopback Mode */ /* Current Channel Status (read-only) */
/* SPCTL0, SPCTL1, SPCTL2 and SPCTL3 registers */ #define SPEN_A 0x00000001/* SPORT enable primary A channel */ #define DTYPE0 0x00000000/* right justify, fill unused MSBs with 0s */ #define DTYPE1 0x00000002/* right justify, sign-extend into unused MSBs */ #define DTYPE2 0x00000004/* compand using mu law */ #define DTYPE3 0x00000006/* compand using a law */ #define SENDN 0x00000008/* MSB or LSB first */ #define SLEN3 0x00000020/* serial length 3 */ #define SLEN4 0x00000030/* serial length 4 */ #define SLEN5 0x00000040/* serial length 5 */ #define SLEN6 0x00000050/* serial length 6 */ #define SLEN7 0x00000060/* serial length 7 */ #define SLEN8 0x00000070/* serial length 8 */ #define SLEN9 0x00000080/* serial length 9 */ #define SLEN10 0x00000090/* serial length 10 */ #define SLEN11 0x000000A0/* serial length 11 */ #define SLEN12 0x000000B0/* serial length 12 */ #define SLEN13 0x000000C0/* serial length 13 */ #define SLEN14 0x000000D0/* serial length 14 */ #define SLEN15 0x000000E0/* serial length 15 */ #define SLEN16 0x000000F0/* serial length 16 */ #define SLEN17 0x00000100/* serial length 17 */ #define SLEN18 0x00000110/* serial length 18 */ #define SLEN19 0x00000120/* serial length 19 */ #define SLEN20 0x00000130/* serial length 20 */ #define SLEN21 0x00000140/* serial length 21 */ #define SLEN22 0x00000150/* serial length 22 */ #define SLEN23 0x00000160/* serial length 23 */ #define SLEN24 0x00000170/* serial length 24 */ #define SLEN25 0x00000180/* serial length 25 */ #define SLEN26 0x00000190/* serial length 26 */ #define SLEN27 0x000001A0/* serial length 27 */ #define SLEN28 0x000001B0/* serial length 28 */ #define SLEN29 0x000001C0/* serial length 29 */ #define SLEN30 0x000001D0/* serial length 30 */ #define SLEN31 0x000001E0/* serial length 31 */ #define SLEN32 0x000001F0/* serial length 32 */ #define PACK 0x00000200/* 16-to-32 data packing */ #define MSTR 0x00000400/* I2S Mode only... TX/RX is master or slave */ #define ICLK 0x00000400/* internally 1 or externally 0 generated transmit or recieve SCLKx */ #define OPMODE 0x00000800/* I2S mode enable ('1') or DSP Serial Mode/Multichannel mode ('0') */ #define CKRE 0x00001000/* Clock edge for data and frame sync sampling (rx) or driving (tx) */ #define FSR 0x00002000/* transmit or receive frame sync (FSx) required */ #define IFS 0x00004000/* internally generated transmit or receive frame sync (FSx) */ #define IRFS 0x00004000/* internally generated receive FS0 or FS1 in multichannel mode */ #define DITFS 0x00008000/* (I2S/DSP serial mode only) Data Independent tx FSx when DDIR bit=1 */ #define LFS 0x00010000/* Active Low transmit or receive frame sync (FSx) */ #define LRFS 0x00010000/* SPORT0 and SPORT1 active low TDM frame sync FS0/FS1 in MC mode */ #define LTDV 0x00010000/* (MC Mode only) SPORT2/SPORT3 tx data valid ena in TDM mode-TDV2/TDV3 alternate pin config */ #define LFIRST 0x00010000/* (I2S Mode Only) transmit left channel first 1, or right channel first 0 */ #define LAFS 0x00020000/* (DSP Serial Mode only) Late (vs early) frame sync FSx */ #define SDEN_A 0x00040000/* SPORT TXnA/RXnA DMA enable primary A channel */ #define SCHEN_A 0x00080000/* SPORT TXnA/RXnA DMA chaining enable primary A channel */ #define SDEN_B 0x00100000/* SPORT TXnB/RXnB DMA enable primary B channel */ #define SCHEN_B 0x00200000/* SPORT TXnB/RXnB DMA chaining enable primary B channel */ #define FS_BOTH 0x00400000/* (DSP Serial & I2S modes only) Issue FSx only if data is in both TXnA & TXnB regs */
A-136
Registers
#define SPEN_B 0x01000000/* #define DDIR 0x02000000/* receiver */ #define DERR_B 0x04000000/* serial & I2S modes (read-only) */ #define DXS0_B 0x08000000/* modes read-only)*/ #define DXS1_B modes (read-only)*/ #define DERR_A I2S modes (read-only) */ #define TUVF_A (read-only, sticky)*/ #define ROVF_A (read-only, sticky)*/ #define DXS0_A modes (read-only)*/ #define DXS1_A modes (read-only)*/ #define RXS0_A (read-only)*/ #define RXS1_A (read-only)*/ #define TXS0_A (read-only)*/ #define TXS1_A (read-only) */ #endif
SPORTx secondary B channel enable */ SPORT data buffer data dirrection 1 = transmitter, 0 = SPORTx secondary B overflow/underflow error status in DSP SPORTx secondary B data buffer status in DSP Serial & I2S
0x10000000/* SPORTx secondary B data buffer status in DSP Serial & I2S 0x20000000/* SPORTx primary A over/underflow error status in DSP Serial & 0x20000000/* SPORT2/SPORT3 TX2A/TX3A underflow status in MC mode 0x20000000/* SPORT0/SPORT1 RX0A/RX1A overflow status in MC mode 0x40000000/* SPORTx primary A data buffer status in DSP serial and I2S 0x80000000/* SPORTx primary A data buffer status in DSP serial and I2S 0x40000000/* SPORT0/SPORT1 RX0A/RX1A data buffer status in MC mode 0x80000000/* SPORT0/SPORT1 RX0A/RX1A data buffer status in MC mode 0x40000000/* SPORT2/SPORT3 TX2A/TX3A data buffer status in MC mode 0x80000000/* SPORT2/SPORT3 TX2A/TX3A data buffer status in MC mode
A-137
A-138
Table B-1 shows all ADSP-21161 processor interrupts, listed according their bit position in the IRPTL, LIRPTL, and IMASK registers. For more information, see Interrupt Latch Register (IRPTL) on page A-27 and Interrupt Mask Register (IMASK) on page A-31. Also shown is the address of the interrupt vector. Each vector is separated by four memory locations. The addresses in the vector table represent offsets from a base address. For an interrupt vector table in internal memory, the base address is 0x0004 0000. For an interrupt vector table in external memory, the base address is 0x0020 0000. The interrupt name column in Table B-1 lists a mnemonic name for each interrupt as they are defined by the def21161.h file that comes with the software development tools. For more information, see Register and Bit #Defines (def21161.h) on page A-121. Table B-1. Interrupt Vector Addresses
Register IRPTL/ IMASK, LIRPTL Bit# 0 1 2 3 4 Vector Address Interrupt Name Function
Emulator (read-only, non-maskable) HIGHEST PRIORITY Reset (read-only, non-maskable) Illegal Input Condition Detected Status, loop, or mode stack overflow; or PC stack full Timer=0 (high priority option)
B-1
IRPTL IRPTL IRPTL IRPTL IRPTL IRPTL IRPTL IRPTL IRPTL LIRPTL LIRPTL LIRPTL LIRPTL LIRPTL LIRPTL IRPTL IRPTL IRPTL IRPTL IRPTL IRPTL IRPTL IRPTL
0x14 0x18 0x1C 0x20 0x24 0x28 0x2C 0x30 0x34 0x38 0x3C 0x40 0x44 0x48 0x4c 0x50 0x54 0x58 0x5C 0x60 0x64 0x68 0x6C
VIRPTI IRQ2I IRQ1I IRQ0I SP0I SP1I SP2I SP3I LP0I LP1I SPIRI SPITI EP0I EP1I EP2I EP3I LSRQI CB7I CB15I TMZLI
Multiprocessor Vector Interrupt IRQ2 asserted IRQ1 asserted IRQ0 asserted Reserved SPORT0 DMA SPORT1 DMA SPORT2 DMA SPORT3 DMA Link Buffer 0 DMA Interrupt Link Buffer 1 DMA Interrupt SPI Receive DMA Interrupt SPI Transmit DMA Interrupt Reserved Reserved DMA Channel 10 - Ext. Port Buffer 0 DMA Channel 11 - Ext. Port Buffer 1 DMA Channel 12 - Ext. Port Buffer 2 DMA Channel 13 - Ext. Port Buffer 3 Link Port Service Request Circular Buffer 7 overflow Circular Buffer 15 overflow Timer=0 (low priority option)
B-2
Fixed-point overflow Floating-point overflow exception Floating-point underflow exception Floating-point invalid exception User software interrupt 0 User software interrupt 1 User software interrupt 2 User software interrupt 3 Reserved - lowest priority
B-3
B-4
C NUMERIC FORMATS
The ADSP-21161 processor supports the 32-bit single-precision floating-point data format defined in the IEEE Standard 754/854. In addition, the processor supports an extended-precision version of the same format with eight additional bits in the mantissa (40 bits total). The processor also supports 32-bit fixed-point formatsfractional and integerwhich can be signed (twos-complement) or unsigned.
C-1
The unsigned exponent e can range between 1 e 254 for normal numbers in the single-precision format. This exponent is biased by +127 (254 2). To calculate the true unbiased exponent, 127 must be subtracted from e.
31 s
30 e 7
23 e 0
22 1.f 22
0 f 0
Hidden Bit
Binary Point
Figure C-1. IEEE 32-Bit Single-Precision Floating-Point The IEEE Standard also provides for several special data types in the single-precision floating-point format: An exponent value of 255 (all ones) with a nonzero fraction is a Not-A-Number (NAN). NANs are usually used as flags for data flow control, for the values of uninitialized variables, and for the results of invalid operations such as 0 * . Infinity is represented as an exponent of 255 and a zero fraction. Note that because the fraction is signed, both positive and negative Infinity can be represented. Zero is represented by a zero exponent and a zero fraction. As with Infinity, both positive Zero and negative Zero can be represented. The IEEE single-precision floating-point data types supported by the processor and their interpretations are summarized in Table C-1.
C-2
Numeric Formats
Extended-Precision Floating-Point
The extended-precision floating-point format is 40 bits wide, with the same 8-bit exponent as in the standard format but a 32-bit significand. This format is shown in Figure C-2. In all other respects, the extended floating-point format is the same as the IEEE standard format.
39 s
38 e 7
31 e 0
30 1.f 30 f
0 0
Hidden Bit
Binary Point
C-3
15 s
14 e 3
11 e 0
10 1.f 10
0 f 0
Hidden Bit
Binary Point
C-4
Numeric Formats
(including hidden 1) is right-shifted the appropriate amount. The packed result is a denormal which can be unpacked into a normal IEEE floating-point number. During the FPACK operation, an overflow sets the SV condition and non-overflow will clear it. During the FUNPACK operation, the SV condition is cleared. The SZ and SS conditions are cleared by both instructions. Table C-2. FPACK Operations
Condition 135 < exp 120 < exp 135 Result Largest magnitude representation. Exponent is MSB of source exponent concatenated with the three LSBs of source exponent. The packed fraction is the rounded upper 11 bits of the source fraction. Exponent=0. Packed fraction is the upper bits (source exponent 110) of the source fraction prefixed by zeros and the hidden 1. The packed fraction is rounded. Packed word is all zeros.
exp = source exponent sign bit remains the same in all cases
exp = 0
exp = source exponent sign bit remains the same in all cases
C-5
Fixed-Point Formats
Fixed-Point Formats
The ADSP-21161 processor supports two 32-bit fixed-point formats: fractional and integer. In both formats, numbers can be signed (twos-complement) or unsigned. The four possible combinations are shown in Figure C-4. In the fractional format, there is an implied binary point to the left of the most significant magnitude bit. In integer format, the binary point is understood to be to the right of the LSB. Note that the sign bit is negatively weighted in a twos-complement format. ALU outputs always have the same width and data format as the inputs. The multiplier, however, produces a 64-bit product from two 32-bit inputs. If both operands are unsigned integers, the result is a 64-bit unsigned integer. If both operands are unsigned fractions, the result is a 64-bit unsigned fraction. These formats are shown in Figure C-5. If one operand is signed and the other unsigned, the result is signed. If both inputs are signed, the result is signed and automatically shifted left one bit. The LSB becomes zero and bit 62 moves into the sign bit position. Normally bit 63 and bit 62 are identical when both operands are signed. (The only exception is full-scale negative multiplied by itself.) Thus, the left shift normally removes a redundant sign bit, increasing the precision of the most significant product. Also, if the data format is fractional, a single-bit left shift renormalizes the MSP to a fractional format. The signed formats with and without left shifting are shown in Figure C-6. The multiplier has an 80-bit accumulator to allow the accumulation of 64- bit products. For more information on the multiplier and accumulator, see MultiplyAccumulator (Multiplier) on page 2-15.
C-6
Numeric Formats
S IG N E D IN T E G E R
B IT W E IG H T
31 -2 31 2
30 30
29 2 29 2
2 2 2
1 1 2
0 0
S IG N B IT B IN A R Y P O IN T
B IT W E IG H T
31
30
S IG N E D F R A C T IO N A L 29 2 2 -2 2 -29 2
1 -30 2
0 -31
-0 -1 -2 2
S IG N B IT
B IN A R Y P O IN T
B IT W E IG H T
31 2 31
30 2 30
U N S IG N E D IN T E G E R 29 2 2 29 2 2 2
1 1 2
0 0
B IN A R Y P O IN T
B IT W E IG H T
31 2 -1
30 2 -2
U N S IG N E D F R A C TI O N A L 29 2 1 2 -3 2 -30 2 -31 2
0 -32
B IN A R Y P O IN T
C-7
Fixed-Point Formats
BIT WEIGHT
63 2 63
62 2 62
UNSIGNED INTEGER 61 2 2 61 2 2 2
1 1
0 2 0
BINARY POINT
BIT WEIGHT
63
BINARY POINT
62 2 -2
-1
-63 -64 2
63
62
61
2 2 2 2 2
1 1 1 2 2
0 0 0 2 2
BINARY POINT
63 30 29 62 62 -2 2 2 2 2
SIGN BIT
63
62
61 -2 2 2
2 -61 2
1 -62 2
0 -63
0 -1 -2 2
BINARY POINT
C-8
G GLOSSARY
Autobuffering Unit (ABU). (See I/O processor and DMA) Arithmetic Logic Unit (ALU). This part of a processing element performs arithmetic and logic operations on fixed-point and floating-point data. Asynchronous transfers. Asynchronous host accesses of the processor. After acquiring control of the processors external bus, the host must assert the CS pin of the processor it wants to access. Auxiliary registers. (See Index Registers) Base address. The starting address of a circular buffer to which the DAG wraps around. This address is stored in a DAG Bx register. Base registers. A base (Bx) register is a Data Address Generator (DAG) register that sets up the starting address for a circular buffer. Bit-reverse addressing. The Data Address Generator (DAG) provides a bit-reversed address during a data move without reversing the stored address. Block repeat. (See Do/Until instructions in the ADSP-21160 DSP Instruction Set Reference) Block size register. (See Length Registers) Boot Memory Space. The processor supports an external boot EPROM mapped to external memory and selected with the BMS pin. The boot EPROM provides one of the methods for automatically loading a program into the internal memory of the processor after power-up or after a software reset. ADSP-21161 SHARC Processor Hardware Reference G-1
Broadcast data moves. The Data Address Generator (DAG) performs dual data moves to complementary registers in each processing element to support SIMD mode. Buffered serial port. (See Serial ports) Burst transfers. Multi-cycle synchronous transfers that contains a packet of at least two 64-bit transfers. For a master, only a DMA channel can master a burst transaction. As a slave, supports burst read transfers from internal memory, or the EPBx data buffers. Bus slave or slave mode. A processor can be a bus slave to another processor or to a host processor. The processor becomes a host bus slave when the HBG signal is returned. Bus Transition Cycle (BTC). A cycle in which control of the external bus is passed from one processor to another (in a multiprocessor system). Circular buffer addressing. The DAG uses the Ix, Mx and Lx register settings to constrain addressing to a range of addresses. This range contains data that the DAG steps through repeatedly, wrapping around to repeat stepping through the range of addresses in a circular pattern. Cluster multiprocessing. This is a multiprocessing system architecture in which the processor uses its link ports and external port for inter-processor communication. Companding (compressing/expanding). This is the process of logarithmically encoding and decoding data to minimize the number of bits that must be sent. Conditional branches. These are JUMP or CALL/return instructions whose execution is based on testing an IF condition. Conflict resolution ratio. Because the external port must arbitrate accesses over three internal buses to one external bus, there is a 3:1 conflict resolution ratio at the external port interface. This ratio plus the 2:1 or greater clock ratio between the processors internal clock and the external system G-2 ADSP-21161 SHARC Processor Hardware Reference
Glossary
clock forces systems that fetch instructions or data through the external port must tolerate at least one cycleand usually many additional cyclesof latency. DAGEN, Data address generator (See DAGs) Data Address Generator (DAG). The Data Address Generators (DAGs) provide memory addresses when data is transferred between memory and registers. Data flow multiprocessing. This is a multiprocessor system architecture in which the processor uses its link ports for inter-processor communication. Data register file. This is the set of data registers that transfer data between the data buses and the computation units. These registers also provide local storage for operands and results. Data registers (Dreg). These are registers in the PEx and PEy processing elements. These registers are hold operands for multiplier, ALU, or shifter operations and are denoted as Rx when used for fixed point operations or Fx when used for floating-point operations. Deadlock resolution. When both theprocessor subsystem and the system try to access each others bus in the same cycle, a deadlock may occur in which neither access can complete. Techniques for resolving deadlock vary with the interface: DRAM, host, or multiprocessor system. Delayed branches. These are JUMPS and CALL/return instructions with the delayed branches (DB) modifier. In delayed branches, no instruction cycles are lost in the pipeline, because the processor executes the two instructions after the branch while the pipeline fills with instructions from the new branch. Direct branches. These are JUMP or CALL/return instructions that use an absolutenot changing at runtimeaddress (such as a program label) or use a PC-relative address.
G-3
Direct reads and writes. A direct access of the processors internal memory or I/O processor registers by another processor or by a host processor. DMA (Direct Memory Accessing). The processors I/O processor supports DMA of data between processor memory and external memory, host, or peripherals through the external, link, and serial ports. Each DMA operation transfers an entire block of data. DMA chaining. The processor supports chaining together multiple DMA sequences. In chained DMA, the I/O processor loads the next Transfer Control Block (DMA parameters) into the DMA parameter registers when the current DMA finishes and auto-initializes the next DMA sequence. DMA parameter registers. These registers function similarly to data address generator registers, setting up a memory access process. These registers include Internal Index registers ( IIx), Internal Modify registers (IMx), Count registers (Cx), Chain Pointer registers (CPx), General Purpose registers (GPx), External Index registers (EIEPx), External Modify registers (EMEPx), and External Count registers (ECEPx). DMA TCB chain loading. This is the process that the I/O processor uses for loading the TCB of the next DMA sequence into the parameter registers during chained DMA. DMACx control registers. The DMA control registers for the EPBx external port buffers: DMAC10, DMAC11, DMAC12, and DMAC13. These correspond respectively to EPB0, EPB1, EPB2, and EPB3. Edge-sensitive interrupt. The processor detects this type of interrupt if the input signal is high (inactive) on one cycle and low (active) on the next cycle when sampled on the rising edge of CLKIN. Endian format, little versus big. The processor uses big-endian format moves data starting with most-significant-bit and finishing with least significant bitin almost all instances. The two exceptions are bit order for data transfer through the serial port and word order for packing through the external port. For compatibility with little-endian (least signifi-
G-4
Glossary
cant-first) peripherals, the processor supports both big- and little-endian bit order data transfers. Also for compatibility little-endian hosts, the processor supports both big- and little endian word order data transfers. Explicit Versus Implicit operations. In SIMD mode, identical instructions execute on the PEx and PEy computational units; the difference is the data. The data registers for PEy operations are identified (implicitly) from the PEx registers in the instruction. This implicit relation between PEx and PEy data registers corresponds to complementary register pairs. External bus. The processor extends the following signals off-chip as an external bus: DATA47-16, ADDR23-0, RD, WR, MS3-0, BMS, CLKOUT, BRST, ACK, and SBTS. External memory space. This space ranges from address 0x0200 0000 through 0x0CFF FFFF (Normal word) for Non-SDRAM and from address 0x0020 0000 through 0x0FFF FFFF (Normal word) for SDRAM. External memory space refers to the off-chip memory or memory mapped peripherals that are attached to the processors external address (ADDR23-0) and data (DATA47-16) buses. External port FIFO buffers (EPB0, EPB1, EPB2, and EPB3). The I/O processor registers used for external port DMA transfers and single-word data transfers (from other processors or from a host processor). These buffers are eight-deep FIFOs. External port. This port extends the internal address and data buses off chip, providing the processors interface to off-chip memory and peripherals. Field deposit (Fdep) instructions. These shifter instructions take a group of bits from the input register (starting at the LSB of the 32-bit integer field) and deposit the bits as directed anywhere within the result register. Field extract (Fext) instructions. These shifter extract a group of bits as directed from anywhere within the input register and place them in the result register (aligned with the LSB of the 32-bit integer field).
G-5
Programmable Flag pins. These pins (FLGx) can be programmed as input or output pins using bit settings in the MODE2 register. The status of the flag pins is given in the FLAGS or IOFLAG register. General-purpose input/output pins. (See Programmable Flag pins) Flag update. The processors update to status flags occurs at the end of the cycle in which the status is generated and is available on the next cycle. Harvard architecture. A memory architectures that has separate buses for program and data storage. The two buses allow a data word and an instruction simultaneously. Hold time cycle. This is an inactive bus cycle that is automatically generates at the end of a read or write (depending on the external port access mode) to allow a longer hold time for address and data. The addressand data, if a writeremains unchanged and is driven for one cycle after the read or write strobes are deasserted. Host transition cycle (HTC). A cycle in which control of the external bus is passed from the ADSP-21161 processor to the host processor. During this cycle the processor stops driving the RD, WR, ADDR23-0, MS3-0, CLKOUT, PA, and DMAGx signals, which must then be driven by the host. I/O processor register. One of the control, status, or data buffer registers of the processors on-chip I/O processor. Idle cycle. This is an inactive bus cycle that is automatically generated (depending on the external port access mode) to avoid data bus driver conflicts. Such a conflict can occur when a device with a long output disable time continues to drive after RD is deasserted while another device begins driving on the following cycle. IDLE. An instruction that causes the processor to cease operations, holding its current state until an interrupt occurs. Then, the processor services the interrupt and continues normal execution.
G-6
Glossary
Index registers. An index register is a Data Address Generator (DAG) register that holds an address and acts as a pointer to memory. Indirect branches. These are JUMP or CALL/return instructions that use a dynamicchanges at runtimeaddress that comes from the PM data address generator. Interleaved data. To take advantage of the processors data accesses to 4and 3-column locations, programs must adjust the interleaving of data into (not necessarily sequential) memory locations to accommodate the memory access mode. Internal memory space. This space ranges from address 0x0000 0000 through 0x0005 3FFF (Normal word). Internal memory space refers to the ADSP-21161 processors on-chip SRAM and memory mapped registers. Interrupts. Subroutines in which a runtime event (not an instruction) triggers the execution of the routine. JTAG port. This port supports the IEEE standard 1149.1 Joint Test Action Group (JTAG) standard for system test. This standard defines a method for serially scanning the I/O status of each component in a system. Jumps. Program flow transfers permanently to another part of program memory. Link ports. The processor has two 8-bit wide link ports, which can connect to other processors or peripherals link ports. These bidirectional ports have eight data lines, an acknowledge, and a clock line. Length registers. A length registers is a Data Address Generator (DAG) register that sets up the range of addresses a circular buffer. Level-sensitive interrupts. This type of interrupt is detected if the signal input is low (active) when sampled on the rising edge of CLKIN.
G-7
Loops. One sequence of instructions executes several times with zero overhead. McBSP, multichannel buffered serial port. (See Serial port) MCM, multichannel mode. (See Multichannel mode) Memory access modes. The processor supports Asynchronous and Synchronous modes for accessing external memory space. In asynchronous access mode, the processors RD and WR strobes change before CLKINs edge. In synchronous access mode, the processors RD and WR strobes change on CLKINs edge. Memory blocks and banks. Memory is divided into blocks that are each associated with different data address generators. The processors external memory spaces is divided into banks, which may be addressed by either data address generator. Modified addressing. The DAG generates an address that is incremented by a value or a register. Modify address. The Data Address Generator (DAG) increments the stored address without performing a data move. Modify registers. A modify register is a Data Address Generator (DAG) register that provides the increment or step size by which an index register is pre- or post-modified during a register move. Multichannel mode. In this mode, each data word of the serial bit stream occupies a separate channel. Multifunction computations. Using the many parallel data paths within its computational units, the processor supports parallel execution of multiple computational instructions. These instructions complete in a single cycle, and they combine parallel operation of the multiplier and the ALU or dual ALU functions. The multiple operations perform the same as if they were in corresponding single-function computations.
G-8
Glossary
Multiplier. This part of a processing element does floating-point and fixed-point multiplication and executes fixed-point multiply/add and multiply/subtract operations. Multiprocessor memory space. The portion of the memory map that includes the I/O processor registers of in a multiprocessing system. This address space is mapped into the unified address space of the ADSP-21161 processor. Multiprocessor system. A system with multiple processors, with or without a host processor. The processors are connected by the external bus and/or link ports. Multiprocessor vector interrupt. The vector interrupt (VIRPT) permits passing interprocessor commands in multiple-processor systems. One processor writes a vector address to another processors VIRPT register. Writing the address initiates the vector interrupt on the processor that receives the write. The ADSP-21161 processor executes (vectors to) the interrupt service routine at that address. Neighbor registers. In Long word addressed accesses, data is moved to or from two neighboring data registers. The least-significant-32-bits moves to or from the explicit (named) register in the neighbor register pair. In forced Long word accesses (Normal word address with LW mnemonic), the the Normal word address is converted to Long word, placing the even Normal word location in the explicit register and the odd Normal word location in the other register in the neighbor pair. PAGEN, Program address generation logic. (See the Program Sequencer chapter) Peripherals. This refers to everything outside the processor core. The ADSP-21161 processors peripherals include internal memory, external port, I/O processor, JTAG port, and any external devices that connect to the ADSP-21161.
G-9
Precision. The precision of a floating-point number depends on the number of bits after the binary point in the storage format for the number. The processor supports two high precision floating-point formats: 32-bit IEEE single-precision floating-point (which uses 8 bits for the exponent and 24 bits for the mantissa) and a 40-bit extended precision version of the IEEE format. Post-modify addressing. The Data Address Generator (DAG) provides an address during a data move and auto-increments the stored address for the next move. Pre-modify addressing. The Data Address Generator (DAG) provides a modified address during a data move without incrementing the stored address. Registers swaps. This special type of register-to-register move instruction uses the special swap operator, <->. A register-to-register swap occurs when registers in different processing elements exchange values. Saturation (ALU saturation mode). In this mode, all positive fixed-point overflows return the maximum positive fixed-point number (0x7FFF FFFF), and all negative overflows return the maximum negative number (0x8000 0000). Semaphore. This is a flag that can be read and written by any of the processors sharing the resource. Semaphores can be used in multiprocessor systems to allow the processors to share resources such as memory or I/O. The value of the semaphore tells the processor when it can access the resource. Semaphores are also useful for synchronizing the tasks being performed by different processors in a multiprocessing system. Serial ports. The ADSP-21161 processor has four synchronous serial ports that provide an inexpensive interface to a wide variety of digital and mixed-signal peripheral devices.
G-10
Glossary
SHARC. This is an acronym for Super Harvard Architecture. This architecture balances a high performance processor core with high performance buses (PM, DM, IO). Shifter. This part of a processing element completes logical shifts, arithmetic shifts, bit manipulation, field deposit, and field extraction operations on 32-bit operands. Also, the Shifter can derive exponents. SMUL, Saturation on multiplication. (See Multiplier Saturation modes) SST, Saturation on store. (See Multiplier Saturation modes) Subroutines. The processor temporarily interrupts sequential flow to execute instructions from another part of program memory. Single-word data transfers. Reads and writes to the EPBx external port buffers, performed externally by the bus master (or host) or internally by the slave's core. These occur when DMA is disabled in the DMACx control register. Synchronous transfers. Synchronous host accesses of the ADSP-21161 processor. When CS is not asserted, the host must act like another processor in a multiprocessor system, by generating an address in multiprocessor memory space, asserting PA and WR or RD, and driving out or latching in the data. TADD, TDM address. (See the section Multichannel Mode) TCB chain loading. The process in which the DMA controller downloads a Transfer Control Block from memory and autoinitializes the DMA parameter registers. Time Division Multiplexed (TDM) mode. The serial ports support TDM or multichannel operations. In multichannel mode, each data word of the serial bit stream occupies a separate channel each word belongs to the next consecutive channel so that, for example, a 24-word block of data contains one word for each of 24 channels.
G-11
Transfer control block (TCB). A set of DMA parameter register values stored in memory that are downloaded by the DMA controller for chained DMA operations. Tristate versus three-state. Analog Devices documentation uses the term three-state instead of tristate because Tristate is a trademarked term, which is owned by National Semiconductor. Universal registers (Ureg). These are any processing element registers (data registers), any Data Address Generator (DAG) registers, any program sequencer registers, and any I/O processor registers. Von Neumann architecture. This is the architecture used by most (non-DSP) microprocessors. This architecture uses a single address and data bus for memory access. Waitstates. Waitstates are applied to each external memory access depending on the banks external memory access mode (EBxAM). The External Bank Waitstate (EBxWS) field in the WAIT register sets the number of waitstates for each bank.
G-12
INDEX
Numerics
16-bit floating-point format, C-4 32-bit data (See normal word) 40-bit extended-precision floating-point format, C-3
A
Abs function, 2-9 Absolute address, 3-14, G-3 Acknowledge (ACK) pin, 5-42, 6-54, 6-56, 7-6, 7-40, 7-89, 13-4, 13-16 Acknowledge controls, 1-12 Active drive REDY (ADREDY) bit, 7-44, 7-51, A-61 Active low versus active high frame syncs, 10-43 Add instruction, 2-1, 2-9, 2-36 Address bus (ADDR) pin, 5-21, 7-6, 7-38, 7-49, 7-89, 7-96, 8-26, 8-29, 13-4, 13-16 Address buses, 1-2 Address fields, A-42 PM and DM, A-43 Address fields for asynchronous host accesses, 5-21, 7-49 Addressing (See post-modify, pre-modify, modify, bit-reverse, or circular buffer) Storing top-of-loop addresses, A-41 Addressing, DSP external memory registers, A-90
Alternate registers (See secondary registers) ALU carry (AC) bit, 2-11, 3-54, A-14 ALU fixed-point overflow (AOS) bit, 2-11, A-19 ALU floating-point (AF) bit, 2-11, A-16 ALU floating-point invalid (AI) bit, 2-11, A-14 ALU floating-point invalid status (AIS) bit, 2-11, A-19 ALU floating-point overflow status (AVS) bit, 2-11, A-19 ALU floating-point underflow status (AUS) bit, 2-11, A-19 ALU negative (AN) bit, 2-11, A-13 ALU overflow (AV) bit, 2-11, 3-54, A-13 ALU saturation (ALUSAT) bit, 2-4, 2-10, A-4 ALU x-input sign (AS) bit, 2-11, A-14 ALU zero (AZ) bit, 2-11, A-13 AND, logical, 2-9 And breakpoints (ANDBKP) bit, 12-11 Arithmetic logic unit (ALU), 1-6, 2-1, 2-9 Instructions, 2-9, 2-12 Interrupts, 3-43 Operations, 2-9 Saturation, 2-10 Status, 2-4, 2-8, 2-10, 2-11, 2-20, 3-43 Arithmetic operations, 1-3, 2-9, 2-10 Arithmetic shifts, 2-1, G-11 Arithmetic status (ASTATx/y) registers, 2-8, A-13 Assembly language, 2-2
I-1
INDEX
Asymmetric data moves, 2-39 Asynchronous access mode, 5-43, 7-6, 7-12, 7-13, 7-44, 7-50, 7-94, 13-16, G-8 Direct write, 7-56 For all external memory banks, A-65 Interface Timing, 7-14 Interface timing, 7-14 Readbus master, 7-15 Read/Writebus slave, 7-14 Slave write FIFO, 7-50 Starting a transfer, 7-48 Timing derivation, 7-17 Transfers, 7-42, 7-48 transfers, 7-48 Writebus master, 7-17 Asynchronous transfers, G-1 Average instructions, 2-9, 2-36
B
Background registers (See Secondary registers) Background registers (See secondary registers) Bank activate (ACT) command, 8-2, 8-20, 8-32 Barrel-shifter (See shifter) Base (Bx) registers, 4-2, 4-16, A-47, G-1 Bidirectional functions, 10-1 Binary log (floating-point operation), 2-9 Bit (bit manipulation) instruction, 3-5 Bit manipulation, 2-1, 2-23, G-11 Bit-reverse addressing, 4-1, 4-4, 4-8, A-3, G-1 Bit-reverse addressing (BRx) bits, 4-4, 4-8, A-3 Bit-reverse (Bitrev) instruction, 4-8, 4-17, 4-23 Bit test flag (BTF) bit, 3-54, A-17 BIT TST instruction, 2-8
Bit XOR instruction, 3-54 Booting, 1-14, 5-23, 5-35, 5-37, 13-71 16-bit SPI host boot, 11-39 32-bit SPI host boot, 11-38 8-bit SPI host boot, 11-41 Another DSP, 7-108 External port booting, 6-70 From a 16-bit SPI host, 11-39 From a 32-bit SPI host, 11-38 From a 8-bit SPI host, 11-41 From an EPROM, 6-71, 13-73, 13-74, 13-75 From the link port, 6-88, 6-89, 6-113, 6-114 Mode selection, 6-70, 6-89, 6-114 Multiple DSPs, 13-73 Multiprocessor booting from external memory, 13-75 Multiprocessor EPROM booting, 13-73 Multiprocessor host booting, 13-73 Multiprocessor link port booting, 13-75 Multiprocessor SPI booting, 11-42 Sequential booting, 13-74 Single and multiple processors, 13-71 SPI, 11-34 Boot memory select (BMS) pin, 5-35, 6-42, 6-70, 6-89, 6-114, 7-10, 13-5, 13-12 Boot select override (BSO) bit, 5-32, 5-35, 6-31, 6-42, A-60 Boundary scan, 12-1, 12-29 Branch Conditional, 3-15 Delayed, 3-15 to 3-18 Direct, 3-14, G-3 Indirect, 3-14 Branches and sequencing, 3-13 Branching execution, 3-13 Direct and indirect branches, 3-14 Immediate branches, 3-16 Breakpoint output (BRKOUT) pin, 12-8
I-2
INDEX
Breakpoint status shift (BRKSTAT) register, 12-12 Breakpoint status (STATx) bit, 12-13 Breakpoint stop (BKSTOP) bit, 12-8 Breakpoint triggering mode (xMODE) bit, 12-10 Broadcast load, 4-1, 4-2, 4-3, 4-5, 5-51, A-5, G-2 Broadcast load enable (BDCSTx) bits, 4-2, 4-4, 4-5, 5-40, A-5 Broadcast writes, 7-50, 7-57 BSDL file, 12-4 BSDL Reference Guide, 12-29 Buffer hang disable (BHD) bit, 6-31, 6-43, 12-11, A-61 Buffer hang override (BHO) bit, 12-11 Buffer overflow, circular, 4-9, 4-12, 4-15 Buffers Link port, 6-83, 6-133 Reading from an empty buffer, A-61 SDRAM buffering, 8-16 Buffer status, 6-129, 6-136, 6-139 Built-in self-test operation (BIST), 12-28 Burst length, 8-22 Burst length, setting, A-84 Bursts Sequential bursts, external port, 7-58 Burst transfer (BRST) pin, 7-6, 7-38, 7-89, 7-96, 13-6, 13-16 Burst transfers, 7-26, 7-29, G-2 Buses, 1-2, 1-10 Accessing the DSP bus, 7-79 Acquiring the bus, 7-45 addressing operations, 5-7 Arbitration, 7-93, 7-95, 7-96 arbitration, 5-7 Bus contention, A-48 Bus lock, 7-83, 7-92, 7-110 Bus master, 7-44, 7-79, 7-97 Timeout, 7-101
Buses (continued) Bus slave, 7-79, 7-97 Bus slave defined, G-2 Bus synchronization, 7-105, 7-108 Conflict resolution ratio, 7-3 Data access types, 5-47 Deadlock, 7-44, 7-54, 7-82 DSP bus, 7-79 Enhancements, 1-17 Multiprocessor bus arbitration, 7-93 Priority, 5-39 Processor core, 1-17 Bus exchange (See program memory bus exchange (PX) register) Bus lock and semaphores, 7-110 Bus lock (BUSLK) bit, 7-102, 7-111, A-10 Bus master, current (CRBMx) bit, A-69 Bus master (Bm) condition, 3-55, 7-95, 7-111 Bus master count (BCNT) register, 7-102, A-79 Bus master max time-out (BMAX) register, 7-101, A-79 Bus master output (BMSTR) pin, 13-5 Bus master select (CSEL) bit, A-5 Bus Request, multiprocessor (BRx) pins, 7-89, 7-94, 7-111, 13-5 Bus synchronized (BSYN) bit, 7-106, A-69 Bus Transition Cycle (BTC), 7-20, 7-45, 7-96, G-2 BYPASS instruction, 12-4
C
Cache disable (CADIS) bit, 3-11, A-10 Cache efficiency, 3-11 Cache freeze (CAFRZ) bit, 3-11, A-11 Cache hit/miss (See cache efficiency) CALL instructions, 3-13
I-3
INDEX
Capacitors Bypass, 13-69 Decoupling capacitors, 13-69 Loading, 13-67 CAPTURE state, 12-7 CAS before RAS transaction (CBR), 8-3, 8-37, 8-39 CAS latency, 8-2, 8-8, 8-17, 8-24 CAS-to-RAS delay (SDTRCD), 8-4, 8-21 Chained DMA External port, 6-46 Link ports, 6-85 Serial ports, 6-99 Chained DMA enable, external port (CHEN) bit, 6-32, A-80 Chained DMA enable (SCHEN_A and SCHEN_B) bit, serial port, A-102 Chained DMA sequences, 6-25 Chain insertion mode, 6-29, 6-130 Chain pointer (CPx) registers, 6-7, 6-12, 6-25, A-88 Chip select (CS) pin, 7-42, 7-44, 7-79, 7-89, 13-8, A-75 Circular buffer addressing, 1-8, 4-2, 4-4, 4-12, A-6, G-2 Registers, 4-15 Setup, 4-13 SIMD and Long word accesses, 4-17 Wrap around, 4-15 Circular buffer addressing enable (CBUFEN) bit, 4-2, 4-4, 4-14, A-6 Circular buffering, length and base registers, A-47 Circular buffer x overflow interrupt (CBxI) bit, A-30 Circular buffer x overflow status (CBxS) bit, A-20 Clear, bit, 2-23 Clear interrupt (CI) Jump instruction, 3-14
Clip function, 2-9 CLKOUT disable (COD) bit, A-61 Clock and frame sync frequencies (DIV), 10-33 Clock cycles delays, 8-2, A-73 Clock derivation, 13-24 Clock distribution, 13-63 Clock divisor (CLKDIV) bits, A-112 Clock double (CLKDBL) pin, 13-6, 13-25, 13-26 Clock input (CLKIN) pin, 7-38, 7-89, 10-8, 12-1, 13-16, 13-24, 13-28 Clock output (CLKOUT) pin, 7-7, 7-96, 13-8, 13-62 Clock ratio, 13-28 Clock ratio configuration (CLK_CFGx) pins, 13-25, A-70 Clock relationships, 13-27 Clock rising edge (CKRE) bit, 10-21, A-101 Clocks CLKOUT and CCLK clock generation, 13-27 Coordinating the SDRAM CLK rate, 8-3 Core clock and system clock relationship to CLKIN, 13-27 Core clock ratio, 8-12, 13-28 Determining switching frequencies, 13-26 Determining the period, 13-28 Jitter, 13-63 Programming clock ratio example, 13-40 SPICLK, 11-3 System clock CLKIN, 10-8 Clock signal options, 10-40 Cluster multiprocessing, G-2 CODECs, 10-1 Code select (CSEL) bit, 3-56, 7-95, A-5
I-4
INDEX
Commands, SDRAM Active command tRAS, 8-4 Bank activate (ACT), 8-2, 8-20, 8-21, 8-32, A-73 CBR automatic refresh, 8-3 Mode register set (MRS), 8-19 Precharge, 8-20, 8-29, 8-33, 8-37, A-74 Refresh, 8-29, 8-30, 8-37 Self refresh (SREF), 8-3, 8-20, 8-39 Companding (compressing/expanding), 1-14, 10-2, 10-39, G-2 Compare accumulation (CACCx) bits, 2-11, A-17 Compare function, 2-9 Complementary conditions, 3-59 Complementary registers, 2-40, G-5 COM port, McBSP (See Link Ports) COM port, McBSP (See link ports) Computational mode, 2-42 Computational mode, setting, 2-4 Computational status, using, 2-8 Computational units (See processing elements) Conditional Branches, 3-15, 3-59, G-2 Complementary conditions, 3-59 Compute operations, 3-58 Conditions list, 3-54 Execution summary, 3-58 Instructions, 3-3, 3-53 Sequencing, 3-53 SIMD mode and conditionals, 3-57 Condition codes, 3-54 Conditioning input signals, 13-60 Configuration register, 8-3 Configuring and enabling the SPI system, 11-9 Configuring frame sync signals, 10-4 Conflict resolution ratio, G-2 Context switch, 1-9, 2-32
Core clock ratio, 8-12 Core hang, causes, A-121 Core-memory halt (COMHALT) bit, 12-12 Core-to-CLKIN ratio (CRAT) bit, A-70 Count (Cx) registers, 6-7, 6-11, A-87 Counter-based loops, 3-26, 3-27 (See also Non-counter-based loops) Crosstalk, 13-68 Current Bus Master (CRBMx) bits, 7-95, 7-96 Current loop counter (CURLCNTR) register, 3-31, A-45
D
Data, fixed- and floating-point, 2-1, G-1 Data access conflicts, 5-7 Dual-data accesses, 5-5 Dual-data access restrictions, 5-5 Options, 5-52 (See also data moves) Settings, 5-32 Data Address Generators (DAGs), 1-8, 4-1, 5-8, 5-40, 8-22, G-3 Data alignment, 4-19 Data move restrictions, 4-21 Data moves, 4-18 Enhancements, 1-17 Features, 1-5 Instructions, 4-23 Operations, 4-9 Setting Modes, 4-2 SIMD mode, 4-18 Status, 4-8 Data addressing mode, 2-42 Data alignment, 5-10, 5-25, 5-48, 7-1 Data alignment, link data, 9-12 Data buffers, 6-13
I-5
INDEX
Data bus (DATA) pins, 7-8, 7-38, 7-89, 13-9, 13-16 Data (Dreg) registers, G-3 Data fetch, external port, 7-3, G-3 Data file registers, listed, A-23 Data flow, 1-3, 2-1 Data flow multiprocessing, G-3 Data format, 2-2 Extended-precision normal word, 40-bit floating-point, 2-5 External data, 6-46 Link data, 6-86 Normal word, 32-bit fixed-point, 2-6 Normal word, 32-bit floating-point, 2-4 Serial data, 6-99 Short word, 16-bit floating-point, 2-6 Data hold cycle, 7-12 Data independent transmit frame sync (DITFS) bit, 10-22 Data I/O mask DQM (data I/O mask) pin, 8-3, 8-29 Data memory data select (DMDSEL) bit, 12-11 Data Memory (DM) bus, 1-2 Data moves, 1-10 Conditional, 3-59 Moves to/from PX, 5-14 Data packing, 1-12, 7-56, 7-58, 7-59, 7-110, 10-37 Data registers, 1-6, 2-30, 2-42, G-3 Data registers, secondary hi/lo (SRRFH/L) bits, 2-33 Data transfers, using EPBx buffers, 7-58 Data type, external port (DTYPE) bit, 6-32, 6-43, 6-46, A-81 Data type, serial port (DTYPE) bit, 6-96, 6-109, 10-24, A-101 Data type and formatting (multichannel and non-multichannel), 10-37, 10-38 Data types, 5-47
Deadlock resolution, 7-82, G-3 Deadlock (See bus deadlock) Debugging, tools, 13-49 Decode address (DADDR) register, 3-5, A-44 Decode address register, 3-2 Decode cycle, 3-7 Delayed branch (DB) instruction, 3-15 to 3-18, 3-19 (DB) Jump or Call instruction, 3-17, G-3 limitations, 3-19 Denormal operands, 2-5 Deposit bit field, 2-23 Divisor (DIVx) register, serial port, 10-4 DMA Bus slave versus bus master, 7-59 Defined, G-4 DSP DMA Access To System Bus, 7-84 External port, 7-58 Interrupt-driven DMA, 6-125 Serial port, 6-108 SPI, 11-32 SPI master mode, 11-32 SPI slave mode, 11-33 Transfers, 7-58 DMA Address (DA) Registers, listed, A-49 DMA block transfers, 10-59 DMA channel Buffer registers, listed, 6-13 Interrupt priorities, 6-126 Latency, 6-125 Parameter registers, listed, 6-13 Priority, 6-12, 6-22, 6-24, 6-44, 6-83, 6-99, 6-112 Status, 6-124 DMA channel priority rotation, external port (DCPR) bit, 6-31, 6-43, A-61 DMA channel status (DMASTAT) register, 6-125, A-90
I-6
INDEX
DMA control (DMACx) registers, 6-6, 6-30, 7-103, A-80, G-4 DMA controller, 1-2, 1-15 Enhancements, 1-18, 1-19 Priority pathways, 6-22 DMA data 16-bit external transfers, 6-52 32-bit external transfers, 6-51 32-bit internal transfers, 6-54 48-bit internal transfers, 6-53 64-bit internal transfers, 6-53 DMA enable, external port (DEN) bit, 6-15, 6-32, A-80 DMA external request counter, 6-61 DMA grant (DMAGx) pins, 6-57, 6-67, 7-59, 7-96, 13-9, 13-16 DMA hardware handshake, 6-59, 6-63, A-65 DMA hardware interface, 6-140 DMA hold off, 6-56, 6-62 DMA internal request & grant paths, 6-22 DMA parameter registers, defined, G-4 DMA pipeline, 6-61 DMA request (DMARx) pins, 6-54 to 6-67, 7-59, 13-9, 13-16 DMA sequences Chaining sequences, 6-25 Chain insertion, 6-28 Chain set up and start, 6-28 Sequence complete interrupt, 6-126 Sequence end, 6-21 TCB loading, 6-26, G-4 DMA slave, interrupts, 9-19 DMA targets External memory, 6-49 Internal memory, 6-139 DMx register, 12-13, 12-15 DO UNTIL instruction, 3-24 (See also loops)
DSP Architecture overview, 1-5 Design advantages, 1-1 DSP serial mode, 10-59 Dual add and subtract, 2-36 Dual-data accesses, 5-52 Dual processing element moves (See broadcast load) .D unit (See DAGs or ALU) D unit (See DAGs or ALU)
E
Edge-sensitive interrupts, 3-40, A-10, G-4 Effect latency (See latency) E field, address, A-42 EMULATION instruction, 12-4 Emulation (JTAG), 1-2 Emulation status EMU pin, 13-10, 13-54 Emulator 48-bit PX shift (EMUPX) register, 12-6 Emulator 64-bit PX shift (EMU64PX) register, 12-7 Emulator clock2 (EMUCLK2) register, 12-3, 12-16 Emulator clock (EMUCLK) register, 12-16 Emulator control shift (EMUCTL) register, 12-8 Emulator enable (EMUENA) bit, 12-8 Emulator idle (EMUIDLE) instruction, 12-17 Emulator interface illustrated for ADI JTAG processors, 13-50 Emulator Interrupt (EMUI) bit, A-27 Emulator interrupt enable (EIRQENA) bit, 12-8 Emulator Nth event counter (EMUN) register, 12-3, 12-16 Emulator PC shift (EMUPC) register, 12-7 Emulator PM data shift (EMUPMD) register, 12-5
I-7
INDEX
Emulator pod, connection, 13-56 Emulator ready (EMUREADY) bit, 12-12 Emulator shift (EMUPC) register, 12-7 Emulator space (EMUSPACE) bit, 12-12 Emulator status shift (EMUSTAT) register, 12-11 Enable breakpoint (ENBx) bit, 12-10 Enable (BRKOUT) pin, 12-8 Endian format, 1-14, 10-36, G-4 End-of-loop, 3-25 EPROM booting, 6-70, 6-71 EPROM boot select (EBOOT) pin, 13-10 Equals (EQ) condition, 3-54 Examples bit reverse addressing, 4-8 Cache inefficient code, 3-12 Clock derivation, 13-28 Configuring flags, 13-37 Direct branch, 3-14 DO UNTIL loop, 3-23 Dual processor system example, 8-25 External port DMA programming example, 6-76 Interrupt service routine, 3-48 Long word moves, 5-49 Programming clock ratio, 13-40 PX register transfers, 5-10 to 5-15 Rotating Priority Arbitration, 7-101 SDRAM programming examples, 8-40 Serial port programming examples, 10-67 Single and dual data access, 5-52 SPI programming examples, 11-44 Token passing, 9-27 Examples, timing Framed vs. unframed data, 10-45 Host acquisition of bus, 7-46 Link port handshake, 9-10 Normal vs. alternate framing, 10-45 Serial port multichannel transfer, 10-52
Examples, tming (continued) Serial port word select, 10-52 Typical synchronous write, 7-23 Execute cycle, 3-7 Execution stalls, bus transition, 7-97 Explicit versus implicit operations, G-5 Exponent derivation, 2-1, G-11 Extended precision normal word, 5-25, 5-50 Data access, 5-70 Data storage, 5-2 Mixed data access, 5-50 SIMD mode access, 5-74 External bank access mode (EBxAM) bits, 5-34, 5-42, A-66 External bank x waitstates (EBxWS) bits, 5-35, 5-45, 7-20, A-66, G-12 External bus arbitration, 6-24 External bus priority (EBPRx) bits, 5-33, A-61 External handshake mode, 6-49, 6-66 DMA exceptions, 6-66 (EXTERN) bit, 6-33, 6-47 to 6-66, A-82 program control (PCI) interrupt, 6-67 External instruction execution packing modes, 5-102 External memory, 1-18, 5-16, 5-22, G-5 Access modes, 5-42, G-8 Access timing, 7-13 Addressing registers, A-90 Banks, 7-9 Interface, 7-3, 7-6 External memory addresses, A-43 External memory DMA count (ECEPx) registers, 6-8, 6-12, 6-55, 6-66, A-90 External memory DMA index (EIEPx) registers, 6-7, 6-12, 6-55, 6-66, A-89 External memory DMA modifier (EMEPx) registers, 6-8, 6-12, 6-55, 6-66, A-89
I-8
INDEX
External port, 1-2, 1-12, 5-7, 7-1, G-5 Buffer modes, 6-42 Buffer status, 6-129 Conflict resolution, 7-3, G-3 Data packing, 1-12 DMA channel priority modes, 6-43 DMA channel priority swap, 6-24 DMA channel transfer modes, 6-46 DMA handshake modes, 6-47 DMA programming examples, 6-76 DMA setup, 6-68 Enhancements, 1-18 Latency, G-2 Modes, 6-30, 7-3 Packing status, 6-129 Selecting the external port buffers mode, A-82 Setting External Port Modes, 7-3 Single-word transfers, 7-58 Status, 6-127 Termination values, 13-16 External port address (EPAx) register, 12-16 External port boot (EBOOT) pin, 6-70, 6-89, 6-114 External port buffer (EPBx) register data transfers, 7-58 External port buffer (EPBx) registers, 6-4, A-61, A-76 External port buffer x DMA interrupt (EPxI) bit, A-30 External port bus priority (PRIO) bit, 6-33 External port DMA Channels, 6-49 DMA hardware interface, 6-140 DMA setup, 6-68 Modes, 6-46, 6-47 External port DMA channel priority rotation (DCPR) bit, A-61 External port (EP) registers, listed, A-49
External port FIFO buffers, 6-130, G-5 External port halt (EPHALT) bit, 12-12 External port-link port rotating DMA channel priority (PRROT) bit, A-61 External port packing mode (PMODE) bits, 6-32 External port stop (EPSTOP) bit, 12-9 EXTEST instruction, 12-4 Extract bit field, 2-23 Extract exponent, 2-23
F
False always (FOREVER) Do/Until condition, 3-56 FAX for information, 1-21 Fetch address (FADDR) register, 3-2, A-44 Fetch cycle, 3-7 Fetched address, 3-2 Field deposition/extraction, 2-1, G-11 FIFO buffer status, external port (FS) bit, 6-128, A-84 File Transfer Protocol (FTP) site, 1-21 Fixed-point ALU instructions, 2-12 Data, 2-1, G-1 Multiplier instructions, 2-21, 2-36 Operands, 2-10, A-14 Operations, 2-31 Saturation values, 2-19 Fixed-point overflow interrupt (FIXI) bit, 3-43, A-31 Fixed priority, 6-22, 6-44, 6-84, 6-99, 6-112 Flag input (FLAGx_IN) conditions, 3-55 Flag input/output (FLAGx) pins, 7-82, 10-7, 13-10, 13-16, 13-34, 13-39 Flag input/output (FLGx) bits, 13-34, A-10, A-37 Flag input/output select (FLGxO) bits, A-10
I-9
INDEX
Flag input/output value (FLAGS) register, 13-34, A-37 Flag pins, configuration example, 13-37 Flag update, 2-12, 2-20, 2-27, 2-46, 3-43, 4-9, 5-46, 7-82, 13-39, G-6 Floating-point ALU instructions, 2-14 Data, 2-1, 2-7, G-1 Data format (RND32) bit, 2-4 Invalid operation (FLTII) interrupt, 3-43 Multiplier instructions, 2-21 Operations, 2-31, 2-36 Floating-point invalid interrupt (FLTII) bit, A-31 Floating-point overflow interrupt (FLTOI) bit, 3-43, A-31 Floating-point underflow interrupt (FLTUI) bit, 3-43, A-13, A-31 Flow-through SBSRAM (See SBSRAM) Flush DMA buffers/status (FLSH) bit, 6-30, 6-128, A-83 Format conversion, 2-9 Format packing (Fpack/Funpack) instructions, 2-6 Fractional Data, 2-6, 2-7 Input(s), 2-22 Results, 2-17, C-6 Framed versus unframed data, 10-42 Frame sync early versus late, 10-44 (FSx) pins, 10-4, 13-10 internal vs. external, 10-42 options, 10-41 rates, setting, 10-49 required (FSR) bit, A-102 signals, configuration, 10-4 Full-duplex operation, specifications, 10-4 Functions, ABS (absolute value), 2-9
G
General-purpose (GPx) registers, 6-7, 6-12, 6-26, A-89 Global interrupt enable (IRPTEN) bit, A-4 Greater or Equals (GE) condition, 3-54 Greater Than (GT) condition, 3-54 Ground plane, 13-68
H
Handshake and idle for DMA enable (HIDMA) bit, 6-31, A-67 Handshake mode, 6-48, 6-57, 6-144 DMA, A-65 Enable/disable transition, 6-62 Operation, 6-60 Register handshake/write-back messaging, 7-77 Transfer Size, 6-58, 6-68 Handshake mode (HSHAKE) bit, 6-33 to 6-66, A-82 Handshaking External port, 7-4, 7-42, 7-87 External port DMA, 6-47 Link port, 9-2, 9-10 Harvard architecture, 5-4, G-6 Hold off DSP, bus transition, 7-97 DSP, during DMA, 6-62 External device, during DMA, 6-56 SBSRAM, 7-40 Hold time, inputs, 13-19 Hold time cycle, 5-45, 7-12, G-6 Host bus acknowledge (REDY) pin, 13-13 Host Bus Grant (HBG) pin, 7-42, 7-44, 7-79, 7-89, 7-96, 7-107, 13-11 Host bus master (HSTM) bit, A-69 Host bus request HBR pin, 7-42, 7-44, 7-111, 8-29, 13-10 Host bus width (HBW) bit, A-60
I-10
INDEX
Host interface, 1-13 Access to link buffers, 9-14 Booting, 6-70 Deadlock resolution See bus deadlock Deadlock Resolution With SBTS, 7-54 Enhancements, 1-18 signals, 7-44 Status, 7-76 Uniprocessor, 7-50 Host most significant word first packing (HMSWF) bit, 6-31, A-60 Host packing mode (HPM) bits, 6-31, 6-42, 7-43 Host packing status (HPS) bit, 6-127, A-70 Host Processor Interface, 7-42 Host Transfer Timing, 7-51 Host Transition Cycle (HTC), 7-45, 13-17, G-6 Hysteresis on Reset (RESET) pin, 13-61
I
I2S control bits, 10-49 I2S mode, 10-48, 10-59 I2S support, 1-14 IDCODE instruction, unsupported, 12-4 Identification, processor (PIDx) bit, A-78 Identification code (IDC) bit, A-69 Identification (ID2-0) pin, 7-94, 13-11, 13-16 Idle cycle, 7-10, 7-12, G-6 IDLE instruction, 3-1, 3-48 IDLE instruction, defined, G-6 Idle mode (INIDLE) bit, 12-12 IEEE 1149.1 JTAG specification, 1-16, G-7 IEEE 1149.1 JTAG standard, 13-56 IEEE 754/854 floating-point data format, 2-4, C-1
IEEE floating-point number conversion, 2-6 Illegal input condition detected (IICD) bit, 5-41, A-27 Illegal IOP register access (IIRA) bit, A-20 Illegal I/O processor register access enable (IIRAE) bit, 5-34, 5-41, A-11 Immediate branch, 3-16 Implicit operations, 5-41 Broadcast load, 4-5 Complementary registers, 2-40 Long Word (LW) accesses, 5-48 Neighbor registers, 5-49 SIMD mode, 2-40 In circuit signal analyzer (ICSA) function, 12-11, 12-17 INCLUDE directory, 10-10 Increment instruction, 2-9 Index (Ix) registers, 4-2, 4-15, A-47, G-7 Indirect addressing, 1-8 Indirect branch, 3-15, G-7 Inductance (run length), 13-67 Infinity, round-to, 2-5 Input filtering, link port, 13-60 Input/Output (IO) bus, 1-2 Input setup and hold time, 13-19 Input signal conditioning, 13-60 Input Synchronization Delay, 13-33 Instruction External memory fetch, 7-3, 7-56, G-3 Moves, 7-56 Transfers, 7-56, 7-110 Instruction (bit), 3-5 Instruction cache, 1-9, 3-9, 5-5 Instruction dispatch/decode (See program sequencer) Instruction Execution Mode, external packed (IPACK) bit, A-62 Instruction pipeline, 3-2, 3-7 Instruction register, 12-4
I-11
INDEX
Instructions ADD, 2-1, 2-9, 2-36 AVE, 2-9, 2-36 BIT CLR, 2-23 BIT TST, 2-8 Computational, 2-1 Conditional, 2-8, 2-42 , 7-10 conditional, 2-44 Decrement, 2-9 delayed branch (DB), 3-19 FDEP, 2-25 Multiplier, 2-15, 2-20 Instruction set Changes, 1-20 Enhancements, 1-20 Instruction word Data access, 5-50 Storage, 5-2 Word Rotations, 5-25 Instruction Word Transfer (IWT) bit, 7-49 Integer Input(s), 2-22 Results, 2-16, C-6 Integer data, 2-6 Interleaved data, 5-100, G-7 Internal address bus (IA), 8-26 Internal Buses, 1-10 Internal interrupt vector table (IIVT) bit, 5-32, 5-37, A-60 Internal I/O bus arbitration (request & grant), 6-22 Internal memory, 5-2, 5-16, 5-18, 5-24, G-7 Internal memory 32-bit transfers (INT32) bit, A-84 Internal memory data width (IMDWx) bits, 5-12, 5-32, 5-37, 5-47, 6-86, A-60
Internal memory DMA Count (Cx) registers, A-87 Internal memory DMA index (IIx) registers, 6-7, 6-9, A-87 Internal memory DMA modifier (IMx) registers, 6-7, 6-9, A-87 Internal serial clock (ICLK) bit, 10-24 Internal timer, 8-3 Internal transmit frame Sync (ITFS) bit, 10-25 Interprocessor Messages and Vector Interrupts, 7-76 Interrupt and Timer Pins, 13-33 Interrupt controller, 3-2 Interrupt-driven I/O, external port (INTIO) bit, 6-43, 6-128, 6-130, A-82 Interrupt-driven transfers External port, 6-130 Link port, 6-134 Serial port, 6-136 Interrupt enable, global (IRPTEN) bit, 3-41, A-4 Interrupting IDLE, 3-48 Interrupt input (IRQ2-0) pins, 13-11, 13-33 Interrupt input x interrupt (IRQxI) bit, A-28 Interrupt latch (IRPTL) register, A-27 Interrupt latency, 3-36 Cache miss, 3-36 Delayed branch, 3-36 IRQx and multiprocessor vector standard, 3-38 Single-cycle instruction, 3-36 Writes to IRPTL, 3-35 Interrupt mask (IMASK) register, 3-41, A-31 Interrupt mask/mask pointer, link port (LIRPTL) register, 3-42, 3-46, A-34
I-12
INDEX
Interrupt mask pointer (IMASKP) register, 3-45, A-32 Interrupt nesting enable (NESTM) bit, 3-45 Interrupts, 1-9, 2-8, 3-1, 3-34, 4-9, 5-41, 5-42, 5-46, A-28, G-7 Arithmetic, 3-43 Clear interrupt (CI) Jump, 3-48 Conditions for generating interrupts, 10-60 Data Address Generators (DAGs), 4-14 Delayed branch, 3-18 DMA interrupts, 6-125 to 6-130 DMA slave, 9-19 Hold off, 3-39 Idle instructions, 3-48 Inputs (IRQ2-0), 3-34 Interrupt sensitivity, 3-40, A-10, G-7 Interrupt vector table, 5-37, B-1 IRPTL write timing, 3-35 Latch status for, A-27 Latency (See interrupt latency) Link port, spurious interrupts, 9-9, 9-22 Link ports, 9-17 Masking and latching, 3-41, 3-42, 6-126, 9-19 Multiprocessing, 3-49 Nested interrupts, 3-45, A-4 Non-maskable RSTI, A-60 PC stack full, 3-53 Program control (PCI) interrupts, 6-67 Response, 3-34 Re-using, 3-47 Sensitivity, interrupts, A-10 Software, 3-35 Spurious, link port, 9-9, 9-22 Timer, 3-51 Vector interrupts, 7-76, G-9 VIRPT, 3-45 Interrupts and sequencing, 3-34
Interrupt vector, sharing, 10-7 Interrupt x edge/level sensitivity (IRQxE) bits, 3-40, A-10 Interval timer, 3-50 INTEST instruction, 12-4 IO architecture, 1-19 IOFLAG value register, A-38 I/O interface to peripheral devices, 10-1 I/O interrupt conditions, 6-124 IOP addresses for SPI registers, 11-9 I/O processor, 1-2, 1-14, 5-16, 6-1, 6-9, 6-81, 6-96 DMA channel priority, 6-22 External port modes, 6-29 Link port modes, 6-81 Registers, G-6 Serial port modes, 6-95 Shadow registers, 7-55 Status, 6-121 I/O processor registers, listed, A-48 IOP Shadow Registers, 7-55 I/O stop (IOSTOP) bit, 12-9 IR decode space, 12-2
J
Joint Electronic Device Engineering Council (JEDEC), 8-9 JTAG boundary register, 12-18 data output (TDO) pin, 13-15 emulation, designing for, 13-49 emulator references, 13-56 in circuit emulator (ICE), 12-3 instruction register codes, 12-4 interface, access to features, 12-2 interface pins, 13-41 logic, 12-3 pod connector, illustrated, 13-58 port, 1-2, 1-16, 12-1, 12-3, 13-49
I-13
INDEX
JTAG (continued) port, defined, G-7 references, additional documents, 13-56 scan chain, restrictions, 13-54 signals, 13-54 signals, listed, 13-52 specification, IEEE 1149.1, 12-1, 12-3, 12-4, 12-29 test access port (TAP), 12-3, 13-49, 13-55 test clock (TCK) pin, 13-15 test data input (TDI) pin, 13-15 test-emulation port, 12-1 to 12-29 test mode select (TMS) pin, 13-15 test reset (TRST) pin, 13-16 JUMP instructions, 3-1, 3-13, G-7 Clear interrupt (CI), 3-14, 3-48 Loop abort (LA), 3-14, 3-24 Pops status stack with (CI), 3-45
L
Latch, characteristics, 12-1 Latch status for interrupts, A-27 Latchup, 13-60 Late frame sync (LAFS) bit, 10-25 Latency, 3-5, 3-11, 3-36, 6-30, 6-96, 6-108, 6-125, G-3 Direct read, 7-57 DMA status, A-91 Input Synchronization, 13-33 Instruction fetch, external memory, 7-3 I/O processor registers, A-48 Link ports, 9-13 Shadow registers, 7-55 Slave write FIFO, 7-56 Synchronous write, 7-22 System registers, 3-5 Vector interrupt, 3-38 Least significant bits (LSB), 3-9
LEFTO operation, A-17 LEFTZ operation, A-17 Length (Lx) registers, 4-2, 4-16, A-47, G-7 Less or Equals (LE) condition, 3-54 Less than (LT) condition, 3-54 Level-sensitive interrupts, 3-40, A-10, G-7 Line run length (inductance), 13-67 Line termination, link port, 9-30 Link buffer assignment (LARx) bits, 6-82 Link buffer DMA chaining enable (LxCHEN) bit, 6-82, 6-85, 9-7, A-93 Link buffer DMA enable (LxDEN) bit, 6-15, 6-82, 6-85, 9-7, 9-13, A-93 Link buffer enable (LxEN) bit, 6-82, 6-83, 9-7, A-93, A-94 Link buffer extended word size (LxEXT) bit, 6-83, 9-7, 9-13, A-92, A-93, A-95 Link buffer receive packing error status (LRERRx) bits, 9-22, A-96 Link buffer status (LxSTATx) bits, 9-13, A-96 Link buffer-to-port connections, 9-3 Link buffer transmit/receive (LxTRAN) bit, 6-82, 6-86, 9-7, A-93, A-94 Link buffer x DMA interrupt mask (LPxMSK) bit, A-35 Link buffer x DMA interrupt mask pointer (LPxMSKP) bit, A-35 Link (LSP) registers, listed, A-50 Link port, 1-2, 1-15, 1-19, 9-1, 9-10, G-7 Booting, 6-85, 6-88, 6-89, 6-113, 6-114 Buffers, 6-83, 6-133, 9-3 Data transfers, cluster, 7-93 Designing for link ports, 9-30 DMA, 6-85, 6-86, 6-112, 9-4, 9-16 Enhancements, 1-19 Handshake timing, 9-10 Identifying the one to service, 9-21 Interrupt-driven transfers, 6-134
I-14
INDEX
Link port (continued) Interrupts, 9-17 to 9-22 Line termination, 9-30 Priority modes, 6-83 Status, 6-131, 6-133 Throughput, 9-31 Token passing, 9-27 Transmission errors, 9-22 Link port acknowledge (LxACK) pins, 9-2 to 9-12, 13-12, 13-16 Link port assignments (LABx) bits, A-96 Link port boot (LBOOT) pin, 6-70, 6-89, 6-114, 13-12 Link port buffer (LBUFx) registers, 6-4, 9-3, 9-12, A-92 Link port buffer x DMA interrupt (LPxI) bit, A-34 Link port clock divisor (LxCLKD) bit, A-93, A-95 Link port clock divisor (LxCLKDx) bits, 9-8, 9-10, 13-26, 13-28 Link port clock (LxCLK) pins, 9-2, 9-10, 9-12, 13-11, 13-16 Link port control (LCTL) register, 6-6, 6-81, 9-9, 9-13, A-92, A-93 Link port data (LxDAT7-0) pins, 7-8, 9-2, 9-3, 9-12, 13-11, 13-16 Link port data path width (LxDPWID) bit, 9-8, 9-10, A-94, A-96 Link port DMA channel priority rotation (LDCPR) bit, 6-82, 6-83, A-61 Link port DMA interrupts, latch and mask bits, 3-43 Link port-external port rotating DMA channel priority (PRROT) bit, A-61 Link port input filter circuits, 13-60 Link port interrupt DMA summary interrupt (LPISUMI) bit, 3-43, 6-126, A-29, A-34
Link port interrupt (LIRPTL) register, 3-46, A-34 Link port pulldown resistor, caution when enabled, 9-9 Link port pulldown resistor disable/enable (LxPDRDE) bit, 9-8, 9-12, A-94, A-95 Link port receive mask (LxRM)) bits, 6-131, A-98 Link port receive request status (LxRRQ) bits, 6-131, A-99 Link port service request interrupt (LSRQI) bit, 6-126, 6-133, 9-9, 9-17, A-30 Link port service request (LSRQ) register, 6-134, 9-19, A-93, A-98 Link port transmit mask (LxTM) bits, 6-131, A-98 Link port transmit request status (LxTRQ) bits, 6-131, A-98 Logical operations, 2-9 Logical shifts, 2-1, G-11 Long word, 5-25, 5-48, 5-50 Data access, 5-10, 5-48, G-9 Data moves, 5-48 Data storage, 5-2 SIMD mode, 5-80 Single Data, 5-76 SISD Mode, 5-78 Loop, 3-1, 3-22, G-8 Address stack, 3-5, 3-29 Conditional loops, 3-23 Counter stack, 3-30, 3-31 End restrictions, 3-25 Status, 3-30 Termination, 3-3, 3-24, 3-30, 3-31, 3-54 Loop abort (LA) Jump, 3-14, 3-24 Loop address stack, 3-29 Loop address stack (LADDR) register, A-45 Loopback mode, 9-3
I-15
INDEX
Loop counter expired (LCE) condition, 3-22, 3-56 Loop counter (LCNTR) register, 3-31, 3-32, A-45 Loop counter stack, 3-30 Loop counter stack, access, A-45 Loops and sequencing, 3-22 Loop stack empty (LSEM) bit, 3-31, A-21 Loop stack overflow (LSOV) bit, 3-31, A-21 Low active transmit frame sync (LFS, LTFS and LTDV) bit, 10-25, A-102 .L unit (See ALU) L unit (See ALU)
M
Mantissa (floating-point operation), 2-9 Masking interrupts, 3-41 Masking interrupts, link port, 9-19 Master In Slave Out (MISO) pin, 11-6 Master mode, 6-48, 6-50 16-bit external transfers, 6-52 32-bit external transfers, 6-51 32-bit internal transfers, 6-54 48-bit internal transfers, 6-53 64-bit internal transfers, 6-53 Controls, 6-51 Internal address/transfer size generation, 6-52 SPI, 11-25 Transfer Size, 6-52 Master mode enable (MASTER) bit, 6-32, 6-47 to 6-66, A-82 Master Out Slave In (MOSI) pin, 11-6, 11-22 Maximum burst length (MAXBL) bit, A-84 Max/Min function, 2-9
Memory, 1-2, 1-11, 5-1, 5-8, 5-16, 5-24, G-7 Access priority, 5-5, 5-39, 5-82 access priority, 5-7 Access types, 5-40, 5-46, G-8 Access word size, 5-47 addressing external memory, A-90 Asynchronous interface, 5-43, G-8 Banked external memory, 7-9 Banks, 5-2, 5-22, 5-38 banks, 5-2 Banks of memory, 7-9, G-8 blocks, 5-2 to 5-8, 5-18, 5-37 blocks, defined, G-8 Booting, 5-23, 5-35 Boot memory, 5-23, 7-10 Boot memory, defined, G-1 Columns of memory, 5-8 Data types, 5-47 Enhancements, 1-18 internal memory addresses, A-43 Mixing 32-Bit & 48-Bit Words, 5-26 mixing 32-Bit and 48-Bit Words, 5-26 mixing 40/48-bit and 16/32/64-bit data, 5-24, 5-31 Mixing word width SIMD mode, 5-84 SISD mode, 5-82 Multiprocessor, 5-19 Synchronous interface, 5-43, G-8 Transition from 32-bit/48-bit data, 5-30 Unbanked memory, 5-22 Memory map restrictions, A-42 Memory mapped devices, 6-12 Memory mapped registers, 5-16, A-47, A-51 Memory read RD pin, 6-54, 7-9, 7-38, 7-89, 7-96, 13-13, 13-16
I-16
INDEX
Memory select (MSx) pins, 5-22, 5-39, 7-8, 7-9 to 7-96, 7-97, 7-107, 8-4, 8-8, 8-32, 13-12, A-75 Memory test (MTST) bit, 12-11 Memory test shift (MEMTST) register, 12-2, 12-13 Memory test shift (MEMTST) register , 12-13 Memory transfers, 5-53 16-bit (Short word), 5-54 32-bit (Normal word), 5-62 40-bit (extended-precision normal word), 5-70 64-bit (Long word), 5-76 Message (MSGRx) registers, 7-76, 7-77, A-77 M field, address, A-42 Microprocessor interface, 7-85 -law companding Mnemonics (See instructions) Mode control 1 (MODE1) register, A-3 Mode control 2 (MODE2) register, A-10 Mode control 2 shadow (MODE2_SHDW) register, A-78 Mode mask (MMASK) register, 3-44, 4-14, A-8 Mode register Defined for SDRAM, 8-3 Mode register set (MRS) command, 8-19 Modes, multichannel, 10-2 Modified addressing, 4-10, G-8 Modify address, 4-1, G-8 Modify instruction, 4-14, 4-17 Modify (Mx) registers, 4-2, 4-15, A-47, G-8 Modulo addressing, 1-8 Most significant word first, packing (MSWF) bit, 6-32, 6-42, A-81 Multichannel buffered serial port, McBSP (See serial ports)
Multichannel mode, 10-2, G-11 Multichannel receive channel select (MRCSx) registers, A-114 Multichannel selection registers, 10-57 Multichannel transmit compand select (MTCCSx) registers, A-113 Multifunction computations, 2-34, G-8 Multi-master error (MME) bit, 11-30 Multiple DSP connection to JTAG header illustrated, 13-54 Multiple DSP systems, 13-51 Multiplier, 1-6, 2-1, G-9 Clear operation, 2-18 Input modifiers, 2-21 Instructions, 2-15, 2-20 Operations, 2-15, 2-19 Result (MRF/B) registers, 2-15, 2-16 Rounding, 2-18 Saturation, 2-18 Status, 2-8, 2-19, 2-20 Multiplier fixed-point overflow status (MOS) bit, 2-20, A-19 Multiplier floating-point invalid (MI) bit, 2-19, A-16 Multiplier floating-point invalid status (MIS) bit, 2-20, A-20 Multiplier floating-point overflow status (MVS) bit, 2-20, A-19 Multiplier floating-point underflow (MU) bit, 2-19, A-16 Multiplier floating-point underflow status (MUS) bit, 2-20, A-20 Multiplier negative (MN) bit, 2-19, A-14 Multiplier overflow (MV) bit, 2-19, 3-55, A-15 Multiplier results (MRFx and MRBx) registers, listed, A-24 Multiplier signed (MS) bit, 3-55 Multiplyaccumulator (See multiplier)
I-17
INDEX
Multiprocessing Booting, 6-71, 7-108 Bus arbitration, 7-93 Cluster Multiprocessing, 7-90, 7-91 Data flow multiprocessing, 7-90 Direct read and write, 7-109 DSP Interface, 7-87 Interface, 1-14, 1-19 Interface Status, 7-112 Interrupts, 3-49 Local memory, 7-85 Memory, 5-16, 7-3, 7-12, A-43, G-9 Multiprocessing pins, 7-94 Multiprocessing System Architectures, 7-90 SDRAM Dual processor system example, 8-24 SIMD processing, 7-93 SPI booting, 11-42 System, G-9 System architectures, 7-90 Vector interrupt, 3-49, G-9 wand local memory, 7-85 Multiprocessing operation, SDRAM, 8-24, 8-38 .M unit (See multiplier) M unit (See multiplier)
Normal word, 5-25, 5-50 Accesses with LW, G-9 Data access, 5-50 Data storage, 5-2 Mixing 32-bit data and 48-bit instructions, 5-25 Multiprocessor memory, 5-21 SIMD mode, 5-64, 5-68 SISD mode, 5-62, 5-66 Not, Logical, 2-9 Not-a-number (NAN), 2-5 Not Equal (NE), 3-54
O
Open drain drivers Support, 1-15 Operands, 2-5, 2-9, 2-15, 2-23, 2-30, G-3 Operands and results Storage for, A-23 Optimizing cache usage, 3-11 Optimizing DMA throughput, 6-139 Or, Logical, 2-9 Overflow (See ALU, multiplier, or shifter)
P
Paced master mode, 6-48, 6-54 Packed instruction mode (IPACK) bit, A-62 Packing 16- to 32-bit, 7-69 16- to 48-bit, 7-75 32- to 64-bit from host, 7-66 40- to 48-bit from host, 7-74 8- to 48-bit, 7-68, 7-76 Data, 1-12, 6-12, 6-42, 6-49, 6-67, 7-56 External port status, 6-129
N
Nearest, round-to, 2-5 Negate breakpoint (NEGx) bit, 12-9 Nested interrupt routines, 3-3 Nesting Multiple interrupts (NESTM) bit, A-4 No boot mode (NOBOOT) bit, 12-11 Non-counter-based loops, 3-27, 3-28 (See also counter-based loops) NOP command, 8-22, 8-37
I-18
INDEX
Packing (continued) Host data packing, 7-59 Link port status, 6-133 Packing mode combinations, 5-102, 7-60 SPI word packing, 11-37 Packing 16-bit to 32-bit Words (PACK) bit, 6-97, 6-98, 6-111, A-101 Packing enable (PACKEN) bit, SPI port, 11-14, A-119 Packing mode (PMODE) bits, 5-36, 6-42, 7-110, A-81 Packing status, external port (PS) bits, 6-127, A-85 Parallel assembly code (See Multifunction computation or SIMD operations) Parallel operations, 2-34, G-8 Pass function, 2-9 PCB transmission line, 13-64 PC stack pointer (PCSTKP) register, 3-20 Peripherals, 1-2, 1-11, 5-7, 7-4, 7-9, 9-1, 9-30, G-7, G-9 connecting to link ports, 9-31 I/O interface to, 10-1 Pin, reset states, 13-22 Pin connections, SDRAM, 8-7 Pin descriptions, 13-2 Pin states at reset, 13-19, 13-22 Pipelined SBSRAMs (See SBSRAM), 7-40 Pipelining with the SDBUF bit, A-75 Plane, ground, 13-68 PLL-based clocking, 13-24 PLL ratios, 13-25 Pod logic DSP 2.5V pod logic, 13-58 Pod logic, DSP 2.5V pod logic, 13-57, 13-59
Pop Loop counter stack, 3-31 Program counter (PC) stack, 3-13 Status stack, 3-45 Porting from previous SHARCs Assembly syntax, 2-31 Booting, 6-70, 6-88 Bus lock, 7-54 Circular Buffer Enable (CBUFEN) bit, 4-4, 4-14 Conditional instructions, 7-10 Instruction Word Transfer (IWT) bit, 7-49 Link ports, 9-1, 9-9 Multiprocessor Memory Space Waitstates (MMSWS) bit, 7-13 Paged DRAM boundary, 7-12 Performance, 2-39 Symbol changes, 1-20 Port rotate rotating DMA channel priority, linkexternal ports (PRROT) bit, 6-24, 6-82, 6-85, A-61 Post-modify addressing, 1-8, 4-1, 4-10, 4-23, G-10 Power sequence, JTAG emulator, 13-56 Power supply, analog (AVDD) pin, 13-5 Power supply, analog return (AGND) pin, 13-5 Power supply, core (VDDINT) pin, 13-16 Power supply, ground (GND) pin, 13-10 Power supply, I/O (VDDEXT) pin, 13-16 Power-up options, SDRAM, 8-19 Precharge command, 8-29, 8-32, 8-33, 8-37, A-74 Precharge command, defined, 8-3 Precision, 1-5, 2-4, 2-5, 2-6, G-10 Pre-modify addressing, 1-8, 4-1, 4-10, 4-23, G-10 Primary registers, 1-9, 2-30
I-19
INDEX
Priority Access, 7-103 Fixed and rotating, 7-92 Rotating priority, 7-92 Rotating Priority Arbitration Example, 7-101 Priority, DMA requests (See also DMA channel priority, Rotating priority, and Fixed priority), 6-43 Priority, external port-bus (PRIO) bit, 7-103, A-83 Priority access (PA) pin, 7-89, 7-94, 7-103, 13-13, 13-16, A-83 Priority bus arbitration select, rotating (RPBA) pin, 13-13 Probes, oscilloscope, 13-70 Processing elements, 1-1, 1-6, 1-7, 2-1, 2-31 Processing element Y enable (PEYEN) bit, SIMD mode, 2-4, 2-38, 4-3, 4-6, 4-18, 5-34, 5-39, A-5 Processor clock frequency, 10-1 Processor core, 1-5 Access to link buffers, 9-13 Buses, 1-10, 1-17 Enhancements, 1-17 Program control interrupt (PCI) bit, 6-25, 6-26, 6-67, 6-126 Program counter (PC) register, 3-2, A-41 Program counter (PC) relative address, 3-14, G-3 Program counter (PC) stack, 3-52 Program counter (PC) stack empty (PCEM) bit, 3-53 Program counter (PC) stack full (SOVFI) interrupt, 3-53 Program counter shadow (PC_SHDW) register, A-77 Program counter stack empty (PCEM) bit, 3-53, A-20
Program counter stack full (PCFL) bit, 3-53, A-20 Program counter stack (PCSTK) register, 3-5, A-44 Program counter stack pointer (PCSTKP) register, 3-5, 3-53, A-44 Program fetch (See program sequencer) Program flow, 3-8 Program memory address (PMDAx) register, 12-15 Program memory bus exchange (PX) register, 1-11, 5-10, 5-38, A-25, A-77 Program Memory (PM) bus, 1-2 Program sequence address (PSAx) register, 12-15 Program sequencer, 3-1 to 3-66 Control, 1-7 Latency, 3-5 PSx, DMx, IOx, & EPx (Breakpoint) register, 12-13, 12-15 Pull-down resistors, link port, 9-12 Push Loop counter stack, 3-32 Program counter (PC) stack, 3-13 Status stack, 3-44
R
RAS-to-CAS delay, 8-4, 8-12 Read command, SDRAM, 8-34 Read command, SDRAM, pin state during, 8-35 Read commands, SDRAM, 8-34 Read (RD) pin, 6-54, 7-9, 7-38, 7-89, 7-96, 13-13, 13-16 Reads Direct read latencies, 7-57 Direct reads, 7-109 Direct reads & writes, 7-57 Slave, 7-55
I-20
INDEX
Ready-Host Acknowledge (REDY) pin, 6-56, 7-42, 7-44, 7-51, 7-79, 7-89, A-61 Receive clock (RCLKx) pins, 13-16 Receive data buffer status (RXS) bits, 6-138 Receive data (RXx) registers, 6-4 Receive overflow status (ROVF) bit, 6-137 Reciprocal function, 2-9 Refresh command (REF), 8-2, 8-29, 8-30, 8-37, 8-39 Refresh cycle, 8-3, A-74 Register codes JTAG instruction, 12-4 Registers, A-1 to A-121 Boundary, 12-17 data file registers, listed, A-23 Data (R0-R15, S0-S15) registers, A-23 Decode address, 3-2 files, 2-30, 10-10, G-3 files, (See also data register files), 2-30 groups (I/O Processor), A-49 I/O processor registers, listed, A-48 JTAG boundary, 12-18 latency (See latency) load broadcasting (See broadcast load) Memory mapped, A-47, A-51 Neighbor, 5-48, 5-49, 5-76, 5-78, 5-80 Universal (Ureg) registers, 2-40 write file precedence, 2-30 Registers, complementary (See Complementary registers) Registers, neighbor (See neighbor registers) Register-to-register Moves, 2-45, 5-11 Swaps, 2-44, G-10 Transfers, 2-43 Register writes and effect latency, 10-30 Reset interrupt (RSTI) bit, A-27 Reset out (RSTOUT) pin, 13-14
Reset (RESET) pin, 7-89, 13-13, 13-29, 13-61 Input hysteresis, 13-61 Pin states at reset, 13-19 Resistors, Pull-up/down, 13-16, 13-22, 13-54 restrictions on ending loops, 3-25 on short loops, 3-26 restrictions, delayed branch, 3-19 Restrictions on short loops, 3-26 Results (MRF/MRB) registers, 2-32 Return (RTI/RTS) instructions, 3-13, 3-35 ROM boot accessmode (RBAM) bit, A-67 ROM boot waitstates (RBWS) bit, A-67 Rotate bits, 2-23 Rotate (See swap operator) Rotating priority, 6-22, 6-24, 6-44, 6-84 Rotating Priority Bus Arbitration (RPBA) pin, 7-94, 7-98 Rounded output, 2-22 Rounding 32-bit data (RND32) bit, A-5 Rounding mode, 2-4, 2-7, A-5 RS-232 device, restrictions, 10-7 RUNBIST instruction, 12-4
S
SAMPLE instruction, 12-4 Sampling edge for data and frame syncs, 10-43 Saturation (ALU saturation mode), G-10 Saturation maximum values, 2-19 SBSRAM DSP pins, 7-37 Hold off, 7-40 Partial Truth table , 7-39 Signal mapping figure , 7-37 Support, 7-39 Using External SBSRAM, 7-36 Scale (floating-point operation), 2-9
I-21
INDEX
SDRAM Accessing, 8-25 Block diagram, 8-24 Calculating the refresh counter, 8-13 Configuring, 8-10 Controller commands, 8-31 controller interface, illustrated, 8-5 Controller standard operation, 8-22 Device densities and page size combinations, 8-28 DMA transfers, 8-37 Dual processor system example, 8-24 page size (SDPGS), 8-18 Page sizes supported, 8-10 Pin connections, 8-7 Powering up after reset, 8-30 Selecting the active command delay, 8-20 Specifications, 8-1 Timing specifications, 8-8 SDRAM A10 (SDA10) pin, 8-8, 13-14 SDRAM address mapping 128 Mbit, 8-28 256 Mbit, 8-28 64 Mbit, 8-27 SDRAM bank cycle time tRTP, 8-4 SDRAM buffer (SDBUF) bit, 8-16, 8-17, A-75 SDRAM burst length, 8-2 SDRAM CAS latency (SDCL) bit, A-73 SDRAM clock enable (SDCKE) pin, 8-8, 8-32, 13-14 SDRAM clock ratio (SDCKR) bit, 8-12, A-75 SDRAM clock (SDCLK) pin, 8-1, 8-6, 8-8, 8-15, 13-14 SDRAM column address select CAS pin, 8-7, 8-32, 13-6 SDRAM controller, 8-22
SDRAM controller (SD) registers, listed, A-50 SDRAM control (SDCTL) register, 8-2 to 8-39, A-73 SDRAM control (SDCTL) register, bit definitions, A-73 SDRAM data mask pin (DQM), 8-7, 13-9 SDRAM device memory bank (SDBN) bit, 8-15, A-75 SDRAM external address (EA) pin, 8-26 SDRAM external memory bank 0 enable (SDEMx) bit, 8-16, 8-37, A-74 SDRAM external memory bank (SDBS), 8-16 SDRAM interface, 1-13, 8-29 SDRAM interface, storing configuration data, 8-11 SDRAM latency mode, 8-33 SDRAM page length, specifying, 8-19 SDRAM page size, defined, 8-3 SDRAM page size (SDPGS) bit, 8-18, 8-19, A-74 SDRAM Parallel refresh command, 8-29 SDRAM power up mode (SDPM) bit, 8-19, 8-32, A-74 SDRAM power up sequence (SDPSS) bit, 8-19, 8-32, A-74 SDRAM refresh counters, 8-24, 8-38 SDRAM refresh counter value (SDRDIV) register, 8-3, 8-13, 8-34, A-72 SDRAM row address select (RAS) pin, 8-8, 8-32, 13-13 SDRAM SDCLK0 disable (DSDCTL) bit, 8-15, A-73 SDRAM SDCLK1 disable (DSDCK1) bit, A-73 SDRAM Self-refresh mode Entering and exiting, 8-31 SDRAM self refresh (SDSRF) bit, 8-20, 8-31, 8-39, A-74
I-22
INDEX
SDRAM tras (SDTRAS) bit, 8-21, A-73 SDRAM trcd (SDTRCD) bit, A-75 SDRAM trp (SDTRP) bit, 8-21, A-73 SDRAM write enable SDWE pin, 8-8, 8-32, 13-14 Secondary processing element, 2-37 Secondary registers, 1-9, 2-32, 2-43, 4-4, 4-6, A-4 Secondary registers for computational units (SRCU) bit, 2-33, A-3 Secondary registers for DAGs (SRDxH/L) bits, 4-4, A-3, A-4 Secondary registers for register file (SRRFH/L) bit, A-4 Selecting the frame sync options (FS_BOTH), 10-50 Selecting the I2S transmit and receive channel order (L_FIRST), 10-49 Self refresh command (SREF), 8-3, 8-20, 8-39 Semaphores, 7-57, 7-110, G-10 Sensing interrupts, 3-40 Serial clock (SCLKx) pins, 10-4, 13-14 Serial peripheral interface( See SPI) Serial port, 10-1 to 10-67 Data types, 10-37 Disabling the serial port(s), 10-8 Enabling DMA (SDEN), 10-51 Enabling I2S mode (OPMODE, MCE), 10-49 Enabling master mode (MSTR), 10-50 Interrupts, 10-60 timing example, word select timing in I2S mode, 10-52 Word formats, 10-35 Serial port block diagram, 10-3 Serial port chained DMA enable (SCHEN) bit, 6-97, 6-99, 10-28, A-102 Serial port clock, internal clock (ICLK) bit, A-101
Serial port clock, internal clock ( MSTR) bit, I2S mode only, A-101 Serial port connections, 10-3 Serial port control registers and data buffers, 10-9 Serial port control (SPCTLx) registers, 6-6, 6-96, 10-14, 10-15, A-100, A-101 Serial port count (CNTx) registers, A-113 Serial port current channel selected (CHNL), A-110 Serial port data bufferstatus (DXS_A) bit, A-104 Serial port data direction control (DDIR) bit, A-103 Serial port data independent transmit frame sync (DITFS) bit, A-102 Serial port divisor (DIVx) registers, 10-4, A-112 Serial port DMA chaining, 10-65 Serial port DMA channels, 10-59 Serial port DMA enable (SDEN) bit, 6-15, 6-97, 6-99, 6-109, 10-28, A-102 Serial port DMA interrupt (SPxI) bit, A-29 Serial port DMA parameter registers, 10-61 Serial port DXA data buffer status (DXS_A) bit, 13-10 Serial port DXA error status (DERR_A) bit, A-104 Serial port DXB data buffer status (DXS_B) bit, 13-10, A-104 Serial port DXB error status (DERR_B) bit, A-103 Serial port enable (SPEN) bit, 6-96, 6-109, 6-136, 10-28, A-101 Serial port frame sync ( IFS or IRFS) bit, internal, A-102 Serial port FS both enable (FS_BOTH) bit, A-103 Serial port interrupts, 10-7 Serial port interrupts, priority of, 10-8
I-23
INDEX
Serial port interrupt (SPxI) bit, 10-8, A-29 Serial port late frame sync (LAFS) bit, A-102 Serial port loopback, 10-46 Serial port loopback mode (SPL) bit, A-110 Serial Port (LSP) registers, listed, A-50 Serial port multichannel frame delay (MFD) bit, A-109 Serial port multichannel mode enable (MCE) bit, A-109 Serial port multichannel mode pairings SPORT0 and SPORT2, SPORT1 and SPORT3, 10-52 Serial port number of multichannel slots (NCH) bit, A-109 Serial port operation mode (OPMODE), A-101 Serial port operation modes, 10-14, 10-47 Serial port pin/line terminations, 10-66 Serial port receive compand registers (MRxCCSx), A-114 Serial port receive control (SRCTLx) registers, 6-6 Serial port receive data status (RXS_A) bit, A-104 Serial port receive select registers (MRxCSx), A-114 Serial port receive underflow status (ROVF_A) bit, A-104 Serial port registers, listed, 10-10 Serial port reset, 10-8 Serial ports Features, 10-1 Moving data between SPORTS and memory, 10-58 Named, 10-1 Serial port (SPORT), 1-14, G-10 Buffers, 6-97, 6-136, 6-139 DMA, 6-99, 6-100, 6-112 Interrupt-driven transfers, 6-136
Serial port (SPORT) (continued) Multichannel operation, G-11 Priority modes, 6-99 Status, 6-135 Transfer modes, 6-99 Serial port transmit buffer (TXx) registers, A-111 Serial port transmit compand registers (MT2CCSx and MT3CCSx), A-113 Serial port transmit data status (TXS_A) bit, A-104 Serial port transmit select registers (MT2CSx and MT3CSx), A-113 Serial port transmit underflow status (TUVF_A) bit, A-104 Serial port Word length, 10-36 Serial scan path, 12-5 Serial scan paths, 12-5 Serial shift register (EMUPX), 12-6 Serial test access port (TAP), 12-1 Serial word endian (SENDN) bit, 6-96, 6-97, 10-28, A-101 Serial word length (SLEN) bits, 6-96, 6-97, 6-109, 6-111, 10-28, 10-49, A-101 Set, bit, 2-23 Setting serial port modes, 10-9 Setup time, inputs, 13-19 S field, address, A-42 Shadow write FIFO, 5-23, 7-58 SHARC, G-11 Background information, 1-16 (See also Porting from previous SHARCs) SHARC ICE hardware, compatibility, 12-7 Shift bits, 2-23 Shifter, 1-6, 2-1, 2-23, G-11 Instructions, 2-28 Operations, 2-23, 2-27 Status flags, 2-27 Shifter input sign (SS) bit, A-17 Shifter operations, A-17
I-24
INDEX
Shifter overflow (SV) bit, 3-55, A-17 Shifter zero (SZ) bit, 3-55, A-17 Short (16-bit data) sign extend (SSE) bit, 2-4, 5-51, A-5 Short word, 5-25, 5-51 Data access, 5-51 Data storage, 5-2 SIMD mode, 5-56, 5-60 SISD mode, 5-54, 5-58 Signal For Cluster Multiprocessor Systems , 7-89 Signal integrity, 13-66 Signal skew, minimizing, 13-54 Signed data, 2-6 Signed input, 2-22 Sign extension, A-5 Silicon revision number, A-78 SIMD mode, 3-54, 5-51, A-5 Complementary registers, 2-40 Computational operations, 2-43 Defined, 2-37 Implicit operations, 2-40 Multiprocessing, 7-93 Status flags, 2-46 Single serial shift register path, 12-1 Single-step (SS) bit, 12-8 Single-word transfers, 10-65, G-11 SISD mode, 5-51 Defined, 1-7 Unidirectional register transfer, 2-45 Slave direct reads and writes (See direct read and direct write) Slave mode, 6-48, 6-55, 6-144 Operation, 6-55 SPI, 11-28 Transfer size, 6-57 Slave reads and writes, 7-55 Slave write FIFO, 7-50 Slave write FIFO data pending (SSWPD) bit, synchronous, A-70
Slave write latency, 7-56 Slave write pending (SWPD) bit, 7-112, A-70 Software interrupt x, user (SFTxI) bit, A-31 Software reset (SRST) bit, A-60 Software reset (SYSRST) bit, 12-8 Specifications, timing, 13-25 SPI Block diagram, 11-2 Boot loader kernel, 11-35 Configuring and enabling, 11-9 Data word formats, 11-21 Disabling the SPI system, 11-30 DMA, 11-32 Error signals and flags, 11-29 Examples, programming, 11-44 Features, 11-1 Functional description, 11-2 hang in receive data buffer, A-121 Interface, enabling, 11-9 IOP registers, 11-9 Master mode, 11-25 Master mode DMA operation, 11-32 Master mode operation, 11-25 Slave mode, 11-28 Slave mode DMA operation, 11-28, 11-33 system, configuring and enabling, A-117 Transfer formats, 11-15, 11-21 SPI baud rate (BAUDR) bit, 11-12, A-118 SPI clock phase (CPHASE) bit, 11-12, A-117 SPI clock polarity (CP) bit, 11-12, A-117 SPI clock rate (SPICLK) pin, 11-3, 11-22, 13-15 SPI control (SPICTL) register, 11-4, 11-9, 11-21, 11-25, 11-42, A-117 SPI data fetch (GM) bit, 11-14, A-119 SPI data format (DF) bit, 11-12, A-117
I-25
INDEX
SPI device select SPIDS pins, 11-4 to 11-33, 13-15 SPI enable (SPIEN) bit, 11-12, A-117 SPI flag select (FLS) bit, 11-13, A-118 SPI interrupt (LIRPTL) register, A-34 SPI master in slave out (MISO) pin, 13-12 SPI master out slave (MOSI) pin, 13-12 SPI master select (MS) bit, 11-12, A-117 SPI MISO pin disable (DMISO) bit, 11-13, A-119 SPI multimaster error (MME) bit, 11-17, A-115 SPI open drain output enable (OPD), 11-13, A-119 SPI packing enable (PACKEN) bit, 11-14, A-119 SPI Port (LSP) registers, listed, A-50 SPI programmable slave select enable (PSSE) bit, 11-13, A-118 SPI receive data buffer (SPIRX), 11-9, 11-20, A-120 SPI receive DMA enable (RDMAEN) bit, 11-14, A-119 SPI receive DMA interrupt latch (SPIRI) bit, A-34 SPI receive DMA interrupt mask pointer (SPIRMSKP) bit, A-36 SPI receive DMA interrupt mask (SPIRMSK) bit, A-35 SPI reception error (RBSY) bit, 11-19, 11-31, A-116 SPIRX interrupt enable (SPRINT) bit, 11-12, A-117 SPI seamless operation (SMLS) bit, 11-13, A-118 SPI selection of SPIDS (DCPH0), 11-13, A-118 SPI send zero (SENDZ) bit, 11-14, A-119 SPI shift register, 11-21, 11-25 SPI sign extend (SGN) bit, 11-14, A-119
SPI status (SPISTAT) register, 11-9, 11-15, 11-30, A-115, A-121 SPI transfer complete (SPIF) bit, 11-17, A-115 SPI transmission error (TXE) bit, 11-30 SPI transmit buffer (SPITX) register, 11-9, 11-20, A-121 SPI transmit DMA enable (TDMAEN) bit, 11-12, A-118 SPI transmit DMA interrupt latch (SPITI) bit, A-34 SPI transmit DMA Interrupt mask pointer (SPITMSKP) bit, A-36 SPI transmit DMA interrupt mask (SPITMSK) bit, A-35 SPITX interrupt enable (SPTINT) bit, 11-12, A-117 SPI word length (WL) bit, 11-12, A-117 SRAM (memory), 1-2 SREF command, pin state during, 8-40 Stacking status during interrupts, 3-44 Stack overflow/full interrupt (SOVFI) bit, A-27 Stacks and sequencing, 3-52 Status, 5-46 Host interface, 7-76 Link port, 9-13 Status registers, 3-3 Status stack, 3-44 Pop, 3-45 Push, 3-44 Status stack empty (SSEM) bit, 3-44, A-20 Status stack overflow (SSOV) bit, 3-44, A-20 Sticky status (STKYx/y) registers, 2-8, 2-20, A-16, A-18, A-19 Subroutines, 3-1, G-11 Subtract/add, 2-9 Subtract instructions, 2-36 Subtract/multiply, 2-1, G-9
I-26
INDEX
Subtract with borrow, 2-9 .S unit (See shifter) S unit (See shifter) Suspend Bus Three-state (SBTS) pin, 7-44, 7-54, 7-83, 7-89, 13-14 Swap register operator, 2-44, G-10 Switching frequencies, determining, 13-26 Synchronous access mode, 5-43, 7-6, 7-11, 7-12, 7-14, 7-44, 7-94, 13-16, G-8 Burst Interface timing, 7-26 Burst Length determination, 7-29 Burst Mode Interface Timing, 7-26 Burst read, external port buffers, 7-58 Burst Readsbus master, 7-31 Burst Read/Writebus slave, 7-26 Burst Stall Criteria, 7-29 Burst Writesbus master, 7-33 example of synchronous write followed by synchronous read, 7-25 Interface timing, 7-18 Readbus master, 7-20 Read/Writebus slave, 7-18 Synchronous Mode Interface Timing, 7-18 Write, One Waitstate Mode, 7-25 Write, Zero-Waitstate Mode, 7-22 Synchronous Burst Static RAM (See SBSRAM) Synchronous transfers, G-11 System, multiprocessor system diagram, 7-87 System bus, processor core access to system bus, 7-82 System bus interfacing, 7-78 System configuration (SYSCON) register, 6-6, A-60 System control (SC) registers, listed, A-49
System design Considerations for flags, 13-38 Designing for high frequency operation, 13-62 Designing for JTAG emulation, 13-49 Determining clock period, 13-28 Layout requirements, 13-54 layout requirements for routing signals, 13-54 Pod specifications, 13-56 Point-to-point connections, 13-65 Recommendations and suggestions, 13-68 System (Sreg) registers, A-2 System (Sreg) registers, program sequencer, A-26 System status (SYSTAT) register, 7-76, A-69
T
TAP pin, 12-3 Target board connector, 13-50 Target board connector, for emulator probe, 13-50 TCB chain loading, 6-24, 6-25, 6-26, G-11 Termination, end-of-line termination restrictions, 13-64 Termination codes (See condition codes and loop termination) Termination values, link port, 9-30 Test, bit, 2-23 Test access port (TAP) (See JTAG port) Test clock (TCK) pin, 12-3, 13-41 Test data input (TDI) pin, 12-3, 13-41 Test Data Output (TDO) pin, 13-41 Test flag (TF) condition, 3-54, 3-55 Test logic reset (TRST) pin, 12-3, 13-16, 13-41 Test mode select (TMS) pin, 12-3, 13-16, 13-41
I-27
INDEX
Test mode (TMODE) bit, 12-11 Time-Division-Multiplexed (TDM) mode, 1-14, 10-1, G-11 Timed release bus mastership, 7-92 Timeout, bus mastership, 7-101 Timer, 1-9, 3-50, 8-3 Timer and sequencing, 3-50 Timer count (TCOUNT) register, 3-50, A-46 Timer enable (TIMEN) bit, 3-50, A-10 Timer expired high priority (TMZHI) bit, 3-51, A-28 Timer expired low priority (TMZLI) bit, 3-51, A-30 Timer expired (TIMEXP) pin, 13-15, 13-33 Timer period (TPERIOD) register, 3-50, A-46 Timing External Memory Accesses, 7-13 External port, 7-1 Link port handshake, 9-10 SDRAM, 8-8 Specifications, System design, 13-25 Toggle, bit, 2-23 Token passing Link ports, 9-27 Token passing, link ports, 9-27 Top-of-loop address, 3-23 Top-of-PC stack, 3-53 Transfer control block (TCB), 6-12, 6-26, G-12 Transmit and receive data buffers (TXA/B, RXA/B), 10-30 Transmit data status (TXS) bit, 10-23 Transmit data (TXx) registers, 6-4 Transmit frame synch divisor (TFSDIV) bit, A-112 Transmit frame sync required (TFSR) bit, 10-27
Transmit/receive DMA, external port (TRAN) bit, 6-32, A-81 Transmit underflow status (TUVF) bit, 6-137, 10-23 Tristate versus three-state, G-12 True always (TRUE) if condition, 3-56 Truncate, rounding (TRUNC) bit, 2-4, A-5 Twos-complement data, 2-6, 2-10 Type, data (See data types)
U
Unaligned 64-bit memory access (U64MA) bit, 5-34, 5-41, A-11, A-20 Underflow exception, 2-5 Underflow (See multiplier) Unified address space, 1-12 Universal (Ureg) registers, 1-10, 2-40, 5-10, A-26, A-46, G-12 Control and status, A-2 Data Address Generator, A-46 Processing element, A-23 Program Sequencer, A-26 Unpacked data, 6-42, 7-56 Unsigned data, 2-6 Unsigned input, 2-22 Unsupported instructions, IPCODE, 12-4 UPDATE state, 12-5 USERCODE instruction, unsupported, 12-4 User-defined status (USTATx) registers, A-22 Using the cache, 3-11
V
Values, saturation maximum, 2-19 Vector interrupt, 1-13, G-9 Vector interrupt, multiprocessor (VIRPTI) bit, A-28
I-28
INDEX
Vector interrupt address (VIRPTA) bit, A-64 Vector interrupt address (VIRPT) register, 3-49, 6-126, 7-77, 7-78, A-63 Vector interrupt data optional (VIRPTD) bit, A-64 Vector interrupt pending (VIPD) bit, A-69 Vector interrupts, 7-77 Host, 7-78 Interprocessor, 7-76 Von Neumann architecture, 5-4, G-12
Word rotations, 5-25 Wrap around, buffer, 4-9, 4-12, 4-15 Write commands, SDRAM, 8-34, 8-36 Writes Direct reads & writes, 7-110 direct write, asynchronous interface, 7-56 Slave, 7-55 Write (WR) pin, 6-54, 7-9, 7-38, 7-89, 7-96, 13-16
X W
Waitstates, 1-12, 5-42, 5-45, 7-12, 7-19, G-12 Waitstates and access mode (WAIT) register, 6-6, 8-16, A-65, A-66 Web site, 1-21 Xor, Logical, 2-9
Z
Zero, round-to, 2-5
I-29
INDEX
I-30