DSP INTEGRATED CIRCUITS
Academic Press, 1999
CONTENTS
PREFACE i
DSP INTEGRATED CIRCUITS
1.1 INTRODUCTION 1
1.2 DIGITAL SIGNAL PROCESSING 2
1.3 STANDARD DIGITAL SIGNAL PROCESSORS 3
1.4 APPLICATION-SPECIFIC ICs FOR DSP 4
1.4.1 ASIC DIGITAL SIGNAL PROCESSORS 5
1.4.2 DIRECT MAPPING TECHNIQUES 6
1.5 DSP SYSTEMS 7
1.5.1 FACETS 8
1.6 DSP SYSTEM DESIGN 11
1.6.1 SPECIFICATION AND DESIGN PROBLEM CAPTURE 12
1.6.2 PARTITIONING TECHNIQUES 13
1.6.3 DESIGN TRANSFORMATIONS 18
1.6.4 COMPLEXITY ISSUES 19
1.6.5 THE DIVIDE-AND-CONQUER APPROACH 21
1.6.6 VHDL 22
1.7 INTEGRATED CIRCUIT DESIGN 26
1.7.1 SYSTEM DESIGN METHODOLOGY 27
1.7.2 TECHNICAL FEASIBILITY 27
1.7.3 SYSTEM PARTITIONING 29VLSI CIRCUIT TECHNOLOGIES
2.1 INTRODUCTION 33
2.2 MOS TRANSISTORS 33
2.2.1 A SIMPLE TRANSISTOR MODEL 35
2.3 MOS LOGIC 38
2.3.1 nMOS LOGIC 39
2.3.2 CMOS LOGIC CIRCUITS 41
2.3.3 PROPAGATION DELAY IN CMOS CIRCUITS 42
2.3.4 POWER DISSIPATION IN CMOS CIRCUITS 46
2.3.5 PRECHARGE-EVALUATION LOGIC 47
2.3.6 PROCESS VARIATIONS 48
2.3.7 TEMPERATURE AND VOLTAGE EFFECTS 48
2.4 VLSI PROCESS TECHNOLOGIES 50
2.4.1 BULK CMOS TECHNOLOGY 50
2.4.2 SILICON-ON-INSULATION - SOI 51
2.4.3 BIPOLAR TECHNOLOGIES - TTL 52
2.4.4 BIPOLAR TECHNOLOGIES - ECL 52
2.4.5 BIPOLARCMOS TECHNOLOGIES - BiCMOS 53
2.4.6 GaAs BASED TECHNOLOGIES 54
2.5 TRENDS IN CMOS TECHNOLOGIES 55DIGITAL SIGNAL PROCESSING
3.1 INTRODUCTION 59
3.2 DIGITAL SIGNAL PROCESSING 60
3.2.1 SENSITIVITY 60
3.2.2 ROBUSTNESS 61
3.2.3 INTEGRATED CIRCUITS 61
3.3 SIGNALS 62
3.4 THE FOURIER TRANSFORM 63
3.5 THE z-TRANSFORM 64
3.6 SAMPLING OF ANALOG SIGNALS 66
3.7 SELECTION OF SAMPLE FREQUENCY 68
3.8 SIGNAL PROCESSING SYSTEMS 69
3.8.1 LINEAR SYSTEMS 69
3.8.2 SI - SHIFT-INVARIANT SYSTEMS 70
3.8.3 LSI - LINEAR SHIFT-INVARIANT SYSTEMS 70
3.8.4 CAUSAL SYSTEMS 72
3.8.5 STABLE LSI SYSTEMS 72
3.9 DIFFERENCE EQUATIONS 72
3.10 FREQUENCY RESPONSE 73
3.10.1 MAGNITUDE FUNCTION 74
3.10.2 ATTENUATION FUNCTION 75
3.10.3 PHASE FUNCTION 76
3.10.4 GROUP DELAY FUNCTION 77
3.11 TRANSFER FUNCTIONS 78
3.12 SIGNAL-FLOW GRAPHS 79
3.13 FILTER STRUCTURES 80
3.14 ADAPTIVE DSP ALGORITHMS 82
3.14.1 LMS - LEAST MEAN SQUARE FILTERS 83
3.14.2 RLS - RECURSIVE LEAST SQUARE LATTICE FILTERS 85
3.15 DFT - THE DISCRETE FOURIER 86
3.16 FFT - THE FAST FOURIER TRANSFORM ALGORITHM 88
3.16.1 CT-FFT - THE COOLEY-TUKEY FFT 88
3.16.2 ST-FFT - THE SANDE-TUKEY FFT 94
3.16.3 WINOGRAD's FAST ALGORITHM 96
3.16.4 IFFT - THE INVERSE FFT 97
3.17 FFT PROCESSOR - CASE STUDY 1 97
3.17.1 SPECIFICATION 98
3.17.2 SYSTEM DESIGN PHASE 98
3.18 IMAGE CODING 99
3.19 DISCRETE COSINE TRANSFORMS 100
3.19.1 EDCT - EVEN COSINE TRANSFORM 101
3.19.2 ODCT - ODD COSINE TRANSFORM 102
3.19.3 SDCT - SYMMETRIC COSINE TRANSFORM 103
3.19.4 MSDCT - MODIFIED SYMMETRIC COSINE TRANSFORM 103
3.19.5 FAST COSINE TRANSFORMS 106
3.20 DCT PROCESSOR - CASE STUDY 2 107
3.20.1 SPECIFICATION 108
3.20.2 SYSTEM DESIGN PHASE 108DIGITAL FILTERS
4.1 INTRODUCTION 117
4.2 FIR FILTERS 117
4.2.1 LINEAR-PHASE FIR FILTERS 118
4.2.2 DESIGN OF LINEAR-PHASE FIR FILTERS 119
4.2.3 HALF-BAND FIR FILTERS 122
4.3 FIR FILTER STRUCTURES 125
4.3.1 DIRECT FORM 126
4.3.2 TRANSPOSED DIRECT FORM 126
4.3.3 LINEAR-PHASE STRUCTURE 127
4.3.4 COMPLEMENTARY FIR STRUCTURES 128
4.3.5 MISCELLANEOUS FIR STRUCTURES 129
4.4 FIR CHIPS 129
4.5 IIR FILTERS 131
4.6 SPECIFICATION OF IIR FILTERS 131
4.7 DIRECT DESIGN IN THE z-PLANE 134
4.8 MAPPING OF ANALOG TRANSFER FUNCTIONS 134
4.8.1 FILTER ORDER 135
4.9 MAPPING OF ANALOG FILTER STRUCTURES 141
4.10 WAVE DIGITAL FILTERS 142
4.11 REFERENCE FILTERS 142
4.12 WAVE DESCRIPTIONS 143
4.13 TRANSMISSION LINES 144
4.14 TRANSMISSION LINE FILTERS 147
4.15 WAVE-FLOW BUILDING BLOCKS 149
4.15.1 CIRCUIT ELEMENTS 149
4.15.2 INTERCONNECTION NETWORKS 151
4.16 DESIGN OF WAVE DIGITAL FILTERS 154
4.16.1 FELDTKELLER's EQUATION 156
4.16.2 SENSITIVITY 158
4.17 LADDER WAVE DIGITAL FILTERS 158
4.18 LATTICE WAVE DIGITAL FILTERS 159
4.19 BIRECIPROCAL LATTICE WAVE DIGITAL FILTERS 167
4.20 MULTIRATE SYSTEMS 171
4.21 INTERPOLATION WITH AN INTEGER FACTOR L 171
4.21.1 INTERPOLATION USING FIR FILTERS 174
4.21.2 INTERPOLATION USING WAVE DIGITAL FILTERS 177
4.22 DECIMATION WITH A FACTOR M 180
4.22.1 HSP43220 180
4.23 SAMPLING RATE CHANGE WITH A RATIO L/M 182
4.24 MULTIRATE FILTERS 182
4.25 INTERPOLATOR - CASE STUDY 3 183FINITE WORD LENGTH EFFECTS
5.1 INTRODUCTION 193
5.2 PARASITIC OSCILLATIONS 194
5.2.1 ZERO-INPUT OSCILLATIONS 195
5.2.2 OVERFLOW OSCILLATIONS 197
5.2.3 PERIODIC INPUT OSCILLATIONS 199
5.2.4 NON-OBSERVABLE OSCILLATIONS 200
5.2.5 PARASITIC OSCILLATIONS IN ALGORITHMS USING FLOATING-POINT ARITHMETIC 201
5.3 STABILITY 201
5.4 QUANTIZATION IN WDFs 201
5.5 SCALING OF SIGNAL LEVELS 204
5.5.1 SAFE SCALING 205
5.5.2 FFT SCALING 207
5.5.3 Lp-NORMS 208
5.5.4 SCALING OF WIDE-BAND SIGNALS 210
5.5.5 SCALING OF NARROW BAND SIGNALS 213
5.6 ROUND-OFF NOISE 215
5.6.1 FFT ROUND-OFF NOISE 218
5.6.2 ERROR SPECTRUM SHAPING 220
5.7 MEASURING ROUND-OFF NOISE 221
5.8 COEFFICIENT SENSITIVITY 223
5.8.1 COEFFICIENT WORD LENGTH 224
5.9 SENSITIVITY AND NOISE 224
5.10 INTERPOLATOR, Cont. 226
5.11 FFT PROCESSOR, Cont. 226
5.12 DCT PROCESSOR, Cont. 227DSP ALGORITHMS
6.1 INTRODUCTION 233
6.2 DSP SYSTEMS 233
6.2.1 DSP ALGORITHMS 234
6.2.2 ARITHMETIC OPERATIONS 236
6.3 PRECEDENCE GRAPHS 237
6.3.1 PARALLELISM IN ALGORITHMS 237
6.3.2 LATENCY 238
6.3.3 SEQUENTIALLY COMPUTABLE ALGORITHMS 241
6.3.4 FULLY SPECIFIED SIGNAL-FLOW GRAPHS 242
6.4 SFGs IN PRECEDENCE FORM 242
6.5 DIFFERENCE EQUATIONS 247
6.6 COMPUTATION GRAPHS 251
6.6.1 CRITICAL PATH 251
6.6.2 EQUALIZING DELAY 252
6.6.3 SHIMMING DELAY 252
6.6.4 MAXIMUM SAMPLE RATE 253
6.7 EQUIVALENCE TRANSFORMATIONS 255
6.7.1 ESSENTIALLY EQUIVALENT NETWORKS 256
6.7.2 TIMING OF SIGNAL-FLOW GRAPHS 257
6.7.3 MINIMIZING THE AMOUNT OF SHIMMING DELAY 259
6.7.4 MAXIMALLY FAST CRITICAL LOOPS 260
6.8 INTERLEAVING AND PIPELINING 262
6.8.1 INTERLEAVING 262
6.8.2 PIPELINING 263
6.8.3 FUNCTIONAL AND STRUCTURAL PIPELINES 267
6.8.4 PIPELINE INTERLEAVING 268
6.9 ALGORITHM TRANSFORMATIONS 269
6.9.1 BLOCK PROCESSING 269
6.9.2 CLUSTERED LOOK-AHEAD PIPELINING 272
6.9.3 SCATTERED LOOK-AHEAD PIPELINING 275
6.9.4 SYNTHESIS OF FAST FILTER STRUCTURES 276
6.10 INTERPOLATOR, Cont. 277DSP SYSTEM DESIGN
7.1 INTRODUCTION 285
7.2 A DIRECT MAPPING TECHNIQUE 286
7.3 FFT PROCESSOR, Cont. 289
7.3.1 FIRST DESIGN ITERATION 289
7.3.2 SECOND DESIGN ITERATION 291
7.3.3 THIRD DESIGN ITERATION 299
7.4 SCHEDULING 300
7.5 SCHEDULING FORMULATIONS 302
7.5.1 SINGLE INTERVAL 302
7.5.2 BLOCK SCHEDULING FORMULATION 306
7.5.3 LOOP-FOLDING 306
7.5.4 CYCLIC SCHEDULING FORMULATION 307
7.5.5 OVERFLOW AND QUANTIZATION 314
7.5.6 SCHEDULING OF LATTICE WAVE DIGITAL FILTERS 318
7.6 SCHEDULING ALGORITHMS 321
7.6.1 ASAP AND ALAP SCHEDULING 322
7.6.2 EARLIEST DEADLINE AND SLACK TIME SCHEDULING 323
7.6.3 LINEAR PROGRAMMING 323
7.6.4 CRITICAL PATH LIST SCHEDULING 323
7.6.5 FORCE-DIRECTED SCHEDULING 324
7.6.6 CYCLO-STATIC SCHEDULING 326
7.6.7 MAXIMUM SPANNING TREE METHOD 328
7.6.8 SIMULATED ANNEALING 329
7.7 FFT PROCESSOR, Cont. 331
7.7.1 SCHEDULING OF THE INNER LOOPS 333
7.7.2 INPUT AND OUTPUT PROCESSES 335
7.8 RESOURCE ALLOCATION 336
7.8.1 CLIQUE PARTITIONING 337
7.9 RESOURCE ASSIGNMENT 339
7.9.1 THE LEFT-EDGE ALGORITHM 339
7.10 INTERPOLATOR, Cont. 341
7.10.1 PROCESSOR ASSIGNMENT 343
7.10.2 MEMORY ASSIGNMENT 344
7.10.3 MEMORY CELL ASSIGNMENT 347
7.11 FFT PROCESSOR, Cont. 349
7.11.1 MEMORY ASSIGNMENT 349
7.11.2 BUTTERFLY PROCESSOR ASSIGNMENT 352
7.11.3 INPUT AND OUTPUT PROCESS ASSIGNMENT 355
7.12 DCT PROCESSOR, Cont. 355DSPARCHITECTURES
8.1 INTRODUCTION 365
8.2 DSP SYSTEM ARCHITECTURES 365
8.3 STANDARD DSP ARCHITECTURES 367
8.3.1 HARVARD ARCHITECTURE 368
8.3.2 TMS32010 369
8.3.3 TMS320C25 AND TMS320C50 370
8.3.4 TMS320C30 370
8.3.5 TMS320C40 371
8.3.6 MOTOROLA DSP56001 AND DSP56002 371
8.3.7 MOTOROLA DSP96001 and DSP96002 373
8.4 IDEAL DSP ARCHITECTURES 374
8.4.1 PROCESSING ELEMENTS 375
8.4.2 STORAGE ELEMENTS 375
8.4.3 INTERCONNECTION NETWORKS 376
8.4.4 CONTROL 376
8.4.5 SYNCHRONOUS AND ASYNCHRONOUS SYSTEMS 377
8.4.6 SELF-TIMED SYSTEMS 377
8.4.7 AUTONOMOUS BIT-SERIAL PEs 378
8.5 MULTIPROCESSORS AND MULTI-COMPUTERS 379
8.6 MESSAGE-BASED ARCHITECTURES 380
8.6.1 INTERCONNECTION TOPOLOGIES 381
8.7 SYSTOLIC ARRAYS 383
8.8 WAVE FRONT ARRAYS 385
8.8.1 DATAWAVE 386
8.9 SHARED-MEMORY ARCHITECTURES 388
8.9.1 MEMORY BANDWIDTH BOTTLENECK 389
8.9.2 REDUCING THE MEMORY CYCLE TIME 389
8.9.3 REDUCING COMMUNICATIONS 390
8.9.4 LARGE BASIC OPERATIONS 392SYNTHESIS OF DSP ARCHITECTURES
9.1 INTRODUCTION 395
9.2 MAPPING OF DSP ALGORITHMS ONTO HARDWARE 396
9.2.1 DESIGN STRATEGY 397
9.3 UNIPROCESSOR ARCHITECTURES 397
9.4 ISOMORPHIC MAPPING OF SFGs 403
9.4.1 CATHEDRAL I 403
9.5 IMPLEMENTATIONS BASED ON 405
9.5.1 VECTOR-MULTIPLIER BASED IMPLEMENTATIONS 406
9.5.2 NUMERICALLY EQUIVALENT IMPLEMENTATION 407
9.5.3 NUMERICALLY EQUIVALENT IMPLEMENTATIONS OF WDFs 410
9.6 SHARED-MEMORY ARCHITECTURES WITH BIT-SERIAL PEs 413
9.6.1 MINIMIZING THE COST 414
9.6.2 UNIFORM MEMORY ACCESS RATE 414
9.6.3 FAST BIT-SERIAL MEMORIES 416
9.6.4 BALANCING THE ARCHITECTURE 416
9.6.5 MODE OF OPERATION 417
9.6.6 CONTROL 418
9.7 BUILDING LARGE DSP SYSTEMS 419
9.8 INTERPOLATOR, Cont. 422
9.9 FFT PROCESSOR, Cont. 423
9.9.1 SELECTING THE INTERCONNECTION NETWORK 423
9.9.2 RE-PARTITIONING THE FFT 426
9.9.3 THE FINAL FFT ARCHITECTURE 432
9.10 DCT PROCESSOR, Cont. 435
9.11 SIC - SINGLE INSTRUCTION COMPUTER 436
9.11.1 PARTITIONING OF LARGE DSP SYSTEMS 437
9.11.2 IMPLEMENTATION OF VARIOUS SIC ITEMS 438DIGITAL SYSTEMS
10.1 INTRODUCTION 447
10.2 COMBINATIONAL NETWORKS 448
10.3 SEQUENTIAL NETWORKS 449
10.4 STORAGE ELEMENTS 450
10.4.1 STATIC STORAGE ELEMENTS 451
10.4.2 DYNAMIC STORAGE ELEMENTS 453
10.4.3 METASTABILITY 454
10.5 CLOCKING OF SYNCHRONOUS SYSTEMS 454
10.5.1 SINGLE-PHASE CLOCK 455
10.5.3 TWO-PHASE CLOCK 458
10.5.4 CLOCK SKEW 460
10.6 ASYNCHRONOUS SYSTEMS 460
10.7 FINITE STATE MACHINES - FSM 463
10.7.1 LOOK-AHEAD FSM 464
10.7.2 CONCURRENT BLOCK PROCESSING 466PROCESSING ELEMENTS
11.1 INTRODUCTION 471
11.2 CONVENTIONAL NUMBER SYSTEMS 471
11.2.1 SIGNED-MAGNITUDE REPRESENTATION 473
11.2.2 COMPLEMENT REPRESENTATIONS 473
11.2.3 ONE's-COMPLEMENT REPRESENTATION 475
11.2.4 TWO's-COMPLEMENT REPRESENTATION 476
11.2.5 BINARY OFFSET REPRESENTATION 478
11.3 REDUNDANT NUMBER SYSTEMS 478
11.3.1 SIGNED DIGIT CODE 479
11.3.2 CANONIC SIGNED DIGIT CODE 480
11.4 RESIDUE NUMBER SYSTEMS 482
11.5 BIT-PARALLEL ARITHMETIC 483
11.5.1 ADDITION AND SUBTRACTION 484
11.5.2 BIT-PARALLEL MULTIPLICATION 487
11.5.3 SHIFT-AND-ADD MULTIPLICATION 488
11.5.4 BOOTH's ALGORITHM 489
11.5.5 TREE-BASED MULTIPLIERS 490
11.5.6 ARRAY MULTIPLIERS 491
11.5.7 LOOKUP TABLE TECHNIQUES 493
11.6 BIT-SERIAL ARITHMETIC 494
11.6.1 BIT-SERIAL ADDITION AND SUBTRACTION 494
11.6.2 BIT-SERIAL MULTIPLICATION 495
11.6.3 SERIAL/PARALLEL MULTIPLIER 495
11.6.4 TRANSPOSED SERIAL/PARALLEL MULTIPLIER 498
11.6.5 S/P MULTIPLIER-ACCUMULATOR 499
11.7 BIT-SERIAL TWO-PORT ADAPTOR 499
11.8 S/P MULTIPLIERS WITH FIXED COEFFICIENTS 502
11.8.1 S/P MULTIPLIERS WITH CSDC COEFFICIENTS 504
11.9 MINIMUM NUMBER OF BASIC OPERATIONS 505
11.9.1 MULTIPLICATION WITH FIXED COEFFICIENT 506
11.9.2 MULTIPLE-CONSTANT MULTIPLICATIONS 508
11.10 BIT-SERIAL SQUARERS 510
11.10.1 SIMPLE SQUARER 510
11.10.2 IMPROVED SQUARER 512
11.11 SERIAL/SERIAL MULTIPLIERS 514
11.12 DIGIT-SERIAL ARITHMETIC 516
11.13 THE CORDIC ALGORITHM 516
11.14 DISTRIBUTED ARITHMETIC 517
11.14.1 DISTRIBUTED ARITHMETIC 518
11.14.2 PARALLEL IMPLEMENTATION OF DISTRIBUTED ARITHMETIC 522
11.15 THE BASIC SHIFT-ACCUMULATOR 522
11.16 REDUCING THE MEMORY SIZE 525
11.16.1 MEMORY PARTITIONING 525
11.16.2 MEMORY CODING 525
11.17 COMPLEX MULTIPLIERS 527
11.18 IMPROVED SHIFT-ACCUMULATOR 529
11.18.1 COMPLEX MULTIPLIER USING TWO-PHASE LOGIC 530
11.18.2 COMPLEX MULTIPLIER USING TSPC LOGIC 530
11.19 FFT PROCESSOR, Cont. 531
11.19.1 TWIDDLE FACTOR PE 533
11.19.2 CONTROL PEs 535
11.19.3 ADDRESS PEs 535
11.19.4 BASE INDEX GENERATOR 536
11.19.5 RAM ADDRESS PEs 538
11.20 DCT PROCESSOR, Cont. 538INTEGRATED CIRCUIT DESIGN
12.1 INTRODUCTION 547
12.2 LAYOUT OF VLSI CIRCUITS 548
12.2.1 FLOOR-PLANNING AND PLACEMENT 548
12.2.2 FLOOR-PLANS 549
12.2.3 GLOBAL ROUTING 550
12.2.4 DETAILED ROUTING 551
12.2.5 COMPACTION BY ZONE REFINING 553
12.3 LAYOUT STYLES 553
12.3.1 THE STANDARD-CELL DESIGN APPROACH 554
12.3.2 THE GATE ARRAY DESIGN APPROACH 555
12.3.3 THE SEA-OF-GATES DESIGN APPROACH 557
12.3.4 THE UNCONSTRAINED-CELL DESIGN APPROACH 558
12.3.5 THE UNCONSTRAINED DESIGN APPROACH 561
12.4 FFT PROCESSOR, Cont. 562
12.5 DCT PROCESSOR, Cont. 564
12.7 ECONOMIC ASPECTS 568
12.7.1 YIELD 569INDEX 573