PDSP16510A_技术文档

PDSP16510

PDSP16510A

Stand Alone FFT Processor

Supersedes version in December 1993 Digital Video & DSP IC Handbook, HB3923-1

DS3475 - 4.4 May 1996

The PDSP16510 performs Forward or Inverse Fast

Fourier Transforms on complex or real data sets containing up

to 1024 points. Data and coefficients are each represented by

16 bits, with block floating point arithmetic for increased

dynamic range.

An internal RAM is provided which can hold up to 1024

complex data points. This removes the memory transfer

bottleneck, inherent in building block solutions. Its organisa-

tion allows the PDSP16510 to simultaneously input new data,

transform data stored in the RAM, and to output previous

DATA INPUT

3 TERM

WINDOW

OPERATOR

COEFFICIENT

WORKSPACE

RAM

WORKSPACE

RAM

ROM

results. No external buffering is needed for transforms con-

taining up to 256 points, and the PDSP16510 can be directly

connected to an A/D converter to perform continuous trans-

forms. The user can choose to overlap data blocks by either

0%, 50%, or 75%. Inputs and outputs are synchronous to the

40MHz system clock used for internal operations.

FOUR

DATA PATHS

OUTPUT

BUFFER

A 1024 point complex transform can be completed in

some 98µs, which is equivalent to throughput rates of 450

million operations per second. Multiple devices can be con-

nected in parallel in order to increase the sampling rate up to

the 40MHz system clock. Six devices are needed to give the

maximum performance with 1024 point transforms.

Either a Hamming or a Blackman-Harris window operator

can be internally applied to the incoming real or complex data.

The latter gives 67dB side lobe attenuation. The operator

values are calculated internally and do not require an external

ROM nor do they incur any time penalty.

RESULT OUPUT

Fig. 1. Block Diagram

FEATURES

Completely self contained FFT Processor

Internal RAM supports up to1024 complex points

The device outputs the real and imaginary components of

the frequency bins. These can be directly connected to the

PDSP16330 in order to produce magnitude and phase values

from the complex data.

16 bit data and coefficients plus block floating point for

increased dynamic range

450 MIP operation gives 98 microsecond transforma-

tion times for 1024 points

ASSOCIATED PRODUCTS

Up to 40MHz sampling rates with multiple devices.

PDSP16540 Bucket Buffer

Internal window operator gives 67dB side lobe

attenuation and needs no external ROM.

PDSP16330 Pythagoras Processor.

PDSP16256 Programmable FIR Filter.

PDSP16350 I/Q Splitter / NCO

84 pin PGA or 132 surface mount package

SAMPLE

CLOCK

CONFIGURATION

WORD

GND

INEN

DIS

DOS

CLK

AUX15:0

R15:0

X

Y

PHASE

PDSP16510

PDSP16330

ANALOG

INPUT

A/D

D15:0

DEF

I15:0

MAGNITUDE

DEN DAV S3:0

GND

SCALE VALUE

AVAILABLE

RESET

1

Fig. 2. Typical 256 Point Real Only System Performing Continuous Transforms

PDSP16510

N

M

L

D9

D10

D12

D11

D14

D13

DIS

VDD

DEF

DAV

GND

AUX0

AUX1

AUX2

AUX3

AUX4

AUX5

AUX6

AUX7

D8

D6

D15

INEN

SCLK

AUX8

D7

D5

AUX9

AUX11

AUX13

AUX15

DEN

I14

AUX10

K

J

D4

D2

AUX12

AUX14

GND

I15

D3

H

G

F

GND

D0

D1

LFLG

R0

VDD

R1

VDD

I13

E

D

C

B

A

R2

I12

R3

R4

I10

I11

R5

R6

I8

I9

R7

R10

R12

R14

S0

DOS

S2

I0

I2

I4

I7

R8

R9

R11

R13

R15

VDD

S1

GND

S3

I1

I3

I5

I6

1

2

3

4

5

6

7

8

9

10

11

12

13

Pin Out for 84 PGA Package (AC84) - bottom view

PIN FUNC

FUNC

AUX13

VDD

PIN FUNC

89

FUNC

GND

R3

PIN FUNC

111 GND

112 S1

1

VDD

GND

I7

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

GND

VDD

SCLK

GND

DAV

GND

INEN

VDD

DEF

GND

DIS

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

D8

2

D7

90

3

AUX12

GND

D6

91

VDD

R4

113 GND

114 DOS

115 DOS

116 VDD

117 S2

4

I8

D5

92

5

I9

AUX11

VDD

GND

VDD

D4

93

GND

R5

6

I10

94

7

VDD

I11

GND

95

R6

8

AUX10

AUX9

AUX8

AUX7

VDD

GND

D3

96

R7

118 GND

119 S3

9

GND

I12

97

R8

10

11

12

13

14

15

16

17

18

19

20

21

22

VDD

D2

98

GND

VDD

R9

120 GND

121 VDD

122 I0

VDD

I13

99

GND

D1

100

101

102

103

104

105

106

107

108

109

110

GND

I14

AUX6

VDD

D15

VDD

R10

R11

R12

R13

GND

R14

R15

DISAB

S0

123 I1

VDD

D0

124 GND

125 I2

VDD

I15

AUX5

GND

D14

GND

D13

LFLG

GND

R0

126 I3

GND

DEN

AUX15

GND

AUX14

GND

AUX4

AUX3

AUX2

VDD

127 I4

D12

128 GND

129 VDD

130 I5

D11

GND

R1

D10

AUX1

AUX0

VDD

D9

VDD

R2

131 I6

132 VDD

Pin Out for 132 Leaded Chip Carrier (GC132)

2

PDSP16510

SIGNAL

D15:0

TYPE

DESCRIPTION

I

Data input during real only mode. The real component in complex data mode.

AUX15:0

When DEF is active AUX15:0 are used to define the operating mode as defined in Table 3.

When DEF is in-active AUX15:0 either provide the 16 bit imaginary component of complex

input data, or a second set of real only inputs.

R15:0

I15:0

DEF

O

I

These pins output the real component of the transformed data when DAV and DEN are active.

Otherwise they are high impedance.

These pins output the imaginary component of the transformed data when DAV and DEN are

active. Otherwise they are high impedance.

The high going edge of DEF is used to internally latch the contents of AUX15:0, which then

define the operating mode. In the simplest system DEF is a power on reset. When DEF is low

the internal control logic is reset.

SCLK

S3:0

I

System clock used for internal computations.

O

These pins indicate the number of shifts towards the binary point which have occurred as the

result of the conditional scaling logic. When the data path right shift is restricted to 2 places

per pass, state 15 is used to indicate an overflow and only a total of 14 shifts is possible.

LFLG

INEN

O

I

This flag indicates that data is being loaded into the device. It goes active in response to an

INEN input, and may be programmed to go in-active after the complete, one quarter, or one

half a data block has been loaded.

The use of this input is mode dependent. It is either used as an active low, load enabling,

signal for the DIS strobe, or it is used to initiate a new block load operation.

DIS

I

The rising edge of this input is used to load data into the device.

DOS

The rising edge of this input is used to dump data from the device. In most applications it may

be tied to the DIS input, even if the output rate must be higher than the input rate because of

overlapped data blocks. The DIS input is then internally divided down.

DAV

DEN

O

I

An active low signal that indicates that a transform is complete. Transformed data will then

be output in normal sequential order using DOS. It may be optionally programmed to be

delayed by 24 DOS strobes to match the delay through a PDSP16330.

This input is used to enable the data dump operation when DAV has gone active. If it is tied

low the device will automatically dump data when DAV goes active. Otherwise the device will

wait for the enabling signal to go low before the dump operation commences.

DISAB

VDD

I

Only available in the 132 pin GC package. When high the block floating logic is disabled.

P

+5V pins

GND

Ground pins

NOTE. All references to DEF, INEN, DAV, and DEN within the text do not contain the bar designator, signifying an active low

signal. This is considered to be implied by the signal name and is not meant to imply a change in the signal function.

FUNCTIONAL OPERATION

halved for a given transform size. Two real inputs then replace a

single complex input, and are processed in parallel.

The PDSP16510 performs decimation in time, radix 4,

forward or inverse Fast Fourier Transforms. Data is loaded

into an internal workspace RAM in normal sequential order,

processed, and then dumped in the correct order. With real

only input data the processing time can approximately be

Either a Blackman-Harris or a Hamming window can be

generated internally, and applied to the incoming real or complex

data with no time penalty. No external ROM is needed to support

these windows. The Blackman-Harris window gives improved

dynamic range over the Hamming window when two closely

3

PDSP16510

spaced frequencies are to be detected, and one is of smaller

magnitude than the other. It does, however, reduce the actual

frequency resolution, and the Hamming window may then be

preferable.

INPUT

SELECT

RAM

Data in and out of the device is represented by 16 bit real

and imaginary components, with 16 bit sine and cosine values

contained in an internal ROM. Conditional scaling, coupled

with word growth through the butterfly data path, gives in-

creased dynamic range. Transforms can be computed with

sample sizes of either 256 or 1024 data points. The 256 point

option can alternatively be used to simultaneously execute

either four 64 point transforms, or sixteen 16 point transforms.

The 16 point mode can only be used with a rectangular

window, and no overlapping of data blocks is possible.

The device can be configured, either, to perform continu-

ous transforms in a real time application, or as slave processor

to a more general purpose signal processing system. In the

continuous mode, with transform sizes of 256 points or less,

it contains three internal control units which simultaneously

allow new data to be loaded, present data to be transformed,

and previous results to be dumped. Additional, external, input/

output buffering is not needed. The internal input buffer also

allows data blocks to be overlapped by either 50% or 75%,

apart from the mode with no overlaps.

Shift left until largest point

has one sign bit.

SIN / COS

ROM

16

MULTIPLIER

S

29 - 14 13 - 0

"1"

When 1024 point transforms are to be calculated, without

loss of incoming data during the transform time, it is necessary

to use an input buffer. This requirement is satisfied by a single

PDSP16540 support device.

In any of the real or complex modes it is possible to obtain

higher performance by connecting devices in parallel. It is then

possible to increase the sampling rate to that of the system

clock used for internal operations.

18

16

FIRST ADDER

19Bit Result

18 - 1

0

REGISTER FILE

The mode of operation of the device is controlled by 16

bits in a control register. These are loaded through the

AUX15:0 port when a control signal DEF is active low. This

port is also used to provide the imaginary component of

complex input data, and, if complex transforms are to be

performed, an external tristate buffer will be needed to isolate

the control information. This should only be enabled when

DEF is active. DEF is also used to initialise the internal

circuitry, and can be a simple power on reset if control

parameters need not be subsequently changed.

SECOND ADDER

19Bit Result

18 - 1

0

REGISTER FILE

THIRD ADDER

19Bit Result

DATA PRECISION

18 - 3

17 - 2

During each pass of a radix-4 fast Fourier transform it is

possible for either component of a particular result to grow by

a factor of up to four in the first pass, and 5.242 in subsequent

passes. This is between two and three bits in each pass and

the data path must allow for this word growth to avoid any

possibility of overflow. At the end of the data path the word is

again reduced to 16 bits by discarding least significant bits.

Any un-necessary word growth to prevent overflow thus

results in loss of arithmetic precision, and has a detrimental

effect on the dynamic range achievable.

In practice these large word growths only occur when

bipolar complex square waves are transformed, and even

then will not occur on every pass. The PDSP16510 compro-

mises by allowing a 2 bit word growth during the butterfly

calculation in the first pass. This is equivalent to ignoring the

most significant bit of the 19 bit final result, which is assumed

to be an extra sign bit, and then selecting the next 16 bits for

CR

BIT3

SELECT

Fig. 3 One of Four Data Paths

storage. In subsequent passes a Control Register Bit allows

the user to continue to select these 16 bits, or instead to use

the 16 most significant bits. The latter option is equivalent to

a 3 bit word growth. The 2 or 3 bit word growth option applies

to ALL subsequent passes and is not a per pass option.

If the 2 bit option is selected there is a possibility of

overflow occurring in one of the passes. The prediction of

overflow is mathematically difficult, and only occurs with

specific complex square waves. Scaling down the inputs

cannot be guaranteed to prevent overflow because of the

4

PDSP16510

block floating point shifting scheme, which is discussed later.

Overflow can NEVER occur if the 3 bit option is chosen, but at

the expense of worse dynamic range.

TRANS-

FORM

WORKSPACE

FFT

When overflow does occur a flag is raised which can be

read by the user ( see later discussion on scale tag bits ), and

the results ignored. In addition all frequency bins are forced

to zero to prevent any erroneous system response.

Even with only 2 bit word growth poor dynamic range will

be obtained if the data is simply reduced to 16 bits, and

becomes worse when the incoming data does not fully occupy

all the bits in the word. These problems are overcome in the

PDSP16510, however, by a block floating point scheme which

compensates for any unnecessary word growth.

DATA PATH

OUTPUT

INPUT

DATA

LOAD

Fig. 5. RAM Organization with 1024 Point Transforms

RAM has been designed for use in a wide variety of applica-

tions. The provision of an independent input strobe (DIS),

allows data to be loaded without the need for additional

external buffering. An independent output strobe (DOS) is

also provided. DIS and DOS can thus be tied together, this

being particularly useful when the device is performing the

inverse transform back to the time domain. Transfer of data

occurs internally from DIS to SCLK, so although thay can be

of different frequencies, they must be synchronous to each

other. In the same way transfer of data also occurs from SCLK

to DOS, so while DOS can also be independent of SCLK it

must also be synchronous to it. Inputs and outputs are both

supported by flag and enabling signals which allow transfers

to be properly co-ordinated with the internal transform opera-

tion.

In many applications the DIS and DOS inputs can be tied

together and fed by the sampling clock. If the output rate must

be higher than the input rate, as with multiple devices support-

ing overlapped data samples, both strobes can still be con-

nected together. The clock supplied should then be twice or

four times the sampling clock, and an internal divider can be

used to provide the correctly reduced input rate. The provision

of a separate DOS pin does, however, allow the output rate to

be different to the input rate, and therefore faster than strictly

needed. Further output processing at higher rates is then

possible if this is advantageous to system requirements.

The internal workspace is double buffered when 256

point transforms are to be performed. A separate output buffer

is also provided. These resources, together with separate

input and output buses, allow new data to be loaded and old

results to be dumped, whilst the present transform is being

computed. Additional, external, input buffering is not needed

to prevent loss of incoming data whilst a transform is being

performed.

During each pass the number of sign bits in the largest

result is recorded. Before the next pass, data is shifted left

[multiplied by 2], once for every extra sign bit in this recorded

sample. At least one component in the block then fully occu-

pies the 16 bit word, and maximum data accuracy is preserved

Up to four shifts are possible before every pass after the

first, with a total of fifteen for the complete transform. At the end

of the transform the number of left shifts that have occurred is

indicated on S3:0. Lack of pins prevents a separate output

being available to indicate that overflow has occurred in the 2

bit word growth option. For this reason the maximum number

of compensating left shifts in this mode is restricted to 14.

State 15 is then used to indicate that overflow has occurred.

The first step in the butterfly calculation multiplies 16 bit

data values with 16 bit sine/cosine values, to give 18 bit

results. This increased word length preserves accuracy

through the following adder network, and has been shown

through simulations to be an optimum size for transform sizes

up to 1024 points. This is particularly true when the input data

is restricted to below 16 bits, as is necessary with practical A/

D converters with very high sampling rates. The bottom bit of

this 18 bit word is forced to logical one and as such is a

compromise between truncation and true rounding. It gives a

lower noise floor in the outputs compared to simple truncation.

To prevent any possibility of overflow during the butterfly

calculation the word length is allowed to grow by one bit

through each of the three adders. The least significant bit is

always discarded in the first two adders . Sixteen bits are then

chosen from the final adder in the manner discussed earlier,

and the number of sign bits in the largest result is recorded for

use in the following pass.

When block overlapping is required, internally stored

data will be re-used, and a proportionally smaller number of

new samples need be loaded. Note that the internal window

operator still functions correctly since it is actually applied

during the first pass, and not whilst data is being loaded. The

internal RAM organisation is shown in Fig. 4. It should be

Fig. 3 shows one of the four internal data paths which can

compute a radix-4 butterfly in twelve system clock cycles. This

equates to completing the butterfly in 3 cycles for the complete

device.

DATA TRANSFERS

SAMPLE CLOCK

POWER ON RESET

510 PARAMETERS

The data transfer mechanism to and from the internal

GND

WEN WS RES

WORKSPACE

A

IMAG'

REAL

AUX

I

PDSP16540

BUCKET

BUFFER

PDSP16510

FFT

DATA PATH

O/P

BUFFER

D

R

INPUT

DATA

LOAD

RS MD5:0 DAV

GND

WORKSPACE

B

LOAD IN

LAST PASS

SYSTEM

CLOCK

TRANS-

FORM

Fig. 4. RAM Organization with 256 Data Points

Fig. 6. 1024 Point Transforms with I/P Buffer

5

PDSP16510

noted that the amount of overlap between I/O transfers and

transforms is completely under the control of the system, since

an input enable signal (INEN) and an output enable (DEN) can

be used to initiate transfers.

within 32 words.

If no incoming data is to remain un-processed, the user

must ensure that the time taken to acquire sufficient data to

instigate a new transform is greater than or equal to the

transformation time itself. The latter can be calculated from

Table 4, once the system clock rate has been defined. When

1024 point transforms are performed, both the time to read

data from the input buffer, and also the time to dump data,

must be included in the calculation to determine the minimum

time in which data can be loaded into the external buffer.

The peak transfer rate is limited by the characteristics of

the I/O circuits, but can be greater than the sampling rate

which is determined by the transform time. When load and

dump operations are not concurrent with transform operations

( as in the 1024 point modes ), then the maximum I/O rate is

equal to the system clock rate, Ø. When other transform sizes

are specified, the sampling rate, S, is reduced by a factor F.

This is defined below where Ø is in MHz and L is the system

clock low time in nanoseconds :

In the 1024 point mode there is insufficient workspace for

input and output buffering in addition to working memory. The

device is then configured in a mode with separate load,

transform and dump operations. The internal arrangement is

shown in Fig. 5. The support of an external input buffer is

needed if incoming samples are not to be lost whilst a

transform is in progress. This is loaded at the sample clock

rate and transferred to the FFT processor as quickly as

possible. In this mode the PDSP16510 always expects to

receive 1024 words, regardless of the amount of block over-

lapping. Data stored internally cannot be re-used when block

overlapping is required, and data from the external buffer must

be re-read as necessary.

Fig. 6 illustrates a typical 1024 point system with an input

buffer which supports complex input data. The input buffer

can be provided by a PDSP16540 Bucket Buffer without the

need for any external control logic. It supplies RAM for 1024

x 32 complex words, and allows transfers to the FFT Proces-

sor at the full system clock rate. The PDSP16540 also sup-

ports the standard 50% and 75% data block overlapping, but

in addition allows the user to define the amount of overlap to

S = FØ, where F = 4 / (6+0.001ØL)

F is typically 0.66 and applies to all transforms except for those

of 1024 points, even if INEN is driven such that concurrent

operations do not actually occur (Note also that S must be

1

N/2

N

1

N

DIS

DATA IN

VALID

TSD

TSA

THD

THA

TSI

THI

INEN

LFLG

50% Overlap

TFH

Min Time =THA

TFL

TFH

INEN

Edge activated

system

TSA

TED

16510A,A0,B0,C0

Characteristic

Symbol

Min

Max

Units

Data In set up Time

Data In Hold Time

T_SD

T_HD

T_SA

T_HA

T_HI

10

0

ns

INEN active going set up

8

INEN active Hold Time

0

INEN in-active Hold Time to ensure no load

2

INEN in-active going set up for no load operation

Delay to LFLG going active ( 30 pf load )

Delay to LFLG going in-active ( 30 pf load )

Min time to INEN low in edge mode

T_SI

8

T_FH

T_FL

T_ED

10

15

Table 1. Advanced Timing Information with Continuous Inputs.

6

PDSP16510

synchronous to SCLK). If this causes a system limitation in a

single device application, then the device can be configured

for pseudo, Mode 2, multiple device operation. Separate load,

transform, and then dump operations will then always occur,

but DEN must be low when a transform is complete or DAV will

never go active. See the section on multiple device operation.

complex transforms, the single device edge mode of opera-

tion is identical to that of a multiple device system. With 256

point transforms, and their concurrent derivatives, the location

of the low going edge in the data stream is dependent on the

amount of block overlapping. The low going edge transition

must be provided after 64 samples have been loaded with

75% overlapping, and after 128 samples have been loaded

with 50% overlapping. With no overlapping the edge must be

provided after 256 samples have been loaded.

LOADING DATA

Data loading is controlled by three signals; DIS an input

strobe, INEN a load enable, and LFLG an output flag. Detailed

timing information is given in Table 1. Once sufficient data has

been acquired, a transform will automatically commence. This

is normally after a complete block has been loaded, except

when a single device is performing overlapped transforms of

256 points or less. With 75% overlapping, transforms will

commence after 25% of a new block has been loaded, and

with 50% overlapping transforms commence after 50% of the

data has been loaded. The remainder of the block is provided

by data already stored in the internal RAM.

In a single device system with Bit 12 set, INEN can be

taken high to inhibit the load operation when gaps occur in the

data stream. In the INEN edge activated mode gaps in the

data stream can only be accommodated if the DIS clock is

externally inhibited. Taking INEN high will not inhibit the

loading of data in this mode.

With gaps in the data stream the peak sampling rates can

be higher than continuous sampling rates. When data loading

is not coincident with transform operations the peak rate can

equal that of the system clock, otherwise it is reduced by the

factor, F, given on the opposite page.

The data strobe is used to load data into the internal

workspace RAM, and data must meet the specified set up and

hold times with respect to its rising edge. DIS can be a

continuous input since the device only loads data when an

input enabling signal is active.

An internal synchronisation interval is necessary be-

tween the last sample being loaded with the DIS strobe and

transforms being started with the system clock. This can be up

to twelve system clock periods when data transfers and

transforms are overlapped. The transform times given later in

Table 4 are maximum values, and include these twelve

periods.

When Control Register Bit 12 is set in any multiple device

mode, the DEF high going edge will also initiate a load

operation after it has been internally synchronised to the rising

DIS edge. If the first device in a multiple device system is

programmed in this manner, the transform sequence will

automatically start when DEF goes in-active. The other de-

vices need the INEN edge as usual, and must have Bit 12

reset. A fuller explanation of the use of Bit 12 in a multiple

device mode is given in the section on I/O In Multiple Device

Systems. Note that the use of Bit 12 in a single device system

(Control Register Bits 10:9 = 00) is completely different to its

use in a multiple device mode.

The way in which the INEN signal controls data loading

is dependent on whether a single or multiple device is to be

implemented, and the status of Control Register Bit 12.

When Bit12 is set in a SINGLE device system the INEN

signal is simply used as an enable for the DIS strobes. When

INEN is low, and provided the relevant set up and hold times

have been satisfied, data will be loaded with the rising edge of

the DIS strobe. If no gaps occur within the incoming data,

INEN can be tied permanently low, provided that the sampling

rate has been chosen such that transforms are completed

before a new block of data is loaded. For transforms of less

than 1024 points, data will then be continually processed

without any loss of information. In the 1024 point modes the

device will cease loading data when 1024 samples have been

loaded, and even if INEN remains low no more data will be

accepted until the previous results have been dumped.

In a multiple device system an edge is ALWAYS needed

to commence a load operation, and Bit 12 has a different

purpose. The edge is provided by INEN going low. Loading

will cease when a complete block (or group of blocks with

multiple concurrent transforms) of data has been loaded, even

if INEN remains low. INEN must go high at some point after the

minimum hold time has been satisfied, and then return low

AFTER ALL DATA HAS BEEN LOADED, before a new load

operation can commence. Low going edges which occur

before all data has been loaded will be ignored.

The LFLG output goes active in response to the DIS rising

edge used to load the first data sample, and indicates that a

load operation is occurring. In an edge activated system the

LFLG output will go high as the result of the first high going DIS

edge after INEN has gone low. In the simple INEN enabling

mode, internal logic counts the number of valid inputs and

detects when the programmed block length has been

reached. LFLG then goes low and will go high again in

response to the next valid DIS strobe. LFLG will go low when

DEF is active and will go high in response to the first INEN

enabled DIS edge after DEF has gone in- active.

The active going LFLG edge does not normally have any

system significance, but in the block overlapping modes the

in-active going edge will occur when 50% or 75% of the data

has been loaded. By driving the INEN input on one device with

the LFLG output from a previous device, this edge can be used

to partition data between several devices in a multiple device

system. It can also be used to provide an address marker for

a user defined input buffer, when executing 1024 point trans-

forms with a single device. It is not needed, however, when the

input buffer is provided by the PDSP16540.

DUMPING DATA

Data output is controlled by an output strobe [DOS], a

dump enable signal [DEN], and a Data Available signal [DAV].

The DAV signal is used to indicate that the internal output

buffer contains transformed data, and the DEN input is used

to control the outputting of that data. The output buffer within

the device is clocked by the DOS input, and must be primed

The INEN edge mode is actually provided for the correct

operation of multiple device systems, but if Bit 12 in the Control

Register is reset in the SINGLE device mode, the edge

activated operation will still be possible. With all but 256 point

7

PDSP16510

with a number of DOS strobes (see "user notes - stopping

DOS") once a transform is complete in order to transfer data

to the output pins. DAV will not go active until this priming has

occurred.

timing is given in Table 2. It should be noted that the DOS input

MUST be continually present before DAV goes active. If this

is not the case the DAV output will not go active at the correct

time, and the internal output circuitry will not be primed. Once

DAV is active, however, it is possible for DOS to be irregular,

and DEN can be used to inhibit the action of the output strobe

as discussed previously. For the correct operation of the

device the user must ensure that DOS becomes continuous

and DEN remains low once DAV goes in-active.

When continuously transforming data such that new

outputs are internally available before the previous block has

been completely dumped, then DAV would normally stay

active and give no indication that one block dump had been

finished and another block started. Additional internal circuitry

is, however, provided to ensure that DAV goes inactive for one

DOS high time, thus supplying an inter block marker.

The state of the DEN input at the end of a transform is

used to control the transition of the active going edge of the

DAV output with respect to the DOS strobes. The latter are

then used to transfer data from the device to the next system

component. If the DEN input is tied low in a single device

system, the active going DAV transition will be internally

synchronised to the rising edge of a DOS clock. If DEN is not

tied low it must be guaranteed to be low at the end of the

internal transform operation for this synchronization to occur.

Since there is no external indication of this event, the user

must take care to only allow DEN to go high whilst DAV is

active, if this DAV synchronous mode is needed.

ASYNCHRONOUS DAV MODE

SYNCHRONIZED DAV OPERATION

If DEN is not active in a single device when the transform

is complete, then the device will wait for DEN to go active

before any data is dumped. This mode is suitable for applica-

tions in which output processing is under the control of a

remote host, such as a general purpose digital signal proces-

sor. The DAV output will then go active as soon as the output

buffer is full, and will not be synchronised to the DOS edge. In

such systems the DOS strobe may not necessarily be present

at this time. Table 3 gives the relevant timing information.

In this host controlled dump mode the PDSP16510 waits

for the host to activate the DEN input after DAV has gone

active. DEN then functions as an enable for the host produced

data strobes on the DOS pin. DEN may either stay active for

the complete transfer, or may be used to enable each DOS

In the DAV synchronised mode the first rising edge of the

DOS clock, after DAV has gone active, must be used to

transfer the first transformed sample from the output pins to

the next system component. It should be noted that the output

buffer will have been primed before the active DAV transition,

since DOS must be a continuous clock, and there is then no

delay before the first output becomes valid. The DAV output

can be used as a clock enable for this next device, and

transfers will continue in normal sequential order until the

required data has been dumped. DAV will then go inactive in

response to the last DOS edge which was used to transfer

data to the next device.

This mode of automatically dumping data when it is ready

finds applications in real time data flow systems, and detailed

1

N

DOS

TDD

DATA O/P

O/P 1

O/P 2

TLZ

TDH

THZ

S3:0

DAV

Scale Tag Value

TVD

TVI

16510A,A0,B0,C0

Characteristic

Symbol

Min

Max

Units

ns

Output Enable Time

Output Disable Time

T_LZ

T_HZ

T_DD

T_DH

T_VD

T_VI

15

ns

Data Delay Time ( 30 pf load )

Data Hold Time

2

1

DAV active Delay Time ( 30 pf load )

DAV in active Delay Time ( 30 pf load )

10

Table 2. Output Timing with DEN tied low. ( Advanced Data )

8

PDSP16510

input. When DEN and DOS are both active an internal read

operation occurs, and an address generator is incremented.

DAV goes in-active in response to the DOS edge needed to

read the last output, unless Bit 15 in the Control Register is set.

In this case DAV goes in-active when the next INEN edge is

received for reasons given later.

used to drive the INEN input. This then initializes a new load

operation only when the previous dump has been completed.

Results are transferred from the device with the rising

edge of the DOS strobe when DEN is active. This is consistent

with using the device in a data flow architecture, as is com-

monly employed in data processing systems. In a typical

microprocessor based system, however, data is normally

expected to become valid before the end of the data strobe

produced by the processor. It is thus necessary for the user

to provide a ‘dummy’ data strobe in order to transfer data to

the outputs which can then be read by the host during the next

data strobe. In addition further ' dummy ' strobes are needed

each time DAV goes active in order to prime the output

circuitry. The actual output sequence is given in Table 3 for a

single device systemand is described more fully in "user notes

- stopping DOS".

In host controlled systems the time to dump data could be

longer than the transform time. The dump time in such a

system will dictate the maximum sampling rate that can be

used without the loss of incoming data. In the 1024 point

mode, when the loss of data is not important, the PDSP16510

is designed to not accept new data until the previous results

have been dumped. Such a system needs no input buffer, and

INEN can be permanently tied low if the edge activated mode

is not in use. If the loss of data is to be avoided an input buffer

is needed and the host must have received all the results

before a new block of data has been loaded into the buffer.

For 256 point transforms, with host controlled dumping,

it is still possible to overlap load and dump operations. The

maximum dump times, however, must be less than the load

times to avoid data corruption. Previously converted outputs

will be actually corrupted, rather than inputs simply not being

used.

If the loss of incoming data is not important, the device

can be forced to do separate load, transform, and then dump

operations. The corruption of results will then never occur, no

matter what dump time is taken. This can be achieved by

ensuring that INEN is not active between loading a block of

data and completing the dump of the results from that data.

The same ends can be achieved if the INEN edge activated

mode ( Bit 12 reset ) is used, and the inverted DAV edge is

GENERAL DUMP CONSIDERATIONS

The tri-state drivers on the output buses are only enabled

when both DAV and DEN are active. When DEN is tied

permanently low the output bus will start to become valid from

the DOS edge which also generates the DAV output. The next

DOS edge can then be used to transfer the first output to the

next device. When DEN is driven low in response to the DAV

output, the outputs start to become valid when DEN goes low.

The Scale Tag outputs become valid at the same time as data,

and when enabled will continue to indicate the correct value

until all frequency bins have been dumped. If at any time

during the dump operation DEN goes in- active, then both the

DAV

TVI

DEN

TPS

TPW

TPH

Dummy Strobes

(4)

(1)

(2)

(3)

DOS

O/P 1

THZ

O/P 1

TOH

O/P 2

O/P N

Un-defined

DATA

O/P

TLZ

TDD

THZ

S3:0

Scale Tag Value

Un-defined

In this zone SCLK and DOS requirements have to be met - See "User Notes - stopping DOS"

16510A,A0,B0,C0

Characteristic

Symbol

Min

Max

Units

DEN Set Up Time

T_PS

T_PW

T_PH

T_VI

10

5

ns

Host Strobe Width

DEN Hold Time

DAV in-active going Delay ( 30 pf load )

Output Enable Time ( see Fig 13 )

Output Data Delay Time ( 30 pf load )

Output Disable Time ( see Fig 13 )

Read Cycle Time

10

15

10

T_LZ

T_DD

T_HZ

T_RC

T_OH

25

2

Old Data Hold Time

Table 3. Host Controlled Output Timing. ( Advanced Data )

9

PDSP16510

PARAMETERS POWER

ON RESET

+5V GND

GND

MD5 MD4:0 RES

AUX

D

I

PDSP 16540

BUCKET

BUFFER

AUX

DIN

O/P

REAL

ONLY

PDSP16510

HOST

SYSTEM

PDSP16510

R

S3:0

WS RS

DAV

SAMPLE

CLOCK

SYSTEM

CLOCK

SYSTEM

CLOCK

Fig. 7. Host Controlled System

Fig 8. 1024 Point Real Transforms

data and scale tag outputs will go high impedance after the

delay shown in Table 3.

The host loads a block of data into the PDSP16510, using

DIS enabled by INEN, which is then automatically trans-

formed. The DAV output provides a flag indicating that the

transform is complete, and results are then read by the host

using DOS enabled by DEN. A new set of inputs is not

normally loaded until the previous results are complete. If,

however, 1024 point transforms are not to be performed,

loading new data could coincide with dumping previous re-

sults. This, however, would require a host system with sepa-

rate input and output buses, and which also allowed coinci-

dent transfers. As discussed previously, transferring results

must take no longer than loading new data to prevent corrup-

tion of the outputs.

Valid transformed data is actually available within the

device from DAV going active until INEN again goes active,

and a new set of data is loaded. The output tristate drivers,

however, normally go high impedance when DAV goes in-

active once a dump operation has been completed. In order to

support systems in which it may be necessary to read the

transformed data more than once, a Control Register Bit is

provided which keeps the DAV output active until a further

INEN edge is received. The user must then keep track of how

many outputs have been dumped before INEN is generated to

start a new load operation.

The DAV output can be delayed by an amount equivalent

to the pipeline delay through the PDSP16330. This option is

invoked by setting a control bit, and allows DAV to indicate that

polar data is available at the output of the PDSP16330. When

the option is used the tri-state outputs will be enabled when

data is actually available and DEN is active, and not when DAV

eventually goes active.

In the system illustrated by Figure 7, the host also controls

the mode of operation of the FFT processor. The DEF signal

is produced from an address decode, and the control parame-

ters are loaded from the host bus by connecting the AUX

inputs to the data outputs.

Two Control Register Bits allow a range of dump size

options to be supported. In some applications the results of

interest may only lie in the lower 25 or 50% of the frequency

bins, the sampling rate having been chosen to prevent

aliasing, and the transform size having been selected to give

the required frequency resolution. In other systems it is only

necessary to output the second half of a given sized transform.

This is useful when filtering is to be performed in the frequency

domain using Overlap /Discard Fast Convolutions. With this

method FIR filters with N taps can be implemented in the

frequency domain using 50% overlapped transforms on 2N

samples. After multiplication in the frequency domain with the

required frequency response, the inverse transform is per-

formed and the first half of each output is discarded. Since only

half the results are dumped, the dump clock need not be twice

the rate of the clock used to load data.

REAL ONLY TRANSFORMS WITH A SINGLE DEVICE

In the simplest case real transforms can, of course, be

computed by forcing zero levels on the imaginary input pins.

The device can, however, be configured to internally perform

two simultaneous real transforms instead of a single complex

transform. The block floating point logic will then use data from

both blocks when it determines the number of shifts to be

applied. This dual transform technique is used to increase the

maximum permissible sampling rates, but since an additional

data pass is required in order to un-scramble the transformed

data, the actual performance is not quite double that possible

with a complex transform of the same size. The 4 x 64 point

complex mode becomes an 8 x 64 real mode, but the change

from 16 x 16 complex transforms to 32 x 16 real transforms is

not supported.

When a real transform is performed the algorithm pro-

duces complex results for each of the incoming data blocks,

but each result only represents the first half of the frequency

domain data. This does not cause any loss of information

since the two halves are mirror images of each other. As with

complex transforms, it is necessary for a different system

configuration to be used when 1024 point transforms are

required. These are considered later, and the following only

applies to 256 or 64 point transforms.

FULL CO - PROCESSOR OPERATION

A single device can be configured as a co-processor to a

host system in which both the loading and dumping of data is

under the control of the host. Such a system is shown in Figure

7, in which DEN is a host provided enable for host read

operations, and INEN is an enable for host write operations.

DIS and DOS are host data strobes.

10

PDSP16510

In a single device system, performing non overlapped

transforms on data from a SINGLE source, only the Real input

pins are used, and the Imaginary inputs are redundant except

when configuring the device. By setting Control Register Bits

8:6 to 101, however, it is possible for a single device to accept

data from two independent sources using the real and imagi-

nary inputs. Maximum sampling rates will then only be half

those possible when a single source is used, if no incoming

data is to remain un-processed. With two sources a transform

must be completed in the time to load parallel blocks, other-

wise incoming data will be lost. With one source a transform

need not be finished until two data blocks have been acquired.

In this dual input mode results from data on the real inputs

always precede those from the imaginary inputs.

Configuration

Clock Periods

16 X 16PT

4 X 64PT

256PT

COMP

REAL

420

624

816

1024PT

3907

816

8 X 64PT

2 X 256PT

1032

4699

2 X 1024PT REAL

Table 4. Computation Times in Clock Periods

If block overlapping is needed, it is always necessary to load

pairs of data blocks simultaneously, using both the real and

imaginary inputs. With dual sources of data this presents no

problem, and Control Bits 8:6 should be set to 110 or 111 for

the relevant amount of overlapping. If data is from a single

source an external FIFO is needed to provide a simple delay

for a block of data. Decodes 001 through 100 from Control Bits

8:6 must be used to select the required overlap.

Thus if block overlapping is not needed Control Register Bits

8:6 should be set to 101.

This fast transfer mode is supported by a special option

on the PDSP16540 Bucket Buffer. It will acquire two 1024

point non overlapping blocks using the sampling clock, and

then transfer the results to the FFT processor at the full system

clock rate. Figure 8 shows the system arrangement. It does

not support block overlapping.

The output of the FIFO must provide data for the real

inputs. Continuous inputs can still be accepted, and each

block will initially occur on the imaginary inputs, and then occur

again on the real inputs as an output from the FIFO. The data

output sequence will consist of the results from a pair of inputs,

followed by the results obtained after the required overlap.

Thus with 50% overlapping the sequence is 1 & 2 followed by

1.5 & 2.5 followed by 3 & 4 followed by 3.5 & 4.5 etc., where

1 2 3 4 are the sequential inputs to the external FIFO, 1.5 is the

overlap between 1 & 2, and 2.5 is the overlap between 2 & 3.

When eight simultaneous 64 point transforms are per-

formed, the sampling rates given in Table 5 assume that data

is from a common source. The data outputs will be in the

correct sequence from 1 to 8, corresponding to inputs 1

through 8 in normal order from a single source. When data is

from two sources the sampling rates will be halved, and the

output sequence will be 1A 1B 2A 2B 3A 3B 4A 4B, where A

and B are the dual simultaneous sources on the real and

imaginary inputs respectively. If data block overlapping is

used in either of the above cases, the eight outputs will be

followed by results from the same basic eight blocks but time

displaced to give the required overlap. If more than two

sources are to be handled the user must provide appropriate

buffering and multiplexing, and the sampling rates must be

proportionally reduced.

With 1024 point transforms all block overlaps are handled

by the buffer logic, and not by the internal RAM, but the device

must still be programmed to expect the required overlap if the

external buffer makes use of the in-active LFLG edge to mark

the overlap point. To achieve the performance given in Table

5 with 50% overlaps, the buffer must provide sufficient storage

for at least 2.5 data blocks. With 75% overlaps it must provide

storage for 2.75 blocks. This extra storage allows transfers

between devices to be only needed when a complete new

block has been acquired for 50% overlaps, and when half a

new block has been acquired for 75% overlaps.If storage is

restricted to two data blocks, only half the sampling rates given

will be possible. Transfers between devices must then occur

when a half or a quarter of a new block has been acquired.

Since the minimum time between transfers must be no less

than the transform time itself, the sampling rates must be

proportionally reduced to prevent loss of data.

SINGLE DEVICE SAMPLING RATES

In a single device system the maximum sampling rate is

dependent on the transform size, the data overlap, and

whether real or complex data is applied. Table 4 gives the

times taken to complete the transforms for the various block

sizes, which include an allowance for synchronisation be-

tween the DIS strobe and the system clock. If continuous data

is to be transformed, the time to acquire a new block of data

(or partial block with overlapping) must be at least equal to

these transform times. Load and dump times must also be

added in the 1024 modes. For non continuous transforms the

peak rate is limited by the system clock rate and the factor , F,

When two 1024 point transforms are performed with a

single device, on data from a single source, the input buffer

must be arranged to acquire two blocks before initialising a

transfer to the device. In order to improve the maximum

sampling rates possible, data should be read simultaneously

from each half of the buffer, and loaded into the real and

imaginary inputs. This halves the transfer time from the buffer

to the device, but requires the device to expect dual inputs.

1024 COMPLEX

0% 50% 75%

6 .8 3 .4 1 .7

8 X 64 REAL

0% 50% 75%

256 COMPLEX

2 X 256 REAL

2 X 1024 REAL

16 X 16 COMPLEX

4 X 64 COMPLEX

0% 50%

75%

0% 50%

75%

4 .0

0% 50% 75%

0% 50%

75%

2 3 .9

-

1 6 .1 8 .0

1 2 .3

6 .1 3 .0

2 4 .6 1 2 .3 6 .1

1 9 .5 9 .7 4 .3

1 2 .1 6 .0

3 .0

Table 5 :

Guide to MAX Sampling rates (in MHz) possible from a single device system.

SCLK is 40 MHz. Where sampling rate is asynchronous to SCLK, a PDSP16540 (or similar) is assumed on the input.

11

PDSP16510

given previously.

This is loaded at the sampling rate and then data is transferred

to the PDSP16510 at a user defined rate. The time taken to

load this external buffer must be at least equal to the sum of

the time to transfer data in and out of the FFT processor and

the transform time itself. When data blocks are overlapped by

50% or 75%, no more than one half or one quarter of the block,

respectively, must have been loaded in the same time. In the

1024 point modes the dump time can be any user defined

value, and need not be increased to allow for block overlap-

ping. The dump time, however , does directly effect the

maximum sampling rates which can be accommodated with-

out loss of incoming data.

The time taken to dump the transformed data must be no

more than the load time, if continuous inputs are to be

supported and I/O operations are concurrent with transforms.

With block overlapping the dump time must be reduced to the

time taken to load the partial block. This dump time must

include four extra DOS strobes needed to prime the output

circuitry when a transform is complete. These, in effect, can be

added to the transform time such that with concurrent I/O and

0%, 50%, or 75% overlapping;

nS or (nS)/2 or (nS)/4 must be gtr than or equal to PK + 4W

The maximum sampling rates for 1024 point transforms

at any load and dump rate can be calculated from the following

relationship:

where n is the transform size, S is the input DIS period, P is

the number of clock periods given in Table 4, K is the system

clock period, and W is the DOS period which can be less than

S if necessary. Note also that S must be synchronous to

SCLK, and if an asynchronous ratio is required then a

pdsp16540 input buffer should be used.

When DIS and DOS are produced from a common source

the minimum allowable sampling period must be increased to

allow for the extra dumping time. Thus when DIS and DOS

have equal periods and, for example, there is no overlapping;

1024S or 512S or 256S > 1024B + PK + D

for 0%, 50%, or 75% overlapping respectively. S, P, and K

were defined opposite. B is the clock period in which data is

read from the input buffer and loaded into the device, D is the

total dump time allowing for the four extra DOS periods. The

periods of the load and dump clocks cannot be less than the

system clock period. The maximum sampling rates given in

Table 5 assume that a 40 MHz I/O rate is used, and that all

results are dumped.

(n - 4)S must be greater than or equal to PK

The maximum sampling rates given in Table 5 allow for the

extra dumping time.

MULTIPLE DEVICE SYSTEMS

The load and dump operations are not concurrent with

transforms in the 1024 point modes, and an external input

buffer will be needed if loss of incoming data is to be avoided.

In real time applications several devices may be used in

parallel in order to increase the sampling rate, but not to

increase the transform size. When all outputs are commoned

together, and feed a single output processor, then the data

dump time must always be less than or equal to the time taken

to load the data block ( or 50% or 25% of the time with block

overlapping ). In most configurations with block overlapping

the dump rate requirements will limit the maximum input rate,

if only one output processor is provided. This can be avoided

if the system provides separate output processors for every

device. The system clock used for internal calculations then

ultimately imposes a limit on the maximum sampling rate

possible.

A multiple device system performing complex transforms

with a single output processor is shown in Figure 9. The INEN/

LFLG signals are used to co-ordinate the segmentation of

data between devices. The in-active going edge of LFLG

instigates the load procedure in the next device, and, since

this edge can be programmed to occur either 25%, 50%, or

100% through the load operation, it can cause the next device

to commence loading before the previous one has finished. In

this manner data block overlapping is achieved. When mul-

tiple concurrent transforms are performed ( for example 4 x 64

or 8 x 64 ) two LFLG transitions are sometimes needed to

support block overlapping. This is fully explained in the section

on Mode 1 sampling rates.

Configuration

Parameters

Power on

Reset

Output

Clock

Complex Data

Input

IMAG

REAL

O/P

S

MAG'

PDSP16330

PDSP16510

PHASE

CLK

SCALE

TAG

IMAG

REAL

O/P

S

DATA

AVAIL'

IMAG

REAL

O/P

S

In any of the multiple device modes an INEN edge

transition is needed to start a new load procedure when the

previous one has finished. When the LFLG output from the last

device is fed back to the INEN input of the first device,

continuous transforms will be executed. This continuous

sequence can be started by the rising edge of DEF if Control

Register Bit 12 is set in the first device (see section on Loading

INPUT CLOCK

Fig 9. Multiple Device Configuration

12

PDSP16510

Data). This bit must not be set in the other devices. Since all

devices are supplied from a common input bus and have a

common source of control parameters, this Bit 12 inversion is

best mechanized with an Exclusive OR gate in the AUX12

input line of the first device. The input can then be inverted

when DEF is active but otherwise not be effected. Once the

first device has been started with the DEF edge, the sequence

will continue automatically using the LFLG /INEN connection

between devices.

In many applications data is transformed continuously

after power on, and the concept of a first data sample does not

exist. If, however , the opposite is true, the first data sample

must be present on the input pins such that it can be loaded

with the second rising DIS edge after DEF has gone in-active.

The data must meet the set up and hold times given in Table

1, and DEF itself must meet the parameters normally met by

the INEN rising edge. The latter requirement is necessary to

avoid a possible one DIS cycle variance, due the internal DEF

synchronization logic. If the position of the first data sample is

not important, it is not necessary for DEF to have any set up

specification.

Without the feedback from the last device, the first device

would wait for another externally supplied initialising pulse. In

such a system with N devices in parallel, then N continuous

transforms must be executed before the first device can wait

for a new INEN input.

When only one output processor is provided the data

outputs from all devices are connected together, and internal

logic will enable the tri-state outputs when a device is ready to

output data i.e. DAV goes active. When data blocks are

overlapped it is possible that the output rate requirements will

limit the input sampling rate (see section on Multiple Device

Sampling Rates). Additional output processors will remove

this restriction, and the correct choice of multiple device

operating mode will optimise the sampling rates that can be

achieved with a given number of devices.

The synchronisation intervals, necessary to co-ordinate

input and output operations with the transform operation, lead,

in effect, to some uncertainty in the time needed to complete

a transform. Thus a particular device in a multiple device

system can effectively complete a transform in less system

clock periods than another device in the same system. To

prevent one device turning on its output bus before the

previous one has finished, it is either necessary to use a faster

output rate than would otherwise be required, or to use the

inverted DAV output from one device to drive the DEN input of

the next. The latter option allows DIS and DOS to be con-

nected together, and ensures that the second device will not

output data until the first device has finished.

This method of driving the DEN input from the inverted

DAV output from a previous device requires a change to the

single device DAV and DEN operation. If DEN is active at the

end of a transform in a multiple device system, the DAV output

will go active when the output circuit has been primed by the

DOS strobes. This operation is identical to that provided for a

single device system, and is transparent to the user as long as

DEN and DOS are active . If DEN is not active, however, the

DAV output will not asynchronously go active as happens in

a single device system. Instead DAV will only go active when

DEN eventually goes active. Since DEN is the inverted DAV

output from a previous device, it is thus never possible for two

devices to be actively outputting data. The DAV active going

edge remains synchronised to the DOS strobe since the DEN

input will only go active when a previous DAV goes in-active.

A further change to the output circuitry ensures that the output

buffer is primed even though DEN is not active. The first word,

however, only progresses as far as the final output latch. The

output bus is not enabled, and address increments do not

DEF

DIS / DOS

INTERNAL

START

INEN A

LFLG A

DAV A

INEN B

LFLG B

DAV B

LOAD A1

LOAD A2

TRANSFORM A1

DUMP A1

LOAD B1

TRANSFORM B1

DUMP B1

INEN C

LFLG C

LOAD C1

TRANSFORM C1

DUMP C1

DAV C

Fig 10. Three Device System with Separate Load, Transform, and Dump Operations

13

PDSP16510

occur, until DEN is finally received. This modification to the

internal control logic ensures that the output buffer does not

impose unnecessary gaps between consecutive transforms.

These gaps would, in turn, force the required DOS frequency

to be greater than the DIS frequency ( or greater than twice or

four times the frequency with 50% and 75% overlaps ).

The system illustrated by Figure 9 produces a common

DAV output by OR'ing together all the individual, active low,

DAV outputs. This is not guaranteed to give an indication when

one transform has finished, and the next one has started,

since it may simply glitch as one DAV goes in-active and the

next one goes active after some delay. This glitch will not

cause system problems since it occurs at a point clear of the

high going edge of the DOS strobe. To provide a marker for

the end of a transform each in-active going DAV edge should

set its own latch, which is then reset by a subsequent DOS

edge. The output of the latches can then be OR'd together if

necessary.

100% of the block has been loaded. When multiple transforms

are performed concurrently (for example 4 x 64) a LFLG

transition occurs at the relevant point whilst the first block in

the group is being loaded. LFLG then goes high again and

returns low at the overlap point in the last block. This double

LFLG transition allows two devices to support 50% block

overlapping, since the first transition from the first device can

be used to initiate the load procedure in the second device.

The second transition from the second device then initiates a

new load procedure in the first device. The additional edges

from each device have no effect since they occur when the

device they are driving is already doing a load operation.

In such a two device system supporting 50% overlaps the

inverted DAV from the first device must drive the DEN input of

the second device. The data dumping time is then shared

equally between both devices. The second device only out-

puts data when the first has finished, but both dumps must be

finished in the time taken to load the group of blocks if only one

output processor is provided. Without the DAV/DEN connec-

tion one device would only have had the time needed to load

half of one sub block in which to dump its data.

In a similar manner four devices will handle 75% overlaps

when concurrent multiple transforms are to be computed. The

second, third, and fourth devices make use of the first transi-

tion, and ignore the second. The first device uses the second

transition from the last device, and ignores the first. With the

DAV/DEN connection each device will have one quarter of the

load time to dump its data when a single output processor is

provided .

More than two devices will provide increased perform-

ance for multiple transforms with 50% overlapping, and more

than four devices will increase the performance with 75%

overlapping. External logic is then needed to ensure that each

device only uses the correct LFLG transition. Any device

should only use the negative LFLG transition from a previous

device if its own LFLG is low, and the LFLG output from the

previous device plus one is low.

Three multiple device operating modes are actually pro-

vided, and are selected with Control Register Bits 10:9. The

choice of a particular mode is application dependent, and will

effect the maximum sampling rate achievable with a given

number of devices.

MULTIPLE DEVICE SAMPLING RATES

MODE 1. (BITS 10:9 = 01)

In this mode transfers in and out of the device are concurrent

with transform operations. This mode must not be used for

1024 point transforms due to internal memory size restric-

tions. When real transforms are performed in this mode, only

the real data input is used, regardless of the amount of block

overlapping.

The increase in performance is directly related to the

number of devices provided, but the input and output rates are

limited to FØ where F and Ø are as defined previously. Within

this restriction the theoretical performance is given by;

MODE 2 (BITS 10:9 = 10)

NnS > PK+4W, or 0.5NnS > PK+4W, or 0.25NnS > PK+4W

This mode is suitable for all transform sizes, since separate

load, transform, and then dump operations occur. More de-

vices than required by Mode 1 are necessary to achieve a

given sampling rate, but the input and output rates can be any

value up to the full system clock rate with the A grade part. As

with Mode 1, additional output processors are needed to

avoid the sampling rate restriction imposed by block overlap-

ping.

for 0%, 50%, or 75% overlapping. N is the number of devices,

n is the transform size, S is the DIS strobe period, P is the

number of system clock periods given in Table 4, K is the

system clock period, and W is the DOS strobe period. Note

that DIS should be synchronous to SCLK, and also that DOS

should be synchronous to SCLK.

If an output processor is provided for every device, two

devices with 50% block overlapping or four devices with 75%

block overlapping will give the same sampling rates as a single

device with no overlapping. If only one output processor is

provided, the two or four times increase needed in the output

rate over the input rate, usually imposes a limit on the input

rate, since the output rate is limited to a factor, F, of SCLK.

In this operating mode the DIS and DOS strobes can

often be tied together, since a faster DOS strobe gives no

improvement in the sampling rates possible. This remains true

even when the output rate must be twice or four times the input

rate due to block overlapping. Options can then be used which

internally divide the DIS strobe by two or four, and thus allow

the input to be driven by the faster DOS strobe.

The number of devices, N, needed to achieve a given

sample rate can be derived from the following formula:

NnS > nS + PK + D for no overlapping

NnS > 2 X [nS + PK + D] for 50% overlapping

NnS > 4 X [nS + PK + D] for 75% overlapping

N is the number of devices, n is the transform size, S is the DIS

strobe period, P is the number of system clock periods given

in Table 4, K is the system clock period, and D is the total dump

time including 4 extra DOS periods as discussed previously.

The DIS and DOS periods are any value defined by the user,

down to the system clock period with the A grade part. Note

that DIS should be synchronous to SCLK, and also DOS

In this mode the LFLG goes in-active after 25%, 50%, or

14

PDSP16510

should be synchronous to SCLK.

be a simple power on reset if the operating mode is fixed once

power is supplied. The AUX pins are also used to provide the

imaginary component of the complex input data. Thus, if

complex inputs are needed, the mode definition must be

implemented through a tri-state buffer which is only enabled

when DEF is active. The imaginary input data must be

disabled during this time.

Table 6 lists the functionality of each of the bits in the

mode control register, and further explanations are as fol-

lows:-

In this mode increasing the output clock frequency will

allow a greater continuous input rate. The provision of

separate DIS and DOS pins allows this to be mechanized, and

the DOS frequency can be increased to that of the system

clock used internally. When the sum of the dump time

(including four extra DOS periods for output priming ) plus 12

system clock periods (the transform time variation caused by

input synchronization) is less than the load time, one device

will be guaranteed to have finished dumping before the next

one starts. The inverted DAV to DEN connection between

devices is then not needed, and all DEN inputs can be

grounded.

The LFLG transitions occur at the same times as Mode 1,

except that the double transition does not occur with multiple

concurrent transforms. Fig. 10 illustrates a timing sequence

with three devices. Real transforms still only use the real

inputs regardless of the amount of block overlapping.

BITS 2:0

These bits define one of 7 options for the sample size and

type of data. In the 1024 point options the device will assume

the non concurrent operating mode, regardless of whether a

single or multiple device system is specified. The internal

control logic will then ensure that data is loaded, transformed,

and dumped in sequential operations.

For other data set sizes, loading, transforming, and

dumping, can all occur simultaneously with a single device;

the actual overlap will be dependent on the relative occur-

rences of the INEN input. Only in Mode 1 can concurrent

operations be done with multiple devices.

MODE 3 (BITS 10:9 = 11)

Multiple device Mode 3 is provided in order to improve the

performance when block overlapping is needed, and separate

output processors are provided. In this mode transfers in and

out of the device are never concurrent with transform opera-

tions. The device will actually load extra data such that the

required data to perform two overlapped transforms is stored

internally. The amount of internal RAM prohibits the use of this

mode when performing overlapped 1024 point transforms.

LFLG will go in-active after a normal data block have been

loaded, regardless of the overlap selected. The device, how-

ever, continues to load more data. Thus, for example, in the 4

x 64 mode, five 64 point blocks will be loaded. This technique

allows each device in the system to complete two or four

overlapped transforms (depending on the amount of overlap)

before any new data is needed. When doing a straightforward

256 point transform the device will load 256 + 128 data points.

The full benefits are only obtained if more than one output

processor is provided, but an extra processor is not always

necessary for every device. Sampling rates up to the system

clock rate are possible. The equations defining the sampling

rates become:

BIT 3

This bit determines the number of right shifts built into the

data path. In either condition only two right shifts occur during

the first pass. If the bit is reset, three shifts occur in subsequent

passes and the block floating point scheme allows up to fifteen

compensating left shifts. If it is set, two shifts occur in every

pass and overflow is possible. This is indicated by reducing

the number of compensating left shifts to fourteen, and using

scale tag value fifteen to indicate that overflow has occurred.

BITS 5:4

These bits define the choice of window operator. If other

windows are needed they must be applied externally. The

fourth option is used to specify the inverse transform, which

does not require the use of a window operator. When 16 x 16

complex transforms are specified by Bits 2:0, only the rectan-

gular window can be used. The use of any of the other options

will cause the device to enter an internal test mode.

BITS 8:6

(N - 1)L > 2PK + 2D for 50% overlaps

(N - 1)L > 4PK + 4D for 75% overlaps

These bits define 0%, 50%, or 75% data block overlap-

ping, and the division factor on the DIS input. Overlapping

must not be specified with 16 x 16 complex transforms.

Two decodes allow the DIS input to be divided by two or four,

when 50% and 75% overlapping is respectively needed.

These options allow the DOS and DIS input pins to be still

supplied from a common source, even though the output rate

must be faster than the input rate. The frequency of this source

would be dictated by the output rate requirement, with the

input rate internally reduced by the correct amount.

Special decodes are provided to support real only trans-

forms from dual sources, using both the real and auxiliary

inputs. When data is from a single source, and no overlaps are

needed, only the real input should be used. If 50% or 75%

overlaps are needed from a single source of real data, the

device always expects blocks to be simultaneously loaded. An

external FIFO is then needed to supply data to the real inputs

after a delay of one block. Each block is thus loaded twice,

where L is the time needed to load a normal block of data but

not including the extra data, P is the number of system clock

periods given in Table 4, K is the system clock period, and D

is the total dump time including 4 extra DOS periods. As

before, both DIS and DOS must be synchronous to SCLK.

When real transforms are to be performed on single

sourced data, an external FIFO is needed to provide pairs of

data blocks. These are loaded simultaneously into the real

and imaginary inputs. See the section on real transforms.

OPERATING MODES

The operating mode of the PDSP16510 is determined by

the condition of 16 bits in an internal Control Register. The

status of these bits is defined by the inputs present on the

AUX15:0 pins when the DEF input is active. The DEF input can

15

PDSP16510

firstly through the Auxiliary inputs and then through the Real

inputs.

BIT 10:9

When this bit is set the PDSP16510 will not generate DAV

until 24 DOS clocks after data was actually valid. In this case

the output tri-state drivers will be enabled at the correct time,

even though the DAV signal was not externally valid. Host

controlled dumping should not be used.

These bits define a single device system, or one of three

multiple device possibilities. The choice between the first and

second multiple device mode is dependent on the transform

size and the sampling rate needed. The third mode should

only be used when overlapped multiple transforms with less

than 1024 points are to be performed simultaneously. It

changes the LFLG logic and allows sampling rates up to the

system clock rate to be achieved with multiple output proces-

sors.

BIT 12

When this bit is set in the single device mode, the INEN

input is a simple load enable signal. When it is reset an INEN

edge is needed at the end of a load sequence before a new

one can commence.

When it is reset in a multiple device mode it has no

action, but when it is set it will cause the DEF high going edge

to also initiate a load operation.

BIT 11

BIT 14:13

BITS

2:0

Dec'

OPTION

These bits allow four dump size options to be provided.

Individual frequency bins are not accessible.

000

001

010

011

100

101

110

111

16 x 16 COMPLEX

4 x 64 COMPLEX

256 COMPLEX

1024 COMPLEX

8 X 64 REAL

2 X 256 REAL

2 X 1024 REAL

NOT USED

BIT 15

Under normal circumstances DAV would be expected to

go invalid when a transform has been dumped. In some

applications, however, it may be necessary to read the outputs

more than once. When this bit is set, DAV will remain valid until

the next INEN input, and will indicate that the transformed data

still remains in the internal buffer. As soon as the next INEN is

received the transformed data will be overwritten. Whilst DAV

remains active the output tri-states will be enabled.

3

0

1

SHIFT 3 PLACES AFTER PASS1

ALWAYS SHIFT 2 PLACES

WINDOW OPERATORS

5:4

00

01

10

11

RECTANGULAR

HAMMING WINDOW

BLACKMAN-HARRIS

INVERSE TRANSFORM

Since only a finite segment of a signal can be observed and

processed at any one time, it is impossible to obtain pure

spectral lines. Discontinuities are introduced at the bounda-

ries of the observation interval which lead to spectral leakage.

Windows are weighting functions applied to the data in order

to reduce these discontinuities at the boundaries.

In the time domain the signal has to be observed through

a finite window as a matter of accord. This is in fact equivalent

to multiplying the signal with a set of uniform weights i.e. a

rectangular window operator. In the frequency domain the

spectrum of the data will be the spectrum of this weighting

function shifted to the sinusoidal frequencies of the compo-

nents in the data.

The rectangular window has a Fourier Transform which is

a SINC(X) function. This has sidelobes which are only 13dB

down from the main lobe. This severely limits the dynamic

range of the system since a second sinusoid in close proximity

would have its main lobe swamped by this side lobe. This

would occur if its amplitude was a mere 13dB down from the

first sinusoid.

Window operators are thus mathematically constructed

to cancel these sidelobes as far as possible. Unfortunately this

is normally done at the expense of making the main lobe

spread over more frequency bins. This reduces the ability of

the system to resolve two frequencies, and can only be

overcome by using more data samples. This may not always

be possible because of other system constraints.

8:6

000

001

010

011

100

101

110

111

NO OVERLAP

50% OVERLAP

50% OVERLAP AND DIS ÷ 2

75% OVERLAP

75% OVERLAP AND DIS ÷ 4

DUAL SOURCE, NO OVERLAP

DUAL SOURCE, 50% OVERLAP

DUAL SOURCE, 75% OVERLAP

10:9

00

01

10

11

SINGLE DEVICE

N DEVICES, CONCURRENT I/O

N DEVICES, LOAD-TRANS-DUMP

SPECIAL MULTIPLE TRANSFORM

11

12

00

01

DAV NOT DELAYED

24 CLK DAV DELAY

0

1

INEN EDGE ACTIVATED

INEN IS SIMPLE ENABLE

14:13 00

O/P FIRST QUARTER

O/P FIRST HALF

O/P LAST HALF

01

10

11

O/P ALL RESULTS

A common rule of thumb defines the resolution of an FFT

system as half the full width of the mainlobe. The width of the

mainlobe for a rectangular window is two frequency bins; for

the Hamming window it is four bins; for the Blackman-Harris

15

0

1

NORMAL DAV

KEEP DAV ACTIVE TILL INEN

Table 6. Mode Control Bit Allocations

16

PDSP16510

trated in Table 7. The results are obtained from the reference

quoted, which should be consulted for a full mathematical

treatment. The significance of each parameter is outlined

below :

REAL IMAG'

DATA DATA

PARAMETERS POWER

ON RESET

Highest Side Lobe Level

XR

XI

The inherent rectangular window has sidelobes which

are only 13dB down from the mainlobe. These severely limit

the dynamic range. The object of the window is to improve this

situation with better side load attenuation.

AUX

D

PDSP16116

COMPLEX

MULTIPLIER

PDSP16510

R

YR

YI CLK

SAMPLE

CLOCK

Mid-Point Loss

ZERO

In line with the filter concept it is possible to conceive of

an additional processing loss for a tone of frequency mid-way

between two bins. This is defined as the ratio of the coherent

gains of two tones, one at the mid-point and one at the sample

point. It is expressed in dB in Table 8.

SYSTEM

CLOCK

WINDOW

PROM

COUNTER

FIRST

SAMPLE

CLR

Fig. 11. External Window Generator

Overall loss

window it is six bins.

An overall figure for the reduction in signal to noise ratio

can be obtained by adding the mid-point loss to the reciprocal

of the equivalent noise power bandwidth in dB. It is a measure

of the ability of the window to detect single tones in broadband

noise. The variance between windows is less than 1dB.

The latter two windows are actually supported by the

PDSP16510. These are constructed on the fly as needed, and

take the general form:

A - Bcosx + Ccos2x where x = (2pn)/N, n = 0 to N-1

For Hamming, A = 0.54, B = 0.46, C = 0

For Blackman-Harris, A = 0.42323, B = 0.49755,C=0.07922

6.0dB Bandwidth

This figure, expressed in bin widths, represents the ability

of the window to resolve two tones and should be as close to

unity as possible. As the highest sidelobe level is reduced, this

parameter tends to get worse, and a compromise must be

used when choosing a window.

These windows can be applied to any of the transform

size options, except the 16 x 16 complex variant. When the

latter is specified the rectangular window option MUST be

selected, or the device will be configured in an internal test

mode.

If other operators are required these must be applied

externally. This can be conveniently achieved with either a

PDSP16112 or a PDSP16116, both of which are complex

multipliers but with different accuracies. Fig. 11 shows how

either one can be configured to perform two separate multipli-

cations with one input common to both. This arrangement is

necessary to perform the window function on complex inputs.

Important features of the windows generated by

PDSP16510, and other commonly used windows, are illus-

Overlap Correlation

In many practical systems the squared magnitudes of

successive transforms are averaged to reduce the variance of

the measurements. If, however, a windowed FFT is applied to

non overlapping partitions of the sequence, data near the

boundaries will be ignored since the window exhibits small

values at those points. To avoid this loss partitions are usually

overlapped by 50% or 75%, which might, at first sight, remove

the need to average successive transforms. If non-windowed

Window

Operator

Highest

Side Lobe

Mid-Point

Loss dB

Overall

Loss dB

6dB

Bandwidth

Overlap Correlation

75%

75

50%

50

Rectangular

Hamming

-13

-43

-70

-69

-58

-67

3.92

1.78

1.25

1.02

1.1

3.92

3.1

1.21

1.81

2.17

2.39

2.35

1.81

70.7

60.2

53.9

56.7

57.2

23.5

11.9

7.4

9

Dolph-Chebyshev

[C = 3.5]

Kaiser-Bessel

[C = 3]

3.35

3.55

3.47

3.45

Blackman

Blackman-Harris

[3 term]

1.13

9.6

Table 7. Window Performance ( from The use of Windows for Harmonic Analysis. F J Harris. Proc IEEE Vol 66. Jan 1978 )

17

PDSP16510

Figures given for the dynamic range of a system must be

carefully interpreted, since there is no exact definition of the

measurement. Three different ways of measuring dynamic

range have been investigated using 1024 point transforms.

The ‘best’ dynamic range figures will be obtained with

single tone measurements, and these results are often quoted

to indicate the need for greater bit accuracies. The measure

is the ratio of a full scale sinusoid to the average noise level

and the results will be essentially independent of the window

operator. The results given by the PDSP16510 are compared

to various other configurations in the first column of Table 8.

With this method the dynamic range is bound to improve as

more bits are used to represent the data. Theoretically 6 dB of

dynamic range will be obtained for every bit representing the

input data, if the internal arithmetic accuracy gives no degra-

dation in performance. In practice this improvement has no

significance since the incoming waveforms will be much more

complex than a single sinusoid.

An alternative method of determining dynamic range is

with a slot noise test. White noise is passed through a narrow-

band notch filter, several frequency bins wide, and the FFT

computed. There is no noise in the filtered slot at the input to

the FFT, but there is noise in the frequency bins corresponding

to the width of the notch. Dynamic range is measured as the

difference in dB of the average signal power and the average

noise power and can be considered to give more useful

results. Comparative results from various configurations are

also given in the second column of Table 8. The performance

with 24 bit data is seen to be little better than that obtained with

the PDSP16510. This can be attributed to the scaling scheme,

word growth, and rounding method used within the device.

When two nearby tones are to be capable of detection,

the window operator will dictate the performance of the

system. The final column in Table 8 illustrates the results

obtained using two sinusoids of different amplitudes, with the

larger one residing mid-way between two frequency bins, and

the smaller 5.5 bins away. The two frequencies are five bins

apart to avoid the effects of the mainlobe widths. The dB

figures given are the difference in amplitude between the two

signals when the smaller one is still just detectable as a

separate peak from the larger one.

Arithmetic Accuracy

Max Tone

WRT Noise

Slot Noise

Test

2 Tones

with

Freq Spread

16 bit,unconditional

scaling

60

44

45

24 bit arithmetic with

unconditional scaling,

16 bit inputs

88

74

67

61

65

63

16 bit inputs with

PDSP16510 block FP

Full 32 bit Floating point

with 16 bit inputs

93

82

67

Table 8. Comparative Dynamic Range Measurements

transforms are overlapped by 75% or 50%, then 75% or 50%

of the data will be correlated. When windows are applied,

however, the data common to both transforms will be operated

upon by different portions of the window waveform. The

difference in these portions will dictate the amount of correla-

tion between overlapped data. At 50% overlap Table 7 shows

that with all windows the data is virtually independent, and

successive averaging would still be needed. At 75% overlap

figures are obtained which are closer to the 75% correlation

obtained with no window.

Examination of Table 7 shows that the Blackman-Harris

window gives performance very similar to that of the Kaiser-

Bessel and Dolph-Chebyshev windows. The latter two win-

dows can not be computed as they are needed since they are

mathematically too complicated. The values are normally pre-

computed and stored in a ROM; this would need to contain 1M

bits to match the accuracy of the rest of the system.

Use of the Hamming window gives worse dynamic range

than the more complex windows, but it has less effect on the

overlap correlation and it has a smaller main lobe width.

SPECTRAL PERFORMANCE

There are two important parameters in the measurement

of spectral response: resolution and dynamic range. Resolu-

tion defines how closely two sinusoids can be spaced in

frequency and still be identified; dynamic range defines how

great the difference in the amplitudes of the sinusoids may be

and yet the smaller one still identified. Resolution is deter-

mined by the observation time [i.e. the width of the frequency

bin] and the window operator that is used. Dynamic range is

also determined by the window operator, but in a hardware

implementation it is also influenced by the number of bits used

to represent the data throughout the calculation.

The hardware effects include the accuracy of the A/D

converter, the number of bits representing the window opera-

tor and the twiddle factors, and the way the growth in word

length is handled as the FFT calculation proceeds. The

obvious way to overcome these limitations is to use floating

point arithmetic; but in real life the accuracy of the A/D

converter is fixed and the sample size is limited. Floating point

arithmetic is thus an overkill solution for the majority of

applications. This is especially true for transform sizes up to

1024 points, which is the intended application area.

This technique illustrates the performance of the window,

since the amount by which sidelobe structure of the larger

signal swamps the mainlobe of the smaller signal will deter-

mine if the smaller is detected. The theoretical attenuation of

the highest sidelobe levels, with respect to the mainlobe, for

the window options provided by the PDSP16510 have been

given in Table 7, and represent the dynamic range that can be

obtained if arithmetic effects are ignored. The results in the

final column in Table 8 are the practical results given by the

device, and as with the slot noise test indicate that the

arithmetic scheme used by the PDSP16510 is equivalent to

using 24 bit data. The Blackman Harris window was used in all

cases.

18

PDSP16510

(3.2) Accessing the RAM at this point

USER NOTES - STOPPING DOS

At this moment, when DAV has been made active

before data appears on the output pins, data is not yet

in the output buffer. Internally the precise SCLK cycle

at which the RAMs are read and written to the output

buffers now has to be waited for. This cycle, as

described above occurrs 2 in every 12 SCLK cycles, so

at worst case 6 SCLK cycles have to elapse until data

is guaranteed to be in the output buffer.

(1) GENERAL DESCRIPTION

The transform is calculated internally fully synchronous to

SCLK. However, as all outputs are referenced to DOS, a

transfer has to be made between the two clocks. In addition,

some dummy DOS strobes are needed to operate the internal

control logic, and to advance data from the internal RAMs to

the output pins.

The most simple configuration for the device is to have

DOS running continuously and for DEN to be permanently

active. When this happens the user will just be aware of data

appearing on the output pins on the same DOS cycle when

DAV goes active. However, there are many situations where

either DOS is not continuously running, or DEN is not

permanently active. To help explain how to operate the device

in these situations, the internal operation of the output circuits

must be described. For those who are not going to be

interrupting DOS, the remainder of this section can be

ignored.

If the DOS rate is similar to the SCLK rate, and the user

has been immediately applying DOS pulses (on

seeing DAV go active) hoping to get data off the chip,

then this will not actually happen.

The next internal flag raised is the one which indicates

that the output data has been successfully read from

the RAMs and is now in the output buffer.

(3.3) The next DOS rising edge (regardless of DEN status)

(2) INTERNAL RAM - GENERAL DESCRIPTION.

For single device operation of transforms less than 1024

points, the internal RAM is shared between three separate

operations which enable the device to output old transformed

results, calculate the current transform, and input new data

ready for the next transform. All these operations, along with

the internal control logic, are controlled by a 12-cycle state

machine. The RAM operations are:

The flag indicating that the RAMs have been read is

transferred to circuitry operating on DOS. The output

enable signal, DEN, does not have to be present at this

point.

(3.4) The next DEN-Enabled DOS rising edge (ie the 1st one

of this sequence)

The output state machine receives it's first edge.

(a) 2 cycles in every 12 are dedicated to reading new

information in the input buffer and writing it to the RAM.

(3.5) The next DEN-Enabled DSO rising edge (ie the 2nd)

(b) 2 cycles in every 12 are dedicated to reading the

contents of the RAM and advancing that data to the

output buffer.

Internal output address generators start to count

(ready for fetching the next set of output data).

(3.6) The next DEN-Enabled DOS rising edge (ie the 3rd)

(c) 8 cycles in every 12 are dedicated to the read and write

operations of the transform currently being calculated.

An enable signal is raised for the final data latch in the

output buffer.

(3) SEQUENCE OF EVENTS

The sequence of events relating to the output control and

data flow is as follows :

(3.7) The next DEN-Enabled DOS rising edge (ie the 4th)

(a) The final data in the output buffer latch clocks-

through new data and presents it to the output

pads.

(3.1) An SCLK rising edge :

(a) An internal flag is raised to indicate that the

transform has finished and data is available to be

dumped. Data will be present in the internal RAM,

and the output address generator will be at the

correct address. Access to the RAM at this

moment, however, has not been made.

(b) The output pads come output of high impedance

.

(c) If DAV was previously inactive, it is now made

active.

(b) If at this moment the device is programmed to be a

single device, and DEN is inactive, then DAV will be

made active - ie without the presence of DOS. If

DEN is active at this point, or the device is

programmed in any multiple device mode, then

DAV will remain inactive.

19

PDSP16510

(4) OUTPUT SCENARIOS

Considering the above sequence, therefore, some single

device situations can now be explained :

(4.3) 1024 point transforms, single device mode.

(4.1) DOS is continuously present, but DEN is inactive

(Transform size less than 1024)

In the case of 1024 point transforms, the internal RAM

is no longer operated in the manner described in

section 2. The RAM is instead totally dedicated to one

operation at a time. Thus data for a transform will be

loaded, and all 12 out of 12 SCLK cycles will be

available for the transfer of input data to the RAMs.

During the transfrom no transfers from the input to the

RAM or from the RAM to the output are possible. This

is why DIS and DOS can be equal to SCLK for 1024

point transforms.

In this case, when the transform is complete, as the

device is programmed as a single device and DEN is

inactive, DAV will be made active. Even though DOS

is running, the status of DAV at this point does not rely

on it.

The user can now monitor the status of DAV, and after

at least 6 SCLK cycles can initiate some further action,

eg by external control force DEN active at some later

time when the rest of the system is ready to accept the

transformed data. Independently of this external

control, the next DOS pulse will start to operate the

sequence of events as described above (ie point No.

3.3). When DEN is eventually made active, the

remainder of the above sequence (points Nos 3.4 to

3.7) is executed, with 4 DEN-Enabled DOS pulses

needed before data is observed on the output pins.

If 1024 point transforms are being performed and the

device is programmed as a single device, then

"asynchronous" operation of DAV is possible as

described earlier for transform sizes less than 1024

points. If DEN is inactive at the time the transform has

finished calculating, then DAV will be made to go active

regardless of the state of DOS. Although 6 SCLK

cycles do not have to be waited for as in section 3.2, a

transition has to be made from the transform

controlling the internal RAM to the output circuits

cnotrolling it. This operation plus the time taken to

advance data from the RAMs to the output buffer takes

exactly 4 SCLK cycles.

If however the user immediately forces DEN active

upon monitoring DAV go active and waiting for the

required 6 SCLK cycles, then 5 DOS pulses would

have to be issued. The first of these 5 would start the

sequence of events as described above (3.3), and the

fact that it is enabled by DEN would be irrelevant. The

required DEN enabled pulses in this situation would be

the 2nd, 3rd, 4th and 5th pulses supplied.

Hence the sequence of events is exactly as described

in section 3, except that section 3.3 should read 4

SCLK cycles rather than 6. The analysis of sections 4.1

and 4.2 are also true if the 6 SCLK cycle time is

substituted with 4 SCLK cycles.

(4.2) DOS is not running, and DEN is inactive. (Transform

sizes less than 1024)

(5) DUMMY DOS STROBES AFTER DEF

In this situation, again as the device is programmed to

be a single device and DEN is inactive at the point

where the transform is complete, DAV will be made

active regardless of the state of DOS. The user can

now monitor this event on DAV and after waiting a

further 6 SCLK cycles, use it to switch on DOS and to

make DEN active.

In addition to the dummy DOS strobes needed prior to

dumping data, it is necessary to provide at least 4 DOS strobes

after DEF has gone inactive, but before DAV goes active.

These initialise the internal address counters and do not rely

on DEN also being active. They are needed every time DEF

has been used to change the operating mode.

DOS can now be switched on for at least one pulse (but

may be more), and the sequence of events as

described earlier (from point No 3.3) will start. DEN can

then be made active, whereby a further 4 DEN-

Enabled DOS pulses will be required before data is

seen on the output pins. This is the situation shown in

table 3.

Alternatively, DEN and DOS could be made to operate

on the same cycle. In this case data will appear on the

output pins on the 5th DOS pulse (the first would not

actually require the presence of DEN, but the 2nd, 3rd,

4th and 5th would)

20

PDSP16510

ABSOLUTE MAXIMUM RATINGS [See Notes]

Waveform - measurement level

Test

Supply voltage Vcc

Input voltage V_IN

Output voltage V_OUT

-0.5V to 7.0V

-0.5V to Vcc + 0.5V

V _H

Delay from output

high to output

high impedance

0.5V

Clamp diode current per pin I_K(see note 2)

Static discharge voltage (HMB)

Storage temperature T_S

Junction Temperature, Commercial

Junction temperature, Industrial

Junction Temperature, Military

Package power dissipation

18mA

500V

-65°C to 150°C

100°C

Delay from output

low to output

high impedance

0.5V

V _L

115°C

155°C

5000mW

Delay from output

high impedance to

output low

1.5V

NOTES ON MAXIMUM RATINGS

Delay from output

high impedance to

output high

1. Exceeding these ratings may cause permanent damage.

Functional operation under these conditions is not implied.

2. Maximum dissipation or 1 second should not be exceeded,

only one output to be tested at any one time.

3. Exposure to absolute maximum ratings for extended

periods may affect device reliablity.

0.5V

1.5V

V_H- Voltage reached when output driven hig

V_L- Voltage reached when output driven low

4. Current is defined as positive into the device.

ELECTRICAL CHARACTERISTICS

Operating Conditions (unless otherwise state)

PDSP16510A C0Tamb = 0 C to + 70°C. Vcc = 5.0v ± 5%

PDSP16510A B0Tamb = -40 C to + 85°C. Vcc = 5.0v ± 10%

PDSP16510A A0Tamb = -55 C to +125°C. Vcc = 5.0v ± 10%

Symbol

Notes

Units

Characteristic

Value

Typ.

Min.

Max.

V_OH

V_OL

V_IH

V_IL

I_IN

C_IN

I_OZ

I_SC

I_OH= 4mA

I_OL= -4mA

SCLK, DIS, DOS, DEN need 3V

DEN needs 0.7V max

GND < V_IN< V_CC

V

2.4

-

2.0

-

Output high voltage

Output low voltage

Input high voltage

Input low voltage

Input leakage current

Input capacitance

Output leakage current

Output S/C current

0.4

-

0.8

+10

V

µA

pF

µA

mA

-10

10

GND < V_OUT< V_CC

V_CC= Max

-50

10

+50

300

SWITCHING CHARACTERISTICS

Characteristic

Symbol

Ø

Min

Max

Conditions

Clock Frequency ( MHz )

Clock High Period ( ns )

Clock Low Period ( ns )

Max DOS, DIS Frequency

DC

13

40

Max Ø high time is 1msec

T_CH

T_CL

10

Ø_D

FØ

Less than 1024 points or Mult Dev Mode 1

Note F =

4

6 + 0.001ØT_CL

Max DIS Frequency

Max DOS Frequency

Ø_D

Ø

1024 points or Mult Dev Modes 2 and 3

SCLK to DIS/DOS RELATIONSHIP

Both DIS and DOS must be synchronous to SCLK. Ideally they should both be produced from SCLK, in which case the

SCLK rising edge would either be first or coincident with the DIS and DOS rising edges.

In any event, the rising edge of SCLK must not fall between 2ns and 10ns after the rising edge of either DIS or DOS

21

PDSP16510

ORDERING INFORMATION

PDSP16510A C0 AC

PDSP16510A C0 GC

PDSP16510A B0 AC

PDSP16510A B0 GC

PDSP16510A A0 AC

PDSP16510A A0 GC

PDSP16510A/MA/GCPR

( Commercial -PGA Package )

( Commercial -Leaded Chip Carrier )

( Industrial - PGA Package )

( Industrial - Leaded Chip Carrier )

( Military - PGA Package )

( Military - Leaded Chip Carrier )

( Military - Screened Leaded Chip Carrier. See separate datasheet for details)

22


型号：	PDSP16510A
厂家：	MITEL NETWORKS CORPORATION
描述：	Stand Alone FFT Processor 单机FFT处理器
文件：	总25页 (文件大小：273K)
中文：	中文翻译
下载：	下载PDF数据表文档文件

PDSP16510A [MITEL]

相关型号：

PDSP16510AA0AC

PDSP16510AA0AC

PDSP16510AA0AC

PDSP16510AA0GC

PDSP16510AA0GC

PDSP16510AA0GC

PDSP16510AAOAC

PDSP16510AAOGC

PDSP16510AB0AC

PDSP16510AB0AC

PDSP16510AB0GC

PDSP16510AB0GC