## Atacama Large Millimeter Array # Programming Manual For Tunable Filter Bank (TFB) CORL-60.01.07.00-002-E-MAN Version: E Status: Approved 2006-03-06 | Prepared By: | | | |--------------------------|-----------------------------------|------------| | Name(s) and Signature(s) | Organization | Date | | G. Comoretto | INAF - Osservatorio di<br>Arcetri | 2006-03-08 | | Approved By: | | | | Name and Signature | Organization | Date | | A. Baudry | Université de Bordeaux | 2006-03-08 | | J. Webber | NRAO | | | Released By: | | | | Name and Signature | Organization | Date | | | | | | | | | | | | | Programming manual for Tunable Filter Bank Doc #: CORL-60.01.07.00-002-E -MAN Date: 2006-03-06 Status: Approved Page: 2 of 26 ## **Change Record** | Version | Date | Affected<br>Section(s) | Reason/Initiation/Remarks | |---------|------------|------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------| | A | 2004-04-27 | All | Initial version | | В | 2004-08-07 | All | Aligned to implementation and specifications after pre-prototype tests | | С | 2005-03-15 | 4.4.1, 4.4.2,<br>4.4.6, 4.4.7 | Updated programming interface for FIR filter | | | 2005-03-31 | | Change for Model register, and other minor corrections | | D | 2005-09-27 | 4.4.2, 4.4.5 | Change for minor mode register, to align with firmware. Added support for 3 bit bypass mode. Corrected requantization register use | | E | 2006-03-06 | 3.1, 4.2.1,<br>4.3, 4.3.6,<br>4.4, 4.4.7 | Added paragraphs on CRC Error Monitoring and Delay Clock Phase. Corrected minor errors in Delay State Counter. Added support for low power mode | Programming manual for Tunable Filter Bank Doc #: CORL-60.01.07.00-002-E -MAN Date: 2006-03-06 Status: Approved Page: 3 of 26 ### Table of Contents | 1 | GEI | NERAL DOCUMENT DESCRIPTION | 4 | |---|------|------------------------------|----| | | | Purpose | | | | | Scope | | | 2 | | LATED DOCUMENTS AND DRAWINGS | | | | 2.1 | References | 4 | | | | Abbreviations and Acronyms | | | | | Glossary | | | | 2.4 | Related Control Drawings | 5 | | 3 | | NCTIONALITY OVERVIEW | | | | 3.1 | Software/Control Function | 6 | | | 3.2 | Monitor & Control Functions | 6 | | | 3.3 | Summary of Monitor Points. | 6 | | 4 | PR( | OGRAMMING INTERFACE | 6 | | | | Overview | | | | 4.2 | FPGA personality download | | | | 4.2. | $\mathcal{E}$ | | | | | Delay subsystem | | | | 4.3. | | | | | 4.3. | | | | | 4.3. | E | | | | 4.3. | | | | | 4.3. | , e | | | | 4.3. | $\mathcal{L}$ | | | | | Filter Bank Subsystem | | | | 4.4. | $\mathcal{E}$ | | | | 4.4. | $\boldsymbol{\varepsilon}$ | | | | 4.4. | $\mathcal{L}$ | | | | 4.4. | | | | | 4.4. | | | | | 4.4. | $\mathcal{E}$ | | | | 4.4. | 1 | | | | 4.4. | 8 Monitor points | 26 | Programming manual for Tunable Filter Bank Doc #: CORL-60.01.07.00-002-E -MAN Date: 2006-03-06 Status: Approved Page: 4 of 26 #### 1 General document description #### 1.1 Purpose This document describes the programming model and interface for the correlator Tunable Filter Bank (TFB). A few of the described functions are still to be confirmed in the final version of the filter. #### 1.2 Scope This manual refers to communication between the Station Card and the TFB card. Communication uses the Station Control Card (SCC) bus interface described in CORL-60.00.00.00-020-B-MAN. #### 2 Related Documents and Drawings #### 2.1 References TP CORL-60.00.00.00-020-B-MAN Correlator Control Bus Manual CORL-60.01.07.00-001-A-SPE Tunable Filter Bank Specification Document CORL-60.00.00.00-007-D-DWG Schematics for CPLD2 interface chip CORL-60.01.07.00-60.01.01.00-A-ICD Internal ICD: TFB to Baseline Correlator #### 2.2 Abbreviations and Acronyms Test point SCC Station Control Card Baseline Filter Board BFB **CPLD** Complex Programmable Logic Device DDS Direct Digital Synthesizer DLL Delay Locked Loop (clock generator in FPGAs) FIR Finite Impulse Response filter **FPGA** Field Programmable Gate Array Local Oscillator LO LSB Least significant bit LUT Lookup Table **MSB** Most significant bit RD Random Data Random Data Generator **RDG** TBD To be determined **TFB** Tunable Filter Bank Programming manual for Tunable Filter Bank Doc #: CORL-60.01.07.00-002-E -MAN Date: 2006-03-06 Status: Approved Page: 5 of 26 #### 2.3 Glossary **Channel**: frequency interval chosen by the IF processor, and sampled by a single sampler **Frequency division mode**: Correlator operating mode in which different correlator planes process a separate sub-channel (a portion of the 2 GHz IF bandwidth) Sub-channel: frequency interval chosen by each FIR filter **Pre-Prototype board**: Version of the TFB with a limited number of sub-channels, and some extra test features. Not included in the ALMA deliverables, but referred in this document. **Time division mode**: Correlator operating mode in which different correlator planes process a separate segment of data (a portion of the 1 ms data frame) #### 2.4 Related Control Drawings CORL-60.01.07.00-005-A-DWG Internal ICD Drawing for Tunable Filter Bank CORL-60.00.00.00-007-D-DWG Schematics for CPLD2 interface chip Board Schematics for Prototype Board CORL-60.01.07.01-005-A4-DWG TFB Prototype II Schematics CORL-60.01.07.02-001-B-DWG Delay FPGA Schematics Drawing #### 3 Functionality Overview The tunable Filter Bank provides two basic functionalities: **Fine delay:** Samples are delayed by one sample unit, from 0 to 255 samples. Optionally, delay range is extended to 32K samples. Delay is common to all subchannels. **Filtering:** 32 sub-channels are derived from the input bandwidth. Each sub-channel can be independently positioned in frequency and may have different bandwidths. Filter output is re-quantized to 2 bit samples. Output samples can be re-quantized to 4 bit samples, using two paired sub-channels. The filter section can be bypassed. If the filter is unused or bypassed, it can be placed in a low power mode. These functionalities are performed respectively by two subsystems: the delay section and the filter bank section. These two sections occupy independent intervals within the board address space. Monitor and debug functions are provided. The delay section monitors input samples and provides sample and state counters. The filter section provides digital total power measurements at the filtered sub-channel outputs. All of these data are useful for correct reconstruction of the correlation function from the digitized correlation coefficients. Programming manual for Tunable Filter Bank Doc #: CORL-60.01.07.00-002-E -MAN Date: 2006-03-06 Status: Approved Page: 6 of 26 #### 3.1 Software/Control Function The control software performs the following operations: - FPGA personality download - FPGA Intialization - Filter tap download - Fine delay setting - Local Oscillator setting - Fine LO setting - LO phase setting - Output quantization setting - Low power mode #### 3.2 Monitor & Control Functions The board provides the following monitor functionalities: - Input sample statistics - Input sample state count - Output sample total power - Test signal generation/check - Altera personality integrity vs. soft errors in configuration memory #### 3.3 Summary of Monitor Points Hardware monitor points for each FPGA and CPLD in the system are provided via a front panel connector. Each FPGA is connected to this front panel connector by two dedicated lines, and each line can be assigned in software to 16 (or more) internal monitor points. Software Monitor and test points are provided at the inputs of the delay and filter sections. Pseudorandom test data can be inserted both at the input of the delay section, and at the output of the filter section. #### 4 Programming interface #### 4.1 Overview The interface follows the CPLD2 interface chip specification. This chip provides basic interface functionality and a common scheme for FPGA personality download. An address space for 16 separate addresses in each board is provided, with two Programming manual for Tunable Filter Bank Doc #: CORL-60.01.07.00-002-E -MAN Date: 2006-03-06 Status: Approved Page: 7 of 26 registers (Control and Data) for each address. By specifying an address in the Control Register, several registers can be addressed with a single Data location. This scheme has been expanded for the TFB, as more than 16 FPGAs are present in the board. Moreover, each filter FPGA implements two independent Therefore we adopted the technique of grouping pairs of filter FPGAs (for a total of 4 filters) in a single address. The byte written in the control register specifies both the filter in the group, and the register to be addressed within the filter. This is implemented by duplicating the control register in each of the 2 FPGAs, while the data registers are unique for each filter. Each FPGA has a physical address identification port, with a geographic address code hardwired in the board, that is used to identify itself. In the 'Board addressing' table a list of the address space is provided. Locations 0-7 map the 32 filter channels (subchannels). Locations 8-10 map the delay function for the three input bits (bit B is the LSB, bit D is the MSB, or sign). Address 11 is used in a test version of the TFB to program a general purpose test FPGA. This #### Board addressing scheme | Address | Function | |---------|-----------------------------| | 0 | Filter sub-channels 0-3 | | 1 | Filter sub-channels 4-7 | | 2 | Filter sub-channels 8-11 | | 3 | Filter sub-channels 12-15 | | 4 | Filter sub-channels 16-19 | | 5 | Filter sub-channels 20-23 | | 6 | Filter sub-channels 24-27 | | 7 | Filter sub-channels 28-31 | | 8 | Delay bit B | | 9 | Delay bit C | | 10 | Delay bit D | | 11 | Test chip (test board only) | program a general purpose test FPGA. This chip is not present in the final version of the TFB. #### 4.2 FPGA personality download The board contains only two different FPGA personalities, corresponding to the two subsystems (plus the test chip in the pre-prototype board). Personality download can thus be performed simultaneously for all FPGAs of each subsystem. The TFB card uses Altera FPGAs (for filtering and software error monitoring) together with Xilinx FPGAs (input data and fine delay, CPLD2 interface); Xilinx FPGAs are adopted in other correlator boards. Differences in the personality download process are however hidden from a programmer point of view. The download process is performed according to the description in the Control Bus Manual via the CPLD2 chip to the Altera FPGAs. A JTAG connector is available for CPLD personality download (CPLD2 chip and Altera CPLD for personality soft error monitoring), and a second JTAG loop crosses every other FPGA in the board. Both interfaces are available on the backplane connector, for remote configuration. Programming manual for Tunable Filter Bank Doc #: CORL-60.01.07.00-002-E -MAN Date: 2006-03-06 Status: Approved Page: 8 of 26 #### 4.2.1 Personality soft error monitoring The Altera FPGAs have an internal circuit that every few seconds checks the configuration memory for soft errors, and asserts a dedicated line when an error is detected. No such check is available on Xilinx delay FPGAs. A second CPLD (an Altera MAX 3000) is present in revision 2 of the board (employing Stratix-II FPGAs) to monitor these soft error lines. These signals are OR-ed together and the result is sent to the CPLD2 chip on the INIT/CK\_INH1 line. As a consequence, on soft error the red LED on the board is lighted, and bit 7 of the CPLD2 monitor register is asserted. The control software should periodically monitor this bit, and reload the FPGA configuration if an error is detected. This signal is asserted during personality download. This bit was previously used to monitor the INIT configuration signal, now available only on bit 6 of the CPLD2 monitor register. The personality download routine should be modified, in order to monitor only bit 6 of this register), instead of bit 6 and 7. #### 4.3 Delay subsystem The delay subsystem is derived from the baseline Filterboard. The main difference is that separate FPGAs are used for the three bits, and no 4-bit modes are available. Moreover, the delay range has been extended to allow for new modes, and to provide sufficient delay range for delay tracking within a single integration. Apart from these (important) differences, all control and monitor functions are the same. Within each delay FPGA, five mode registers and four monitor registers are available. They are selected by the control register content whose bit assignment is as follows: | Bit | Function | | |-----|----------------------|--| | 2–0 | Mode register select | | | 3 | Unused | | | 5-4 | Monitor select | | | 6 | Unused | | | 7 | DLL Reset | | Delay Control Register bit assignment The RESET bit resets the two DLL clock generators in each chip. Programming manual for Tunable Filter Bank Doc #: CORL-60.01.07.00-002-E -MAN Date: 2006-03-06 Status: Approved Page: 9 of 26 Registers are listed in the following tables, with size expressed in bytes (number of Write operation needed to specify the register content): | Register | Size | Function | |----------|------|---------------------------------------| | 0 | 1 | Test point select | | 1 | 1 | Mode | | 2 | 2 | Device control (DC) | | 3 | 4 | Pseudorandom Generator seed | | 4 | 2 | Delay value (sync-ed to STROBE pulse) | | 5 | 1 | Clock phase adjustment | Delay FPGA registers Multibyte values are always specified MSB first. Each new write shifts by 8 bits the current register value, and place the new value in the least significant 8 bits. Each chip has 4 monitor locations, selected by bits 5 and 4 of the control register. Monitor locations 0-2 (control register = 0x00-0x20) allow to read the 20 bit check counter, according to the table below. Location 3 reads back the lower byte of the Pseudorandom Generator seed written in register 3, thus checking that a value can be written and read back correctly. Bits 6 and 7 of location 2 map the LOCK flag of the two DLLs in the chip, ensuring that the clock generation circuitry works correctly, while bit 4 corresponds to signals that the counter value is valid and can be safely read. | Register | Function | |----------|-----------------------------------------------| | 0x00 | Lower Byte of counter (bits 11:4) | | 0x10 | Most significant byte of counter (bits 19:12) | | | Least significant bits of counter (bits 3:0) | | | VALID bit (bit 4) and DLL LOCK flags (bits | | 0x20 | 6:7) | | 0x30 | MS Byte of RDG seed | #### **Delay monitor Locations** Most operations are synchronized with the STROBE input from the backplane. This signal is nominally a 50% duty cycle square wave, with a period of 1 ms. The Programming manual for Tunable Filter Bank Doc #: CORL-60.01.07.00-002-E -MAN Date: 2006-03-06 Status: Approved Page: 10 of 26 rising front marks a clock cycle (and thus an input sample) nominally coincident with a 1ms boundary. The signal is fed to delay chip C, and from this propagated to delay chips B and D, and then to the filter chip array. The strobe is delayed by two clocks on chip C, and thus needs to be realigned among different chips. This is done automatically, using a hardware geographic address input in each chip. The internal strobe signal has a delay of 5 clock cycles with respect to the rising edge of the board STROBE input, and the STROBE output to the filter chip array has a delay of 10 clock cycles, to match the signal propagation delay, with the same duty cycle. #### 4.3.1 Test point select register | Select | Test point TP0 | Test point TP1 | |--------|-------------------------------|---------------------------------| | 0 | - | 1MSEC (Internal strobe) | | 1 | - | - | | 2 | DIN2 (Control data bus) | DIN0 (Control data bus) | | 3 | INIT (1ms strobe) | TST (LSB of monitor counter) | | 4 | BIT1 (from other FPGAs) | MON0 (Monitor bus) | | 5 | BIT2 (from other FPGAs) | MON-OE (Data bus output enable) | | 6 | RA17 (RDG output) | DII0 (Input sample pin) | | 7 | - | - | | 8 | DI17 (Input sample) | DI0 (Input sample | | 9 | - | - | | A | SEL3 (Select line for reg. 3) | RA0 (RDG output) | | В | C_D (from CPLD2) | - | | С | WRITE (from CPLD2) | DO0 (Output sample) | | D | DO1 (Output sample) | CS (from CPLD2) | | Е | C-ENA (Count enable) | - | | F | DLL0 (Locked signal) | DLL1 | List of available test points on TP0 and TP1 pins. The test point select register selects which internal signal is connected to the two TP lines, that are routed to the front panel connector. The 4 LSB and MSB select TP0 and TP1 respectively. The signals available are reported in the TP0/TP1 table. The Programming manual for Tunable Filter Bank Doc #: CORL-60.01.07.00-002-E -MAN Date: 2006-03-06 Status: Approved Page: 11 of 26 meaning of each signal can be inferred from the delay FPGA schematic. Their detailed description is not given here, as this feature is used only for hardware debugging, and requires a detailed knowledge of the schematic anyway. Some signals are unallocated, as they were used for the second delay bit in the original FPGA, and will be assigned later. For example, writing 0x20 on the Test Point Select register selects the 1 msec strobe pulse on TP0, and the input bit DIN0 on TP1. #### 4.3.2 Mode register The mode register sets various options in the delay subsystem that can be controlled by a single bit. The function for each bit is listed in the table below. Most of these functions are used for debug. In normal operations only bits 2 and 4 are used. Bit 2 was used in the original filterboard to align the 1ms STROBE signal among delay chips. The function is now performed in hardware, and this bit is ignored. Bit 4 enables 16 bit delay value. If this bit is cleared, the upper 8 bits of the delay are always set to zero, allowing delay value to be specified with one byte only. Bit 5 forces all output bits to the value for sample #0. Bit 6, together with bit 1, substitute bit 30 and 31 of the input data word respectively with input exchange lines B\_EX1 and B\_EX2. In this way, it is possible to check the integrity of these lines using the Random Data Generator/checker. Bit 7 forces the output clock signals to zero. The DLL1 feedback signal is also blanked, and therefore the corresponding LOCK signal will become false. | Bit | Function | |-----|-----------------------------------------------------------------| | 0 | Select random data instead of input | | 1 | Generate internally 1ms STROBE signal | | 2 | Adjust STROBE signal timing (obsolete, used in old filterboard) | | 3 | 0=RDG repeats every 1ms – 1: free running RDG | | 4 | Two byte delay value | | 5 | All output bits set to the same value | | 6 | Test lines for Delay exchange bits with Random Data Gen. | | 7 | Blank clock output to filter section | Mode Register bit assignment Programming manual for Tunable Filter Bank Doc #: CORL-60.01.07.00-002-E -MAN Date: 2006-03-06 Status: Approved Page: 12 of 26 #### 4.3.3 Pseudorandom data generator seed The pseudorandom generator is a standard component of the Correlator subsystem. It allows checking for data integrity over any interconnection line. The generator is a standard 35 bit Linear Feedback Shift Register (polynomial $x^{35}+x^{33}+1$ ), modified in order to provide 32 consecutive samples at every clock cycle. Each of the 32 samples can be checked using the same generator polynomial, but is uncorrelated to the others. Repetition time is 8.5 seconds for the sample sequence, and 270s for the 32-bit word sequence. The RDG uses a 32 bit seed value, loaded in register 3 (MSB first). MSB can be read back using monitor register 3, thus allowing for simple read-back tests. If bit 3 of Mode Register is clear, this register content is loaded in the RDG synchronously with the STROBE signal, otherwise it is never loaded. After the RDG has been initialized, bit 3 can be set, leaving the RDG free running, or leaved clear, thus repeating the same sequence every ms. The output of the RDG is connected to the input stream bit reversed. This is because in the usual notation for polynomial algebra, bit 0 corresponds to the *most recent* sample, while in the ALMA correlator bit 0 is the oldest. Therefore the output word of the RDG, RA(i) correspond to $x^{(31-i)}$ of the generation polynomial, and also to bit (31-i) of the seed. The RDG is set to a nonzero value at startup, to avoid the locked "all zeros" state. To prevent it from running, a zero value must be loaded in the register. The RDG output is substituted to the data at the input of the chip. Thus the first word to emerge from the chip after RDG initialization is the seed, bit-reversed and delayed/rotated by the number of samples specified in the delay register. #### 4.3.4 Device Control register The Device Control register (register 4) is a 16 bit register used for various functions in the monitor counter. It is loaded with two write operations, MSB first. It is used to select which parameter to monitor. Its bit assignment is: | Bit | Function | |------|-----------------------------------------------------------------------------------| | 4-0 | Input bit (time slice) to monitor (0=oldest, 31=youngest) | | 7-5 | Counter input: 0-3 bit 0-3 values; 4=state, 5=RD checker error, 6=always; 7=never | | 11-9 | Comparison value for state counter | | 12 | Select bit to monitor (unused in this version) | | 15 | Reset counter | Device Control Register bit assignment Programming manual for Tunable Filter Bank Doc #: CORL-60.01.07.00-002-E -MAN Date: 2006-03-06 Status: Approved Page: 13 of 26 It is possible to perform various statistics on any of the 32 input samples. All bits are available to all delay counters, and thus it is possible to monitor any 3 independent parameters, one per chip. Bit 0 (bit A, LSB) is not available in this board, as opposed to the baseline filter bank, and thus any reference to this bit gives undefined results. Only one sample per word can be monitored, as the circuit works at a clock rate of 125 MHz. The sample to be monitored is selected using bits 4:0 (sample 0 being the most recent one), and the selected sample bit is delivered to the other two delay chips using dedicated exchange signals. If operations on the 3 bit sample are to be performed (state count), all three chips must be programmed with the same value in this field. The counter counts the occurrences of "1" (or TRUE) values in the selected signal in a specified 1ms interval, thus the maximum count value is 125000. The parameter to be monitored is selected by bits 7:5. It is possible to monitor: - 0-3: A specific sample bit. Bit 0 does not exist, bit 1 is the least significant one, bit 3 is the sign. Bits are rearranged in each chip in order to be presented consistently to the monitor counter, i.e. bit 3 is the sign in all the three delay chips. Samples are selected from the input word, possibly substituted by the RDG output. - 4: The frequency of one particular state (specified in bits 11:9). Bit 11 corresponds to the sign (bit D), and bit 9 to the least significant bit (bit B). The status is specified as a pseudo-Gray code, as used by the sampler. Bit 8 is ignored. State is specified in the same way for all three chips, bits are rearranged internally in a consistent way. Up to three states for a particular sample can be monitored simultaneously in the three chips. - 5: Output of the RD checker. This checker always monitor the bit processed by the chip, relative to the sample selected by DC register bits 4:0. It is possible to route a bit from another chip using bit 6 in the Mode register, and selecting sample 30 or 31. To perform bit error rate check, RD checker is automatically synchronized to the input data stream, by loading a short sequence of input bits before the measure. Then it is left running and its output compared to the input data stream. - 6 and 7 specify "count always" and "count never", and can be used to check the monitor counter itself. Monitor operations are inhibited when the reset counter bit (bit 15 of the DC register) is set. The monitor circuit operates synchronously with the STROBE signal. At each STROBE pulse the counter is reset, and begins monitoring the specified condition for the following 1ms interval. At each subsequent STROBE pulse, count result is transferred to a storage register, that can be accessed from the monitor location. Programming manual for Tunable Filter Bank Doc #: CORL-60.01.07.00-002-E-MAN Date: 2006-03-06 Status: Approved Page: 14 of 26 The effective integration time is slightly less than 1ms, and corresponds to 124938 clock cycles. Therefore state counts should be normalized to the value read in the "count always" function. Count results are always available and stable. However, if the SCC reads the result across one millisecond boundary, the different portions of the count register are read when different results were stored, and inconsistent results are obtained. To synchronize read operations, the VALID signal is available as bit 5 of monitor register 2. When this signal is high (for approximately 0.5 ms after the 1ms STROBE), the count register can be safely read by the SCC. #### 4.3.5 Delay value register The delay is set writing one value (LSB) or two consecutive values (MSB first) in the Delay register. The same value must be broadcasted to all 3 chips in the board, and usually to all boards for the same antenna. Therefore it is reasonable to assume that the SCC could set the value for all boards under its control within 1ms frame. The actual update occurs at the next 1ms strobe pulse. The baseline filterboard uses only one byte for the delay value, always in the range 0-63 bits. To emulate this behavior, bit 4 of the Global Control Register must be set to zero. If this bit is set to 1, then both bytes must be specified for a 16 bit delay value. Even when only one byte is used, the delay range is extended to 255 bits, i.e. the 3 most significant bits in the delay value must be set to zero to closely emulate baseline filterboard behavior. Delay circuitry is composed of two sections. The lower 5 bits of the delay value are used to rotate bits within each 32-sample word, moving the shifted out samples to the next word. We remember that bit 0 in each word is the oldest, and bit 31 is the most recent one. For example, with a delay of 4 samples, word $0 \times 76543210$ is shifted as $0 \times 6543210$ ? and the next word is $0 \times ????????$ ("?" values coming from adjacent words). For delays greater than 32, the word is delayed using a variable length FIFO. If the FIFO length (delay) increases, inserted samples are not predictable. If FIFO length decreases, samples are simply deleted. In any case, since the delay block is pipelined, delay change does not occur instantaneously (on a single sample), but is spread over 6 consecutive words (48 ns), and the corresponding samples are in general invalid in this period. This is comparable or shorter than the settling time of the variable phase sampler clock, and is consistent with system specifications. Delay changes are applied at the FIFO input. Therefore, the change appears to the chip output with a delay equal to its current setting. For example, if delay is set to 2000 samples, any change appears at the output after 500 ns. In this way, the delay between the adjustment of the variable sample clock and that in the delay chip is constant, depending only on antenna location. Programming manual for Tunable Filter Bank Doc #: CORL-60.01.07.00-002-E -MAN Date: 2006-03-06 Status: Approved Page: 15 of 26 Maximum delay is limited by the FIFO buffer memory size. Current implementation uses all the internal memory available in a XC2S50E FPGA, that corresponds to 32K samples or 8µs of delay range. Minimum delay is 0, and maximum is the maximum value for the implemented range minus 64 (0x7fbf for the current implementation). This is necessary because the FIFO cannot read the same word being written. When the delay is set to zero, each word is presented at the output after ten 125MHz clock cycles. The STROBE signal is propagated to the FIR section with the same delay. The extended delay range is useful to reduce synchronization problems between fine and bulk delay. When operating in frequency multiplexing mode, the station board provides bulk delays in units of 64x32 samples (at 4 GHz). FIR filter kernel introduces a comparable delay in the data stream. Therefore, when fine delay overflows and bulk delay is adjusted, a phase glitch of the order of a microsecond occurs in the data stream. Moreover, changing the delay between the tunable mixer and the correlator changes the LO effective phase, and the phase offset must be compensated. If the "fine" delay has sufficient range, it should be possible to follow the geometric delay variations for the whole integration period without updating the bulk delay. As the maximum delay rate is about one 4 GHz sample every 86 ms, a delay range of 32K sample would allow for a maximum integration time of 45 minutes, well above any realistic need. The proposed delay compensation strategy would be therefore to have the "fine" delay set to an appropriate large accounting for all variations in the geometric delay during the integration period. The bulk delay in the station card is set to compensate for most of the geometric delay, but is not adjustable during the integration. Then, as required for delay tracking during the observations, only the fine delay is adjusted. #### 4.3.6 Clock phase register The phase of the FPGA internal clock can be dynamically adjusted using this register. The value written is interpreted as a phase with 1/16 turn (22.5 degrees, or 0.5 ns) resolution. Only the 4 least significant bits are considered. As the phase step is implemented using internal routing delays, the actual resolution is considerably worse, and non uniform. Actual step is around 45 degrees, or 1 ns. Measured values are: | Setting | Clock delay | Setting | Clock delay | |---------|-------------|----------|-------------| | 0,1,2 | 0.00 ns | 8,9,10 | 4.00 ns | | 3 | 0.92 ns | 11 | 5.00 ns | | 4,5,6 | 2.00 ns | 12,13,14 | 6.10 ns | | 7 | 2.92 ns | 15 | 7.20 ns | Clock delay as a function of Clock Phase Register setting Programming manual for Tunable Filter Bank Doc #: CORL-60.01.07.00-002-E -MAN Date: 2006-03-06 Status: Approved Page: 16 of 26 Only the internal clock phase is affected. The clock signal propagated to the Filter bank FPGAs is independently generated. The internal clock phase affects both the capture window of the input data to the card, and the phase of the data sent to the filter bank subsystem. #### 4.4 Filter Bank Subsystem Each address in the Filter Subsystem corresponds to 4 independent sub-channels. Thus the Control Register selects not only which register will map to the Data Register, but also which sub-channel will be addressed. The bit assignment for the Control Register is as follows: | Bit | Function | |-----|--------------------| | 3–0 | Register select | | 7-4 | Sub-channel select | FIR Control Register bit assignment The Register Select field selects which register will be written in a Data Write operation, or which monitor function will be read in a Data Read operation. The subchannel select field selects sub-channel for both read and write. Each bit selects one of the 4 sub-channels, with bit 4 corresponding to the first one within the group, and bit 7 the last one. More than one filter may be programmed simultaneously (e.g for filter tap loading) by setting to "1" more than one bit in the sub-channel select field. During read operations, only one filter is selected, corresponding to the lowest numbered bit selected. For example, writing 0xc3 to the control register #6 selects register 3 in sub-channels #26 and #27 for subsequent data write operations, and monitor register 3 in sub-channel #26 for subsequent data read operations. Filter registers are listed in the following table. Size is expressed in bytes. | Register | Size | Function | |----------|------|-----------------------------| | 0 | 1 | Test point select | | 1 | 1 | Model register | | 2 | 1 | Mode2 register | | 3 | 4 | Pseudorandom Generator Seed | | 4 | 2 | DDS Frequency | | 5 | 2 | DDS phase offset | Programming manual for Tunable Filter Bank Doc #: CORL-60.01.07.00-002-E -MAN Date: 2006-03-06 Status: Approved Page: 17 of 26 | Register | Size | Function | |----------|------|---------------------------------------| | 6 | 0 | Load DDS Frequency | | 7 | 0 | Load DDS Phase Offset | | 8 | 2 | Tap Value (for 2 <sup>nd</sup> FIR) | | 9 | 1 | Tap Address (for 2 <sup>nd</sup> FIR) | | 10 | 1 | Output LUT values | | 11 | 2 | Monitor mode register | | 12 | 2 | TP integration time (ms) | | 13 | 0 | Start TP integration | | 14 | 0 | Low Power Mode (even filters only) | | 15 | 0 | Unused | Programmable registers in each Sub-Channel filter For multibyte registers, the most significant byte is loaded first, as for the delay chip. A size of zero bytes means that the location is "address only", i.e. writing the location causes an action to occur, but the value written is not used. Registers 14 and 15 are not used. The Random Data Generator is similar to the delay chip one, and is not described here. #### 4.4.1 Test Point Register Each FPGA has two lines, connected to a front panel connector, for debug purposes. Each of these lines can be connected to any of 16 points inside the filter. (If required, more test points could be added at a later stage). Each line is assigned to one filter, with TPO assigned to the lowest (even) numbered filter in the FPGA. The internal node monitored is selected by writing on register 0 of the corresponding filter. Only the lowest 4 bits are used. Node assignment is listed in the table below: | Select | Test point connected to node | |--------|--------------------------------------| | 0 | stb - 1 ms strobe signal | | 1 | front_ms_stb - Edge detector on STB | | 2 | reset_counter | | 3 | data_valid — in the monitor register | **Programming manual** for Tunable Filter Bank Doc #: CORL-60.01.07.00-002-E -MAN Date: 2006-03-06 Status: Approved Page: 18 of 26 | Select | Test point connected to node | |--------|-----------------------------------------------------------------------------| | 4 | B_selected - Input stream selected for test or bypass - bit B | | 5 | C_selected $-$ Input stream selected for test or bypass $-$ bit $C$ | | 6 | $ t D_{ t selected}$ . – Input stream selected for test or bypass – bit $D$ | | 7 | Bit_selected - Input stream selected for test | | 8 | prn_ref - Pseudo random noise generator output | | 9 | bust_b - | | A | bust_c- | | В | bust_d - | | С | error_count_3 - Bit comparison error: true if B,C,D differ | | D | error_count_4 - Pseudo random checker error | | Е | LSB output | | F | MSB output | List of available test points on TP pins. #### 4.4.2 Mode Registers The Model and Model Registers set some 1-bit functions in the sub-channel. Since these bits must be set independently, two 8 bit registers have been used instead of a single 16 bit register. Not all bits are used, the unused bits are "don't care". The meaning of each bit is listed in the following tables. #### Mode 1 register Bit PROMPT\_LD is used only for debug. If this bit is set, phase frequency values are loaded when the "load DDS frequency" and "load DDS phase offset" locations are addressed, instead of waiting for the 1ms as described below. Bit NO\_CLEAR\_PHASE prevents the internal OL phase to be cleared on a frequency change. Then, after a frequency change, the phase is derived from its previous value and the new frequency. Bit STOP\_DDS stops the DDS operations. This can be used to synchronize operations of the filters. By setting these bits before integration, all DDS can be initialized, with appropriate commands, without hardware time constrains. Clearing this bit with a broadcast operation will start the DDS operation in a coordinated way. The signal is synchronized to the 1ms pulse. Bit MODE\_4BIT is used to select the next 2 significant bits in the output requantization circuit. This is used in 4 bit mode, where two filters are used together to **Programming manual** for Tunable Filter Bank Doc #: CORL-60.01.07.00-002-E -MAN Date: 2006-03-06 Status: Approved Page: 19 of 26 produce a 4 bit sample. One filter (with this bit cleared) generates the 2 most significant bits, and a second (with this bit set) the 2 least significant ones. This is described in chapter 4.4.6. | Bit | Function | |-----|--------------------------------------------------------------------------| | 0 | PROMPT_LD: DDS Phase and Frequency are immediately loaded. | | 1 | NO_CLEAR_PHASE: Do not clear Phase on Frequency change | | 2 | MODE_4_BIT: Select next 2 significant bits in the output requantization | | 3 | STOP_DDS: DDS is stopped. Sync-ed by STROBE | | 4 | OUTPUT_TEST: The output stream is substituted by a pseudorandom sequence | | 5 | REPEAT_TEST: If set, RDG is reinitialized every 1ms pulse. | | 6 | RESET DLL: Reset the DLLs in the chip | #### Model Register bit assignment Bit OUTPUT\_TEST allows the output to be substitute by a signal generated by a 35 bit Pseudorandom Data Generator (RDG). The generator is a standard component of the correlator subsystem, and the sequence can be checked at the input of the next stage. This is described in more detail in chapter 4.3.3. Bit LOAD\_RDG controls the RDG. If it is set, the RDG is reinitialized with a programmable seed every 1ms pulse. This is useful to generate an exactly known data pattern, to test next stages in the correlator data path. Seed is written to register 12, LSB first. LSB of the seed can be read back on monitor register 3, and can be used for simple communication tests in which a value is written and read back. This bit must be set high for at least 1ms to load the seed, as in the delay chip. Bit RESET\_DLL resets the on-chip DLL clock generators. It is only active in the first of the two FIR implemented on each chip (even numbered sub-bands), as DLL's are shared among them. It is suggested that DLLs are reset at start-up, and a reset is necessary if they loose lock (as seen through bits 7-5 of monitor register 2). The bypass mode is used when the correlator operates in time division mode. Each filter will output one of the 32 input samples, requantized to 2 bit. In this mode, the 9 bit output of the filter is composed by the 3 bits of the sample selected by Monitor Mode Register bits 4-0, left padded with zeros. This value is then fed to the output requantization and monitor circuitry, that is thus the only portion of the circuit used. The 3 bit value is requantized to 2 bit, and converted from the pseudo-Graycode used **Programming manual** for Tunable Filter Bank Doc #: CORL-60.01.07.00-002-E -MAN Date: 2006-03-06 Status: Approved Page: 20 of 26 in the sampler to the 2-bit signed binary code used in the correlator. The monitor functions can be used to perform input and output signal statistics. Bit HALF\_DELAY is used in oversample modes. Setting this bit, an extra delay of 16 samples (at 4 GHz) is added at the filter input, providing output samples delayed by half clock. Processing both normal and delayed samples allows to recover part of the 2-bit quantization loss. #### Mode2 register Bit HALF\_BAND is used when the filter is used with a band of 31.25 MHz. The complex-to-real output stage is modified in order to shift the complex filter output by 15.625 MHz instead of 31.25 MHz, and each output sample is repeated twice. Bit ONE\_DELAY is analogue to HALF\_DELAY for half band mode. In this case, however, the extra delay is added between the two filter stages, for simplicity. Bit REVERSE\_FREQ controls the frequency order in the output signal. If this bit is cleared, frequency ordering in the output signal is the same than in the input signal. Output frequency 0 and 62.5 MHz corresponds to input frequencies (LO-32.5 MHz) and (LO+32.5 MHz) respectively. | Bit | Function | |-----|-------------------------------------------------------------------------------------------| | 0 | BYPASS: Bypass mode. One of the input samples is directly fed to the output quantizer | | 1 | HALF_DELAY: Delay input by 16 samples (half sample delay at output) | | 2 | HALF_BAND: Decimate output by 2 and shift complex output by 15.6 KHz instead of 31.25 KHz | | 3 | ONE_DELAY: Delay input of 2 <sup>nd</sup> stage filter by one delay | | 5 | REVERSE_FREQ: Output frequency scale is reversed. | | 6-7 | BYPASS FMT: control the output representation in bypass mode | #### Mode2 Register bit assignment If the bit is set, the frequency order in the output signal is reversed with respect to the input signal, and frequency 0 corresponds to (LO+32.5 MHz). Local oscillator frequency is always mapped to 31.25 MHz, i.e. the output band is centered on the LO frequency. If HALF\_BAND bit is set, all these frequencies are scaled by a factor of 2, e.g. the LO frequency is always mapped to 15.625 MHz. Programming manual for Tunable Filter Bank Doc #: CORL-60.01.07.00-002-E-MAN Date: 2006-03-06 Status: Approved Page: 21 of 26 #### Possible values for the BYPASS\_FMT bits Frequency mapping for direct frequency order Frequency mapping for reversed frequency order The BYPASS\_FMT bits select which sample bits and which representation are used in bypass mode for the 2 output bits. In bypass mode, the 3 bit pseudo-Greycode sample from the sampler is converted to a 3 bit binary value. In 2 bit operations, the 2 most significant bits of this value are sent to the correlator. It is possible to use 4 correlator planes to perform a 3 bit correlation, splitting the signal in 2 parts. For debug purposes (e.g. distribution test), it is possible to send to the output 2 most significant bits of the raw Greycode. The bits sent to the output are thus: Programming manual for Tunable Filter Bank Doc #: CORL-60.01.07.00-002-E -MAN Date: 2006-03-06 Status: Approved Page: 22 of 26 | <b>Bits</b> 7,6 | Output | |-----------------|-------------------------------------------------------------------------------| | 00 | 2 bit bypass: The 2 most significant bits of the converted sample | | 01 | 3 bit bypass, MS part: the most significant bit of the converted sample | | 10 | 3 bit bypass, LS part: the two least significant bits of the converted sample | | 11 | Direct mapping of input bits B and C | #### Possible values for BYPASS FMT bits The BYPASS mode bits are active only if the BYPASS bit is set. The MODE\_4BIT flag is not used in bypass mode, the output representation (2 or 3 bit) is selected using only these bits. #### 4.4.3 DDS Frequency and Phase Registers The DDS frequency is set by writing a 16 value (MSB first) in the DDS Frequency register. Value is interpreted as a unsigned frequency, in the range 0-2GHz, with a step of 31 KHz. At the frequency update instant, (see below), the following actions are performed: - "the DDS phase accumulator is cleared (if bit 2 on Mode Register is not set) - "the DDS frequency is set to the specified frequency - " for the 32 successive clock cycles, the appropriate sample phase offsets are loaded in the time multiplexed slices of the mixer As a consequence, every time the frequency is changed, a severe phase discontinuity occurs in the data. Phase is however predictable after the glitch, and consistent over different channels and filterboards across the correlator. This should not cause any limitations, as long as no frequency change "on the fly" is allowed during normal operation. It may be necessary to change the phase of some of the local oscillators during operation. For example when the bulk delay changes, the LO phase as seen by the correlator also changes and this variation must be compensated. The global phase register is used for this purpose. A phase offset is specified to this register, as a 10 bit value in units of 1/1024 of a turn, and is added to the DDS output. In order to ensure consistency and predictability of the phase, frequency and phase update operations must be exactly synchronized. Since the SCC has to control several filters it is usually impossible to program 32 DDS within 1 ms frame. Thus it is not sufficient to use the 1ms strobe alone to synchronize these operations, and a different scheme has been devised. Therefore the values written to the frequency and phase registers are not used directly, but are instead written to temporary registers. When all filters have been programmed, without hardware imposed time constrains, the SCC broadcasts a write to location 6 or 7 of all the involved filters. This write must be completed in a selected **Programming manual** for Tunable Filter Bank Doc #: CORL-60.01.07.00-002-E-MAN Date: 2006-03-06 Status: Approved Page: 23 of 26 1ms frame, just before the requested update time. Since a Broadcast operation takes 32 microseconds, this constraint can be easily met. At the next 1ms pulse, frequency and/or phase offset registers are updated. The frequency update sequence is performed only if the write operation is done on address 6. Writing to address 7 triggers a load of the phase value only. Phase offset register is loaded both after a broadcast to address 6 and to address 7, and thus must be set to a known value prior to a frequency change. Phase synchronization can be overriden by the PROMPT\_LD bit in the global mode register: if this bit is set, phase offset is immediately applied to the DDS output. This means that the phase will be incorrect until both bytes will be written. #### 4.4.4 FIR Tap Coefficient Registers The first stage filter has fixed tap coefficients, stored in ROM. The second stage filter is programmable and the filter shape is determined by loading a set of 32 tap coefficients. To initialize the second stage FIR tap coefficients, the values are loaded in the Tap Coefficient Register (9 bit signed, MSB first). Then the tap address (5 bit, 0 to 31) is written in the Tap Address Register. This operation initiates a memory write cycle to the Tap Coefficient Memory. This operation is repeated 32 times, until all values are loaded. Tap values are common for the I and Q branches of the complex filter, i.e. the FIR performs a convolution of complex data with a real FIR function. The function is symmetric around zero delay, i.e. the same coefficients are applied for positive and negative lags. If more than one filter (the usual case) share the same tap values, this operation can be broadcasted. #### 4.4.5 Output requantization register The 9 bit output sample must be re-quantized to a two bit sample to be used by the correlator. This is performed by multiplying the sample value by an appropriate scale factor, and choosing two bits of the result. 4 bit quantization is also possible, using two filters in parallel. The scale factor can be computed using the digital total power computed by the monitor circuit. A 8 bit scale factor is used, allowing for about 1% accuracy in threshold setting. The quantization step is set to 2<sup>11</sup> and 2<sup>9</sup>, respectively forr 2 and 4 bit requantization. For two bit operations, the scale factor must be set to the value 2<sup>11</sup>/0.996/RMS(out) = 2056/RMS(out), where RMS(out) is the measured RMS value, and 0.996 is the optimal step for 2 bit quantization. For 4 bit operations the scale factor must be set to 1560/RMS(out). Two filters must be used with exactly the same programming. The second filter in the group must have the bit MODE\_4BIT set in model register. If the 4 bit mode is selected in the mode register 1, then the next two bits are used. The scale factor must also be set to the optimum value for 4 bit quantization. **Programming manual** for Tunable Filter Bank Doc #: CORL-60.01.07.00-002-E -MAN Date: 2006-03-06 Status: Approved Page: 24 of 26 In the bypass mode, the re-quantization circuit is not active. Re-quantization from 3 bit Gray code to 2 bit signed code is performed using a fixed table. Output format and mode (2 bit or 3 bit) is selected using bits BYPASS FMT in Mode 2 register. #### 4.4.6 Monitor Mode Register, Integration Period register The monitor functions are controlled by the Monitor Mode Register. With this register it is possible to: - Check one of the input lines with a Random Data Checker - Select which input line will be used for output in the bypass mode - Specify integrator prescaler - Specify continuous or single-shot integration | Bit | Function | |-------|-----------------------------------------------------------------------| | 0-4 | Input channel for bypass or state count: 0=older, 31=newer | | 5-7 | Sample bit, or reference state | | 8 | 0=input sample, 1=output sample | | 9-11 | Signal to monitor | | 12-14 | Prescaler setting. A total of 1+2* <value> bits are discarded</value> | | 15 | Continuous (1) or single-shot (0) integration | #### Monitor Mode Register bit assignment Bits 4-0 select which of the 32 input samples is examined in the monitor function. This sample is used also in the bypass mode, to generate the output time-multiplexed sample to the station card, and can be monitored on the test point lines. Bit 8 selects whether the input or output samples are monitored. If bit 8 is set to 0, the monitor circuit examines one of the 32 3-bit samples. The sample bit field (bits 7-5) selects several functions, depending on the test performed (bits 11-9). For state count, the field is compared to the sample, and matches are counted. For most tests, this field selects the bit to be tested: 1=bit B, 2=bit C, 3=bit D. For RDG error rate count, value of 0 selects an "equality test": all 3 bits are assumed equal, and mismatches are counted. Bit 7 selects a "bust" mode to test the data checker itself. When this bit is set, and sample 11 is selected, the random data checker is initialized with the complement of the true signal, producing deterministic errors. If bit 8 is set, the 2-bit requantized output of the filter is monitored for functions 0, 1 and 2. Total power and DC value functions for the output sample refers to the 9-bit filter output. DC value is the arithmetic mean of the selected sample, and may be useful to check for a DC offset in the signal. The monitor circuit can monitor the following values, depending on the value in bits 9-11 of the register: **Programming manual** for Tunable Filter Bank Doc #: CORL-60.01.07.00-002-E -MAN Date: 2006-03-06 Status: Approved Page: 25 of 26 - Number of "1" bits in the selected sample (3 bit input, or 2 bit output) - 1 State count for selected input state (3 bit input, or 2 bit output) - 2 RDG Error rate count from RD Checker (3 bit input, or 2 bit output) - 3 Total power (of the 3 bit input, or 9bit output) - 4 DC value (of 3 bit input, or 9 bit output) - 5 Unused - 6 Count always (to check the counter) - 7 Count never (also to check the counter) Function 2 (RDG Error rate) requires that the filter inputs receive a random data pattern, for example generated by the delay chips. "Equality test" requires also that the pattern be synchronized between the three sample bits. The integration period is specified in units of 1 ms, and is a unsigned 16 bit quantity. Therefore, the maximum integration period is about 1 minute. Minimum integration period depends on the polling capability of the SCC. An integration period of 0 is not allowed. Integration is started when address 15 is broadcasted. If bit 15 of the Monitor Mode Register is set, integration is automatically restarted at the end of the integration period. This bit is sensed at the end of the integration period. To stop this feature, bit 15 must be cleared. To ensure that the result will fit in the output register, a prescaler is provided, controlled by bits 12-14 of the Monitor Mode Register. The number of bits discarded is equal to (1+2\*prescaler setting>), i.e. the number of bits discarded are comprised between 1 (field set to 0), adequate for 1ms of integration time, and 13 (field set to 7), allowing an integration at 125 MHz for 60 s to fit in the 20 bit result. At the end of the integration period, the monitor counter is copied to a 20 bit monitor output register. Bit 4 of monitor location 2 is set to 1, and cleared when the register is read. In this way, it is possible to poll the register waiting for a new valid value. #### 4.4.7 Low power mode In several observing modes the TFB will be operated with a reduced number of sub-channels, or in the bypass mode. In these modes, the digital filter section is not used. In order to reduce the overall power dissipation it is appropriate to switch off the unused filter sections and thus improve the system reliability and component average lifetime. A low power mode has therefore been provided. It should be selected for all chips in the bypass mode, and for all unused chips when less than the maximum numbers of sub-channels are used. Low power mode is selectable on a chip by chip basis: the Low Power Mode Register of the even sub-channel controls both filters in the same chip. Writing "0" to this register, the filter input and the local oscillator frequency are set to zero, reducing Programming manual for Tunable Filter Bank Doc #: CORL-60.01.07.00-002-E -MAN Date: 2006-03-06 Status: Approved Page: 26 of 26 the total power required by the FPGA. The signal distribution, output and monitor sections are unaffected. To enable normal operations, a "1" must be written to the Low Power Mode Register. At power up all chips are programmed in low power mode, to ensure minimum power consumption until initialization is completed. #### 4.4.8 Monitor points Each FIR has 4 readable locations. Location is selected by bits 0-1 of the FPGA control register, while filter address is given by bits 4-7 of the same register. Register content is analogue to that of the delay chips, to provide a common interface. | Register | Function | |----------|------------------------------------| | 0 | Monitor counter bits 19-12 | | 1 | Monitor counter bits 11-4 | | 2 | Bits 3-0: Monitor counter bits 3-0 | | | Bit 4: data valid bit | | | Bits 7-5: DLL locked flags | | 3 | MSB of RDG seed | | 4 | Data capture register | #### FIR monitor registers The monitor output register gives a 20 bit unsigned value, distributed between addresses 0,1, and 2. Bit 4 of register 2 provide a "data valid" bit, that is set when a new data has been loaded and is automatically cleared after a read, and 3 DLL "locked" signals (at most 3 DLLs are likely used in the chip). Monitor location 3 reports the 8 most significant bits of the RDG seed register. This can be used to check that a value written in this register can be correctly read back. The data capture register captures 4 consecutive 2-bit samples at each 1 ms time boundary. LSB are stored in even bits, MSB in odd bits, with older sample in bits 1-0. The expected sample values for deterministic input signals (pseudo random noise with specified seed, constant values) can be computed using a circuit model, and compared with the captured values for testing.