ALPACA PDR#2 Overall System Capabilities, RF Sampling, and Data Storage

> Brian Jeffs June 18, 2019



**BYU Electrical & Computer Engineering** IRA A. FULTON COLLEGE OF ENGINEERING







# **ALPACA Performance Specs. (1)**

| Performance Characteristic                             | Specification                                              |
|--------------------------------------------------------|------------------------------------------------------------|
| Frequency Coverage (tunable within this range)         | 1300 – 1720 MHz (420 MHz total BW)                         |
| Beamformer real-time processing bandwidth              | 305.2 MHz                                                  |
| Number of real-time beams                              | 40                                                         |
| Integrated spectra data products per beam, per channel | XX pol (real float), YY pol (real float), XY pol (complex) |
| Pulsar / Transient mode:                               |                                                            |
| Number of frequency channels   BW per channel          | 1250 coarse chan.   244.1 kHz separation, 325.5 kHz BW     |
| Fastest integration dump interval                      | 64 microseconds                                            |
| HI Spectral Line (zoom spectrometer) mode:             |                                                            |
| Total number of frequency channels   BW per channel    | 36,000 (spanning 183.7 MHz)   5.1 kHz                      |
| Shortest integration dump interval                     | 100 ms                                                     |
| Beamformer calibration mode:                           |                                                            |
| Covariance matrix outputs per each 512 coarse channel  | Lower triangular 144x144 matrices, 500 ms max dump rate    |



# **ALPACA Performance Specs. (2)**

| Performance Characteristic                                 | Specification                                             |
|------------------------------------------------------------|-----------------------------------------------------------|
| LO and IF frequencies                                      | NONE: direct sampling of bandpass RF                      |
| ADC sample rate   resolution                               | 2,000 Msamp/s   12 bits (10+ enobs)                       |
| Complex baseband sample rate                               | 500 Msamp/s                                               |
| 1 <sup>st</sup> stage PFB FFT length   oversample ratio    | 2048 channels   4/3 oversampled                           |
| 2 <sup>nd</sup> stage (zoom) PFB length   oversample ratio | 64, pruned to 48 non-overlapped channels   1/1            |
| Peak I/O data rates:                                       |                                                           |
| Output data rate per FPGA board   input rate per HPC       | 52.1 Gbps   37.5 Gbps (8 bit real + 8 bit imag. samples)  |
| Total max output data rate in pulsar spectrometer mode     | 50.0 Gbps (16 bit int real & 32 bit int complex: 16r+16i) |
| Optional (unfunded) beamformed voltage data mode:          | (Ability to support these modes is undetermined)          |
| Beamformed raw voltages data rate, total over all HPCs     | 520.8 Gbps (cmplx int 16, 40 beams, X&Y pol)              |
| Beamformed raw voltages data rate, one HPC                 | 20.83 Gbps (cmplx int 16, 40 beams, X&Y pol)              |



## **ALPACA Performance Specs. (3)**

| Performance Characteristic                           | Specification                                                                                                                                      |
|------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------|
| Number of input ports (antennas)                     | $138 + 6$ spare = 144, index: $0 \le j \le 143$                                                                                                    |
| Number of Xilinx ZCU111 FPGA boards   fid index      | 18   index: $0 \le p \le 17$ , fid = p                                                                                                             |
| Number of HPCs                                       | 25 index: $0 \le i \le 24$                                                                                                                         |
| Number of GPUs per HPC                               | 2 index: $0 \le l \le 1$                                                                                                                           |
| Number of processing threads per GPU                 | 1                                                                                                                                                  |
| xid index: identifies a unique GPU & hashpipe thread | $\operatorname{xid} = q = 2i + l,  0 \le q \le 49$                                                                                                 |
| Frequency channels processed by $q$ th xid           | $k \in \{q, q + 50, q + 100, \dots q + 1200\}$ for xid range of $0 \le q \le 49$ .<br>This implies the channel index range is $0 \le k \le 1249$ . |
| Reduced bandwidth modes: total BW processed options  | 305.2, 244.2, 183.7, 122.1, or 61.0 MHz total BW<br>(i.e., select and process any of 5, 61 MHz wide subbands)                                      |
|                                                      |                                                                                                                                                    |
|                                                      |                                                                                                                                                    |

# Improved Array Geometry

and Planetary

ophysic

Electrical & Computer Engineering

- Reduced component count
- Reduced processing requirements
- Similar FOV sensitivity flatness
- Improved  $T_{svs}$  due to lower mutual coupling and better impedance match
- Reduced cryogenic cooling complexity: 1 compressor, 3 cold heads vs 2 and 4 respectively



## Beamformer Subsystems



#### Notes:

- 1. This solution assumes the newly developed 100 GbE FPGA block will support 4 independent 25 GbE ports.
- If assumption (1) is not satisfied, then each ZCU111 must be dedicated to a single switch, and each of 4 25 GbE ports per HPC is routed to a different switch.
- 3. In the case of (2), the number of available ports per switch drops from 7 to 2.
- 4. To support a single xid thread per GPU, channel bonding is required for the 25 GbE port pairs.
- 5. All network cables are single QSFP28 100 GbE to quadropus 4 x SFP28 25 GbE fan out connectors.

## Xilinx RFSoC board, F-Engine

**BYU** Electrical & Computer Engineering



## Xilinx RFSoC board, F-Engine

SFP28

(4x GTY)

DisplayPort USB 3.0

(1x GTR)

(2x GTR)

• RFSoC FPGA has 8 on-chip ADCs, 4 Gsamp/sec, with digital down conversion and lowpass filtering to complex baseband.

©' 📷 🛲 🛲

PL 4x C

- Will directly sample RF over fiber downlinks with no analog mixer.
- 4 x 25 GbE internet ports support I/O data rate to HPC/GPUs.
- 18 of these ZCU111 boards will be used to support 138 antenna inputs.
- Oversampled polyphase filter bank for coarse frequency channelization

AMS Clocking

CLK



#### ADC, Sampling, and Pre-Filtering



- Passband corners:
  1300 MHz to 1720 MHz
- Band-defining filter attenuates adjacent RFI prior to the RF over fiber link to limit dynamic range
- Anti alias filter is just ahead of ADC to reduce noise aliased into passband
- Pre-ADC filter is lower order, lower cost

#### ADC, Digital Down Conversion

**BYU** Electrical & Computer Engineering



- Built-in ADC Digital Down Converter (DDC)
- 2 GHz sample frequency
- Mixes sampled real RF down to complex baseband
- Decimates to final sample rate of 500 MHz

## ADC, Digital Down Conversion

for 2 Gsps sampling



- Built-in ADC pre-decimation anti-alias filter response.
- Decimation by 4 moves.
  250 MHz to full bandwidth point at 1.0.
- Complex baseband sample rate is 500 MHz.
- 420 MHz usable passband.
- No NCO tuning to select 305 MHz beamformer band: just pick from PFB channels.
- Ready for upgrade to 420 MHz beamformer with more HPCs.



## **XB-Engine: Correlator / Beamformer**

25 x Tyan 2U Server DP Xeon, 96 GB RAM



#### 50 x Nvidea GeForce RTX 2080 Ti GPUs





#### Transient Mode, Coarse PFB (1 of 25)

**BYU** Electrical & Computer Engineering



**BYU** Electrical & Computer Engineering









INVIDIA KIA 2000 II GPU, 2 OI 2 per FIPU













## **Data Sink and Archive Options**

- 1. New external small Lustre file store dedicated to ALPACA
  - Included in ALPACA budget
  - 76 TB, 100 Gbps max write rate, two 24 disk servers (standard hard disk)
  - Requires additional Infiniband switch and nics for HPCs required.
- 2. Use existing AO file store capacity
- 3. Users provide own resources, e.g. specialized back ends after beamforming



# **Test Fixtures for Beamformer Development**

- Analog RF beamformer test source
  - Produces 8 outputs of a single sinusoid, plus noise, using Agilent signal generators and RF splitter
  - Connectors and voltage levels compatible with two separate FMCs or RFoF driver boards
  - Tests proper ADC function, synchronous sampling across 2 ZCU111s, and beamformer performance
- Slow Packet Generator
  - MATLAB based for non-real-time ZCU111 100 Gbe output simulation
  - Tests network and beamformer control logic and basic operation
  - Modified version of FLAG packet generator
  - Easy synthesis to simulate arbitrary signal structures arriving at array
- Real-Time Packet Generator
  - Firmware implementation on a single ZCU111, out output through physical 100 Gbe port
  - Tests full data rate data integrity and processing speed through switches and HPCs
  - Signal options are noise and single sinusoids with specified phase gradient across 8 simulated antennas
- Beamformed data products sniffer
  - Permits block dump and inspect capability of beamformer outputs for all required observation modes
  - Hosted on one HPC, implemented on CPU (not GPU), data path through network switches