# Com Block

# COM-1001SOFT BPSK/QPSK/OQPSK DEMODULATOR VHDL SOURCE CODE OVERVIEW

### Overview

The COM-1001SOFT is a continuous PSK demodulator written in generic VHDL. The code is portable to most FPGAs.

Features:

- BPSK/QPSK/OQPSK demodulation
- Programmable symbol rate up to fclk/4
- Selectable root raised cosine filter: 20,25,40% rolloff
- Differential/non-differential decoding
- 4-bit soft-quantized demodulated output
- Extensive monitoring: carrier lock, frequency error, AGC gain, SNR estimate.
- Matlab stimulus generation and VHDL test bench

The main component (*DEMOD.vhd*) is optimized for small device utilization and sturdiness. Its main limitations are:

- Input sampling rate must be between 4 and 8\*modulation symbol rate
- Natural frequency acquisition range is limited to ±1% of the symbol rate.
- Frequency acquisition can be extended to 10% of the symbol rate by manual enabling the AFC (which must be manually disabled during tracking)
- Limited input frequency translation

An additional front-end component *(RECEIVER1b.vhd)* can be instantiated for a wider range of input sampling rate and frequency translation, at the expense of a larger device utilization.

A Bit Error Rate Tester *BER\_ROOT.vhd* is included to measure the BER and detect any phase ambiguity over a specified window.

# Software Licensing

This software is supplied under the following key licensing terms:

- 1. A nonexclusive, nontransferable license to use the VHDL source code internally, and
- 2. An unlimited, royalty-free, nonexclusive transferable license to make and use products incorporating the licensed materials, solely in bit stream format, on a worldwide basis.

The complete VHDL/IP Software License Agreement can be downloaded from http://www.comblock.com/download/softwarelicense.pdf

# Portability

The VHDL source code is written in generic VHDL and thus can be ported FPGAs from various vendors.

### **Configuration Management**

The current software revision is 31

| Directory | Contents                                                                                                                                                |
|-----------|---------------------------------------------------------------------------------------------------------------------------------------------------------|
| /doc      | Specifications, user manual, implementation documents                                                                                                   |
| /src      | .vhd source code,.pkg packages, .xdc<br>constraint files (Xilinx)<br>One component per file.                                                            |
| /sim      | VHDL test benches                                                                                                                                       |
| /matlab   | Matlab .m file for generating stimulus files<br>for VHDL simulation and for end-to-end<br>BER performance analysis at various signal<br>to noise ratios |

Project files:

Xilinx ISE 14 project file: com1001\_14.xise Xilinx Vivado v2017.4 project file: project\_1.xpr

## VHDL development environment

The VHDL software was developed using the following development environment:

- (a) Xilinx ISE 14.7 for synthesis, place and route
- (b) Xilinx Vivado 2017.4 for synthesis, place and route and VHDL simulation

The entire project fits easily within a Xilinx Artix7-100T. Therefore, the ISE project can be processed using the free Xilinx WebPack tools.

### **Device Utilization Summary**

The encoder size is fixed (not parameterized).

Device: Xilinx Artix7-100T

| demod.vhd                            |      | % of Xilinx<br>Artix7-100T |
|--------------------------------------|------|----------------------------|
| Registers                            | 2732 | 2%                         |
| LUTs                                 | 2858 | 4%                         |
| Block RAM/FIFO                       | 0    | 0%                         |
| DSP48                                | 0    | 0%                         |
| GCLKs                                | 1    | 3%                         |
| optional front-end<br>receiver1b.vhd |      | % of Xilinx<br>Artix7-100T |
| Registers                            | 2222 | 1%                         |
| LUTs                                 | 3193 | 5%                         |
| Block RAM/FIFO                       | 1    | 1%                         |
| DSP48                                | 6    | 2%                         |
| GCLKs                                | 1    | 3%                         |

### Clock and decoding speed

The entire design uses a single global clock CLK. Typical maximum clock frequencies for various FPGA families are listed below:

| Device family   | Demodulator           |
|-----------------|-----------------------|
| Xilinx Artix 7  | 110 MHz               |
| -1 speed grade  | (27.5 MSymbols/s max) |
| Xilinx Kintex-7 | 180 MHz               |
| -2 speed grade  | (45 MSymbols/s max)   |

### **Configuration Options**

In order to provide configuration flexibility without unduly increasing the hardware complexity, some features require generating different firmware versions. In particular, the channel filter (root raised cosine square root) rolloff can take three distinct values: 20%, 25% and 40%.

Three versions of the *raised\_cos4x* root raised filters are included in the source code /src/demod directory. Filter selection is done by defining the FILTER\_OPTION ASCII constant in the top level generic section.

- -B (x"42") for 20% rolloff
- -C (x"43") for 25% rolloff
- -D (x"44") for 40% rolloff

### When to use the receiver1b.vhd front-end

The additional front-end component is needed in the following use cases:

- A/D converter sampling frequency is greater than 8\*nominal modulation symbol rate.
- when IF undersampling is used, in which case the digitized sampled IF signal is connected to the ADC\_DATA\_I\_IN input. In this case, the ADC\_DATA\_Q\_IN input must be zeroed.

### Clock / Timing

All signal processing, inputs and outputs are synchronous with the external CLK clock.

### VHDL software hierarchy

|    | xc7a100t-1csq324         |                                                                |  |
|----|--------------------------|----------------------------------------------------------------|--|
| ė. | <sup>1</sup> Ha <b>1</b> | OM1001_TOP - Behavioral (src\com1001_top.vhd)                  |  |
|    | ⊕ <mark>'H</mark> a      | RECEIVER1_001 - RECEIVER1B - Behavioral (src\receiver1\receive |  |
|    | ė 🖌                      | DEMOD_001 - DEMOD - behavioral (src\demod\demod.vhd)           |  |
|    | <u>ب</u>                 | DIGITAL_DC1_001 - DIGITAL_DC1B - DIGITAL_DC_ARCH (src)         |  |
|    | ÷                        | 🔚 RAISED_COS_001 - RAISED_COS4B - RAISED_COS_arch (src\d       |  |
|    | ÷                        | 🖬 RAISED_COS_002 - RAISED_COS4B - RAISED_COS_arch (src\d       |  |
|    |                          | 🔚 RAISED_COS_001 - RAISED_COS4C - RAISED_COS_arch (src\c       |  |
|    |                          | 🔚 RAISED_COS_002 - RAISED_COS4C - RAISED_COS_arch (src\c       |  |
|    |                          | 🔚 RAISED_COS_001 - RAISED_COS4D - RAISED_COS_arch (src\c       |  |
|    |                          | 🔚 RAISED_COS_002 - RAISED_COS4D - RAISED_COS_arch (src\c       |  |
|    | ÷                        | AGC3_001 - AGC3 - AGC3_ARCH (src\demod\agc3d.vhd)              |  |
|    | <u>ب</u>                 | 🔓 FREQ_ACQ_005 - FREQ_ACQ - BEHAVIOR (src\demod\freq_a         |  |
|    |                          | 🔓 COSTAS_001 - COSTAS1B - COSTAS_ARCH (src\demod\cost          |  |
|    | ÷                        | BIT_TIMING_LOOP_OO1 - BIT_TIMING1B - BIT_TIMING_ARC            |  |
|    | · · · ·                  | SNR_MEASUREMENT_001 - SIGNAL_NOISE_RATIO - BEHAVIO             |  |
|    | ÷ "H                     | 3ER_001 - BER_ROOT - behavioral (src\BER\BER_ROOT.VHD)         |  |

The code is stored with one, and only one, entity per.

The top program (highlighted) is *com1001\_top.vhd*. It includes the demodulator *demod.vhd* and the optional digital receiver front-end component *receiver1b.vhd* and an ancillary Bit Error Rate tester *ber\_root.vhd*.

*demod.vhd* encompasses most BPSK/QPSK demodulation functions: AGC, AFC, carrier tracking (Costas loop) and Gardner symbol timing tracking loop.

- the frequency translation is implemented within *digital\_dc1b*. The frequency translation is realized in the form of a complex vector rotation, using sine/cosine lookup tables (*signed\_sin\_cos\_tbl*) and pipeline multipliers (*signed\_mult12x9\_10\_s*) made of half adders *ha* and full adders *fa*.
- channel filtering is made by means of two root raised cosine filters *raised\_cos4x*, one for each complex axis.
- Bit timing recovery is implemented in the *bit\_timing1b* component. The bit timing is a classic Gardner loop: It works on (I,Q) input signals sampled at twice the symbol rate. The input signals are taken after center frequency compensation, root raised cosine filtering and AGC. One first computes the timing error as follows: (I<sub>j-1/2</sub> \* (I<sub>j</sub> I<sub>j-1</sub>)) + (Q<sub>j-1/2</sub> \* (Q<sub>j</sub> Q<sub>j-1</sub>)) where I<sub>j-1/2</sub> denotes a half symbol offset with respect tosample I<sub>j</sub> When the loop is tracking, I<sub>j-1/2</sub> is at the center of the symbol (optimum sampling

instant) and  $I_j$  and  $I_{j-1}$  are at the time of bit transition. When averaged over all data bit patterns,  $(I_j - I_{j-1})$  is zero when the loop is tracking.

Because the bit timing errors are usually small (200ppm max due to crystal frequency offsets), the bit timing loop is a first order loop. The bit timing error is scaled to control the bit timing NCO around the nominal value. (no accumulation, no integration, just a simple first order loop). The bit timing NCO controls the resampling of the input samples at the demodulator input. Prior to the resampling (decimation), the input samples are first subject to a simple x2 interpolation so that the granularity of the resampling is at most 1/8th of a symbol. The bit timing NCO is part of the *demod* entity.

- A digital AGC *agc3* is used to normalize the signal amplitude and to condition the resulting signal in a variety of formats: 4bit unsigned (soft-quantization), 8-bit signed (for carrier tracking loop processing).
- Most carrier acquisition, carrier tracking and AFC functions are implemented within the *costas* entity. The Costas carrier tracking loop is a second-order loop which cancels the average phase error and the average frequency error when tracking. The actual carrier NCO is part of the *demod* entity.
- The inverse signal-to-noise ratio is obtained in *snr.vhd* by computing the standard deviation of the 4-bit soft-quantized samples around the noiseless 1111/0000 values. The standard deviation is averaged over 4096 symbols, thus resulting in an accuracy better than 0.6 dB.

# xc7a100t-1csg324 COM1001\_TOP - Behavioral (src\ccom1001\_top.vhd) RECEIVER1\_001 - RECEIVER1 B - Behavioral (src\receiver1\agc17.vhd) AGC21\_001 - AGC21 - behavioral (src\receiver1\agc21.vhd) AGC21\_001 - AGC21 - behavioral (src\receiver1\agc21.vhd) BIAS\_REMOVAL\_001 - BIAS\_REMOVAL - behavioral (src\receiver1\bias\_1 - CL\_FILTER\_001 - CLC - behavioral (src\receiver1\agc21.vhd) CLC\_FILTER\_001 - CLC - behavioral (src\receiver1\clC.vhd) CLC\_FILTER\_002 - CLC - behavioral (src\receiver1\ClC.vhd) FIRHALFBAND3\_11 - FIRHALFBAND3 - Behavioral (src\receiver1\finhalf FIRHALFBAND3\_01 - FIRHALFBAND3 - Behavioral (src\receiver1\finhalf BER\_001 - DEMOD - behavioral (src\BER\BER\_ROOT.VHD)

The optional *receiver1b.vhd* implements the following functions:

- the AGC *agc17.vhd* prevents saturation at the external A/D converters.
- *agc21.vhd* normalizes the internal complex signal.
- bias\_removal.vhd
- *digital\_dc3.vhd* translates the complex input signal center frequency. The frequency shift can be large (for example fclk/4 for IF undersampling)
- CIC (*cic.vhd*) and half-band FIR filters (*firhalfband3.vhd*) are used to decimate while preventing noise aliasing.

### Matlab - VHDL simulation

The matlab program /matlab/siggen\_psk3.m generates a stimulus file for the demodulator. The program features PRBS11 test sequence generation, PSK modulation, additive white Gaussian noise, Doppler, Doppler rate, and clock drift.

The resulting input.txt file is read by the VHDL testbench /sim/tbcom1001.vhd and fed into the front-end receiver (*receiver1b.vhd*), follow-on demodulator (*demod.vhd*) and subsequent bit error rate tester (*ber\_root.vhd*).

### Implementation

### **Bit Error Rate Performances**

The demodulator bit-error-rate performances are within 0.5 dB from the theoretical performances  $\frac{1}{2} * \text{erfc}(E_b/N_o)$  of QPSK demodulators at  $E_b/N_o$ .of 1 dB

Due to the simple implementation, demodulation losses increase at higher SNRs.



**BER** performance

The demodulator tracking threshold is better than -2 dB  $E_{\text{b}}/N_{\text{o}}$ 

### Acquisition

Typical acquisition time is 2000 symbols.

### Phase Map (QPSK)

The nominal phase map follows Gray encoding as illustrated below:



As with all QPSK demodulators, there is a phase ambiguity of n\*90 deg in the demodulated output. The phase ambiguity is not resolved in this module. It is typically resolved either through the use of a unique word periodically inserted in the data stream (for example when using FEC block code) or through bit error rate detection in Viterbi decoder.

### Differential Decoding (QPSK)

In low data rate applications where the phase noise can affect the bit error rate performances, it can be advisable to use differential QPSK. The phase difference between two successive symbols conveys the information symbol. 0 deg = "00"90 deg = "01"180 deg = "10"270 deg = "11".

This implementation is not strictly that of a DPSK demodulator in the sense that the receiver still tracks the carrier phase and frequency using a Costas loop.

### Filter Responses

The channel filter rolloff can be selected among 20%, 25% and 40%.



Filter response, 25% rolloff



### Frequency Acquisition & Tracking

The demodulator comprises three frequency acquisition and tracking processes:

- a phase locked loop (PLL), also known as 'Costas Loop'.
- an Automatic Frequency Control (AFC) loop.
- an extended frequency acquisition circuit.

The AFC is to quickly detect and compensate for carrier *frequency* offsets, generally around the time of the initial acquisition. The PLL is to detect and compensate for carrier *phase* errors.

The PLL is a second order loop. It can track the center frequency over a range of  $\pm 1.5$  symbol rate. The digital implementation of the Costas PLL has a small frequency acquisition range of about  $\pm 1\%$  of the modulation symbol rate.

The main purpose of the AFC is to increase the frequency acquisition window to about  $\pm 10\%$  of the modulation symbol rate (typical). Once the AFC 'zooms in', the AFC must be disabled by the user in order to minimize the implementation loss.

The extended frequency acquisition circuit extends the frequency acquisition range to about  $\pm 50\%$  of the modulation symbol rate (typical). The algorithm relies on the spectrum symmetry: it is thus important to ensure bit randomness at the transmitter for a symmetrical spectrum. This loop is significantly slower than the AFC.

If the unknown received carrier frequency uncertainty is larger, the user must program some search algorithm using the nominal center frequency control.

For high data rates ( > 100 Kbps), carrier phase noise is generally negligible. For lower data rates, it is may be necessary to adjust the carrier tracking loop gain as a tradeoff between carrier phase noise (originating at the modulator, up-converter, downconverter, etc) and thermal noise. To this effect, the user is given control of the loop gain over a range of x1, x2, x4 and x8.

The higher loop gain can also be used temporarily during acquisition to increase the frequency acquisition window from approximately 1% to 3% of the symbol rate. However, use of the AFC is preferred because of the faster acquisition time and larger acquisition range.

In some conditions, such as no input signal, the AFC and PLL loops can drift out and inhibit (re-)acquisition. It is possible for the user to reset the accumulators within the AFC and PLL using the LOOPS\_RESET control signal.

### **ComBlock Ordering Information**

COM-100SOFT BPSK/QPSK/OQPSK demodulator, VHDL/IP core

ECCN: 5A001.b.3

MSS • 845 Quince Orchard Boulevard Ste N• Gaithersburg, Maryland 20878-1676 • U.S.A. Telephone: (240) 631-1111 Facsimile: (240) 631-1676 E-mail: sales@comblock.com

### Demod.vhd block diagram

