Abstract
In this paper we detail the system (viz. silicon-package-pcb) electrical co-design of a 130nm BiCMOS high-speed (25Gbps) 4-channel multi-rate retimer, packaged in a small 6-mm × 6-mm FC BGA package, with integrated advanced signal conditioning circuitries. Electrical optimization of the silicon-package-pcb over the high speed channels, to achieve desired performance, was achieved through a coupled circuit-to-electromagnetic co-design modeling and simulation methodology. Key figure of merits for system electrical performance (viz. insertion loss, return loss, crosstalk/isolation, jitter, and power supply inductance and resistance parasitics, among others) are modeled and analyzed. Laboratory measurements on a retimer are presented that validate the integrity of the modeling methodology. Good correlation between modeling methodology and laboratory measurements is achieved.
I. Introduction
Multi-gigahertz serial links suffer from signal attenuation and distortion due to intrinsic channel/interconnect transmission line impairments. Signal characteristics, impedance discontinuity, propagation delays, skin-effect, dielectric loss, inter symbol interference (ISI), and reflections are among some of the issues that will impact the link performance of the transmission channel [1–2]. To overcome these challenges, signal conditioning techniques have been employed through integration of dedicated circuitries (e.g. repeaters - viz. retimer or redriver). Repeater refers to any active component that acts on a signal to increase the physical lengths over which the signal can be transmitted successfully [3–4]. The category of repeaters includes both retimers and re-drivers. The retimer performs three primary functions – reshape, re-amplify, and re-time the transmitted data to overcome channel impediment and circuit bandwidth limitations [5–6]. Retimer differentiates from redriver by having a clock-data recovery (CDR) which enables the option of including advanced equalization circuitry (e.g. feed-forward equalizer, FFE and decision feedback equalizer, DFE, among others). Figure 1 shows a block diagram of the typical components of a retimer implementation. Retimer consists of a complete receiver and a driver which are synchronized by clocks either recovered from data stream by a CDR or from a reference clock. For optimal performance, retimer placement should be fully evaluated through system-level co-design modeling/analysis of the link to assess signal integrity, timing jitter, and power integrity implications.
This work focuses on electrical optimization of a 4-channel retimer, to achieve 25Gbps+ data rate performance. Section II gives an overview of the retimer device. System-level (die + package + PCB) electrical co-design modeling and analysis methodology is detailed in Section III. Measurements on the system are presented in Section IV. Findings and observations are covered in Section V.
II. Retimer Description
The device is a four-channel multi-rate retimer with integrated signal conditioning (Figure 2.0). It is used to extend the reach and robustness of long, lossy, crosstalk-impaired high-speed serial links while achieving high bit error rate (BER). The device data path consists of several key blocks as shown in Figure 2.0. These key circuits are the continuous time linear equalizer (CTLE), variable gain amplifier (VGA), decision feedback equalizer (DFE), clock and data recovery (CDR), and the differential driver with finite impulse response (FIR) filter. For a complete description of the functionalities of each circuit please refer to the device datasheet [7].
Each channel includes a continuous-time linear equalizer (CTLE) and a Decision Feedback Equalizer (DFE), which together compensate for the presence of a dispersive transmission channel between the source transmitter and the retimer receiver (Figure 2.0). The CTLE is a fully-adaptive equalizer. The CTLE adapts according to a Figure of Merit (FOM) calculation during the lock acquisition process. The FOM calculation is based upon the horizontal eye opening (HEO) and vertical eye opening (VEO). The CTLE consists of 4 stages, with each stage having 2-bit boost control. This allows for 256 different boost combinations. The CTLE adaption algorithm allows the CTLE to adapt through 16 of these boost combinations. The boost levels can be set between 8 dB and 25 dB at 12.89 GHz. The VGA circuitry assists in the recovery of extremely small signals, working in conjunction with the CTLE to equalize and scale amplitude.
To reduce the effects of crosstalk, reflections, or post cursor inter-symbol interference (ISI), a 5-tap DFE is employed within the data path of each channel. The DFE must be manually enabled, regardless of the selected adapt mode. Once the DFE has been enabled it can be configured to adapt only during lock acquisition or to adapt continuously. For many applications with lower insertion loss (i.e. < 30 dB) lower crosstalk, and/or lower reflections, part or all of the DFE can be disabled to reduce power consumption.
Each channel includes an independent voltage-controlled oscillator (VCO) and phase-locked loop (PLL) which produce a clean clock that is frequency-locked to the clock embedded in the input data stream. The high frequency jitter on the incoming data is attenuated by the PLL, producing a clean clock with substantially-reduced jitter. This clean clock is used to re-time the incoming data, removing high-frequency jitter from the data stream and reproducing the data on the output with significantly-reduced jitter. The recovered data is then output to the FIR filter and differential driver together with the recovered clock which has been cleaned of any high-frequency jitter outside the bandwidth of the CDR clock recovery loop.
The FIR filter is used to pre-distort the transmitted waveform in order to compensate for frequency-dependent loss in the output channel. The output differential voltage (VOD), pre-cursor, and post-cursor equalization of the driver is controlled by manipulating the FIR tap settings. The most common way of pre-distorting the signal is to accentuate the transitions and de-emphasize the non-transitions. The bit before a transition is accentuated via the pre-cursor tap, and the bit after the transition is accentuated via the post-cursor tap. The three-tap finite impulse response (FIR) filter allows for pre- and post-cursor equalization to compensate for a wide variety of output channel media. The main cursor tap is the primary knob for amplitude adjustment. The pre and post-cursor FIR tap settings can then be adjusted to provide equalization. The output FIR compensates for dispersion in the transmission channel at the output of the device. This enables reach extension for lossy interconnect, by minimizing signal impairment and improve jitter to accommodate multiple application areas.
Due to its rich equalization and jitter reduction features, the retimer has many applications and can be effectively deployed in a variety of different systems from backplanes to front ports to active cable assemblies. Figure 3.0 below shows a typical use of the retimer device.
III. Package and PCB Design Details
The device is package in a 6×6mm2 101-pin FcCSP ABM multi-layer package. Fig. 4.0 below shows a top view of the package routing and the critical 4-channel high speed data path (highlighted in yellow). Figure 5.0 shows a three dimensional (3D) representation of the package design.
The high speed channels, for both package and PCB, are designed and appropriately optimized to achieve a singled ended impedance of 50Ohm and differential impedance of 100Ohm respectively. Appropriate signal and power integrity routing design guidelines were adopted during the design. Fig. 6.0 below shows a top view of EVM PCB routing and Figure 7.0 shows a 3D view of the EVM. It is critical to making sure that high frequency effects are captured accurately during the extraction, to that end the package physical design was merged to the PCB design and the merged design was used for extraction (Figure 8.0). The objective was to accurately capture the electromagnetic interaction at the interface of the package and PCB [8].
IV. System Co-Design Modeling Flow
To implement system-level co-design, appropriate modeling methodology is required to assess impact of parasitics on key performance figure of merit (FOMs). Electrical optimization of the silicon-package-pcb over the high speed channels, to achieve desired performance, was achieved through a validated coupled circuit-to-electromagnetic co-design modeling and simulation methodology as described in steps below. The methodology was previously validated with good correlation to silicon measurements on similar devices [9–12].
Step 1: Initial package and PCB physical designs are done while incorporating manufacturing and assembly rules and engineering/customer specs.
Step 2: Perform accurate 3D quasi-static and full wave 3D electromagnetic extraction (both time and frequency models) on appropriate critical signals and power/ground domains.
Step 3: In this step, a coupled circuit-to-electromagnetic system-level analysis is employed for system-level transient analysis. S-parameter models are converted using advanced broadband macromodeling algorithms for causality and passivity.
Step 4: In the final step – design of experiment (DOE) and parametric sweepings are performed to assess key FOM performance against requirements. The flow is re-iterated until requirements are met.
V. System Measurements
To validate the performance of the system predicted through co-design modeling and analysis methodology, as detailed in Section IV above, selective performance testing were performed to assess impact of key FOMs. We present here the laboratory characterization for the system return loss and total output jitter performance of the device. The characterization was performed across a range of frequencies for nominal and corner voltages and temperatures. The primary equipment used was the Agilent PNA Series Network Analyzer & S-Parameter Test Set. Figure 10 below shows the measurement test set-up.
Figure 11 above shows a sample TX output and RX input return loss results for nominal voltage and temperature. The data displayed is from one random sample. All data includes loss from EVM board (~4dB @ 12.9GHz). SCC22 BJ(93-4) & SDD22 BJ(93-3) and SCD11/SDC11 BJ(93-5) & SDD11 BJ(93-3) masks are also shown.
The output jitter performance of the device was characterized by assessing the parametric quality of the eye diagram. Figure 12 belows shows the test set-up for measuring output jitter. The equipment used is the Keysight DCA-X Mainframe with 86108B Precision Waveform Analyzer. Measurements were made over voltage and temperature. Figure 13 and 14 show the TX output jitter at 25.78125Gbps and a sample of the eye digram respectively.
VI. Findings and Observations
As discussed here, multi-gigahertz serial links suffers from signal attenuation and distortions due to their intrinsic channel/interconnect transmission line impairments. Due to its rich equalization schemes and jitter reduction features, the retimer are able to overcome these impairments in order achieve high channels performance. While adoptions and implementations of these signal conditioning techniques help to ensure a compliant device, the high frequency effects and electromagnetics interactions at the junction of the die to package and package to PCB can impact the performance drastically. The system-level co-design modeling methodology developed here helped to assess these challenges early in the design process. The methodology allowed for multiple iterations of the system design until desired performance is achieved. Good predictive performance was achieved for both return loss and output jitter. Figure 15 below shows the correlation between the simulated and measured return loss for two RX channels. As shown fairly good correlation has been achieved overall. The simulated data are shown in green and purple while the measured data are shown in blue and red colors respectively. The return loss performance over PVT was also demonstrated to be in compliance with the standard (viz. IEEE802.3bj CR4/KR4 mask).
Correlation of simulation to measurement for the total output jitter can only be inferred as the system-level circuit-level transient simulation is quite complex to set-up. Based on the measurements, all jitter components meet industry specifications. Minimal variations in output jitter over voltage and temperature were observed. Typical jitter performance for the system were 0.18 UI pk-pk TJ @ 1E-12, 0.006 UI RMS RJ (random jitter), and 0.007 UI pk-pk DCD (duty cycle distortion) respectively. The low jitter performance was additionally enabled by minimizing the parasitics resistance and inductance for the power domains supply as assessed by the modeling flow.
VII. Conclusion
In this paper we presented on the system-level co-design modeling methodology developed to validate the performance of a high speed multi-rate retimer. The rich equalization schemes and jitter reductions techniques adopted helped to overcome the signal integrity challenges of the high frequency lossy interconnects. Additionally, we have presented considerations on die, package and board co-design and co-simulation for signal and power integrity of the system. Using characterization measurements data performed on the EVM system, it was demonstrated that good correlation between measurement and simulation can be achieved for both return loss and output jitter. As demonstrated the electromagnetic interactions at the interfaces can impact performance significantly. To that end, we have shown that the comprehensive modeling/analysis co-design methodology can be successfully applied to characterize the performance of the device early in the design phase. Additionally the methodology provides the ability to perform parametric sweeping of the key performance figure of merits under manufacturing/assembly process variations.
Acknowledgment
The author would like to thank the High Speed system team for collecting and making available all the characterization data on the EVM system.