================================================= Create Placed and Routed DCP to Cross SLR ================================================= What You'll Need to Get Started: - RapidWright 2023.1 or later - Vivado 2018.2 or later One of the example programs that is provided with RapidWright solves a challenging problem on UltraScale+ devices (this approach is not valid for Series 7 or UltraScale parts). Crossing super logic region (SLR) boundaries at high speed can prove quite difficult in conventional Vivado flows. The hardware provides dedicated TX/RX flip flops in Laguna sites to enable the creation of paths with very short delay but experience two significant problems: 1. The dedicated super long lines (SLLs) that connect TX and RX Laguna flip flop pairs are often sensitive to hold time violations due to the higher multi-die variability. 2. Paths crossing the SLR boundary are taxed with an additional delay penality called "Inter-SLR Compensation" (ISC). This penalty increases the calculated delay and reduces it potential for high speed. .. figure:: images/ISC.png :width: 350px :align: center Example Vivado tooltip window describing the Inter-SLR Compensation delay penalty In RapidWright, we have created a parametrized, stand-alone application that can automatically generate a placed and routed DCP from scratch that implements a circuit that eliminates and minimizes the two challenges mentioned above. First, it creates a netlist with pairs of flops that are connected and placed and routed across SLR crossings using the dedicated Laguna TX/RX flip flop sites. Next, it custom routes the clock (the circuit has its own BUFGCE) such that it can individually tune the leaf clock buffers (LCBs) for each direction on each side of the SLR. By using the LCBs, the hold time in the first challenge mentioned above is eliminated. To minimize the ISC penalty, a clock root is generated for each clock region (CR) that contains an SLR crossing. Steps to Run ================ 1. Ensure you have RapidWright correctly setup and/or installed. See the :ref:`Getting Started` page for details. 2. Run the command below to print available options to parameterize the SLR crossing output .. code-block:: bash rapidwright SLRCrosserGenerator -h Example output below: :: ============================================================================== == SLR Crossing DCP Generator == ============================================================================== This RapidWright program creates a placed and routed DCP that can be imported into UltraScale+ designs to aid in high speed SLR crossings. See RapidWright documentation for more information. Option Description ------ ----------- -?, -h Print Help -a [String: Clk input net name] (default: clk_in) -b [String: Clock BUFGCE site name] (default: BUFGCE_X0Y218) -c [String: Clk net name] (default: clk) -d [String: Design Name] (default: slr_crosser) -i [String: Input bus name prefix] (default: input) -l [String: Comma separated list of (default: LAGUNA_X2Y120) Laguna sites for each SLR crossing] -n [String: North bus name suffix] (default: _north) -o [String: Output DCP File Name] (default: slr_crosser.dcp) -p [String: UltraScale+ Part Name] (default: xcvu9p-flgc2104-2-i) -q [String: Output bus name prefix] (default: output) -r [String: INT clk Laguna RX flops] (default: GCLK_B_0_1) -s [String: South bus name suffix] (default: _south) -t [String: INT clk Laguna TX flops] (default: GCLK_B_0_0) -u [String: Clk output net name] (default: clk_out) -v [Boolean: Print verbose output] (default: true) -w [Integer: SLR crossing bus width] (default: 512) -x [Double: Clk period constraint (ns)] (default: 1.538) -y [String: BUFGCE cell instance name] (default: BUFGCE_inst) -z [Boolean: Use common centroid] (default: false) 3. A default scenario of a single bi-directional crossing of 512 bits is generated at the LAGUNA_X2Y120 site on a VU9P part if no options are provided. The DCP is generated in the current working directory with the name ``slr_crosser.dcp`` unless the ``-o`` option is specified. .. code-block:: bash rapidwright SLRCrosserGenerator :: ============================================================================== == SLRCrosserGenerator == ============================================================================== Init: 4.787s Create Netlist: 0.123s Place SLR Crossings: 0.121s Custom Clock Route: 3.756s Route VCC/GND: 0.079s Write EDIF: 0.148s Writing XDEF Header: 0.090s Writing XDEF Placement: 0.213s Writing XDEF Routing: 0.404s Writing XDEF Finalizing: 0.079s Writing XDC: 0.039s ------------------------------------------------------------------------------ [No GC] *Total*: 9.839s Wrote final DCP: /home/user/slr_crosser.dcp 4. Open the DCP using Vivado to view the design. It should look similar to the annotated screenshot below: .. figure:: images/SLR_Crossing.png :width: 550px :align: center Vivado Screenshot with bubble annotations of a single, bi-direction 512-bit SLR crossing circuit. 5. You can also unzip the DCP (treating it like an ordinary ZIP file) and inside you'll find Verilog and VHDL stubs that can be imported into RTL designs for black box inclusion. Example output below: :: $ unzip slr_crosser.dcp Archive: slr_crosser.dcp inflating: slr_crosser.edf inflating: slr_crosser.xdef inflating: slr_crosser_late.xdc inflating: slr_crosser_stub.v inflating: slr_crosser_stub.vhdl inflating: dcp.xml $ cat slr_crosser_stub.v // This file was generated by RapidWright 2018.2.0. // This empty module with port declaration file causes synthesis tools to infer a black box for IP. // Please paste the declaration into a Verilog source file or add the file as an additional source. module slr_crosser(clk_in, clk_out, input0_north, input0_south, output0_north, output0_south); input clk_in; output clk_out; input [511:0]input0_north; input [511:0]input0_south; output [511:0]output0_north; output [511:0]output0_south; endmodule $ Optionally, you can open the DCP in Vivado and write out the netlist as EDIF, Verilog or VHDL to be packaged as an IP. The DCP can then be dropped into the IP cache later. 6. As one additional example, the generator is capable of using every SLL in the device. To generate such a DCP for a VU9P device, run: :: rapidwright SLRCrosserGenerator -w 720 -l LAGUNA_X0Y120,LAGUNA_X2Y120,LAGUNA_X4Y120,LAGUNA_X6Y120,LAGUNA_X8Y120,LAGUNA_X10Y120,LAGUNA_X12Y120,LAGUNA_X14Y120,LAGUNA_X16Y120,LAGUNA_X18Y120,LAGUNA_X20Y120,LAGUNA_X22Y120,LAGUNA_X0Y360,LAGUNA_X2Y360,LAGUNA_X4Y360,LAGUNA_X6Y360,LAGUNA_X8Y360,LAGUNA_X10Y360,LAGUNA_X12Y360,LAGUNA_X14Y360,LAGUNA_X16Y360,LAGUNA_X18Y360,LAGUNA_X20Y360,LAGUNA_X22Y360 The resultant DCP should look similar to the following in Vivado: .. figure:: images/Full_SLR_Crossing.png :width: 550px :align: center Vivado Screenshot of all SLLs being used at potentially a 760MHz for a speed grade 2 device.