RapidWright Report Timing Example

Reports the critical path within an example design (e.g. “microblaze4.dcp”).

Background

Please see our FPT‘19 paper, "An Open-source Lightweight Timing Model for RapidWright" (Presentation) for background details on our RapidWright Timing Model.

Steps to Run

  1. Download the example microblaze4.dcp design:
wget http://www.rapidwright.io/docs/_downloads/microblaze4.dcp
  1. If you need to recompile the code, run:
javac src/com/xilinx/rapidwright/examples/ReportTimingExample.java

in the RapidWright directory. Alternatively, build this example using your IDE.

  1. After compiling, run:
java com.xilinx.rapidwright.examples.ReportTimingExample microblaze4.dcp

The source code for ReportTimingExample.java is provided below for easy reference:

public static void main(String[] args) {
    if(args.length != 1) {
        System.out.println("USAGE: <dcp_file_name>");
        return;
    }
    CodePerfTracker t = new CodePerfTracker("Report Timing Example");
    t.useGCToTrackMemory(true);

    // Read in an example placed and routed DCP
    t.start("Read DCP");
    Design design = Design.readCheckpoint(args[0], CodePerfTracker.SILENT);

    // Instantiate and populate the timing manager for the design
    t.stop().start("Create TimingManager");
    TimingManager tim = new TimingManager(design);

    // Get and print out worst data path delay in design
    t.stop().start("Get Max Delay");
    GraphPath<TimingVertex, TimingEdge> criticalPath = tim.getTimingGraph().getMaxDelayPath();

    // Print runtime summary
    t.stop().printSummary();
    System.out.println("\nCritical path: "+ ((int)criticalPath.getWeight())+ " ps");
    System.out.println("\nPath details:");
    System.out.println(criticalPath.toString().replace(",", ",\n")+"\n");
}

Example Output

Please refer to the timing library Javadoc and code for more implementation details. The Java source code for the timing library is located in: RapidWright/src/com/xilinx/rapidwright/timing/.

Example output using the microblaze4.dcp design is included below:

==============================================================================
==                          Report Timing Example                           ==
==============================================================================
                Read DCP:     2.533s    400.912MBs
    Create TimingManager:     1.294s     19.852MBs
           Get Max Delay:     0.000s      0.000MBs
------------------------------------------------------------------------------
                 *Total*:     3.827s    420.764MBs

Critical path: 1876 ps

Path details:
[superSource->microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/Operand_Select_I/EX_Op2_reg[31]/Q,
 microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/Operand_Select_I/EX_Op2_reg[31]/Q->microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/ALU_I/Using_FPGA.ALL_Bits[31].ALU_Bit_I1/Not_Last_Bit.I_ALU_LUT_V5/Using_FPGA.Native/LUT6/I0,
 microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/ALU_I/Using_FPGA.ALL_Bits[31].ALU_Bit_I1/Not_Last_Bit.I_ALU_LUT_V5/Using_FPGA.Native/LUT6/I0->microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/ALU_I/Using_FPGA.ALL_Bits[31].ALU_Bit_I1/Not_Last_Bit.I_ALU_LUT_V5/Using_FPGA.Native/LUT6/O,
 microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/ALU_I/Using_FPGA.ALL_Bits[31].ALU_Bit_I1/Not_Last_Bit.I_ALU_LUT_V5/Using_FPGA.Native/LUT6/O->microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/ALU_I/Use_Carry_Decoding.CarryIn_MUXCY/Using_FPGA.Native_CARRY4_CARRY8/S[1],
 microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/ALU_I/Use_Carry_Decoding.CarryIn_MUXCY/Using_FPGA.Native_CARRY4_CARRY8/S[1]->microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/ALU_I/Use_Carry_Decoding.CarryIn_MUXCY/Using_FPGA.Native_CARRY4_CARRY8/CO[7],
 microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/ALU_I/Use_Carry_Decoding.CarryIn_MUXCY/Using_FPGA.Native_CARRY4_CARRY8/CO[7]->microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/ALU_I/Using_FPGA.ALL_Bits[24].ALU_Bit_I1/Not_Last_Bit.MUXCY_XOR_I/Using_FPGA.Native_I1_CARRY4_CARRY8/CI,
 microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/ALU_I/Using_FPGA.ALL_Bits[24].ALU_Bit_I1/Not_Last_Bit.MUXCY_XOR_I/Using_FPGA.Native_I1_CARRY4_CARRY8/CI->microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/ALU_I/Using_FPGA.ALL_Bits[24].ALU_Bit_I1/Not_Last_Bit.MUXCY_XOR_I/Using_FPGA.Native_I1_CARRY4_CARRY8/O[2],
 microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/ALU_I/Using_FPGA.ALL_Bits[24].ALU_Bit_I1/Not_Last_Bit.MUXCY_XOR_I/Using_FPGA.Native_I1_CARRY4_CARRY8/O[2]->microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Decode_I/Using_FPGA.Native_i_1__73/I0,
 microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Decode_I/Using_FPGA.Native_i_1__73/I0->microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Decode_I/Using_FPGA.Native_i_1__73/O,
 microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Decode_I/Using_FPGA.Native_i_1__73/O->microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Decode_I/PreFetch_Buffer_I1/Instruction_Prefetch_Mux[33].Gen_Instr_DFF/Using_FPGA.Native_i_1__33/I0,
 microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Decode_I/PreFetch_Buffer_I1/Instruction_Prefetch_Mux[33].Gen_Instr_DFF/Using_FPGA.Native_i_1__33/I0->microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Decode_I/PreFetch_Buffer_I1/Instruction_Prefetch_Mux[33].Gen_Instr_DFF/Using_FPGA.Native_i_1__33/O,
 microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Decode_I/PreFetch_Buffer_I1/Instruction_Prefetch_Mux[33].Gen_Instr_DFF/Using_FPGA.Native_i_1__33/O->microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/Operand_Select_I/EX_Branch_CMP_Op1_reg[22]/D,
 microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/Operand_Select_I/EX_Branch_CMP_Op1_reg[22]/D->superSink]

The GraphPath<TimingVertex, TimingEdge> object contains a path/set of edges. From each TimingEdge you can access the net delay and/or the logic delay values for more detail.

This example was designed to illustrate the default way to use the timing library to report the critical path and its associated data path delay.

Compare with Vivado

To compare the output of the RapidWright timing model to Vivado, run the following at the prompt:

vivado -mode tcl microblaze4.dcp

and then run the following command at the Tcl prompt:

report_timing

In Vivado 2018.3, the timing report shows a data path delay of 1.846 ns (1846 ps). Which has an error of 30 ps or ~1.6%. The full Vivado timing report is shown below:

-----------------------------------------------------------------------------------------
| Tool Version      : Vivado v.2018.3 (lin64) Build 2405991 Thu Dec  6 23:36:41 MST 2018
| Date              : Tue Dec 10 18:33:04 2019
| Host              : xsjclavin40x running 64-bit CentOS Linux release 7.4.1708 (Core)
| Command           : report_timing
| Design            : design_1
| Device            : xcvu3p-ffvc1517
| Speed File        : -2  PRODUCTION 1.23 10-29-2018
| Temperature Grade : E
-----------------------------------------------------------------------------------------

Timing Report

Slack (MET) :             0.058ns  (required time - arrival time)
  Source:                 microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/Operand_Select_I/EX_Branch_CMP_Op1_reg[8]/C
                            (rising edge-triggered cell FDRE clocked by TS_clk  {rise@0.000ns fall@1.000ns period=2.000ns})
  Destination:            microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Use_Debug_Logic.Master_Core.Debug_Perf/single_step_count_reg[0]/CE
                            (rising edge-triggered cell FDRE clocked by TS_clk  {rise@0.000ns fall@1.000ns period=2.000ns})
  Path Group:             TS_clk
  Path Type:              Setup (Max at Slow Process Corner)
  Requirement:            2.000ns  (TS_clk rise@2.000ns - TS_clk rise@0.000ns)
  Data Path Delay:        1.846ns  (logic 0.730ns (39.545%)  route 1.116ns (60.455%))
  Logic Levels:           7  (CARRY8=4 LUT2=1 LUT4=1 LUT6=1)
  Clock Uncertainty:      0.035ns  ((TSJ^2 + TIJ^2)^1/2 + DJ) / 2 + PE
    Total System Jitter     (TSJ):    0.071ns
    Total Input Jitter      (TIJ):    0.000ns
    Discrete Jitter          (DJ):    0.000ns
    Phase Error              (PE):    0.000ns

    Location             Delay type                Incr(ns)  Path(ns)    Netlist Resource(s)
  -------------------------------------------------------------------    -------------------
                         (clock TS_clk rise edge)     0.000     0.000 r
                                                      0.000     0.000 r  Clk_0 (IN)
                         net (fo=835, unset)          0.000     0.000    microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/Operand_Select_I/Clk
    SLICE_X75Y108        FDRE                                         r  microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/Operand_Select_I/EX_Branch_CMP_Op1_reg[8]/C
  -------------------------------------------------------------------    -------------------
    SLICE_X75Y108        FDRE (Prop_DFF2_SLICEL_C_Q)
                                                      0.081     0.081 f  microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/Operand_Select_I/EX_Branch_CMP_Op1_reg[8]/Q
                         net (fo=1, routed)           0.300     0.381    microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/Zero_Detect_I/Using_FPGA.Native_0[21]
    SLICE_X75Y108        LUT6 (Prop_C6LUT_SLICEL_I2_O)
                                                      0.088     0.469 r  microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/Zero_Detect_I/S0_inferred__3/i_/O
                         net (fo=1, routed)           0.010     0.479    microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/Zero_Detect_I/Part_Of_Zero_Carry_Start/lopt_5
    SLICE_X75Y108        CARRY8 (Prop_CARRY8_SLICEL_S[2]_CO[7])
                                                      0.155     0.634 f  microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Data_Flow_I/Zero_Detect_I/Part_Of_Zero_Carry_Start/Using_FPGA.Native_CARRY4_CARRY8/CO[7]
                         net (fo=1, routed)           0.026     0.660    microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Decode_I/jump_logic_I1/MUXCY_JUMP_CARRY2/jump_carry1
    SLICE_X75Y109        CARRY8 (Prop_CARRY8_SLICEL_CI_CO[1])
                                                      0.042     0.702 f  microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Decode_I/jump_logic_I1/MUXCY_JUMP_CARRY2/Using_FPGA.Native_CARRY4_CARRY8/CO[1]
                         net (fo=1, routed)           0.117     0.819    microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Decode_I/jump_logic_I1/MUXCY_JUMP_CARRY3/ex_jump_wanted
    SLICE_X74Y109        LUT4 (Prop_D6LUT_SLICEM_I0_O)
                                                      0.051     0.870 r  microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Decode_I/jump_logic_I1/MUXCY_JUMP_CARRY3/Using_FPGA.Native_i_1__100/O
                         net (fo=1, routed)           0.025     0.895    microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Decode_I/mem_wait_on_ready_N_carry_or/MUXCY_I/lopt_9
    SLICE_X74Y109        CARRY8 (Prop_CARRY8_SLICEM_S[3]_CO[7])
                                                      0.163     1.058 r  microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Decode_I/mem_wait_on_ready_N_carry_or/MUXCY_I/Using_FPGA.Native_CARRY4_CARRY8/CO[7]
                         net (fo=1, routed)           0.026     1.084    microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Decode_I/Use_MuxCy[7].OF_Piperun_Stage/MUXCY_I/of_PipeRun_carry_5
    SLICE_X74Y110        CARRY8 (Prop_CARRY8_SLICEM_CI_CO[4])
                                                      0.099     1.183 r  microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Decode_I/Use_MuxCy[7].OF_Piperun_Stage/MUXCY_I/Using_FPGA.Native_CARRY4_CARRY8/CO[4]
                         net (fo=324, routed)         0.472     1.655    microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Use_Debug_Logic.Master_Core.Debug_Perf/of_piperun_for_ce
    SLICE_X80Y99         LUT2 (Prop_F6LUT_SLICEM_I1_O)
                                                      0.051     1.706 r  microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Use_Debug_Logic.Master_Core.Debug_Perf/single_step_count[0]_i_1/O
                         net (fo=2, routed)           0.140     1.846    microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Use_Debug_Logic.Master_Core.Debug_Perf/single_step_count[0]_i_1_n_0
    SLICE_X80Y99         FDRE                                         r  microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Use_Debug_Logic.Master_Core.Debug_Perf/single_step_count_reg[0]/CE
  -------------------------------------------------------------------    -------------------

                         (clock TS_clk rise edge)     2.000     2.000 r
                                                      0.000     2.000 r  Clk_0 (IN)
                         net (fo=835, unset)          0.000     2.000    microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Use_Debug_Logic.Master_Core.Debug_Perf/Clk
    SLICE_X80Y99         FDRE                                         r  microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Use_Debug_Logic.Master_Core.Debug_Perf/single_step_count_reg[0]/C
                         clock pessimism              0.000     2.000
                         clock uncertainty           -0.035     1.965
    SLICE_X80Y99         FDRE (Setup_HFF2_SLICEM_C_CE)
                                                     -0.061     1.904    microblaze_0/U0/MicroBlaze_Core_I/Performance.Core/Use_Debug_Logic.Master_Core.Debug_Perf/single_step_count_reg[0]
  -------------------------------------------------------------------
                         required time                          1.904
                         arrival time                          -1.846
  -------------------------------------------------------------------
                         slack                                  0.058