Implementation, in the context of RapidWright and compiling designs for FPGAs, is defined as the placement and routing of a synthesized/mapped netlist to a specific FPGA device. This section will describe the detailed mechanics of how placement and routing can be achieved in RapidWright.
As opposed to Vivado, RapidWright enables three layers or levels of placement in its design abstraction: BEL level, site level and module level. Vivado primarily only enables BEL placement (previously in ISE, sites were the major unit of placement). This section details how RapidWright represents and interacts with design elements at the three levels of placement mentioned.
Reliable automatic BEL placement in RapidWright is still a work in progress and care should be taken when attempting this capability.
Creating correct BEL placements is quite tricky as several factors must be taken into consideration when placing a cell onto a BEL site. Some questions one might need to ask when placing a cell onto a BEL site are:
Is the BEL site already occupied and are all pins map-able to the surrounding BEL connections?
Are all of the cell connections routable within the site and interconnect?
Are the clock and set/reset domains compatible with those already used within the site or are there resources available to route alternatives?
Does this cell depend on any dedicated inter-site wires (such as carry chains or DSP cascades) that are not available?
Placing a cell correctly can necessitate updates to the design in the following categories:
Mapping of a
Cellobject to a
Pin mappings between the logical and physical cell pins must be added and/or routed within the site (conditions will vary).
Use of one or more SitePIPs as part of routing the site (stored in the respective
Generic pin mappings are assigned when a cell is created and placed. However, these mappings may need to be adjusted based on the context.
A SitePIP configures a routing BEL to propagate a signal from one of its inputs to its output pin. SitePIPs must be turned on in the respective
SiteInst when a cell is placed onto a BEL as the common convention in Vivado is to always leave the site in a legally routed state.
Within RapidWright, it can be straightforward to move a
SiteInst from one site to another. An example of how to relocate a site instance from one location to another is shown below:
Design d = Design.readCheckpoint("example.dcp");
SiteInstance si = d.getSiteInstanceFromSiteName("SLICE_X0Y0");
The user is responsible for changing any existing routing resources that previously routed to the old site.
One of RapidWright’s unique capabilities is providing another level of hierarchy in implementation. Through the
ModuleInstance classes, a complex cell can be replicated and/or relocated across the device. When a pre-implemented module is created for a device, all valid locations are pre-calculated and stored for the anchor site within the
Module. Therefore, placement of a
ModuleInstance is simply selecting one of the valid anchor sites and applying it.
In Vivado, there is roughly three different types of routing: intra-site, inter-site and clock routing. This section provides a brief overview of each.
Site (Intra-site) Routing¶
When a cell is placed onto a BEL, typical Vivado convention is to route the intra-site net portions immediately after. Routing a site implies mapping the physical net to site wires and site PIPs. In RapidWright, some of this intra-site routing happens when the cell is placed and there are a few methods that can also help finish intra-site routing in special cases.
SiteInst.routeIntraSiteNet() will attempt to route one BELPin to another for intra-site nets.
SiteInst.routeSite() will attempt to route all the nets that pertain to the site.
Interconnect (Inter-site) Routing¶
The majority of work in routing a design is in inter-site routing. This is the task of selecting a set of routing resources the enable a path between a source site pin and one or more sink site pins. The physical routing of a net in RapidWright is simply described by a list of PIPs. RapidWright comes with a rudimentary router for UltraScale architectures, but it is still a work in progress. It doesn’t fully resolve congestion, but provides a working example for more specialized tasks.
Clock routing is very architecture specific and is similar to inter-site routing in that it is also implemented by a list of PIPs. However, there are key steps and constraints that must be satisfied beyond typical inter-site routing.
RWRoute is a full design router that has been developed in the RapidWright framework leveraging its lightweight timing model. It is capable of routing designs in both wirelength-driven and timing-driven modes, enabling the open source community to innovate and develop new algorithms. The open source aspect enables creation of domain-specific algorithms such as bundle routing in customized cost functions for the desired figure of merit. It also supports a partial routing mode, which is an essential capability for a future library-based customized flow.
RWRoute has some limitations:
It currently only supports UltraScale+ devices.
The timing model in RapidWright does not estimate hold time and thus RWRoute cannot address hold time violations.
For the most accurate clock routing in timing-driven mode, certain files will be need to be created (see tcl/rwroute/README for more information).
When attempting to route designs in timing-driven mode, for the most accurate timing estimates on hard blocks (such as DSPs), the design must be pre-analyzed and a set of files must be created to feed into RWRoute (see tcl/rwroute/README for more information).
By default, RWRoute runs in timing-driven mode, routing a design from scratch. To run an instance of RWRoute the syntax is:
rapidwright RWRoute /PATH/TO/INPUT/DCP/design.dcp /PATH/TO/OUTPUT/DCP/design_routed.dcp
In both run instances, with the following options available:
[--nonTimingDriven] for wirelength-driven routing. RWRoute is non-timing-driven with this option, relying on the Manhattan distance to guide the routing expansion and optimize total wirelength.
[--partialRouting] for partial routing. RWRoute strictly preserves routed nets of a design and works only on the unrouted nets of the design.
[--softPreserve] for enabling an experimental feature during
--partialRouting, allowing RWRoute to rip up and re-route otherwise routed (and strictly preserved) nets.
[--wirelengthWeight <arg>] to redefine the wirelength weighting factor. The greater alpha is, the less runtime the router takes, at the expense of longer wirelength.
It is within [0, 1]. Runtimes usually converges when alpha is larger than 0.7. The default value is 0.8.
[--timingWeight <arg>] to redefine the timing-driven weighting factor. The smaller the timing weight is, the better critical path delay will be, at the expense of longer runtime. It is within [0, 1]. The default value is 0.35.
[--shareExponent <arg>] to redefine the sharing exponent for timing-driven routing. It is used to control the routing resource sharing when routing connections. When the sharing exponent is 0, the sharing mechanism is criticality-unaware and encourages resource sharing, even when connections are long and timing-critical. With an increasing sharing exponent, the resource sharing is discouraged for critical connections, allowing more suitable routes for them to optimize timing. As a result, the wirelength and routing time are increased. For an effective criticality-aware sharing mechanism, the sharing exponent should be no less than 1. The default value is 2 for an optimized trade-off between the critical path delay reduction and the wirelength-runtime product increase.
There are three tutorials that provide information about using RWRoute in different routing modes:
For all other configuration options, please refer to src/com/xilinx/rapidwright/rwroute/RWRouteConfig.java.