Synopsys Mentor Cadence TSMC GlobalFoundries SNPS MENT CDNS

( ESNUG 486 Item 4 ) -------------------------------------------- [10/08/10]

From: Thierry Sejourne <thierry.sejourne=user domain=st got calm>
Subject: A first look at Atrenta SpyGlass Physical for pre-RTL/RTL design

Hi, John,

I know you like hearing from users about new tools, so I thought I'd
write about Atrenta's new SpyGlass Physical which was announced this June.
We've used it for the past 18 months to help us define the best physical
architecture for our 32 nm, 15 M instance design (in both the pre-RTL and
RTL stages) that we're currently taping out.  This design has an on-chip
bus that connects all of its IP blocks.

It's tough to define a physical architecture at the pre-RTL/RTL stage 
because we don't have any real physical information then.  Yet different
architectural implementations have a major impact on our design specs;
and some architectures are simply not suitable for P&R.  This is
particularly true for our on-chip bus design.

We used Spyglass Physical to define the best physical architecture for our 
design at RTL or Pre-RTL instead of at the netlist level.  By using it at 
these stages we were able to avoid costly respins at the end of our design 
process (RTL and later).  SpyGlass Physical effectively gives us an early 
top-level floorplan at the RTL level that includes big block assignment and
pad location.  It can provide preliminary design results based on SOC 
abstraction (blackbox based, with many ways to defined timing of the 
blackbox).  Spyglass Physical helped us optimize how we:

     - group all our IP blocks into defined Physical Units.
       (e.g. 4 or 8 physical units.)

     - define our on-chip bus physical architecture to allow
       the split into the defined Physical Units.

SPYGLASS PHYSICAL'S KEY FUNCTIONALITY:

  1. Predict performance using a higher level of abstraction. 

     - We can use incomplete RTL for Spyglass as long as our individual 
       modules can link.
     - Timing constraints can be restrained to only places we want to 
       verify.  For example, SpyGlass Physical didn't go inside our 
       design's black boxes, which were represented with physical 
       envelopes and timing models above the wire level. 
     - We had the option to expand our RTL netlist, for example for the 
       on-chip bus.

  2. A "physical toolbox" which is a graphical environment that can 
     generate tables, bar charts, graphs and  text reports for making 
     implementation trade-offs:

     - Clock browser tool for exploring clock distribution (driver pins, 
       slack observed, period, latency, and the cluster that is fed by
       the clock domain).
     - Physical Unit browser (can list instance count, the number of 
       hard macros, the number of IOs of each Physical Unit and the way 
       they are connected to other Physical Units).

  3. We used its physical toolbox to analyze our floorplan partitioning
     options based on our own metrics:

     - Timing; in our case, high-speed clocks were the dominant drivers
     - Congestion; managing the number of nets crossing our 'Physical 
       Units' (grouping of IP) to be able to reduce congestion later 
       during the place and route stage
     - Area (linked to instance count)
     - Power (you can localize power domains, but this wasn't a factor 
       for our set-top box)

INPUTS AND OUTPUTS

Inputs: SpyGlass Physical can work with incomplete RTL or constraints 
(good black-box handling, deals with incomplete SDC and so on).

Outputs: SpyGlass Physical reports congestion, timing, grouping between the
blocks.  Since this was our first use of the tool, we used the text reports
as input and still manually replicate the grouping in Cadence environment.

TOOL RUNTIMES

We used Synopsys DC/DC-Topo for synthesis.  We used Cadence SOC Encounter or
Mentor Olympus for place and route, as well as Synopsys IC Compiler for some 
macros.  The maximum size of our defined Physical Units (block groupings)
was at about 2 M instances to allow for reasonable P&R runtimes.

The benchmark data below is based on the following design:

     Size: 7 M placed instances, 209 M Transistors, 500 signal pads
   Clocks: 230 total clocks (80 created, 150 generated)
   Blocks: 53 RTL IPs, 40 RTL blocks/glues and 160 Hard IPs,
           8 different IO libraries, 477 memory cuts

Our Spyglass Physical runtimes were as follows:

  - Data prep (tech pre-compilation done once): 4 hours
  - RTL floorplanning & prototyping: 4 hours per run
  - SOC Encounter netlist floorplanning & prototyping: 8 hours per run
  - SOC Encounter or Olympus P&R to timing: 1 week per Physical Unit

DESIGN FLOW AND PROJECT TIME COMPARISONS WITH AND WITHOUT SPYGLASS PHYSICAL

Methodology A: (no SpyGlass Physical)
               No prototyping/partitioning: 37 weeks

   1. RTL Design
   2. Synthesize design, analyze results and iterate at RTL level
   3. Run place and route, analyze and iterate multiple times
      (potential RTL changes discovered at the end of the P&R).
      May never converge if physical architecture is wrong.

Methodology B: (no SpyGlass Physical)
               Prototyping/partitioning at gate-level only: 28 weeks

   1. RTL design
   2. Synthesize design analyze results and iterate at RTL level
   3. Use SOC Encounter for gate level floorplanning and partitioning 
      choices.  Analyze the results, modify the design and iterate 
      (potential RTL changes).
   4. Implement the design using 1 set of place and route runs (though
      it can take a second set).

Methodology C: (using Spy Glass Physical)
               Prototyping/partitioning at pre-RTL/RTL level: 22 weeks 

   1. RTL design
   2. Optimize physical architecture with SpyGlass Physical at pre-RTL 
      through full RTL level.  Stabilize the architecture, e.g. split
      the on-chip bus and define Physical Units (groups of blocks).
   3. Run SOC Encounter floorplanning.  Analyze the results and refine 
      the partitioning changes.
   4. Implement the design using 1 set of place and route runs (though
      it can take a second set).

We estimated that our total project time savings using SpyGlass Physical 
was:
     -  6 weeks or 20% vs gate level partitioning & prototyping.
     - 15 weeks or 40% vs pre-partitioning with no physical information.

However, the typical savings, while important, was not nearly as important 
as our schedule predictability.  It was important to be confident early
on that we have a successful implementation.  It's extremely hard to make 
changes late in the design cycle.

The complexity of our design made doing this pre-RTL/RTL optimization 
with SpyGlass Physical a must.  For 3-5 M instances, you think you can
understand your chip with your brain and diagrams.  Above that level, SW
automation is needed to take care of your corner cases.  You can hit the 
wall because what you physically implemented simply may never work.

SPYGLASS PHYSICAL'S LIMITATIONS

  1. We had to go through a learning cure to use the tool; it's wasn't
     magic at first.  We got better results with time.

  2. We feel users must guide the tool vs letting the tool find the optimal
     solution, causing us to always refine its initial placement (which can
     always be improved by user guides).  Because of this, we want the tool
     to do incremental placement for the refinements rather than starting
     from scratch each time -- which is less predictable.  This is not yet
     available in SpyGlass Physical.

  3. Because it lacks an incremental process, when we add new physical
     constraints, the results from SpyGlass Physical can be unpredictable.

  4. Atrenta uses a cluster model for placement with non-rectilinear shapes
     for black boxes -- which gives optimistic or pessimistic results.  A
     workaround exist by allowing black-box overlaps, but this means manual
     control.

  5. There were some situations where Spyglass Physical hadn't optimized the
     RTL as well as Design Compiler had.  Its estimates were noticably off
     in these few circumstances.

With Spyglass we are able to move our physical partitioning decisions up to 
the RTL/Pre-RTL level instead of the netlist level.  Its two big advantages
were the ability to direct our partitioning choice by our own defined 
metrics with downstream predictability, and its runtime was only 4 hours.

Additionally, Atrenta R&D support was good; we had a local AE and typically
had fixes within 3 days.  We've 2 tapeouts with Spyglass-Physical, including
one in 32 nm.   We are now deploying it in other groups of our division.

    - Thierry Séjourné
      STmicroelectronics                         Grenoble, France

Join Index Next->Item

"Relax. This is a discussion. Anything said here is just one engineer's opinion. Email in your dissenting letter and it'll be published, too."
This Web Site Is Modified Every 2-3 Days
Copyright 1991-2024 John Cooley. All Rights Reserved.
| Contact John Cooley | Webmaster | Legal | Feedback Form |


   !!!     "It's not a BUG,
  /o o\  /  it's a FEATURE!"
 (  >  )
  \ - / 
  _] [_     (jcooley 1991)