Synopsys Mentor Cadence TSMC GlobalFoundries SNPS MENT CDNS

( ESNUG 485 Item 2 ) -------------------------------------------- [06/08/10]

Subject: (ESNUG 473 #1) A follow up eval of Atoptech's Apogee floorplanner

> Recently my group migrated our P&R flow to Atoptech.  To give you a little
> background, our team has been working in 65 nm for about 2 years.  We have
> a fairly wide range of chips and IP ranging from high speed (500+ Mhz)
> semi-custom DSPs and CPUs to SOCs with a wide range of IP from small, high
> speed blocks (100k inst, 800+ Mhz) to larger, slower speed (700k instance,
> 200+ Mhz) cores.  Our P&R flow is netlist-to-GDS, and our focus was in
> this area rather than a full suite from RTL-to-final-verification.
>
>     - [ Iron Man ]


From: [ Iron Man ]

Hi John,

I'm now writing to share our team's experience using Atoptech's Apogee top
level floorplanning and assembly environment.  We've been using Atoptech's
block PNR tool for several years and we felt the move to Apogee was a
natural transition as it would allow us to keep all of our physical design
in a single database/toolset.  We started our migration roughly a year ago,
and I'm summarizing our experience to date.

Our legacy chip assembly and floorplanning environment is a mix of home
brewed scripts and partnerships with a vendor to get native features.  Our
belief is the vendor should provide any features which are general purpose
in nature (aka useable by other customers), and provide us an API to code
features which are specific to our needs.  In the past, our top level tool
usage has focused primarily on basic floorplanning and assembly features,
with an emphasis on K.I.S.S. principles.  Our hope in transitioning to
Apogee was to get beyond baseline functionality into a more automated chip
level environment.

TWO CHIPS:

After our initial tool Q/A of Atoptech, we were ready for a production field
trial.  In our first two Apogee chips we focused on bringing up baseline
functionality similar to what we had in our past environment and focusing on
out of the box QOR.  Our first two chips we did were:

    SOC#1 - 65 nm, 5.1 M std-instances;  814 macros;  28 PnR blocks
            (excludes hard-macros, IP); 56K+ block pins

    SOC#2 - 65 nm, 4.8 M std-instances;  910 macros;  30 PnR blocks
            (excludes hard-macros, IP); 53K+ block pins

Both chips taped out on time using only Apogee for floorplanning & assembly
and have working silicon.


In regards to runtime and capacity, the following stats reflect SOC#1:

  - 25 minutes/13.0 G to (detail) merge all blocks into the chip level
    database.
  - an additional 12 minutes/4.0 G to read in SDC, extract RC, and get
    the full chip timing updated.
  - top level buffering (for design rule violations) based on groute
    took roughly one hour.
  - final top level routing and cleanup took 90 minutes.


Our general experience with Apogee from these two chips:

  1. Apogee has a very unique hierarchical database which is accessible
     via TCL.  Traditionally floorplanners model the data either in a
     psuedo flat fashion (where hierarchy is somewhat available) or where
     hierarchy is abstracted out and you cannot access level N+1.  In
     Atoptech's database, all levels of hierarchy are available and
     you can access data in a very natural form and with filters to
     restrict or allow hierarchical traversing.  You can access timing,
     logic, and physical data throughout the entire hierarchy of your
     design, with read/write access.

     There are no inherent two-level hierarchy restrictions or flattening
     involved.

     Since hierarchy is maintained, it allows the tool to automatically
     abstract timing information if necessary, rather than forcing the user
     to decide whether they wish to see everything (bad runtime) or the top
     only (incomplete data).  Furthermore, you can easily push down/pull up
     data across the hierarchy and also selectively flatten blocks.

  2. At first you couldn't manipulate your hierarchy with Apogee if your
     physical and logical partitioning were different.  This was fixed in
     April '10 but we haven't had time to test this.

  3. Apogee's padring router can route complex multi-domain padrings with no
     manual intervention and produce no DRC or LVS errors.  It could place
     and route both single and dual ring structures as well as flipchip and
     wirebond structures.  (Most tools we've seen still have the model that
     a padring is a single ring of homogenous pads... something we haven't
     had for a long time.)  Atoptech's padring constraints format allows us
     to accurately describe how our padrings look.

  4. Apogee's GUI let us manipulate complex floorplans including rectilinear
     block shapes, macro block and pad placement, custom (analog) net
     routing, etc.  It scrolls and views with hierarchy fairly efficiently.
     Base layers are viewable by merging in the GDS views... but it would be
     nice though if they could have 2 separate palates for layers... one for
     routing layers and one for base layers.  Most times the user is only
     concerned about routing layers, and its cumbersome to have to view all
     layers on a single palate (although it seems most tools go this way.)

  5. Apogee generates OK pin assignments out of the box.  Some hand tweaking
     is still currently required.

  6. Apogee let's you deal with top level clock/signal buffering, and its
     QOR is good.  It reduced our overall clock insertion delay and OCV
     impact on timing.

  7. Top level routing works.  We have hundreds of long clock nets at top
     level and Apogee was able to route them all with extremely low jitter
     in a very tight channel automatically.  (In our previous tool, we had
     to either manually route the clock nets or go through quite a few
     iterations to fix clock jitter violations -- despite setting nondefault
     routing rules -- for clock balancing.  This affected a lot the full
     chip  timing closure cycle.)  In Apogee, this was pretty much done in
     one shot, no iterations.  Results are also predictable (i.e. they do
     not vary from run time to run time, which is very important for clock
     balancing).  We ran into one issue regarding proper extraction of non-
     preferred routing, but I believe this is resolved now.

  8. Top level LVS and DRC correlated very well with Calibre, and the top
     level router produced few DRVs that needed hand cleanup.

  9. Our chips required a lot of custom routing at the top level.  Custom
     (param) route features (PG route, padring route, differential route,
     etc.) were pretty good overall.  Some enhancements are needed.  The
     tool still in some cases produces DRC violations on custom routed nets;
     moreso than detail routed nets.

 10. Hierarchical floorplan interaction (passing floorplan constraints
     between levels) was smoothly handled in Apogee's system, with Atoptech
     tailoring handoff user-options to our needs.

The negative experiences (cons) during our first two chips:

  1. Interactive routing editing needed improvement.  Had issues custom wire
     editing; it could not drop vias properly and it would sometimes create
     DRC violations.  Still some issues on this, but they're minor.  They
     tweaked it so our interactive cleanup experience was easier on the 2nd
     chip tapeout, but we look forward to a more gui-friendly automation.

  2. Database reference library management was not as flexible as we'd
     prefer.  Loading a full hierarchial database, for example reading a
     Verilog netlist for a block, with all of its .libs and LEFs and then
     doing an incremental update of when they all changed is a mess.  You
     have to track each .lib and LEF individually -- same as the other
     P&R tools.  We're hoping they'll fix this.


Some nice timing-related features:

  - Native SDC 1.7 support
  - Multi-threaded RC extraction, tight correlation to StarRCXT
    at both 65 nm and 40 nm nodes
  - Tight timing/si correlation to our signoff STA tool.  We have
    tested correlation with PT-SI, PT/CeltIC and Goldtime flows
    both at 65 nm and 40 nm.
  - CCS (noise + delay) models as well as OCV and AOCV timing
  - MCMM timing analysis, either multi-threaded or distributed
  - Database natively supports hierarchical design
  - PnR block databases can be merged into the chip level
    database very quickly
  - User chooses merging the sub PnR block database at chip level
       hard_hier mode: sub PnR block is in read-only mode
       flat mode: sub PnR block is in read/write mode.
    This means tool can do full chip timing optimization.
  - Tool will use physical abstract model from each sub PnR block
    database for top level PnR.
  - Tool won't need to re-extract the RC for sub PnR blocks.
  - Tool can automatically identify the interface logic in each
    sub PnR block during the full chip timing update and thus timing
    update is very quick.

TWO MORE CHIPS:

Our second round of Apogee chips focused on transition to 40 nm and code
stability.  From a 40 nm standpoint, the main points we focused on:

  - Correlation to signoff timing and extraction (Goldtime/STAR-RCXT)
  - Low power functionality including voltage islands and power down.
  - 40 nm DRC correlation with Calibre (and more importantly the
    ability to not create DRCs in 40 nm.)

Correlation to signoff at 40 nm was excellent and both chips successfully
taped out.  We additionally worked with Atoptech on developing native RDL
(signal + PG) routing abilities rather than using home brew scripts.  Our
initial goal was to have these two testchips use Apogee's RDL router,
however the solution was not delivered in time for these two chips, so we
will hope to deploy them on the next round.  We do have working beta code
we are working with in-house, so they are fairly close.

AND EVEN TWO MORE CHIPS:

Our third round of Apogee chips focused more on cleaning up the issues we
had in our first round.  Again, 2 larger 65 nm SOCs were used.  There's
nothing really earthshaking to report here, as a lot of this focus was to
continue to improve overall QOR and baseline features.  A lot of the effort
was removing hacks or workarounds we had in our first round of tapeouts...
to transition to a flow which is production worthy and can be leveraged
across all of our chips.

We are now done with what we consider Phase 1 of our deployment of Apogee.
We have proven the ability to get baseline functionality of the tool up,
with good out of the box QOR at both 65 nm and 40 nm across a variety of
chips.  Apogee will now be the default tool for floorplanning and chip
assembly in our business unit at 40 nm and we have phased out support for
our legacy tool (although other groups in our company are still using it.)

We are now in Phase 2 of our deployment plans.  This includes supporting
widescale deployment within our group, and from an R&D standpoint to focus
on advanced Apogee features:

  - Feedthru (over the block routing) - Our designs use a semi-abutted
    methodology using some homebrew scripts.  We have been working with
    Atoptech to use native physical driven feedthru insertion.  While
    we chose not to use these on our initial chips, another group
    internally has been working with Atoptech on this functionality and
    show that this function is production worthy, providing solid QOR for
    standard scenarios as well as our corner case scenarios.  We plan to
    leverage their results and will be migrating to Apogee's native
    solution for this function going forward.

  - Timing Budgeting - In our methodology, our IP providers generate the
    constraints for their blocks, and work with the SOC team to derive
    the chip-level constraints.  This leaves a gap on IO timing budgeting.
    Our first approach towards chip timing closure involves coming up with
    a methodology to propagate IO constraints to the PNR blocks.  In the
    past we have used some homebrew scripts to derive these constraints
    (with mixed results).  We have been Q/A'ing Apogee's native timing
    budgeting features and the results, while not production worthy for
    our needs yet, look promising, and we plan to deploy them on a trial
    chip in the next few months.

  - Timing Closure - even with good timing budgets, some timing issues will
    still exist at the top level.  Apogee has a very neat hierarchical
    database which allows for a very naturally hierarchical full chip
    timing closure flow.  Their tool can automatically extract interface
    logic models on the fly to reduce runtime and avoiding costly
    incremental updates.  We have already correlated their full chip timing
    to our signoff full chip STA and we are approaching the accuracy we
    feel necessary to deploy this feature in production (this includes
    abstract generation, timing graph and exception interpretation,
    extraction correlation, timing correlation using CCS noise/delay and
    AOCV in 40 nm).  We plan to deploy this feature on a tapeout happening
    in Q3 of this year (with initial work starting in late Q1).

  - RDL routing - We've been working with Atoptech's R&D to provide an
    optimized RDL signal router to replace our in-house solution.  Although
    this Apogee solution wasn't utilized in the last round of tapeouts, we
    have been provided code and continue to work with them to handle all of
    our corner cases.  We feel it should be production worthy in our next
    round of tapeouts.

  - Top-level CTS -  This is still a work in progress but something we hope
    they deploy for our Q3 tapeouts.

  - Hierarchy manipulation - Our chips are designed bottom up base on the
    IP we are provided.  As our chips continue to grow in size, we need
    the ability to implement blocks that combine numerous IP blocks, rather
    than requiring each IP is its own physical partition.  We have worked
    with Atoptech to specify the baseline functionality needed for this
    functionality (from a floorplanning, partitioning and constraint
    generation standpoint) and are in the process of testing this out with
    a hope to deploy it on a trial basis on one of our next round of chips.

So in total, our business unit has done 10 tapeouts using Apogee, both at
the 40 nm and 65 nm nodes, with working silicon back.  We've completed our
initial transition to Apogee and it's our default floorplanning/assembly
environment for 40 nm and below.  

  - [ Iron Man ]

Join Index Next->Item

"Relax. This is a discussion. Anything said here is just one engineer's opinion. Email in your dissenting letter and it'll be published, too."
This Web Site Is Modified Every 2-3 Days
Copyright 1991-2024 John Cooley. All Rights Reserved.
| Contact John Cooley | Webmaster | Legal | Feedback Form |


   !!!     "It's not a BUG,
  /o o\  /  it's a FEATURE!"
 (  >  )
  \ - / 
  _] [_     (jcooley 1991)