( ESNUG 442 Item 12 ) ------------------------------------------- [03/24/05]
Subject: ( ESNUG 441 #1 ) Two Users Benchmark ACS along with the DC XG Mode
> Although we will continue to support the non-XG mode, I would encourage
> users to move to XG mode. XG mode will become the default in the
> 2005.09 synthesis release.
>
> - Savino Grillo
> Synopsys, Inc. Sunnyvalle, CA
From: Ulrich Zaus <ulrich.zaus=user domain=infineon spot calm>
Hi John,
For one of our blocks (500 K instances, 77 K flops and 5 K inferred clock
gates in 130 nm) we had huge runtimes with an ACS flow and DC 2003.06.
In particular the ACS budgeting made problems. On recommendation of a
Synopsys FAE we tried out the XG mode in the DC 2004.06-SP2-3 release.
The results were impressive. A top-down compile of our block now takes
less than 9 hours on an Opteron machine. And by dividing the flow into
several steps it is possible to keep the memory usage so low that we can
use the faster 32-bit (Linux) version of DC.
To get there we had to make some changes to our scripts of our Infineon
standard DC flow:
1) For all read/write operations we replaced "db" by "ddc" file format
to get the capacity of XG.
2) The DFT UI is new in XG, so all our DFT commands needed to be changed.
Fortunately Synopsys provides a translate program that took care of
most of the changes.
3) Some commands are modified or not supported in XG, so we had to make
a few further adaptations in our scripts.
The modifications took us about one week until we could generate a netlist
using the XG flow.
Then we saw that we had a problem with scan insertion and trouble began.
dft_drc showed us lots of uncontrollable registers. This was caused by
clock gating cells which were not correctly connected to scan enable.
Full analysis and finding a workaround for the problems took us 2 weeks.
The difficulties were caused by 2 bugs in the hookup_testports command.
As a workaround, hookup_testports needed:
1) a previous uniquify of the design. To avoid an explicit uniquify
after elaboration, we moved the hookup behind the compile step which
does an implicit uniquify.
2) use of the 64-bit (amd64) version of DC, because the 32-bit (Linux)
version generated GTECH inverters in the scan_enable tree of our
design, which didn't belong there, and which made some clock gating
cells and all registers in their fanout uncontrollable for scantest.
Therefore our final flow consists of following steps:
1) Analyze RTL
2) Elaborate and insert clock gates
3) Top-down compile
4) Hookup testports of CG cells (64-bit version(amd64))
5) Insert scan (no optimization)
6) Incremental compile
Between the steps the design database is passed on in ddc format.
Overall the process took us about 3 weeks to completely move to XG, but
for the price of a simple top-down compile with an overnight runtime
for a 500 K instance design, it was well worth the effort. We do clock
gating with Power Compiler and scan insertion with DFT Compiler. XG is
not in the Infineon standard flow but with these results we will
certainly need to consider moving more designs to it.
- Ulrich Zaus
Infineon Technologies Munich, Germany
---- ---- ---- ---- ---- ---- ----
From: Chih-Hao Chung <ch-chung=user domain=ali.com.tw>
Hello John,
For over 3 years we have used ACS (a cool feature in DC) to improve our
compile times. ACS partitions large design blocks into smaller manageable
sizes with each block budgeted based on top level constraints. Synth is
distributed over a Linux farm. A simple example of our script is below:
acs_read_hdl design.v
set_clock_gating_style
par insert_clock_gating
source constraints.tcl
set_compile_partitions -auto -force
acs_set_attribute TestReadyCompile true
acs_compile_design
par acs_refine_design
par
We recently benchmarked ACS 2004.06-SP2 against the 2004.12 release with
some impressive results. We used ACS on 150K instance block (VIDEO_CORE)
and partitioned it into 3 sub-blocks and submit the compile jobs to 3
Linux machines. It takes 12 hours in ACS for a multi-pass flow to
complete in both releases, but the QoR is much better.
DC ACS 2004.06-sp2 DC ACS 2004.12 Diff
------------------ -------------- ------
Area 5,218,624.5 4,883,880.0 -6.41%
TNS -4.44 -4.32 -2.70%
WNS -1,621.9 -1,086.76 -32.99%
We also ran the comparison in XG-mode and since XG-mode is supported
natively in ACS we had to change absolutely nothing to get our flow to run.
The QoR was even better with XG with a 28% reduction in memory footprint.
DC ACS 2004.06-sp2 DC ACS 2004.12 Diff
------------------ --------------- ------
Area 5,184,447.5 4,810,787.5 -7.21%
TNS -4.2 -4.33 3.10%
WNS -2,204.81 -1,162.4 -47.28%
We are very pleased with the results. ACS with XG-mode allows us to
synthesize even larger designs faster with better QoR.
- Chih-Hao Chung
Acer Labs Taipei, Taiwan
Index
Next->Item
|
|