( ESNUG 149 Item 1 ) ---------------------------------------------- [9/93]
> Subject: (ESNUG 148 # 4) "Clock Tree Insertion"
>
> Now this guy has been synthesizing his single phase, flip-flop based,
> synchronous design with the default ideal Synopsys clock. He of course
> now has several hundred flops connected to a single clock signal. He
> recognizes that a balanced clock tree must be created and inserted. He
> would be concerned on how designers:
>
> 1) determined the necessary size of the tree
>
> 2) chose the flops to be connected per branch to assure least
> susceptability to clock skew
>
> 3) where & how the netlist was modified to reflect these new connections
--- --- ---
From: rainer@mucsun.sps.mot.com (Rainer Makowitz)
I am regularly involved in 'clock tree synthesis' in the 'back end', i.e. the
Physical Design System. As clock skew depends heavily on placement of Gate
Array macros, the only scenario today to achieve minimal clock skew is to do
it in the 'back end'. From a designer's point of view this means: leave it
to your ASIC supplier to construct an optimal tree and backannotate the new
structure into your netlist.
By the way: basically, all high-fan-out nets can be subject to 'clock tree'
synthesis in the back end, e.g. reset or scan enable. However, nets that
drive more or less static signals (where speed is not important) can be
synthesized with a buffer structure in Synopsys, provided you exploit
hierarchy and use a floorplanner. The limiting factor for cluster size would
be the 'edge rate' limit, that can be set in the Design Compiler.
Besides clock skew, clock insertion delay is also important, as long buffer
delays are susceptible to temperature/voltage variations on large dies that
may contribute unpredictably to skew.
To give you a brief overview on our methodology:
1. for every technology we use an algorithmic estimator to optimize
the buffering stages (buffers/inverters, array size are input;
stage size, drive strength are output)
2. a special dummy buffer that represents the clock tree delay to the
simulator is removed from the netlist;
3. netlist is extended automatically for the buffers and nets (this
is the 'synthesis') after placement of all Gate Array macro cells;
this process makes sure that clusters are locally interconnected
and buffers are located in the center of clusters;
4. routing of clocknets is done before anything else;
5. clock tree is backannotated into the netlist that the 'front end'
designer uses for his simulation;
With today's CAE tools you can achieve clock skew below 1ns for the bigger
array sizes and approx. 3000 FF's. There is significant developent under way
to reduce this figure by a 'zero skew' methodology. Floorplanners will also
offer clock tree 'synthesis' in the future.
- Rainer Makowitz
Motorola ASIC, Munich
--- --- ---
From: cindy@zoran.hellnet.org (Cindy Eisner)
well, the asic vendors i am familiar with provide you with their own tools to
do the clock tree. it usually goes like this: you count your registers, and
use their people or documentation to figure out what kind of skew you can
expect. then you synthesize using clock skew like this:
"set_clock_skew -uncertainty 1 dclk"
which means that you think you will have a maximum of 1 (ns) clock skew.
synopsys uses this information to shorten the critical paths by 1 ns, and
to add delay for hold time on the shortest paths. notice that you are
still using one clock for the entire design.
you typically will be provided with a dummy cell to use for pre-layout
timing simulation that will show the correct delay on the tree (but not
the skew).
then, you take the synopsys output that still uses one clock and use the
vendor's tools to create the clock tree and route it. even though you've
synthesized using set_clock_skew, you will probably still have control over
which flip-flops get the same branch of the tree. use your knowledge of
the design to decide - usually hierarchy helps. notice that theoretically,
you should have no problem even if you route the clock tree totally
randomly, because set_clock_skew takes care of setup as well as hold-time
problems.
hope this helps,
- Cindy Eisner
Zoran Microelectronics LTD
--- --- ---
From: rac@lsil.com (Robert Cottrell)
Insertion of a balanced clock tree has to take account of layout, so that the
tree is balanced not only in terms of the numbers of flip-flops attached, but
also wire lengths. This is particularly important in sub-micron technologies
where wire delays dominate over gate delays. Clock tree insertion could
therefore be the responsibility of the ASIC vendor, with, of course, the user
verifying the skew figures after layout.
Another useful point is that Synopsys allows you to specify some uncertainty
in your clock arrival time at flip-flops, which gets factored into the
optimization constraints so that your design will continue to function
correctly in the presence of skew.
- Dr. Robert Cottrell
LSI Logic Europe
--- --- ---
From: Oren Rubinstein <oren@waterloo.hp.com>
I have a few comments about this issue:
1. When you first synthesize your design, you have to assume some clock
skew, and tell Synopsys about that (set_clock_skew -uncertainty). If
you do that, Synopsys will take care of most of the problems for you.
2. The clock tree insertion method depends on what kind of design you
are doing.
For gate arrays, the clock tree is already there, and you can find
out from the vendor what skew you have.
For standard cells designs, some vendors like to layout a "standard",
proven, clock tree from the beginning, and again you can find out
about the skew.
Another method would be to have your vendor insert the clock tree for you,
in which case the maximum skew should be written in the contract, and you
tell Synopsys that number.
Finally, if you want to do it yourself, Synopsys 3.0b introduced a new
command (ballance_buffer), that does exactly that. I would advise you to
avoid this method. The kind of analysis that Synopsys does is not accurate
enough for a clock tree, because it doesn't take into account the layout.
To create and validate a good clock tree, you need the accuracy of SPICE, so
you run the risk of having unpleasant surprises when it is too late to fix.
- Oren Rubinstein
Hewlett-Packard (Canada) Ltd.
( ESNUG 149 Item 2 ) ---------------------------------------------- [9/93]
Subject: (ESNUG 148 # 5) "report_timing Hates Busses / Loves Individual Bits"
> Inside a simple FIFO design I ran into some post synthesis hold violations
> in simulation. (This kind of surprized me because all the Synopsys reports
> seemed happy concerning all the timing constraints.) Anyway, the simulation
> said that the address pins on the dual port memory instantiated in
> the FIFO had the hold problems. I went back into Synopsys to double check
> things. (BTW the memory instance name is "mammy" and the address pins are
> a multi-bit vector called "A".)
>
> Here's what I found:
>
> dc_shell> report_timing -to mammy/A
> Warning: Can't find object 'mammy/A' in design 'fifo'. (UID-95)
> Error: Design object list required for the '-to' argument. (EQN-19)
>
> What this means is that Synopsys doesn't like vectors, but single
> bits as far as "report_timing" is concerned in rev. 3.0b. My gut hunch
> is that there is probably a whole set of Synopsys commands that dislike
> vectors but I don't have the time to search them out.
From: uunet!uranus!splinter!flieder (Jeff Flieder)
John,
I am writing in response to ESNUG Post 148 Item 5, in which you were
saying that V3.0b does not understand vectors on the point timing command.
The reason for this is that report_timing -to * is a request for a "point
timing" report, so you have to give the command a point, not a collection
of points. It is true that it would be nice if the tool would understand
vectors, and give you point timing to each bit, but this is a minor problem
compared to all the other real glaring problems with the dc_shell interface.
There is a fairly easy way to get timing for all of the points on the bus,
however. Just run the following script:
find(pin,"A[*])
foreach (pinName,dc_shell_status) {
report_timing -to pinName
}
This will find all the pins on the "A" bus and run a report_timing on each
one of them.
Oh, and as far as your hold violations, did you turn on fix_hold, cuz if
you didn`t Synopsys will indeed ignore hold time violations.
- Jeff Flieder
Ford Microelectronics
[ Editor's Note - I did use fix_hold; it turned out that there were modeling
differences between the ASIC vendor's beta Synopsys library and beta
Verilog library. (Or that was the last conclusion...) - John ]
( ESNUG 149 Item 3 ) ---------------------------------------------- [9/93]
Subject: (ESNUG 148 #2) "Mystery Memory Allocation Errors"
> We have been getting an error message from synopsys3.0b:
>
> "Error: internal memory allocation error."
>
> This is from dc_shell_exec running on an hp755. The error message comes out
> on standard error not standard output. The job keeps running but we
> suspect that it's giving up at certain points judgeing by our timing
> results. Synopsys claims that it's not them but I found the strings in the
> dc_shell_exec binary file so it is coming from Synopsys. It happens quite
> often but is not easily repeatable. Perhaps due to other jobs using memory
> on the server. We have to go back to synopsys3.0a for now! We have also
> seen it on a sparc10 with is slightly different error message. Anyone else
> seen this?
--- --- ---
From: cindy@zoran.hellnet.org (Cindy Eisner)
how can synopsys claim it is not them? i've been getting this forever - in
v3.0a as well as v3.0b. however, it only happens to me when i kill an
interactive dc_shell with ^c, so i've just ignored it. i use a sparc 10.
- Cindy Eisner
Zoran Microelectronics LTD
--- --- ---
From: frankz@abyss.kpc.com (Frank Zappulla)
We have also been getting an error message from synopsys3.0b:
"Error: internal memory allocation error."
We gave Synopsys a db file of a design that had this error. We have not
heard back from them yet. We also believe it is a Synopsys bug.
- Frank Zappulla
Kubota Pacific Computer
--- --- ---
From: div@NAD.3Com.COM (Dinesh Venkatachalam)
I saw a similar error on 3.0a. There are some known bugs in 3.0a plus what
I have seen (weird logic implementation in case statement, Internal failure)
so I dont think going back to 3.0a would be wise.
- Dinesh Venkatachalam
3COM Corporation
--- --- ---
From ecker@nixe.zfe.siemens.de (Wolfgang Ecker)
we got the same error messages running synopsys3.0b. They occur independent
from the size of the synthesized designs. The reason for the message is
probably that no more virtual memory is available.
- Wolfgang Ecker
SIEMENS R&D, Germany
|
|