( ESNUG 307 Item 9 ) --------------------------------------------- [12/16/98]
Subject: ( ESNUG 304 #11 305 #1) Zimmer's Mother-Of-All Headaches ESNUG Post
> I've got some interrelated questions to throw out to the ESNUG community.
> This is pretty tricky. I suspect there are less than 10 engineers
> worldwide who could understand and/or answer these questions.
>
> Question 1:
> -----------
>
> Like a lot of people, I use default scripts to compile most blocks, and
> I don't mess with these defaults unless I have to. This implies setting
> appropriate constraints WITHOUT knowing the details of the logic inside,
> specifically without knowing what sort of paths exist (seq vs comb).
>
> So, imagine I've got a module that has inputs that go to flops, outputs
> that come from flops, and outputs that come from both inputs and
> flops (commonly known as a Mealy state machine).
>
> So, I want to budget this thing such that the input-to-clock paths get
> 0.85 clock period, the clock-to_output paths get 0.15 clock periods,
> and the combinational (input-to-output) paths get 0.50 clock periods.
>
> This was easy with the old set_arrival, set_max_delay syntax:
>
> all_inputs_no_clocks = all_inputs() - find(port,clkname)
> set_arrival 0.15 * clk_per all_inputs_no_clocks
> set_max_delay 0.15 * clk_per all_outputs()
> set_max_delay clk_per * ( 0.15 + 0.50 ) \
> -from all_inputs_no_clocks -to all_outputs()
>
> Which is why I've kept using the old syntax all this time.
>
> Now, because of question 2 below, I have to move to the "new" syntax
> of set_input_delay and set_output_delay.
>
> The input-to-clock and clock-to_output paths are easy enough:
>
> all_inputs_no_clocks = all_inputs() - find(port,clkname)
> set_input_delay -clock clkname 0.15 * clk_per all_inputs_no_clocks
> set_output_delay -clock clkname 0.85 * clk_per all_outputs()
>
> Unfortunately, this doesn't work for the input-to-output paths. We've
> now told dc that all the inputs and outputs are related to the clock,
> and that the combined delays take an entire clock cycle. So, there's
> nothing left for the comb budget.
>
> Basically, the sid/sod syntax ASSUMES that a clock cycles is consumed in
> the middle. This isn't true for combinational paths.
>
> The only way that I've found to get around this is to declare the
> combinational paths to be multicycle:
>
> set_multicycle_path 2.0 -setup -from all_inputs_no_clocks \
> -to all_outputs()
> set_multicycle_path 1.0 -hold -from all_inputs_no_clocks \
> -to all_outputs()
>
> (The second line is needed to make the hold calculation work correctly.)
>
> Now we have the clock-related constraints out of the way, and we can put
> combinational constraints on. You would think that this could be done by
> using set_max_delay. Unfortunately, I haven't been able to get this to
> work. Instead, I have to create a virtual clock and put output (or input)
> delay on it.
>
> create_clock -name combclk -period 10.0 -waveform {0, 5.0}
> set_input_delay -clock combclk 0.0 all_ins_no_clks -add_delay
> set_output_delay -clock combclk 17.0 all_outputs() -add_delay
>
> All those single-path timing exceptions sound like a potentially serious
> performance hit (although I must admit that on 9802 I haven't actually
> seen much of a problem). Besides, this is MESSY.
>
> Has anyone found a better way?
>
> Question 2:
> -----------
>
> [ Editor's Note: "sid/sod" = "set_input_delay/set_output_delay" ! - John ]
>
> The reason I have to go to the sid/sod syntax (aside from the fact that
> set_arrival has disappeared from the man pages :-) )is because I am now
> dealing with modules that have multiple clocks all mixed in with logic to
> be compiled (don't ask).
>
> Now, this is what sid/sod was invented for, so this should be easy, right?
> Well, wrong. The sid/sod approach assumes that the constraint setter
> KNOWNS what inputs and outputs are related to what clocks. That's fine
> at the top of the chip, where I've been using sid/sod for years so that
> I can break bidi loops, but my lower-level generic scripts DON'T know that,
> and don't WANT to know that.
>
> If you do the sid/sod for each clock and clk_per in a foreach loop,
> you'll get silly things like a path from the slow clock, plus the
> input delay of the slow clock, ending at a flop clocked by the
> fast clock.
>
> Well, I guess this isn't really silly from dc's point of view. You've
> effectively told dc that there is a source for each input from each
> clock domain with a certain delay, and a sync for each output in
> each clock domain with a certain delay, and it doesn't know any better.
>
> But, this isn't what you really WANT. What you want is for the input
> and output delays for a clock to only be applied to paths that end/start
> on that clock. So, an input that goes to flops on both the fast and slow
> clocks (for example) should have the slow clock's input delay budget on
> the path going to the slow clock flop, and the fast clock's input
> delay budget on the path going to the fast clock flop. Likewise for
> outputs.
>
> How do you do this? I'm still trying it out, but the only way I've found
> so far is to do all the same stuff above (Question 1), and then set
> false paths across all the clock boundaries.
>
> foreach(_clock,all_clocks()) {
> foreach(_other_clock,all_clocks() - {_clock}) {
> set_false_path -from _clock -to _other_clock
> }
> }
>
> That's a shame, since I've already gone to a lot of trouble to define the
> clocks using harmonics of the fastest clock, then shrinking them to the
> target period using minus clock skew. So, I shouldn't have to disable
> all the cross-clock paths.
>
> Does anybody know of a better way to do this?
>
> - Paul Zimmer
> Cerent Corporation
From: Juergen Stallmann <juergen.stallmann@pdb.siemens.de>
Hi Paul, Hi John,
Here are some suggestions concerning your question #1 in ESNUG_305.
We're also working with generic constraints and have been facing
this problem in all of our last projects and also in our current project,
We try to reduce the amount of combinational pathes in sequential
blocks, but as you can imagine, there are still some sequential blocks
with combinational pathes, and you probably don't know where they are.
I'm doing it in this way:
input_delay = 0.5*clock_period
output_delay = 0.8*clock_period
set_input_delay input_delay all_inputs_no_clock
set_output_delay -clock Clkname output_delay all_outputs()
I normally don't specify a related clock for input delays, so dc will
assume the clock of the fanout-tree ff as related clock. If any input
is related to multiple clock domains, ok, you have to specify a related
clock with these inputs.
With these sid/sod (I know, set_input_delay ....), the combinational
pathes would be violated by 0.3*clock_period, even if there's no gate
in this path.
In the next step, I'm looking for all the combinational pathes in
this block. This is done with a dc_shell script and creates a new
dc_shell script, which consists of commands like these for each
combinational path:
set_max_delay path_delay -from input_pin -to output_pin \
-group_path combi_group
Usually I'm allowing a combinational delay of 0.3*clock_period. So
path delay will be defined as:
path_delay = input_delay + output_delay + 0.3*clock_period
Now the automatically created script can be included. As a result, all
combinational pathes are in a separate path group and don't make
trouble in the sequential pathes. So your constraints for your sequential
pathes and you combinational pathes will be considered by dc.
In your early design phase, your combinational pathes may be changing
quite often, so the combi_path script should be invoked before each
compile. As you can't run it on the generic database, you can do it
in this way:
elaborate design
compile without constraints
invoke combi_path_script
apply constraints
define path_delay
include created script with set_max_delay commands
compile
Concerning your question #2, I must agree, it's quite hard if you're
dealing with different clocks in one block. I try to minimize
the number of blocks with different clocks and ports, related to
both of them. If I cant avoid it, ;-( OK, I sit down and write
the specific constraints, or I try to meet the harder constraints even
for the slower clock systems. Maybe I'll get more gates, but when dealing
with 2.5 million gates, who cares.
Hope this helps a little bit.
- Juergen Stallmann
Siemens AG Paderborn, Germany
---- ---- ---- ---- ---- ---- ----
From: gmann@ford.com (Greg Mann)
John,
I think what Paul has shown is that time budgeting is a difficult problem
and that a universal time budget which allows for combinational paths
from input to output and with multiple clocks is at best unreasonable.
It occurs to me that even with only one clock, your previous method of
time budgeting (set_arrival/set_max_delay...) results in timing which
is probably reasonable for each block but is invalid from the point
of view of the big picture. If you have a block "A" which has a
flip-flop feeding a combinational path in block "B" which feeds logic
going to a flip-flop in block "C", then "A" gets 15% of the clock cycle,
"B" gets 85%, and "C" gets 50%, for a total of 150%. You've given away
more time than you have in a clock cycle. (Have I missed something?)
I would try the new design budgeting offered by Synopsys. I haven't
personally used it, but it sounds promising.
- Greg Mann
Ford Microelectronics Colorado Springs, CO
---- ---- ---- ---- ---- ---- ----
From: Brian Jung <bjung@sdd.hp.com>
Hi John,
We ran into this last year too. My problem was an input that went to
clocked element. The output of the clock went through a multiplexor
to the an output port. However, the input also went to the multiplexor's
other selectable input.
We used sid for the input wrt "Clk" and sod for the output wrt "Clk".
Then we used set_max_delay from the input to the output with the exact delay
we wanted plus an offset. The offset was the "Clk" period. This took care of
correcting for the "Clk" relations attached to the input and output ports.
I know this may not be "generic" enough for your app w/ multiple clock
domains. Maybe you already knew this... anyhow, that's been my experience.
- Brian Jung
Hewlett-Packard San Diego, CA
---- ---- ---- ---- ---- ---- ----
From: "Klaasen, ir. C.E." <klaasen@natlab.research.philips.com>
Hello Paul (and John),
Perhaps I am one of the (un)lucky 10 people who understand your two
questions. I have encountered the exact same problems as you have
mentioned, and I came up with similar solutions. As far as question 2 is
concerned, this is what one of my Synopsys spokesmen suggested to do:
If the outputs on the combinational path, already have a set_output_delay
on them, because the output is also the output of a sequencial path (the
given solution for question 1, seems pretty good.)
However, if you only have a set_input_delay, you can still use the
set_max_delay command. This does work! You then have to add the input
delay and the desired comb. delay together ((0.15 + 0.50) * clk_per).
I agree that none of these solution are very nice and simple. It guess,
it would make sense, if you could set both an input- and output- delay in
combination with a max delay.
- Charles E. Klaasen,
Philips Semiconductors Eindhoven, The Netherlands
---- ---- ---- ---- ---- ---- ----
From: "Paul.Zimmer" <paul.zimmer@cerent.com>
John,
Attached is my summary of what I learned from the responses to my "Mother of
All Constraint Headaches..." posting. The bottom line is that there WAS a
better way. So, I'm glad I posted this on ESNUG!
Question 1:
----------
This boiled down to constraining mealy paths (without knowing anything about
the actual paths in the design) in a single-clock environment.
My method involved using set_multicycle_path 2 from the inputs to the
outputs, then creating a virtual clock to constrain these paths.
It turns out that there IS a better way to constain this case. What I
was missing was that set_max_delay COMPLETELY OVERRIDES the normal
clock-to-clock calculation, EVEN IF the clock-to-clock calculation is
"tighter". I didn't understand this (I assumed set_max_delay would set
an independent constraint on the path), and that was why I "couldn't get
set_max_delay to work".
In my case, set_max_delay does exactly what I want, if you know the trick.
The trick is to ADD one full period of the SLOWEST clock to your real
target constraint value, and use this as the max_delay.
When Synopsys looks at this path, the set_multicycle_path command
overrides the normal calculation (eliminating the need for the
set_multicycle_path), and takes the sum of the input and output
delays (which add up to one full clock cycle), plus the path delay, and
compares it to your max_delay value, which is one full clock cycle plus
your target path delay. The net effect is to compare the actual path
delay to the target path delay, which is exactly what you want.
In a multi-clock case, it will do this for the fast clock too, and get
lots of slack (because your max_delay value had a full period of the
slowest clock), but this is harmless.
Question 2:
-----------
This was essentially how to constrain designs with multiple (harmonic)
clocks WITHOUT setting false_path between the clocks.
No one has suggested an alternative yet. So, my final script looks like:
create_clock slowclk -period 100 -waveform {0, 50}
create_clock fastclk -period 10 -waveform {0, 5}
all_ins_no_clks = all_inputs() - slowclk - fastclk
set_input_delay -clock slowclk 60.0 all_ins_no_clks -add_delay
set_input_delay -clock fastclk 6.0 all_ins_no_clks -add_delay
set_output_delay -clock slowclk 40.0 all_outputs() -add_delay
set_output_delay -clock fastclk 4.0 all_outputs() -add_delay
/* constrain mealy paths to 5.0 ns */
set_max_delay 100 + 5 -from all_inputs() -to all_outputs()
/* set false paths between the clocks */
foreach(_clock,all_clocks()) {
foreach(_clock1,all_clocks() - {_clock}) {
set_false_path -from _clock -to _clock1
}
}
This produces exactly the same slack values as my old script, but without
the set_multicycle_path stuff.
Many thanks to all ESNUGers who responded, especially to Brian Jung of HP,
Juergen Stallman of Siemens, and Charles Klaasen of Phillips, who set me
on the right path.
And I HOPE this interests more than 10 people! :)
- Paul Zimmer
Cerent Corporation
|
|