( ESNUG 349 Item 11 ) -------------------------------------------- [4/18/00]
Subject: ( ESNUG 348 #6 ) Cadence's Silicon Ensemble & Cross-Cap At 0.18um
> Here's the deal: I'm working on a 0.18um ASIC with a Japanese foundry.
> Good guys. They're working their butts off and closing timing without
> a problem. Not a word that we need to worry about cross-capacitance.
> Then I get to talking to some of my friends working on processors, and
> I start to get a bad feeling. For one thing, cross-capacitance isn't
> proportional to clock speed, its proportional to metal pitch. You can
> be running 83 MHz in 0.18um and still have cross-capacitance bite you
> in the butt. Sure, maybe you can leave some margin on the table to
> cover additional delay due to cross-capacitance, but you can't margin
> noise. Get a glitch far enough down your logic cone, clock it into a
> flop, and suddenly you get to debug a frequency band where the part
> fails. Lovely.
>
> So when I press my foundry, I get this cheesey answer that they're
> going to insert buffers every couple of millimeters to prevent
> cross-talk. Sounds good, right?
From: Lou Scheffer <lou@cadence.com>
Hi, John,
Here's my take on this. (I'm from Cadence, by the way.)
This solution doesn't work well for several reasons. It's hard to apply on
nets with multiple drivers (if your methodology allows that.) Blindly
inserting buffers will screw up clock trees and other carefully designed
nets. It's hard to do on busses and still keep interbit skew constraints.
> Not once you run the simulations. For most nets, a 1-2 mm buffering
> distance would be fine, but if you have a very high drive cell
> aggressing on a very low drive cell, it can increase your delay 50% at
> less than 1/2 mm of adjacency, even on wide-pitched metal. 50%.
> That's no small potatoes. I mean, how much margin do you have?
This problem shows up in another way that affects performances even more
seriously. If you have no way to estimate crosstalk induced delay, then
you need to over-estimate coupling C to account for the Miller effect.
Suppose you decide to overestimate cross coupling C by 60% (a fairly
typical number). Then assuming half the capacitance is coupling, ALL nets
are overestimated by 30%. This is why chips still work despite the problem
above, which has been fairly serious since 0.25 micron technology.
> The best thing I can say for SE is that it runs without crashing.
>
> Here are the problems we've found:
>
> 1. The parasitics coming from HyperExtract are up to 200% off
> compared to 3D field solution. A chimpanzee throwing darts
> at a diagram of parasitics could do better.
200% on which size parasitics? On very small ones, maybe. The models (at
least in the past) have been generated so they get total C as close
as possible (generally within a few percent). Since coupling C is about
1/2 of total C, if a big coupling C was off by >100%, one or the other
of the grounded or coupling C would become negative! So we can be sure
that big coupling Cs are not off by this far.
A more relevent metric is the effect of the error on delay and crosstalk
computation. This is roughly given by:
(% error in coupling C) x (value of coupling C)
-------------------------------------------
(value of total C)
This should show (from our experience) that although HyperExtract is
certainly not perfect, the errors are manageable.
That being said, for any of the 2, 2.5, etc. extractors, the coupling
capacitances are not as accurate as the total capacitances. That's
because these tools are calibrated against 3-D field solvers, and
the users typically adjust the coefficients to get the total C as close
as possible. Historically they have not worried about whether this C
was coupling or grounded. When all the C was grounded for the purposes
of delay calculation this was OK. Now that there are tools that
can use these coupling Cs, the CAD folks who write these extractors are
trying to get the individual components to better accuracy.
Finally, make sure that you are using an extract parameter set that
has not been pre-compensated for Miller effect. Many times the parameters
are set up to deliberately overestimate coupling, typically by 50-60%.
This is done to include the effect of crosstalk on delay. Even if
you are willing to accept this solution for delay, though, it causes
serious overestimation of crosstalk.
> 2. Its flagging over 1000 noise violations in a few hundred K gates
> of logic. Over 1K noise violations in < 9mm^^2 of randomly
> routed die? Gimme a break! Simulation showed that SE was
> over-estimating noise by > 100%. It was using a slew rate
> less than half of the actual slew rate. Its hard to say how many
> real noise violations are in there, but my simulations are saying
> my usual layout topologies ought to give me < 10 noise violations
> per 100K gates *usually*. Not 1000!
It's certainly not surprising that using a slew rate twice that of the
real one would result in 1000 violations. The next question is why
were the slew rates bad?
The slew rates are derived straight from the library data. Starting
from the inputs and flip-flops, using the load C (and the input slew),
the output slew is looked up in a table. This slew is degraded in the
interconnect, and then forms the slew at the input to the next gate,
and so on.
So if the slews don't match what you see in SPICE, the most likely
reason is the library data is wrong, or that the operating conditions
differ (For example, the fastest possible slew rate is usually obtained
at with a fast process, low temperature and high voltage. If you
compare this to a SPICE run at nominal process, voltage, and
temperature, the SPICE may well be slower by a factor of 2).
Note that input slews have a small effect on delays, compared to output
loading. Therefore it's quite common that the table of output slews
(as a function of input C and input slew) is very wrong. This does
not affect delay calculation much (which is why the error is allowed to
persist) but it affects crosstalk a lot.
Historically, this problem is particularly bad with synthesis and
simulation libraries, since these treat slew poorly if at all. For
example, Synopsys for many years did no slew degradation in interconnect
(has this been fixed?) and SDF provides no way to back annotate slew.
So people developing these libraries have no incentive to get the slews
right. This works OK (or at least no worse than usual) until you get
to crosstalk or some other analysis that depends on realistic slew values.
Next, you might want to look very carefully at the thresholds you are
using for crosstalk analysis. Usually, in a modern CMOS process, there
are scrillions of nets with noise about 20% of supply, lots with 30%
noise, many fewer with >40% noise, and so on. Since the number of
nets reported is an extremely strong function of the threshold, a little
work here characterizing your cells can pay big dividends.
Finally, you might want to double check your intuition. Assuming you
fix the slew problem, and set the thresholds correctly, it would certainly
not surprise me in a design this size to have 100 nets that could be bad,
given that all the neighbors that can switch at the same time did so, and
in the same direction. Will this ever happen in operation? If the
neighbors are busses it is certainly possible. Even if they are random
logic, a modern chip can easily do 10^17 cycles (300 MHz x 10 years),
so some very unlikely combinations can happen.
Of course the application has a strong impact on whether you care about
these unlikely occurances. In a video processor, one bad pixel every
few seconds would never be noticed. In a PC, one error per year would be
totally swamped by the software error rate. If you are building a
pacemaker, though, you might want to fix every single possible crosstalk
problem, no matter how unlikely.
> Part of the trouble with all of this is the incredible lack of data in
> the industry on what to expect in real live designs. Virtually all the
> data I've come across is from contrived layout topologies on test chips
> or from microprocessors with manual or heavily programmed routing.
> That just isn't an option in ASIC design. I haven't found anybody who
> can knowledgably tell me what kind of cross-capacitance issues to expect
> in automatically routed logic. It gets down to religion. Some people say
> that because this type of routing results in large numbers of extremely
> small aggressors, the cross-capacitance can be neglected (the aggressors
> will never all aggress at the same time). Sounds good. But I sit here
> and look at the routing fanning in and out of some of my memories and
> MUXes, and there are long stretches of massive adjacency. I think the
> large-#-of-small-aggressors argument just doesn't hold up across the
> die. Even in this random routing, there are instances of great
> regularity.
Silicon Ensemble sums all the aggressors, no matter how small, for exactly
this reason. For example, you might have a signal that runs across a 256
bit bus. Though each capacitor is very small, in this case they all might
change in the same direction at the same time, and the small Cs cannot be
neglected.
Then, of course, users complain that the crosstalk analysis is reporting
too many potential errors.
> The second argument I've come across is that odds are against all these
> lines having the appropriate phase relationship to clobber each other.
> But I can't guarantee that for all signals entering and leaving memories
> and muxes. What do I look like, someone who wants to run vectors for the
> rest of my freakin' career?
>
> All in all, I feel a lot of bad silicon coming on...
Even without regularity, if you have 10^6 nets with 60 neighbors each, and
run it through 10^17 cycles, and the data is random, you can expect that on
some cycle at least one of the nets will have ALL of its neighbors change
in the same direction! Busses and regularity make the situation worse,
though by how much is very unclear.
These problems are not mysterious, nor are they unavoidable. Microprocessor
designers and advanced ASIC users have been dealing with these issues for
years. The issues can be addressed by methodology, better tools, or more
analysis and awareness by the users. If you don't address it at all,
or address it superficially, it's sure to bite you. I suspect a lot
a bad silicon will be built before this lesson is fully appreciated.
The good news (if any) is that these problems, and others such as IR drop,
electromigration, wire self-heat, and hot electron effects, are well
understood. The next generation of tools (from Cadence, at least) has been
designed from the ground up with these effects in mind. So fairly soon we
should be able to get back to the desired situation where the user types in
RTL and gets back legal layouts, where legal now means DRC correct + all
DSM problems fixed.
- Lou Scheffer
Cadence San Jose, CA
|
|