( ESNUG 548 Item 1 ) -------------------------------------------- [03/20/15]
Subject: CDNS bigwig launches Innovus with 44 jabs at PrimeTime/ICC/ICC2
Last week at CDNlive'15, CDNS bigwig Anirudh Devgan, Sr. VP of digital R&D,
launched his new Innovus PnR tool. Events like this have three parts:
1.) What Cadence claims their new product does, how it supposedly
works, what it's new features are, etc.
2.) What's real from the user perspective. What's really being
said by the Cadence folks -- and it's context.
3.) The fun subtle jabs and pokes and dings that Anirudh is
making against Synopsys with this launch -- that officially
Anirudh is NOT making.
That is, at NO POINT did Anirudh ever say the words: Synopsys, PrimeTime,
IC Compiler, IC Compiler II, Aart, Aart de Geus, Star-RC, Star-RCXT, Avanti,
Milkyway, PhysOpt, Apache, Ansys, nor RedHawk.
And that's the fun part. Anirudh NEVER said any of those words. Not
once. But I heard them clear as day from what Anirudh said... Enjoy...
- John Cooley
DeepChip.com Holliston, MA
---- ---- ---- ---- ---- ---- ----
Anirudh said: "Notice the future is in different process nodes. The
world splits at 28nm. Automotive, micocontollers are
all at 28nm and above. Below 28nm is where mobility
servers, and networking live."
Cooley heard: "... below 28nm is where Qualcomm/Samsung/Apple mobility,
ARM/Intel servers, and Juniper/Cavium networking live."
---- ---- ---- ---- ---- ---- ----
Anirudh said: "Go back 10 to 15 years ago and every chip was either
analog or digital.
Digital is the advanced nodes like 28 to 10 nm. These
are servers, networking, mobile designs.
Analog is the bigger node like 40 to 130 nm. These are
amplifiers, power regulators, PLL's, etc.
The new big trend is Mixed Signal chips. And AMS's
sweet right now is 28 nm.
You need a tool suite that works across ALL THREE:
analog, mixed-signal, and digital."
Cooley heard: A double swipe at Synopsys. Anirudh is effectively
saying: "Ha! Ha! Synopsys only really does digital PnR.
Nobody goes to Aart for analog layout. Same with ATOP
and Olympus. But Cadence owns analog with Virtuoso and
we're stepping up in digital with EDI (now Innovus)!"
SNPS Jab Count: 2 Total SNPS Jab Count: 2
---- ---- ---- ---- ---- ---- ----
Anirudh said: "Innovus handles both mixed-signal and digital designs."
Cooley heard: "Nyah! Nyah! My spiffy new Innovus has:
- a unified db (SNPS doesn't)
- a unified C++ codebase (SNPS doesn't)
- mixed-signal design timing (SNPS doesn't)
- large blocks (SNPS doesn't)
- faster turnaround time than ICC/ICC2
- a single analog and digital db (SNPS doesn't)"
Resulting in 6 more jabs against Aart's ICC/ICC2.
SNPS Jab Count: 6 Total SNPS Jab Count: 8
---- ---- ---- ---- ---- ---- ----
Anirudh said: "The Cadence R&D team launched:
- Tempus STA in 06/2013
- Voltus noise and IR-drop in 11/2013
- Quantus QRC extraction in 07/2014
and now Innovus digital PnR in 03/2015"
"10X TAT and capacity gain."
"Best CPU PPA"
"Integrated signoff. Same database. Same codebase."
"Now does 5-10M inst blocks"
"Getting 20% PPA is what excites our users."
Cooley heard: Tempus is the Cadence answer to SNPS PrimeTime.
Voltus is the Cadence answer to Ansys/Apache RedHawk.
Quantus QRC is the Cadence answer to SNPS Star-RC.
New Innovus is the Cadence answer to SNPS ICC/ICC2.
10X better TAT and capacity -- against old ICC/EDI!
Best ARM PPA -- as compared to Aart's ICC/ICC2!
Integrated signoff -- because ICC/ICC2 and PrimeTime
use different databases!
Now 5-10M inst blocks -- because ICC/ICC2 is still
stuck with 1-2M inst blocks!
And finally, "Getting 20% better PPA against ICC/ICC2
with Innovus is real."
SNPS Jab Count: 8 Total SNPS Jab Count: 18
Apache Jab Count: 1 Total Apache Jab Count: 1
---- ---- ---- ---- ---- ---- ----
Anirudh said: "In the old days, a typical 40/28 nm design was 20 to 30 M
insts cut up into 25-50 top level blocks. Each block was
roughly 0.5 to 1 M instances each -- under the old PnR flow.
Now with 16/14 nm, the designs are 100 M to 150 M insts
which means there would be 150 blocks -- under the old flow.
But with our larger 5-10M inst block size in Innovous, you
are back to only managing 25-50 top-level blocks!"
"A 5 to 10 M block size saves a ton of engineering time by
not having to partition as much."
"A bigger block also helps optimize overall area better."
Cooley heard: "Under the old PnR flow" == Synopsys ICC/ICC2
Anirudh is driving home how bigger 5-10M inst top-level
blocks saves engineering and CPU runtimes as compared
to Aart's ICC/ICC2 0.5 to 1 M inst top level block size.
And although Anirudh did not say it, his slide slams SNPS
with 5-10X TAT gain, core algorithm speed-up, full flow
multithreading and a MASSIVELY PARALLEL underpinning.
(See that "MASSIVELY PARALLEL" down botton? Subtle!!!)
SNPS Jab Count: 9 Total SNPS Jab Count: 27
---- ---- ---- ---- ---- ---- ----
Anirudh said: "We're fast! Super fast! Our new Innovus PnR can do
1.8 M insts in a day. That's phenomenal throughput!"
"These are all customer blocks!"
Cooley heard: I have to admit I wasn't listening all that closely to
what Anirudh was saying because this Innovus vs. ICC
customer benchmark data spoke volumes to me at the time.
My reactions:
- The "reference" flow was either ICC or EDI.
- The way I see the data it breaks out to:
Innovus EDI
9.3 M 28nm 72 hours 700 hours
(3 days) (29 days)
because ICC can't handle blocks bigger than 2 M insts.
Innovus ICC
2.8 M 28nm 48 hours 336 hours
(2 days) (14 days)
Innovus ICC
1.5 M 16nm 20 hours 104 hours
(1 day) (4.3 days)
Holy crap! Going from 1 month to 3 days! Going
from 2 weeks to 2 days! Holy crap!
- I'm assuming these were customer PITA BLOCKS,
not just cherry-picked "nice" blocks. Customers
benchmark those Pain-In-The-Ass (PITA) blocks
which are the toughest blocks to close for timing,
power, area. You can get two specs, but then the
remaining spec balloons out. I'm assuming these
are the Innovus vs. ICC runtimes times it takes
to converge on all 3 PPA specs.
- From the press releases, these customers were most
likely Freescale, Juniper, Renesas, Maxlinear, and
Spreadtrum -- but who's block is who's I don't know.
To me, if it vets out, this Innovus vs. ICC benchmark
data is the "money shot" of this product launch. Wow.
SNPS Jab Count: 5 Total SNPS Jab Count: 32
---- ---- ---- ---- ---- ---- ----
(click on pic to enlarge it)
Anirudh said: "When I took over Cadence digital R&D, I found EDI had
problems in PPA. To fix that Innovus now has:
- a new digital placer call GigaPlace that does
physical-driven, optimization-driven, slack-
driven, layer-aware fully analytical placement.
- reved up GigaOpt by adding multi-threading and
power/timing/area co-optimizations
- integrated Azuro CCopt directly into Innovus. It
now has FlexH to do better CTS.
"Fast TAT is nice, but what people want is our 20% PPA."
Cooley heard: Innovus has a new placer, GigaOpt has been beefed up,
and CCopt now does better CTS inside Innovus. The
prior PITA BLOCK benchmark data implies all these new
things work well -- otherwise Anirudh would not be making
those public 5.2X to 9.7X claims -- but I still want to
hear from these hands-on Innovus end users on what they
say about GigaPlace, "improved" GigaOpt, and CCopt FlexH.
Anirudh's reiteration that engineers want 20% PPA instead
of 10X TAT is true -- although he claims *both*.
SNPS Jab Count: 1 Total SNPS Jab Count: 33
---- ---- ---- ---- ---- ---- ----
(click on pic to enlarge it)
Anirudh said: "Here we have a customer with multiple 16nm CPU blocks
exceeding his target frequency by 10% on all blocks
and reducing his Time-To-Tapeout (a different TAT!) by
1/2 compared to the old flow."
"Innovus' auto net-weighting, pipeline placement, and
elimination of region guides -- plus our new CCopt CTS
FlexH integration -- made this possible."
Cooley heard: Clearly "old flow" is ICC/ICC2. "Look! We did a
multiple ARM core 16nm design that beat Synopsys in
1/2 the Time-To-Tapeout and with 10% faster clocks!"
SNPS Jab Count: 1 Total SNPS Jab Count: 34
---- ---- ---- ---- ---- ---- ----
Anirudh said: "Here's a 3M+ inst 28nm design that was in 4 top level
blocks which took 5 engineers 5 days on 10 machines
to implement using their old flow.
Using Innovus, the same chip took 2 engineers 1 day on
3 machines -- plus it got a 10% area reduction.
This shows the savings one gets with 5-10M inst block
sizes instead of 0.5-1.0M inst block sizes."
Cooley heard: "ICC/ICC2's 0.5-1.0M inst block sizes suck if you can
jump up to 5-10M inst block sizes." Cooley Note To
Self: check this jump in block size claim with users.
10% area cut makes sense if heirarchy is flattened.
SNPS Jab Count: 2 Total SNPS Jab Count: 36
---- ---- ---- ---- ---- ---- ----
Anirudh said: "Innovus did power optimization on a 16nm 1.6M inst block
with 15+ power domains. Innovus got 4.2% better dynamic
combinational power and 13% better dynamic sequential
than the competition."
"Innovus now does IEEE 1801 -- the new UPF -- which is a
combined version of CPF and UPF."
Cooley heard: Innovus can do block level dynamic power optimization.
They also claim to have solved the CPF vs. UPF problem by
ironically supporting *both*; whereas SNPS only does UPF.
Many still use CPF because Conformal LP used CPF first.
SNPS Jab Count: 2 Total SNPS Jab Count: 38
---- ---- ---- ---- ---- ---- ----
Anirudh said: "Because Innovus can handle congestion better than the
competing flow, we had a mutual 16nm customer who saved
3 iterations and 2.5 weeks on their design."
Cooley heard: "Ha! Ha! We won a TSMC 16nm FinFET benchmark against
ICC/ICC2 because we can handle congestion better."
SNPS Jab Count: 1 Total SNPS Jab Count: 39
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
COOLEY'S GIRLFRIEND TEXTED HIM AT THIS MOMENT AND HE GOT DISTRACTED.
My very loosely paraphrasing what Anirudh said next here:
- "Tempus and Voltus and Quantus and Innovus all work
together on the same db -- so it converges faster!"
- "We're adding a better common UI to all our PnR tools."
- "Our reports and visualization are better now."
- "We have neat hooks into mixed signal that Synopsys
doesn't have!"
- "The ARM guys got a 5X faster tapeout of their spiffy
new Cortex-A72 core compared to our competitor!"
AFTER THIS COOLEY STOPPED TEXTING HIS GIRLFRIEND AND REGAINED FOCUS.
SNPS Jab Count: 3 Total SNPS Jab Count: 42
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
---- ---- ---- ---- ---- ---- ----
(click on pic to enlarge it)
Anirudh said: "Customers bounce back and forth between STA and PnR
with our competition, because Tempus and Innovus are
on the same database and the same C++ codebase -- so
our users close much faster!"
Cooley heard: Anirudh is driving home that because PrimeTime and
ICC/ICC2 are on different databases, it significantly
slows down the overall SNPS design closure flow by
5X to 10X compared to his one db Tempus/Innovus combo.
SNPS Jab Count: 1 Total SNPS Jab Count: 43
---- ---- ---- ---- ---- ---- ----
Anirudh said: "Adding Quantus QRC helped a 38M inst, 28nm customer
with more than 16 timing scenarios cut their timing
ECO loops by 10X compared to the competion."
Cooley heard: "Look! Quantus QRC/Tempus/Innovus kicked ass against
Star-RC/PrimeTime/ICC/ICC2!"
SNPS Jab Count: 1 Total SNPS Jab Count: 44
---- ---- ---- ---- ---- ---- ----
Anirudh said: "Voltus does power optimization, grid rail re-sizing,
IR-drop, switching power, de-cap optimization inside
Innovus. It's integrated within PnR."
"Now engineers overdesign their power grid structure;
leaving too much metal in your chip -- leaving less
area for signal routing. This Voltus integration
remedies that."
Cooley heard: "We're still taking on Ansys/Apache Redhawk! Now we're
using it being hooked into Innovus as a selling point!"
The difficulty is I'm NOT hearing many Redhawk customer
stories about how they've switched to Voltus...
SNPS Jab Count: 0 Total SNPS Jab Count: 44
Apache Jab Count: 1 Total Apache Jab Count: 2
---- ---- ---- ---- ---- ---- ----
Anirudh said: "With this Innovus launch, we have introducted our own
organically developed state-of-the-art PnR solution.
Compared to the competion, Innovus has up to 10X speed-
up, a 20% better PPA, and a much bigger 5-to-10 M inst
block size. It's already proven and in production
with all these customers."
Cooley heard: "Hey! Look! We have real customers actually going on
record that they're using Innovus for production PnR!
We're giving Aart's ICC/ICC2 some serious competion!!!"
============================================================================
FINAL ANALYSIS: Last year at SNUG'14, Aart made some mighty big 10X claims
when he launched his "game changer" IC Compiler II there. (And I had a
hoot of a time scooping his ICC2 launch by 4 days with:
"SCOOP II: Multiple spies report that on Monday, Aart de Geus is
going to announce "Project Newton" IC Compiler II (ICC II) in
his upcoming keynote at SNUG'14 in Santa Clara...
- from http://www.deepchip.com/items/0537-10.html
In that scoop I got the detailed breakdown (new placer, combined data model,
new MCMM timer, old CTS, old Z-Route) of what actually was in and not in the
"new" IC Compiler II.
Three months later, I got the CDNS "Project Novus" scoop; Anirudh's answer
to Aart's ICC2 threat.
"Performance and congestion for our 2.2 M inst block:
TNS WNS Util
old EDI 2013: - 75.833 nsec - 0.931 nsec 74.23 %
Project Novus 2014: - 1.326 nsec - 0.044 nsec 73.92 %
Number of routing cells with horizontal congestions (our design's
bottleneck) dropped by ~25%. LVT/SVT cells dropped by ~15%."
- from http://www.deepchip.com/items/0541-05.html
So now 6 months after the "Project Novus" leak, I was not surprised to see
CDNS announce "Innovus". It's key features:
- the ARM guys are on record saying Innovus got the best implementation
PPA with their Cortex-A72 -- saying but not directly saying that
Innovus had beat Aart's IC Compiler 2 ARM Cortex-A72 PPA results.
- Innovus claims closing faster because Tempus/Innovus/Voltus/Quantus
all work on a single database -- and same single C++ codebase --
while DC/PrimeTime/ICC/ICC2 uses different databases and codebases.
- Innovus claims it does 5-10M inst blocks, while ICC/ICC2 are still
stuck in the 1-2M inst block size limit -- which has all sorts of
rammifications on congestion, timing closure, partitions, etc.
- Innovus claims multi-threading and distributed processing, which
means MCMM scenario acceleration.
- The Innovus 5-10X TAT vs. ICC/EDI claim backed by user benchmark data
was quite impressive. It means significantly faster interations.
- finally the biggest is Innovus gets 20% better PPA than ICC/ICC2.
Taken altogether this means Innovus/Tempus does noticably faster digital STA
and PnR with much bigger 5-10 M inst blocks that needs less iterations to
get signoff level 20% better PPA vs. Aart's PrimeTime/ICC/ICC2 flow.
Competition is good.
- John Cooley
DeepChip.com Hollistion, MA
P.S. And those 44 jabs against SNPS plus 2 against Apache were fun, too!
---- ---- ---- ---- ---- ---- ----
Related Articles:
Anirudh quietly tells users of "Project Novus" to nullify ICC II
Readers on ICC II, ATOP, CDNS EDI, upcharges, Z-Route, 24 months
Engineering comments point to SNPS vs. CDNS PNR shakeout at Apple
Join
Index
Next->Item
|
|