( DAC'15 Item 6 ) ----------------------------------------------- [12/18/15]
Subject: CDNS Genus vs. MENT Oasys vs. SNPS DC Graphical synth at DAC'15
THEY'RE MARRIED NOW: Waaaaay back in the old (pre-28nm) days, RTL synthesis
was only about taking some Verilog RTL source code and translating it into
the mininum number of logic gates and flip-flops that met your timing specs.
What PnR later did with the resulting gate netlist was an anonymous backend
PD engineer's headache. "It's not my problem. It met the spec. PnR has
promised to take it from there!" As a result, for PnR to guarantee they
could PnR *any* design that met spec, PnR typically added a 20% padding
to the spec.
"Houston, we have a problem."
- Astronaut James Lovell, on the Apollo 13 moon flight (04/14/70)
The difficulty now at 16nm/14nm, because of the new chip physics mumbo jumbo
this 20% safety margin is just not enough to guarantee PnR timing closure.
Now, RTL synthesis has evolved into RTL physical synthesis. That is, it's
no longer just Verilog RTL to gates -- it's now to placed gates -- with
power, CTS, path groups, congestion, layer-awareness, etc. all being brought
in at that right moment when your gates are being selected and placed.
So, long story short, in the olde (pre-28nm) girlfriend dating days it was:
Design Compiler was dating Magma or Avanti or EDI or Sierra or Atop
Now at 16nm/14nm on down, it's the synthesis-is-married-to-PnR days with:
SNPS DC Graphical married to SNPS ICC or ICC2
CDNS Genus married to CDNS Innovus
MENT Oasys married to MENT Sierra Olympus-SoC
With the only remaining bachelor PnR tool being Atoptech -- which happily
takes whatever Design Compiler or DC Graphical puts out.
SURVEY QUESTION #1:
"What were the 3 or 4 most INTERESTING specific EDA tools
you saw at DAC this year? WHY did they interest you?"
---- ---- ---- ---- ---- ---- ----
Synthesis is getting interesting again with Cadence Genus. Seems
like they leapfrogged Design Compiler (particularly in run time
and capacity) with a redesigned tool.
---- ---- ---- ---- ---- ---- ----
Cadence Genus.
I didn't think I would see anything new in RTL synth for a while,
and if I did, it to be from a start-up, not CDNS.
- claims faster thru-put, with less area and power
- claims of up to 20% less power for certain designs
They presented several Genus benchmarks of CPU and GPU cores.
Had a few 16FF examples, but mostly 28nm. Our designs are 28nm.
What surprised me was Imagination endorsed Genus! (Imgtec is a
known supporter of SNPS.) I heard SNPS was caught by surprise.
Looks like you can add another Anirudh "jab" on the scorecard
for CDNS against SNPS.
---- ---- ---- ---- ---- ---- ----
Genus presentation showed that Cadence is serious about synthesis
and are making progress on runtime and scalability.
The Genus "clips" flow looked interesting. It lets RTL engineers
get realistic synthesis results at the block level, without the
overhead of a full chip synthesis. Similar to Oasys RD.
---- ---- ---- ---- ---- ---- ----
Genus RTL Synthesis
The Cadence claim that Genus RTL synthesis is 3-5X faster than their
old RTL Compiler and scalable to 10M+ instances flat is true.
We've been able to turn 6M+ instances in less than a day. (In our
limited tests, Genus PPA was comparable to CDNS RTL Compiler.)
My overall view is Genus is a rebranding of CDNS RC with a major
engine update for better scalability across a large number of
cores and an improved UI added.
Cadence R&D has been making significant investments to have common
engines with their Innovus/Tempus teams -- and we are seeing some
fruit from this. However, they are only halfway through as they
move to common constraint readers, timers, etc. Statistical OCV
support in the Genus timer is missing -- so for those at the bleeding
edge of characterization/signoff, tight correlation with Innovus P&R
will be a challenge.
When they eventually get there, Innovus is going to be the top
platform in the industry.
---- ---- ---- ---- ---- ---- ----
We just got Genus RTL synthesis and have been using it for the last
few days. We used to use Cadence RC (RTL Compiler); which roughly
matched Synopsys DC Graphical for runtime/QoR.
So far in our comparison, for Genus we are seeing timing improvements
for some -- not all -- of our blocks. Genus runtime is significantly
faster as our company is now able to run multi-core.
---- ---- ---- ---- ---- ---- ----
Genus (RTL Synthesis):
- We have seen 5% to 8% improvements in Genus' QoR, particularly
timing; though we have not yet pushed a Genus-synthesized design
through the Innovus back-end tools to verify the improvements are
maintained.
- Timing improvements appear to be driven largely by taking advantage
of beneficial clock skew opportunities.
- We have also seen Genus run successfully on much larger blocks
(12M inst) that couldn't get through RC.
We still have very little experience, and Genus' default switches
are different from RC Compiler so, it's not obvious how to get an
apple-to-apples comparison.
---- ---- ---- ---- ---- ---- ----
The only part I remember from the Cadence Genus demo was its claims
of how deeply integrated with Innovus it was.
---- ---- ---- ---- ---- ---- ----
Cadence Genus -
I'm interested in how Genus physical design awareness is embedded
in the RTL synthesis step and how effective it is.
---- ---- ---- ---- ---- ---- ----
Do you have an independent user data about how well the early Genus
RTL physical synthesis estimates correlate to final Innovus PnR?
CDNS marketing says within 5%, but I don't believe them.
---- ---- ---- ---- ---- ---- ----
I am just interested in answering question number 1:
Cadence's new digital implementation tools (Genus/Innovus),
Jasper formal verification, and Mentor's (Oasys) RealTime
synthesis.
RealTime was interesting for its runtime and capacity but the
drawback is that it is just qualified against not-so-great
Mentor's P&R (Olympus?) tool.
---- ---- ---- ---- ---- ---- ----
Mentor RTL synthesis suite session covered some technologies
that are new to synthesis. Claims 10x faster, 100M gates at
the chip level, with the ability to consider physical aspects
during RTL synthesis and floorplanning.
Oasys speedup/capacity comes from their "place first" approach
where optimization is performed at the RTL rather than the
synthesized gates. (Apparently it reduces the number of objects
to be handled during synthesis resulting in their speedup and
chip capacity.) It was interesting that placement information
and the RTL partitions were used for timing and optimization.
Right now we're just looking at Oasys. Not full users yet. It's
able to:
- read a full chip RTL,
- synthesize the full chip,
- partition the full chip,
- shape the partitions,
- place the macros and
- analyze congestion.
If we had the ability to do early floorplanning like this as part of
DC Graphical it would definitely help us save some back and forth
iterations between our synthesis and physical design teams.
The ability to cross probe between RTL and physical views is useful.
In the layout view the tool was able to switch between timing,
congestion and power maps -- neat stuff.
---- ---- ---- ---- ---- ---- ----
Mentor Realtime Designer synthesis tool was very interesting.
The biggest differentiator was with the core RTL floorplannig
capability where RD is able to consume the RTL and generate
floorplan for early analysis. This is very useful to debug
any issues and also get a sense of timing and congestion early
on in our design flow. It reduced floorplanning time from
3 weeks to 1-2 days.
Mediatek presented on RTL floorplanning using Realtime Designer,
where they were able generate production quality floorplans for
blocks with 1000s of macros in 2-3x lesser time than a standard
Candence First Encounter floorplanning flow. The Oasys synthesis
being able to optimize at the RTL level -- as opposed to the gate
level -- seems to be the driver for the runtime and capacity claims
of 10 times faster vs. DC Graphical with 100 M gate capacity.
Other RunTime stuff worth mentioning are auto PPA exploration and
a single combined environment for physical and RTL views.
---- ---- ---- ---- ---- ---- ----
That Oasys stuff seems to be a lot like First Encounter.
---- ---- ---- ---- ---- ---- ----
Synopsys DC Graphical
It makes sense for synthesis to give physical guidance to ICC.
---- ---- ---- ---- ---- ---- ----
I nominate the cross-probing and the congestion prediction
between DC Graphical and ICC/ICC2.
---- ---- ---- ---- ---- ---- ----
Nothing has happened with DC Ultra in years.
---- ---- ---- ---- ---- ---- ----
Related Articles
Aart's big 4X faster DC Explorer launch from San Jose SNUG'11
8 user tech evals of CDNS RTL Compiler vs. SNPS Design Compiler
Aart's SUE RIVALS policy backfires on SNPS core synthesis patents
The mock Apple vs. Microsoft "commercials" Oasys did vs. Synopsys
Join
Index
Next->Item
|
|