( ESNUG 359 Item 3 ) --------------------------------------------- [9/13/00]
From: [ Original Author Unknown ]
Subject: BOA, simple_compile_mode, Area Recovery, Vera, & Boston SNUG'99
[ Editor's Note: This is a user trip report from last year's Boston SNUG
(1999). I accidentally found it on a search of my database and, after
reading all the technical meat, thought it worthy of ESNUG. - John ]
This is a quick trip report, for the Boston SNUG. In general the two-day
conference was very useful, and has highlighted some areas that we may want
to think about.
Big technology "takes" - described below.
a) compile_sequential_area_recovery = true
b) hdlin_use_cin = true
c) Multibit Register Inference
d) Don't use full_case parallel_case
e) Plan verification early, and have a dedicated team to do it.
The first day was a day of tutorials, and the second day was user sessions.
I have the entire proceedings on my desk if anyone is interested in reading
the presentations. I will only comment on the sessions I attended, although
some of the others looked useful, and may well be worth a look.
The first tutorial I attended was the "Getting the most from Design
Compiler: Area optimizing techniques and synthesis for high performance
designs". This tutorial was basically talking about, designs that are
multi-instance, multi-channel, parallel, designs running less than 100Mhz.
Set_Max_Area
------------
Set to 0 to enable the best area recovery.
DC now has three compile steps now, with delay being prioritized over area;
they are:
1) Delay optimization TNS
2) Design rules fixing 1 ( delay highest priority)
3) Design rules fixing 2 ( design rules highest priority)
4) Area optimization.
TNS is the Total Negative Slack of a design (i.e. the sum of all the worst
negative slacks per endpoint.) DC can prioritize area over TNS by the
set_max_area -ignore_tns. A critical range can be set to further optimize
critical paths, BUT an overly aggressive value will hurt run-time. TNS is
also shown as a column in the compile log.
Simple_Compile_Mode
-------------------
Simple_compile_mode has the following benefits compiles faster, reduces
area, does not need a uniquify, and automatically does a bottom up compile.
This is for designs that don't have aggressive timing goals, have multiple
instances, spends significant time during High-Level optimization, and
doesn't spend significant time in top-level optimizations such as DRC,
set_fix_hold, and global nets.
We may not be able to use this, but it may be a good "initial compile"
thought. It is set PRIOR to a compile by set_simple_compile_mode true.
Area_Effort
-----------
Compile has a new switch -area_effort [none | low | med | high]. Focuses
compile onto area, by taking advantage of wide fanin gates. This follows
the value of -map_effort, and vice-versa.
Sequential Area Recovery ( THIS COULD BE GOOD)
----------------------------------------------
Enabled by compile_sequential_area_recovery = true. It remaps all the
sequential cells to try and recover area, but doesn't hurt critical path
slack, and may reduce positive slack.
Behavioral Optimization of Arithmetic (BOA)
-------------------------------------------
This is a set of arithmetic transformation techniques that optimizes the
implementation of arithmetic functionality performed by transform_csa. It
uses Carry Save Arithmetic (CSA).
Behavioral Retiming (BRT)
-------------------------
This moves registers around combinational cells, using sequential
optimization techniques. Has a primary goal of timing, with a secondary goal
of area. It preserves functionality at I/O boundaries.
Other Switches To Control Area
------------------------------
hdlin_use_cin = true (this looks good for us - not the default either)
hdlin_infer_mux = default (this is the default, and OK)
hlo_resource_allocation = area_only (not advisable, let the default be)
hlo_resource_allocation = area_only (not advisable, let the default be)
compile -top This only optimizes paths through the top level. This
could be good for us.
The big "takes" for this were compile -sequential_area_recovery, and
hdlin_use_cin. It may also be good to have a look and see how the compile
-area_effort works for us. BOA and BRT look good, but probably won't do
anything for our designs. Also some of these need the DC ultra license,
which we have fewer of.
The afternoon session that I attended was "Breaking the Verification
Bottleneck: Hierarchical Functional Verification". This was primarily a
high level overview, and really did not tell anything that we had not really
thought about before. Maybe an interesting read in your spare time.
Friday brought two user sessions and the keynote speech from Aart de Geus.
The speech was good, and focused on the direction Synopsys is heading in.
There are three areas that they are trying to link together - Physical
Synthesis, IP design and re-use, and High level verification. This has been
enabled by a bunch of new products - TetraMAX, VERA, Chip Architect,
FlexRoute, Core Consultant, and Core Builder.
The first user session was "Synthesis / Design Productivity", which was
split into three presentations.
1) Using Multibit Register Inference. This was good, and is something
that I think we would really want to do. It revolves around using
bus wide cells in the library, instead of using single bit cells.
Ultimately the best implementation is to have wide cells in the
library (which I have begun to pursue), but it can have an impact with
out this. Turned on by hdlin_infer_multibit = default_all. Usage of
the automatic multibit inference capabilities available in Synopsys
leads to a fully-automated flow of inferring gated clocks with
associated power savings of up to 60%. Combined with an innovative
cell design, the inference technique provides area savings of up to
50% in storage elements area, which in turn lead to area savings of
up to 15% in chip area. Tight control over the structure of the
multibit elements enables low-skew CTS, even when combined with
non-gated storage elements.
2) "full_case parallel_case", the Evil Twins of Verilog Synthesis. Two
of the most over used and abused directives included in Verilog models
are the directives "// synopsys full_case parallel_case". The popular
myth that exists surrounding "full_case parallel_case" is that these
Verilog directives always make designs smaller, faster and latch-free.
This is false! Indeed, the "full_case parallel_case" switches
frequently make designs larger and slower and can obscure the fact that
latches have been inferred. These switches can also change the
functionality of a design causing a mismatch between pre-synthesis and
post-synthesis simulation, which if not discovered during gate-level
simulations will cause an ASIC to be taped out with design problems.
This paper details the effects of the "full_case parallel_case"
directives and includes examples of flawed and inefficient logic that
is inferred using these switches. This was EXCELLENT, and when the
electronic version comes out, I will pass it on.
3) Understanding your cell library. This was boring, totally irrelevant
to our needs, and incomprehensible.
The next user session was "Design Reuse/IP Prototyping", which was again
split into three presentations.
1) VHDL Coding Styles for Synthesizable, Reusable Designs. Behavioral
(RTL). VHDL has features that can greatly enhance the process of
reusing a design. High-level languages provide ways to optimize and
modify reusable designs without compromising the integrity of the
underlying design. In this paper, we explore several methods of
coding designs that allow user-customizable features and optimal
synthesis (performance and gate count), using the native language
features of VHDL. We examine the use of the following features that
render reusability to the code and at the same time does not hamper
the synthesizability: - Generics - A Package of constants - Generate
statements - Tying of ports and leaving ports "open" - Unconstrained
arrays and aggregates - Configuration specifications - VHDL
attributes - Block statements. This paper also presents real-life
examples of such reusable constructs and the results of synthesis using
Synopsys. Designers can apply any or all of these ideas to make their
designs more feature-rich, efficient, and highly reusable to create
multi-million gate System-On-A-chip designs. This was an excellent
presentation, and maybe we should look into it.
2) Design Methodology for Developing IP in FPGA's for ASIC Production.
This was all about trying to keep the ASIC flow and the FPGA flow as
near to identical as possible. Interesting, but not applicable to us.
3) Designing for Flexibility. This was about busses, and the different
schemes available. It was badly presented, and I lost the thread. It
presented three busses - Non-Pipeline, Pipelined, and Out of Order, and
that was essentially it. If you are interested in busses, then this may
be interesting.
The last user session was "High-Level Verification", which was again split
into three presentations.
1) A Recipe for Multi-million Gate ASIC Verification. Was interesting,
and came away with the following conclusions a) Have a verification
plan, and make it at the beginning of the project b) Set up the
verification infrastructure early, and allow the RTL designers to
regress against it. c) Use Buss Functional Models, and simple
behavioral models that everyone has access to. d) Have a verification
team
2) Object Oriented Approach to Verification. The task of verifying the
correct behavior of an Application Specific Integrated Circuit (ASIC)
is challenging. A hardware verification team must interpret the
specification of one or more bus protocols and the ASIC, then verify
that the ASIC behaves as specified while adhering to the bus protocols.
Typically the verification process involves modeling the bus protocols
and ASIC in a hardware simulator using a hardware description language.
Hardware description languages are well suited to describe the behavior
of circuits, but fall short when used as a verification language.
Hardware description languages don't provide any high level programming
support such as strong type checking, user defined data types and
generic programming. These features are beginning to become a
necessity to create an effective verification test environment. This
paper will discuss how Verilog, C++ and Object Oriented methodologies
have been used to create verification environments at Sun. It will
also discuss Object Oriented methodologies that can be used to help
decrease development time and increase the quality of verification
environments. These methodologies aren't language dependent and will
work for Vera as well as they do for C++. Interesting, could be useful.
3) Application of MicroPlatform Engineering Techniques in the Development
of VGA Compatible Cores. By working on multiple abstraction levels
the hardware-software co-verification of the INT416-SM and INT416-EX
VGA compatible cores was accelerated. Device drivers for the VGA and
the VGA HDL model developed concurrently, communicating through the
Primitive Device Drivers/Hardware Abstraction Layer/Bus interface. Our
system initially modeled the RTL for the VGA core using VCS, using a
PLI/socket interface to the PDD. The CPU, bridge, and memory subsystems
were initially abstracted as software processes, and later added to the
RTL HDL simulation. Foundation Express/FPGA Express was used to
implement the INT416-SM configuration onto a Xilinx Virtex XCV300 FPGA.
The key point out of this talk was develop models concurrently. My
brain was fried trying to follow this guy, and decipher the project
specific information. I don't think totally useful.
And that was it.
- [ Original Author Unknown ]
|
|