( BSNUG 00 Item 4 ) ------------------------------------------- [ 10/13/00 ]
Subject: VCS, Verilog-XL, Scirocco VHDL, VSim, Roadrunner, Radient, FSMs
SPEED IS EVERYTHING: So there's some controversy around SystemC. Today's
designers still use Verilog and/or VHDL to make chips, so the tech meat
on how to maximize your VCS and/or Scirocco simulations was eagerly lapped
up by the Boston SNUG audience -- especially the detailed talk on how to
use the Roadrunner and Radient speed-ups in VCS.
"FB4 - Maximizing VHDL Simulation Performance 3 stars (out of 3)
This presentation is definitely worth reading. Even though it was
geared toward VHDL simulators, and we are utilizing Verilog
simulators, the concepts transfer pretty much intact to Verilog-XL
and VSim. The main thrust of this session was to understand how
typical simulators work and thus, how to code efficiently to minimize
simulation runtimes. Keep in mind: the Scirocco simulator is a true
cycle-based simulator, Vsim and Verilog-XL are not. Some of the
techniques described will not apply to our (current) simulators.
The techniques presented can, and should, be used both in
non-synthesizable and synthesizable Verilog code. Main themes: reduce
the number of events that the simulator has to handle; use as high an
abstraction level as possible when modeling; make sure you are using
optimally coded RAM models; don't use gate level models if at all
possible - they are very compute intensive; watch the number of inputs
in sensitivity lists; perform operations only when needed - don't
setup default conditions first before other conditions are tested;
structure 'if' statements to reduce common sub-expressions... yada,
yada, yada. You get the picture. There are a *host* of things one
can do to speed up simulations. Read this presentation!"
- Brian Fall of Microchip Technology, Inc.
"We don't use VCS. We did our own benchmarks of it against NC-Verilog
and found NC-Verilog to be ~2X VCS. We're a small customer so Synopsys
never had the time to teach us the VCS switches to make it run faster."
- an anon engineer
"VCS Tutorial: John Girad, Massoud Eghtessad both of Synopsys.
Overall: Somewhat of a repeat, but some useful new stuff.
a) Roadrunner: If you use a coding style (RTL synth subset for
the most part), Roadrunner (RR) will automagically divide your
code up into 4state/event and 2state/cycle divisions. The mapped
2-state logic should run much faster, but of course accuracy is
sacrificed.
b) VCS now takes always blocks with the same sensitivity list and
merges them into on big block. This speeds up sims.
c) Don'ts for VCS speed:
- no async
- no feedback
- no '#' or time variables
- do not use case, for, etc (all but simple if)
- no '<=' (non block)
d) Do's for VCS speed:
- sync
- full sensitivity list
- simple 'if'
- blocking
e) '#1' delays: +nbaopt option gets rid of them.
f) Instead of adding a # delay in the middle of an always block
(which kills VCS speed optimization), put the code in a task
and call the task with the '#' delay in front (# taskname;)
g) **** Even if you have an 'ifdef SCAN that is not on to enable
your $recordvars;, this slows down the sim a lot.
h) VCS has a new PLI learning mode, where it monitors all the
PLI read and write calls, and then makes updates to optimize
the PLI interface to speed things up.
i) Got a Non response (political) to my question about if the
non-PLI Vera interface is still happening. My guess is no.
j) **** VCS will actually stop (at any line of code, randomly)
and start running any other 'always' block with the same
sensitivity list. It can do this recursively throught all
'always' blocks with the same sensitivity list.
k) VCS version 6.0 has a new trigger called +always trigger, where
the compiler will compile all 'always' blocks first, then do the
'initial' blocks.
l) VCS 5.1 and higher has race catcher/analyser (use +race=all).
Not a dynamic checker, but more a lint/parser deal that looks
for common code mistakes.
m) Radiant Technology:
- PreVCS Preprocessor: have to turn on
- config file allows directing Radiant on certain blocks
- use +rad switch to turn it on.
- up to 20x improvement
- more people using this
- kills debugging, SDF, timing checks
- basically Radiant massages the code to get rid of redundant
events. Of course the code no longer matches and thus any
wave output files might be hosed.
- +rad_1 is a relaxed (sort of a 'Radlite') mode
n) 2-state technology:
- 'off' by default
- up to 2x
- not too many people using it
- useful on RTL regression and functional verif
- no strengths
- X's go to 1 and Z's go to 0 (can't change this)
- tri-state maps to logical ORs
- regs initialize to all 0's, wires initialize to all 0's
- '===' identical statements, as well as 'casez', case z,
and even if statements are all complicated by the 2-state
- to check it out a) run VCS with -Xman=4 to get a global
Verilog file of your design (tokens.v). Then parse the file
- Can't do 4-state = to 2-state assignments. Must to be 4 = 4
- Can do 2-state = 4-state, but strength is lost and x and z
map to 1 and 0 respectively.
o) Codecover: Covermeter
New: functional coverage with user defined expressions. Little
Verilog-esque language to write checkers to check for or against
something happening. Could do most of these in E or Vera, but
these are tied into the entire code coverage tool.
In other parts of the Boston SNUG, personally the transfer to tcl synth
talks and the $assert watchdog approach for verification were my fav
highlights. My talk on how to generate a verification plan quickly was
well received, too.
- Peet James of Qualis Design
"TB2 - FSM Designs with Glitch-free Outputs 2 stars (out of 3)
This paper received the '2nd Best Paper' award for the conference.
Second time the author, Cliff Cummings, has captured a SNUG title. He
promotes a unique method of coding Finite State Machines so that all
the outputs are registered without incurring the heavy area penalty of
simply slapping registers on all the combinatorial signals output from
a module. Instead of having separate registers for the state encoding
and for any module outputs, a one-hot state encoding strategy is used
and any combinatorial outputs are assigned to be asserted during their
appropriate state. For all outputs which assert in unique states, no
other registering is required. For outputs with do not occur in unique
states, simply add an additional 'state' to the state encoding and the
output is now registered. The idea is fairly straight forward, but
requires a little bit of paperwork to map out exactly which outputs get
assigned in what states and which outputs will require the addition of
an additional 'state' for assertion. Worth a read for another design
technique."
- Brian Fall of Microchip Technology, Inc.
|
|