It's amazing where Synopsys/Verilog/VHDL consulting can take you.  This week
  I'm working a 10 day contract in Tel Aviv, Israel.  Because it's such an 
  interesting place, I brought my fiancee along.  Mind you: my impression of
  Israel is completely different than hers!  *Her* Israel consists of either
  spending a nice, lazy day on a sunny beach or touring places like Nazareth 
  or Jerusalum.  *My* Israel consists of waking up at 6 AM every morning to
  have a lively discussion with her about how much spending money I'm going 
  to give her for the day, followed by working in a dull, windowless office 
  on various engineering tasks until around 9 PM every night.  I only get to
  "see" Israel through the stories my financee tells me at the end of the day.
  (I've also been promised that I'll get to see actual photos of some of these
  places she has seen once we get back home!)  My married Israeli co-workers
  tell me: "Get used to it.  That's the way life is after you get married!"

  My only other complaint about work in Israel is that I got a bad contract.
  (Yesterday I read in the "Jerusalem Post" that a captured would-be Islamic 
  Jihad suicide bomber was getting paid $6,000 (for his family) *PLUS* "the 
  prospect of 72 virgins in paradise.")  If I had known it was a standard 
  business practice in the Middle East to offer virgins for contract jobs I
  would have asked for some, too!  Damn!  :^)
                                                  - John Cooley
                                                    the ESNUG guy

( ESNUG 235 Item 1 ) ---------------------------------------------- [3/96]

  [ Editor's Note: Because Dave's paper is so long, I'm breaking it up
    over two ESNUG's (235 & 236) sent on the same day.  - John  ]

Subject: (ESNUG 234 #5)  "What's The Non-Propaganda Story On BC and DW?"

>I'm looking for information / real user testimony on ESNUG on Behavioral
>Compiler.  I am also interested in the use model for DesignWare.  We have
>been doing design for several years and have only been using the "free" DW
>components.  Now with Behavioral Compiler, my Synopsys sales rep is really
>pushing DesignWare, saying that without it BC can't work its magic.  (The
>only facts I can distill from my AE and the databooks is that I can't design
>pipelined arithmetic without DW.)  He is also saying that in our industry,
>there is usually a DesignWare seat for every Design Compiler seat, which I
>found hard to believe.


From: dblack@apple.com (David C. Black)

John, I'll answer this by giving you my paper from the recent SNUG '96
meeting titled "Experience & Suggestions on Using Behavioral Compiler".

  - David C. Black
    Apple Computer


Design Characteristics
----------------------

Our project was to design an ASIC that mixed graphics and video for
presentation on a television set.  The design includes several features
involving vertical convolution that improve the appearance of the graphics
on interlaced displays for both NTSC (U.S.) and PAL (European)
televisions.  Since the design interfaces and manages standard DRAM,
internal line buffers and lookup tables were used to meet performance
goals.  These internal buffers and tables were implemented with embedded
memories.  The technology characteristics of our design were:  CMOS 0.5
micron 100k gate array, Plastic Quad Flat Pack, and a 54 MHz main clock.

The following are design elements:

  - Video processing "pipeline"
  - DRAM controller
  - Asynchronous processor bus interface

Additionally, we chose to adhere to the following constraints:

  - Synchronous design including reset
  - Other clock domains synchronized to main clock (using RTL)

The final implementation included:

  - 38k bits embedded synchronous SRAM          - 4 dual ports
  - 3 large single port 10k bits        - 2 misc. single ports
  - Approximately 60k gates random logic


EDA Design Environment
----------------------

At the outset of the project, our group had no EDA tools for ASIC design.
Due to our remote location from main R&D, we were also left to support the
workstations ourselves.

Specialized Synopsys Scripts
----------------------------

- I/O ring synthesis (table driven)
- Structural stitching (bottom up)
- DW memory wrapper generation
- DW simple function implementation
- Log file checker
- Statistics tools (lines of code, area, timing, bug management)

Our SparcStation was actually acquired for an entirely different purpose
with EDA support as a sub-issue.  Relatively soon into the design, we
realized it was a necessary part of the environment due to the development
approach used by Synopsys.  It turned out that some bugs in BC caused
fatals on HP; whereas, on the Sparc architecture intelligible error
messages were issued.

Motivations
-----------

We chose to use Synopsys Behavioral Compiler because it promised to
provide a higher level of abstraction that would enable easier coding and
faster simulation.  This improved the design cycle time, by focusing less
on lower level details and more on the desired functionality.

Additionally, a commensurate reduction in text should reduce typos.
Reductions in debug could also be realized because behavioral designs tend
to simulate faster.  For video design, this is a real issue.  A single
screen of data could easily take 8-10 hours using conventional RTL
simulation.  Behavioral simulations reduced this in half.

Another aspect of higher levels of abstraction is the relative ease in
making changes.  Because state machine coding and register assignment is
avoided, commensurate reductions in complexity would make the code easier
to change.  This also would aid reusability of the code.
Finally, there are a number of optimization and design exploration
features in Synopsys Behavioral Compiler that would improve code size.

Since the video processing portion of the design is a natural pipeline,
the pipeline optimization features would be extremely handy.
NOTE: As an unexpected benefit, the Behavioral Compiler training course
uses an image processing design for the main example.

Conventions - File Naming
-------------------------

Source files were given a common naming template as follows.  All ASIC HDL
source was contained under a single directory (hdl/), and every Verilog
module was placed in its own directory.  Verilog module names were
restricted to always begin with an uppercase letter.  Source files were
named <Module_Name>_<Type>.v.m4, where Type was one of 'beh' or 'RTL'.  A
makefile rule processed source files and renamed the resultant verilog
<Type>.v within each module's directory.  This allowed the Synopsys script
to quickly find the source code and determine what to do with it
(behavioral scheduling or RTL compilation).

  - Aids module/file identification for other readers (esp. when tracking
    down problems)
  - Allows automation for some tasks
  - Aids avoidance of problems (e.g. module name collision)
  - Aids scripting mixed behavioral/RTL design


Conventions - Default Scripts
-----------------------------

A directory was established to contain commonly used dc_shell scripts.  A
common .synopsys_dc.setup implemented by setting up a symbolic pointer
from each user's home directory to the common script directory.  The
common .synopsys_dc.setup script then setup a search path that included
the current directory, and the script directory in that order to allow
overrides.  This script utilized the get_unix_variable() function to
obtain guidance for selecting some of other setup scripts.  For example,
get_unix_variable(VENDOR) yielded a name to select the target vendor
setups.  This allows for quick retargeting of the netlist by creating a
new script containing vendor specific information.

Common scripts, included a script for each of the following tasks in a
common script directory.

  - Determining type of module and file names
  - Behavioral analysis and scheduling
  - Default constraints with overrides
  - RTL compilation

This allowed focus on the coding, and only required overrides for the
exceptions from general rules.  In general, this simplified development
once the common scripts were in place.  This approach also avoids
reinventing the wheel for each module and ensures the general fixes are
applied to all modules.  For designs with multiple engineers, the benefits
of consistency would be even greater.

The scripts take advantage of file naming conventions.  Source verilog was
always named either beh.v or rtl.v and similar approach was taken for all
output files.

Signal naming conventions Aided Debug of BC Related Architecture
----------------------------------------------------------------

An interesting aspect of Synopsys implementation of behavioral coding is
the concept of "signals" versus "local variables".  Signals are basically
your primary I/O; although, in a design containing more than one "always"
block, a signal is any wire or register that crosses always block
boundaries.  Local variables on the other hand are those entities that
stay entirely within the bounds of the "always" block.

An aspect of Verilog I was not familiar with, was that registers/wires can
be defined within the scope of a begin/end block.  It was my practice to
define all variables at the front of the code.  This makes local variables
stand out better.

Signals are considered to be either input or output resources.  Reading an
input or writing an output is a scheduled event and has important
implications to the resultant code.  On the other hand, local variables
are used without similar scheduling impacts. Additionally, output signals
are the only items that may use Verilog non-blocking assignments; whereas,
local variables must use blocking assignments. Thus it is important to
keep these elements distinct within your code.

I quickly settled on the idea of maintaining a naming convention to aid me
in this effort.  By using a consistent suffix convention, I was also able
to write tools that would automatically ferret out some mistakes.

	input  TestMode_il;     // used to speed testing of large counter
	output Clk90MHz_o;      // exported to audio control logic
	output Clk15Hz_o;       // watchdog timer granularity
	input  Clk;
	input  Reset
	reg    [15:0] Count_r;


Applicability of Behavioral Code
--------------------------------

A natural question when first introduced to behavioral compiler, is "How
much of my code will be applicable to behavioral coding?"  When we first
approached this question, the thought was to limit behavioral compilation
to only the video processing sections of the design.  Since this was a
major portion of the code, it would yield an acceptable result (cost of
tool versus benefits).

After taking the training course and diving into initial coding, we were
surprised to discover that many more modules would benefit from the
behavioral coding approach.  Keep in mind that behavioral compiler
produces a synchronous design with a state machine and datapath.  Since by
design practice our design was fundamentally synchronous, then anywhere a
state machine was present, an opportunity for behavioral code resulted.
The resultant reduction in coding was considered enough to justify the use.

Here are some statistics from the design:

  - 51 modules with 8K lines unexpanded m4 source (34% comments).
  - Hierarchy 6 levels deep (2 template and 4 functional).
  - 18 behavioral modules (5500 lines)
  - 19 RTL modules (2100 lines)
  - 14 structural modules (650 lines)


Planning Issues
---------------

Use the largest possible display and a spreadsheet program to view
schedule tables

One feature of Behavioral Compiler is its ability to create large tables
of output information about the scheduling of operations in your design.
With large tables comes the problem of how best to view the information.
My preference has been to import the files (tab delimited) into a
spreadsheet and view them on a large monitor display.

   + Some designs are more naturally represented in RTL

It seemed obvious at the outset that some modules would not fit the
behavioral paradigm.  Modules with asynchronous clocking issues are
obvious.  Some bus protocol interactions were more easily represented in
RTL.  Interestingly, a timing generator for the video pipeline seemed more
natural to code in RTL.

On the other hand, many more modules fit the behavioral style than
initially thought.  Anything with a state machine basically fits the
model, and you don't have to code the state machine directly!  Just model
the behavior and let the compiler do the work.

   + Keep Behavioral and RTL code in separate modules

Although BC provides mechanisms to allow mixing behavioral and RTL code in
the same Verilog module; I found the results to be not worth the price.
There are interactions between the code that need to be watched, and
result in errors if not managed carefully.  For example, signals crossing
the always block boundary are considered "signals" as discussed earlier.
If you use RTL to assign output values from within the behavioral code you
may be creating registers or states where you did not expect.

I did try the careful mixing at first, but quickly concluded that it was not
worth the risk.  I believe the KISS (Keep It Simple Stupid!) principle
applies.  Pipelining behavioral description takes some thought.
One of behavioral compiler's strengths is its ability to automatically
pipeline code and allow you to explore tradeoff's in latency.  Video
processing is quite applicable to the pipelining approach, and one of our
modules uses this to great advantage.  Unfortunately, some restrictions
prevented utilizing this feature in two other modules.

For example, pipelines are restricted to forever loops, and may not
contain while loops themselves.  Careful use of if-else or case statements
within the forever can get around these restrictions, but it takes some
careful thought.  I may have missed a few opportunities to do this because
I was not used to thinking of the behavior in this manner.

Simulation Issues
-----------------

Scheduling errors are functional problems too (e.g. catch unused
inputs/operations)

Part of the methodology called for all modules to be run quickly through
Synopsys to ensure synthesizability.  One benefit to this approach is that
some functional bugs are caught prior to long simulation runs.  For
example, one particular piece of code had a rectangle bounds comparison.

An oversight in cut and paste failed to use both ends of the rectangle,
but instead used the end comparison twice.  BC detected this by noting
that the beginning comparison register assignment was unused.  In fact BC
does rigorous checks for all resources and quickly located doubly
initialized registers as well.

   + Macros for "posedge clock w/ Reset" aided readability and editing

In order to synthesize synchronous resets into a design, BC requires a
specific coding style shown in the above code.  To simplify typing and aid
reading, we setup simple text macros3 to reduce this.

	@(posedge Clk); if (Reset) disable RESET_LOOP;

becomes

	@CLOCK;

Verilog # delays needed for interfacing to real world timing models (e.g.
memories)

An aspect of the design that took us unaware was the issue of inserting
appropriate #delays around the posedge clocks.  The internal synchronous
RAM model used was able to provide a value after a single clock.
Simulation models provided accurate timing which caused some problems.
Consider an external RAM interface:

	@CLOCK;
	RAM_Addr_o = value;
	RAM_EN_o = `TRUE;
	@CLOCK;
	Data_r <= RAM_Q_i;
	@CLOCK;

The above code doesn't work unless the external RAM model has zero hold
from the clock edge.  Inserting a #delay makes this work.  The Synopsys
translate_on/off pragmas were used to avoid BC warning messages (obviously
the clock period must be greater than hold period).

	@CLOCK;
	RAM_Addr_o = value;
	RAM_EN_o = `TRUE;
	@CLOCK;
	/* synopsys translate_off */
	#`RAM_HOLD;
	/* synopsys translate_on */
	Data_r <= RAM_Q_i;
	@CLOCK;

In practice, we reduced this to a simple text macro.

   + Verilog # delays were needed to recreate pipeline latency

Pipelines are a subject known to have simulation timing issues up front.
>From a synthesis coding point of view, only a single clock was needed for
the main graphics pipeline.  For simulation, we inserted Synopsys
translate_off/on pragmas around additional clocks to match the initiation
interval, and finally added a #delay to account for the latency.

	forever begin :FOREVER_LOOP
	    Input_r = Input_i; // explicitly read inputs
	    functional_code
	    /* synopsys translate_off */
	    repeat (`INITIATION_INTERVAL - 1) @CLOCK;
	    Output_o <= /* synopsys translate_off */
	                #(CLOCK_PERIOD * (LATENCY_PERIODS - 1))
	                /* synopsys translate_on */
	                output_value;
	    @CLOCK;
	end //FOREVER_LOOP


 [ Editor's Note: This is paper is continued in ESNUG post 236.  - John ]


 Sign up for the DeepChip newsletter.
Email
 Read what EDA tool users really think.


Feedback About Wiretaps ESNUGs SIGN UP! Downloads Trip Reports Advertise

"Relax. This is a discussion. Anything said here is just one engineer's opinion. Email in your dissenting letter and it'll be published, too."
This Web Site Is Modified Every 2-3 Days
Copyright 1991-2024 John Cooley.  All Rights Reserved.
| Contact John Cooley | Webmaster | Legal | Feedback Form |

   !!!     "It's not a BUG,
  /o o\  /  it's a FEATURE!"
 (  >  )
  \ - / 
  _] [_     (jcooley 1991)