( ESNUG 466 Item 7 ) -------------------------------------------- [07/12/07]

Subject: Testbench gotchas

7.1 Multiple levels of the same virtual method

    Gotcha: A virtual method can be re-defined multiple times, leading to
            confusion as to which virtual method is actually used.

A base class can have a virtual method.  If the base class itself is not
virtual, then its virtual method must be defined so that the base class can
be constructed.  That is, when the base class is constructed, it will
require a stand-alone definition of its virtual method.  When that base
class is extended, the child class can generate a new definition of the base
class virtual method.  If the child class designates this method as virtual
also, then, when it is extended to yet another child level (grandchild), the
grandchild can redefine the virtual method of its parent.

The example below shows a base class, called base, with a virtual method
called print_something.  The base class is extended by class ext1.  In turn,
ext1 is extended by class ext2.  The print_somthing method is designated as
virtual in all three classes.  The gotcha is in understanding which method
is actually used when all three levels have different definitions for the
same virtual method.  The answer is that there will only ever be one
definition for a virtual method within a constructed class, and it is always
the definition that is the furthest descendent from the base.

   package tb_classes;

     class base;
       virtual task print_something;
         $display("printing from base");
       endtask: print_something
     endclass: base

     class ext1 extends base;
       virtual task print_something;
         $display("printing from ext1");
       endtask: print_something
     endclass: ext1

     class ext2 extends ext1;
       virtual task print_something;
         $display("printing from ext2");
       endtask: print_something
     endclass: ext2
   endpackage: tb_classes

   program tb_base;
     import tb_classes::*;
     base b = new;
     initial begin
       b.print_something;     // message displayed is "print from base"
     end
   endprogram: tb_base

   program tb_ext1;
     import tb_classes::*;
     base b;
     ext1 e1 = new;
     initial begin
       b = e1;
       b.print_something;     // message displayed is "print from ext1"
       e1.print_something;    // message displayed is "print from ext1"
     end
   endprogram: tb_ext1

   program tb_ext2;
     import tb_classes::*;
     base b;
     ext1 e1;
     ext2 e2 = new;
     initial begin
       b  = e1;
       e1 = e2;
       b.print_something;     // message displayed is "print from ext2"
       e1.print_something;    // message displayed is "print from ext2"
       e2.print_something;    // message displayed is "print from ext2"
     end
   endprogram: tb_ext2

This gotcha can be avoided by proper training and an understanding of how
System Verilog virtual methods work, coupled with adopting a proper
object-oriented verification usage methodology, such as the Synopsys VMM.


7.2 Event trigger race conditions

    Gotcha: An event that is triggered in the same time step in which
            a process begins looking for the event may not be sensed.

Verilog provides a basic inter-process synchronization mechanism via the
event data type.  There are two gotchas associated with Verilog's event
synchronization.  The first gotcha may or may not really be considered a
gotcha, but experience has shown this to be an issue over the years.  That
is, many engineers don't know that the feature even exists in the language,
and are unaware of how to use it.  An engineer who had been using Verilog
for a number of years recently attended a Verilog training class with his
team.  When the section on event data types and usage was presented, the
engineer asked if this was something new with System Verilog.  The answer
was no, it has been in the language since the beginning.  To this, he
replied, "Why hasn't anyone told me about this before?"

The second, and more significant, gotcha is that there can easily be
simulation race conditions with Verilog's event triggering.  The
following code demonstrates this race condition.

   module event_example1;

     event get_data, send_data;   // handshaking flags

     initial -> get_data;         // trigger get_data event at time zero

     always @(get_data) begin     // wait for a get_data event
       ... // do code to get data
       ... // when done, trigger send_data
       -> send_data;              // sync with send_data process
     end

     always @(send_data) begin    // wait for a send_data event
       ... // do code to send data
       ... // when done, trigger get_data
       -> get_data;               // sync with get_data process
     end

   endmodule

In this simple example, the two always blocks model simple behavioral
handshaking using event data type to signal the completion of one block and
enabling the other.  The initial block is used to start the handshaking
sequence.

The gotcha lies in the fact that, at simulation time zero, each of the
procedural blocks must be activated.  If the initial block activates
and executes before the always @(get_data) block activates, then the
sequence will never start.

How to avoid this gotcha: In Verilog, the only way to solve this issue is
to delay the trigger in the initial block from occurring until all the
procedure blocks have been activated.  This is done by preceding the
statement with a delay, as shown in the code below.

   initial #0 -> get_data;  // start handshaking at time 0, but after all
                            // procedural blocks have been activated

Using the #0 delay will hold off triggering the get_data event until all the
procedure blocks are activated.  This ensures that the always @(get_data)
block will sense the start of a handshake sequence at time zero.

But, using #0 is another gotcha!  The Verilog #0 construct is an easily
abused construct, and does not truly ensure the delayed statement will
execute after all other statements in a given time step.  Many Verilog
trainers have recommended that #0 should never be used.  There are
alternatives based on the nonblocking assignments that have more
reliable and predictable event ordering.  Not using #0 is a good
guideline, except for event data types.  In Verilog, there is no way to
defer the event triggering to the nonblocking event queue.

System Verilog comes to the rescue with two solutions that will remove
the event trigger race condition, and remove the need to use a #0.

How to avoid this gotcha, solution 1: System Verilog defines a nonblocking
event trigger, ->>, that will schedule the event to trigger in the
nonblocking queue.  For the example in this section, this eliminates the
race condition at time zero, and eliminates the need for a #0 delay.
Triggering the get_data event in the nonblocking queue allows for the
always procedure blocks to become active before the event is triggered.

   initial ->> get_data;  // start handshaking at time 0 nonblocking
                          // queue, after all procedural blocks have
                          // been activated

How to avoid this gotcha, solution 2: System Verilog provides a second
approach that will provide a solution to many more situations than the
simple example shown in this section. This second solution uses a trigger
persistence property that will make the trigger visible through the entire
time step, and not just in the instantaneous moment that the event was
triggered.

   module event_example2 ( ... );

     event get_data, send_data;   // handshaking flags

     initial -> get_data;         // trigger get_data event at time zero

     always begin
       wait(get_data.triggered)   // wait for a get_data event
       ... // do code to get data
       ... // when done, trigger send_data
       -> send_data;              // sync with send_data process
     end

     always @(send_data) begin    // wait for a send_data event
       // could have used wait(send_data.triggered) here also, but it is
       // not needed since there is no race condition between the two
       // always blocks
       ... // do code to send data
       ... // when done, trigger get_data
       -> get_data;               // sync with get_data process
     end
   endmodule

The wait (get_data.triggered) returns true in the time step in which
get_data is triggered.  It does not matter if the trigger event occurs
before or after the wait statement is activated.  So, in the above
example, if the initial block activates and executes before the first
always block, the trigger persistence will still be visible when the
first always block becomes active and executes the wait
(get_data.triggered) statement.


7.3 Using semaphores for synchronization

    Gotcha: Semaphore keys can be added to a bucket without having
            first obtained those keys.

    Gotcha: Semaphore keys can be obtained without waiting for prior
            requests to be serviced.

The Verilog event data types have been used for years as a means to
synchronize procedural blocks.  But, this method of procedural handshaking
and communication is too limiting for modern, object-oriented verification
methodologies.  System Verilog provides two additional inter-process
synchronization mechanisms that provide more flexibility and versatility
than simple event triggering provides.  These mechanisms are semaphores and
mailboxes.  Both of these new synchronization methods have subtle behaviors
that must be considered and worked with when being used.  This section
describes gotchas involving semaphores.  Sec 7.4, which follows, describes
the gotchas involving mailboxes.

Semaphores are like a bucket that can hold a number of keys or tokens.
Methods are available to put any number of keys into the bucket and to get
any number of keys out of the bucket.  The put() method is straight forward.
The number specified as an argument to put() is the number of keys placed
in the bucket.

One gotcha is that any number of keys can be placed into the bucket,
regardless of how many were retrieved from the bucket.  Thus, incorrect
code can keep adding more keys to the bucket than were retrieved from
the bucket.  Indeed, a process can add keys to the bucket without
having retrieved any keys at all.

How to avoid this gotcha: This gotcha has to be managed by understanding how
the semaphore get() and put() methods work, and using them properly.

A second gotcha occurs when a process has to wait for keys.  Keys can be
retrieved from the bucket by using the get() method.  The get() method is
blocking.  If the number of keys requested is not available, the process
suspends execution until that number of keys is available.

The get() method has a subtle, non-intuitive gotcha.  If the number of keys
requested is not available, then the request is put into a FIFO queue and
will wait until the number of keys becomes available.  If more than one
process requests keys that are not available, the requests are queued in the
order received.  When keys become available, the requests in the queue are
serviced in the order in which the requests were received, First In, First
Out.  The gotcha is that, when get() is called (a new request), an attempt
will be immediately made to retrieve the requested keys, without first
putting the request into the FIFO queue.  Thus, a new request for keys can
be serviced, even if other requests are waiting in the semaphore request
queue.  The following example demonstrates this gotcha.

   module sema4_example ( ... );

     semaphore queue_test = new; // create a semaphore bucket

     initial begin: Block1       // at simulation time zero...
       queue_test.put(5);        // bucket has 5 keys added to it
       queue_test.get(3);        // bucket has 2 keys left
       queue_test.get(4);        // get(4) cannot be serviced because
                                 // bucket only has 2 keys; therefore
                                 // request is put in the FIFO queue
       $display("Block1 completed at time %0d", $time);
     end: Block1

     initial begin: Block2 #10   // at simulation time 10...
       queue_test.get(2);        // GOTCHA! Even though the get(4) came
                                 // first, and is waiting in the FIFO
                                 // queue, get(2) will be serviced first
       queue_test.get(1);        // this request will be put on the fifo
                                 // queue because the bucket is empty;
                                 // it will not be serviced until the
                                 // get(4) is serviced
       $display("Block2 completed at time %0d", $time);
     end: Block2

     initial begin: Block3 #20   // at simulation time 20...
       queue_test.put(3);        // nothing is run from the fifo queue
                                 // since get(4)is first in the queue
       #10                       // at simulation time 30...
       queue_test.put(2);        // get(4) and get(1) can now be serviced,
                                 // in the order in which they were
                                 // placed in the queue
       $display("Block3 completed at time %0d", $time);
     end: Block3
   endmodule

When a get() method is called and there are enough keys in the bucket to
fill the request, it will be retrieve the requested keys immediately, even
if there are previous get() requests waiting in the FIFO queue for keys.
In the example above, the Block1 process begins execution at simulation
time 0.  It executes until get(4) is called.  At that time, there are only
2 keys available.  Since the request could not be filled, it is put on the
queue.  The execution of Block1 is suspended until 4 keys are retrieved.

Next, a separate process, Block2 requests 2 keys at simulation time 10.
The get(2) executes and retrieves the 2 remaining keys from the bucket
immediately, even though there is the get(4) in the queue waiting to be
serviced.  The process then executes a get(1).  The request cannot be
serviced because the bucket is now empty, and hence is put on the queue.

At simulation time 30, the Block3 process puts three keys back in the
semaphore bucket.  The get(4) request sitting in the FIFO queue still
cannot be serviced, because there are not enough keys available.  There
is also a get(1) request in the queue, but is not serviced because
that request was received after the get(4) request.  Once placed on the
queue, the get() requests are serviced in the order which they were
received.  The get(4) must be serviced first, then the get(1).

This gotcha of having a get() request serviced immediately, even when other
get() requests are waiting in the FIFO queue, can be avoided, if the get()
requests are restricted to getting just one key at a time.  If a process
needs more then one key, then it would need to call get(1) multiple times.
When the process is done, it could return multiple keys with a single put().
It is not necessary to call put(1) multiple times.


7.4 Using mailboxes for synchronization

    Gotcha: Run-time errors occur if an attempt is made to read
            the wrong data type from a mailbox.

A second inter-process synchronization capability in System Verilog is
mailboxes.  Mailboxes provide a mechanism for both process synchronization
and the passage of information between processes.  By default, mailboxes are
typeless, which means that messages of any data type can be put into the
mailbox.  The gotcha is that, when messages are retrieved from the mailbox
with the get() method, the receiving variable must be the same data type as
the value placed in the mailbox.  If the receiving variable is a different
type, then a run time error will be generated.

There are three ways of avoiding this gotcha.  First is the brute force
method of managing the data types manually.  The manual approach is error
prone.  It places a burden on the verification engineers to track what type
of data was put in the mailbox, and in what order, so that the correct
types are retrieved from the mailbox.

The second approach is to use the try_get() method instead of the get()
method.  The try_get() method retrieves the message via an argument passed
to try_get(), and returns a status flag.  One of 3 status flags returns:

 - Returns 1 if the message and the receiving variable are type
   compatible, and the message is retrieved.
 - Returns -1 if the message and the receiving variable are type
   incompatible, in which case the message is not retrieved.
 - Returns 0 if there is no message in the mailbox to retrieve.

The return value of try_get() can be processed by conditional statements to
determine the next verification action.  The following example illustrates
using a typeless mailbox and the put(), get() and try_get() methods.

   module mbox_example1 ( ... );
     logic [15:0] a, b;
     int i, j, s;
     struct packed {int u, v, w;} d_in, d_out;

     mailbox mbox1 = new;    // typeless mailbox

     initial begin
       mbox1.put(a);    // OK, any data type messages can be put in
       mbox1.put(i);    // OK, any data type messages can be put in
       mbox1.put(d_in); // OK, any data type messages can be put in

       mbox1.get(b);    // OK, any data type messages can be put in
       mbox1.get(b);    // ERROR: b is wrong type for next message in
       s = mbox1.try_get(d_out);  // must check status to see if OK
       case (s)
          1: $display("try_get() succeeded");
         -1: $display("try_get() failed due to type error");
          0: $display("try_get() failed due to no message in mailbox");
       endcase
     end
   endmodule

The third approach to avoiding a mailbox run-time error gotcha is to use
typed mailboxes.  These mailboxes have a fixed storage type.  The tool
compiler or elaborater will give a compilation or elaboration error if the
code attempts to place any messages with incompatible data types into the
mailbox.  The get() method can be safely used, because it is known before
hand what data type will be in the mailbox.

The next example illustrates declaring a typed mailbox.

   typedef struct {int a, b} data_packet_t;

   mailbox #(data_packet_t) mbox2 = new;    // typed mailbox

With this typed mailbox example, only messages of data type data_packet_t
can be put into mbox2.  If an argument to the put() method is any other
type, a compilation or elaboration error will occur.


7.5 Coverage reporting

    Gotcha: get_coverage() and get_inst_coverage() do not break down
            coverage to individual bins.

System Verilog provides powerful functional coverage for verification.  As
part of functional coverage, verification engineers define covergroups.  A
covergroup encapsulates one or more definitions of coverpoints & crosscover
points.  A coverpoint is used to divide the covergroup into one or more
bins, where each bin includes specific expressions within the design, and
specific ranges of values for those expressions.  Cross coverage specifies
coverage of combinations of cover bins.  An example covergroup definition:

   enum {s1,s2,s3,s4,s5} state, next_state;

   covergroup cSM @(posedge clk);
     coverpoint state {
       bins state1  = (s1);
       bins state2  = (s2);
       bins state3  = (s3);
       bins state4  = (s4);
       bins state5  = (s5);
       bins st1_3_5 = (s1=>s3=>s5);
       bins st5_1   = (s5=>s1);
     }
   endgroup

These covergroup bins will count the number of times each state of a state
machine was entered, as well as the number of times certain state transition
sequences occurred.

System Verilog also provides built-in methods for reporting coverage.  It
seems intuitive for coverage reports to list coverage by the individual bins
within a covergroup.  GOTCHA!

When the System Verilog get_inst_coverage() method is called to compute
coverage for an instance of a covergroup, the coverage value returned is
based on all the coverpoints and crosspoints of the instance of that
specific covergroup.

When the System Verilog get_coverage() method is called, the computed
coverage is based on data from all the instances of the given covergroup.

The gotcha with coverage reporting is that coverage is based on crosspoints
or coverpoints.  There are no built in methods to report details of
individual bins of a crosspoint.  If the coverage is not 100%, the designer
has no way to tell which bins are empty.

How to avoid this gotcha: If the coverage details for each bin are needed,
then each covergroup should have just one coverpoint, and that coverpoint
should have just one bin.  Then, when the coverage is reported for that
cover group, it represents the coverage for the coverpoint bin.


7.6 $unit declarations

    Gotcha: $unit declarations can be scattered throughout multiple
            source code files.

$unit is a declaration space that is visible to all design units that are
compiled together.  The purpose of $unit is to provide a place where design
and verification engineers can place shared definitions and declarations.
Any user-defined type definition, task defintion, function definition,
parameter declaration or variable declaration that is not placed inside a
module, interface, test program, or package is automatically placed in
$unit.  For all practical purposes, $unit can be considered to be a
predefined package name that is automatically wildcard imported into all
modeling blocks.  All declarations in $unit are visible without having to
specifically reference $unit.  Declarations in $unit can also be explicitly
referenced using the package scope resolution operator.  This can be
necessary if an identifier exists in multiple packages.  An example of
an explicit reference to $unit is:

   typedef enum logic [1:0] {RST, WAITE, LOAD, RDY} states_t; // in $unit

   module chip (...);
     ...
   $unit::states_t state, next_state;  // get states_t def from $unit

A gotcha with $unit is that these shared definitions and declarations can be
scattered throughout multiple source code files, and can be at the beginning
or end of a file.  At best, this is an unstructured, spaghetti-code modeling
style, that can lead to design and verification code that is difficult to
debug, difficult to maintain, and nearly impossible to reuse.  Worse, is
that $unit definitions and declarations scattered across multiple files can
result in name resolution conflicts.  Say, for example, that a design has a
$unit definition of an enumerated type containing the label RESET.  By
itself, the design may compile just fine.  But, then, let's say an IP model
is added to the design also contains a $unit definition of an enumerated
type containing a label called RST.  The IP model also compiles just fine by
itself, but, when the design files, with their $unit declarations are
compiled along with the IP model file, with its $unit declarations, there is
a name conflict.  There are now 2 definitions in the same name space trying
to reserve the label RST.  GOTCHA!

How to avoid this gotcha: Use packages for shared declarations, instead of
$unit.  Packages serves as containers for shared definitions/declarations,
preventing inadvertent spaghetti code.  Packages also have their own name
space, which will not collide with definitions in other packages.  There can
still be name collision problems if two packages are wildcard imported into
the same name space.  This can be prevented by using explicit package
imports and/or explicit package references, instead of wildcard imports (see
Sec 2.6 for examples of wildcard and explicit imports).


7.7 Compiling $unit

    Gotcha: Separate file compilation may not see the same $unit
            declarations as multi-file compilation.

A related, and major gotcha, with $unit is that multi-file compilation and
separate file compilation might not see the same $unit definitions.  The
$unit declaration space is a pseudo-global space.  All files that are
compiled together share a single $unit space, and thus declarations made in
one file are visible, and can be used in another file.  This can be a useful
feature.  A user-defined type definition can be defined in one place, and
that definition can be used by any number of modules, interfaces or test
programs.  If the definition is changed during the design process (as if
that ever happens!), then all design blocks that reference that shared
definition automatically see the change.  On the other hand, a software tool
that can compile each file separately, such as DC, will see a separate, and
different, $unit each time the compiler is invoked.  If some $unit
definitions are made in one file, they will not be visible to another file
that is compiled separately.

Synopsys VCS, Formality and Magellan are multi-file compilers.  DC and
LEDA are separate file compilers.  If $unit declarations are scattered
between multiple files, and the files are not compiled together, then
DC and LEDA will not see the same $unit declarations as VCS, Formality
and Magellan.

How to avoid this gotcha: The gotcha of different tools seeing different
$unit declarations can easily be avoided by using packages instead of $unit.
Packages provide the same advantages of shared definitions and declarations,
but in a more structured coding style.  If $unit is used, then a good style
is to ensure that all $unit definitions and declarations are made in one,
and only one file.  That file must then always be compiled with any file or
files that use those declarations.
Index    Next->Item








   
 Sign up for the DeepChip newsletter.
Email
 Read what EDA tool users really think.


Feedback About Wiretaps ESNUGs SIGN UP! Downloads Trip Reports Advertise

"Relax. This is a discussion. Anything said here is just one engineer's opinion. Email in your dissenting letter and it'll be published, too."
This Web Site Is Modified Every 2-3 Days
Copyright 1991-2024 John Cooley.  All Rights Reserved.
| Contact John Cooley | Webmaster | Legal | Feedback Form |

   !!!     "It's not a BUG,
  /o o\  /  it's a FEATURE!"
 (  >  )
  \ - / 
  _] [_     (jcooley 1991)