( ESNUG 477 Item 3 ) -------------------------------------------- [11/20/08]

Subject: ( ESNUG 476 #9 ) The first US-based C/C++ chip design I've seen

> Standardization isn't only about second sourcing your synthesis, it's
> about verification.  CatapultC uses a synthesizable subset of C++ that
> is supported by QuestaSim, NC-Sim and VCS as well as standard ANSI C++
> compilers such as GCC and Microsoft Visual C++.
> 
>     - Bryan Bowyer
>       Mentor Graphics                            Wilsonville, OR


  [ Editor's Note: This is the first US-based use of C/C++ for real
    design that I've seen.  The others were done in Japan.  - John ]


From: [ Captain America ]

Hi, John,

Please keep me anonymous.

We first looked at CatapultC in 2003, but the tool was still fairly immature
back then.  We started using it on production designs almost 3 years ago; it
is now a robust and mature tool.  We're using it to make production chips
directly from our C++ algorithms.
 
We use CatapultC for 2 types of design implementation:

  1. To synthesize Verilog RTL for our ASIC implementation; to go to
     a real chip.

  2. To design the identical circuit and synthesize Altera FPGA cores
     for prototyping purposes.  With FPGA prototyping we can check to
     see if our design is working in real-time; it's MUCH faster than
     simulation.  Must do minor tweaking on C code.


Both our hardware designers and our system designers writing algorithms
use CatC.   Here's why:
 
  - Graphical user Interface.  CatapultC has an accessible, visual view of
    the hardware circuit it is scheduling, as well as the clock reference
    between the C code and the Verilog RTL code.

  - Click on the C code, the tool brings up the corresponding Verilog RTL.

  - We can also click on a component and see the C code.  This is useful
    because sometimes we get a component that is unusually large and
    expensive so then we see why...e.g. we should have used a 16 bit
    multiplier instead of 32 bit one.   We can also find components in the
    critical timing path and look backwards at our C code.

  - Hierarchical design support.  We do our designs hierarchically where
    we define IP blocks first, then integrate them.  For example, our design
    may have 5 blocks and we want to connect them without worrying about
    detailed timing.  With CatC, we use AC channel to construct the channels
    and connect the blocks.  AC channel (a CatC class that emulates a TLM
    FIFO) provides convenient means of audio and video synchronization.
    This is especially useful for multi-rate design, where our blocks are
    running at different clock speeds. 

  - Interface synthesis.  Interfaces can be hard to design, especially for
    our asynchronous designs with multiple blocks at different clock rates
    and different data rates between blocks.  With CatapultC we focus on
    writing C algorithms and don't have to worry about defining the
    interface because Catapult generates it automatically -- our source is
    interface independent.  Mentor provides a family of interfaces to select
    from, and CatapultC implements it.  Our engineers go to the dial-up box
    and specify the interface they want by clicking on it.  For example:

          direct connection
          one line
          2 line handshake
          FIFO

    Catapult automatically implements the selected interface at the inputs
    and outputs, and we don't have to change our C code or deal with
    detailed timing.

  - Pipelining of loops and unrolling loops.  If we are processing an
    image or a data stream, we write a loop and generate output.  Pipelining
    of loops specifies how often the next iteration may start, so we can
    control the rate at which inputs and outputs are generated.

    With loop unrolling we specify the number of times to copy the loop body
    and enables multiple loop bodies to execute in parallel.  Certain
    dependencies may prohibit unrolling and limit pipelining, but in general
    loop unroll and pipeline constraints provide us with good flexibility
    when optimizing our designs.

  - Tradeoff between performance and area.  CatapultC gives us timing and
    area estimates, then later we run our Verilog RTL through Synopsys DC
    for the final result.  Catapult gives an area score rather than the gate
    count, so we have to experiment with the ratio a bit as it depends on
    the library.  I don't have exact benchmark numbers but we have found
    that the correlation is good.  We are able to make architectural
    decisions base on it because it is all relative.

  - Bug fixing.  Bugs pop up at any time.  If we have a problem with
    performance then lots of rework might be done to the C code.  We do not
    have the luxury to wait for our C code to be completely written and our
    hardware designers to do manual implementation with Verilog RTL before
    doing iterations and fixing the bugs in our C code.  Our final design
    quality is better because we have time for architectural experimentation
    that we just wouldn't have if we implemented the Verilog RTL manually.


3 Months versus 6 Months:

We have one group doing a traditional Verilog RTL-based design flow where
the designers manually create their Verilog RTL code.  It's very time
consuming. 

We do a lot of architectural exploration and optimization with CatapultC,
typically about 3 months worth.  Once our C/C++ coding is done and we are
happy with the architectural exploration, it takes us 2 hours to go from
C/C++ to RTL for a 300K gate design with Catapult C.

For us to do a manual Verilog RTL implementation of those same 300K gates
would take at least 6 months.

Plus, in real life, the spec changes and the RTL guys must retrofit their
design, which can add significant delay to the project.  For us, CatC
tweaks are trivial.
 
We currently are working on an A/V chip.  One of the blocks we have been
working on is 275 K gates right now.  It was initially 1 M gates.  We
reduced it through exploration and optimization of the CatC architecture.


ANSI C/C++ versus SystemC:

Our system designers like to use ANSI C/C++ because it gives good freedom of
expression regarding design content.  It has a construct called 'class'
which is like an object with a private data and public interface.  The
private data is hidden because it is complicated for the user.  This is a
standard feature of ANSI C/C++ that is useful in modeling hardware because
there is quick mapping between class and hardware block.
 
C/C++ doesn't have a concept of time, just of algorithmic behavior.  So the
paradigm is that our system engineers write C++ algorithms and let Catapult
handle most of the scheduling and timing.  This is our preference, as our
system engineers can focus more on the development of good algorithms, which
has most impact on the differentiation of the final chip, without being
too bogged down by the details of scheduling and timing.

C++ is extremely fast to simulate...  it's at least 10X faster than SystemC
and 2 to 3 orders of magnitude faster than RTL.  This is because with C++,
everything is sequential so there is no overhead.  In contrast, with SystemC
you run your algorithm on top of the SystemC simulator kernel.  You are
explicitly expressing parallelism, and thus there is a lot of overhead
associated with it.  You must create a different task; you must run multiple
threads with lots of context switching in between.  Generic C++ works well
for us because most of our design is algorithmic in nature.  For other types
of design specially ones with complicated control, SystemC may be the right
choice of language.

CatapultC supports AC integer and fixed point data types.  AC data types let
user control the bid width of variables to optimize the cost of hardware.
Both data types are easier to use and mapped to hardware better than SC data
types.  AC fixed-point data types are more natural and particular useful in
describing and implementing signal processing algorithms.  AC data types
also simulate much faster than SC data types.


Simulation:

CatC has 3 types of simulation, all w/o changing the original C++ testbench.

  1. Cycle-based.  This is fastest.  The RTL behavior of the real design,
     not real gates.  We can do this with any C++ debugger/compiler.

  2. RTL-based.  After C code is actually synthesized to Verilog RTL code.
     Here we can automatically verify the RTL implementation against the
     original algorithm source.

  3. Gate-level based.  Go through Synopsys DC, import the gates back to
     CatC and CatC does the simulation.

Catapult also has SystemC extensions so it can use SystemC for verification,
if you want.


CatC's Incremental Methodology:

In CatapultC, the user interacts with the tool at every stage, working step
by step.  Since Catapult remembers previous constraints, we can generate
multiple solutions through different constraints.

The basic steps:

  0. Create or open a Project
  1. Create multiple C/C++ source files using the CatC text editor or
     any other user favorite text editor.  I like vi myself.
  2. Add source files to the project.
  3. GUI/menu, set up library (FPGA or ASIC process) a clock frequency
     or frequencies for multiple clock.
  4. Set architectural constraints, done separately from our C code file.
  5. Set Resource constraints.
  6. Catapult does scheduling, generates multiple internal forms of the
     design, then generates Verilog/VHDL RTL.
  7. Catapult presents view of the design.
  8. Go to project directory, verify (Cycle-based, RTL-based, Gate-level
     based).


CatC's Strengths and Weaknesses:

Strengths:

  1. The time savings for implementing our RTL (3 months vs. 6 months) and
     the quality of our system architecture (275 K vs. 1 M gates) due to
     ability to do more exploration and optimization.

  2. Good foundation because it has broad support for generic ANSI C/C++,
     versus forcing us to just use a SystemC subset of the C, not even C++.

     CatC is not very picky in terms of type of syntax and C++ language
     feature types, e.g. template, class, function overload, operator
     overload.  Mentor went to great lengths to be tolerant of the coding
     style and language syntax.

  3. Mentor's local support has been excellent, both in terms of responding
     to bug fixes as well as the most efficient way to code and implement.
     We're a big customer, so I expect this.  But there are EDA horror
     stories where the EDA vendor "disappears" once the P.O. was signed.
     Mentor was and is still there way after the initial purchase.

Weaknesses:

  1. Multi-level hierarchy.  AC channel is currently limited in that there
     are only 2 level of hierarchy.  We can connect the top level block
     with AC channel, but we cannot connect a block with the AC channel
     further in hierarchy.  I would like Mentor to go one level deeper.
     I want to use AC channel inside the blocks, to allow the elements
     inside the loop of the blocks to communicate.  Today we must take a
     2nd loop and build blocks around it.
 
  2. You can't use power as a design constraint.  Catapult's constraints are
     mostly around clock rates today.  Catapult does allow you to evaluate
     power after RTL is done, and they have a flow that lets you use
     Sequence's power tool.  I would prefer that Mentor support power
     constraint directly inside Catapult.

  3. There is a learning curve to shifting to high level design synthesis.

     CatC is useful if your design is highly datapath intensive.  It is
     also good for single clock rate designs, particularly with signal
     processing, which take about 2-3 weeks to go from known a good
     algorithm to the first working design. 

     If your design is control intensive or multi-rate, you need to learn
     how to pipeline and use AC channels, which takes about 6 weeks.  With
     CatC, your C++ code becomes the design source to which modifications
     are made, and you can quickly adapt the design to changing interface
     specifications and other design requirements.

We've had good experiences with CatapultC for years now and recommend it.

    - [ Captain America ]
Index    Next->Item








   
 Sign up for the DeepChip newsletter.
Email
 Read what EDA tool users really think.


Feedback About Wiretaps ESNUGs SIGN UP! Downloads Trip Reports Advertise

"Relax. This is a discussion. Anything said here is just one engineer's opinion. Email in your dissenting letter and it'll be published, too."
This Web Site Is Modified Every 2-3 Days
Copyright 1991-2024 John Cooley.  All Rights Reserved.
| Contact John Cooley | Webmaster | Legal | Feedback Form |

   !!!     "It's not a BUG,
  /o o\  /  it's a FEATURE!"
 (  >  )
  \ - / 
  _] [_     (jcooley 1991)