( ESNUG 251 Item 1 ) -------------------------------------------- [9/12/96]

Subject: ( ESNUG 249 #7 250 #2) What's The Best Way To Synth Multipliers?

> DO: constrain multipliers *accurately* (don't over or under constrain)
> and Design Compiler will do a good job meeting that constraint.  DON'T:
> flatten or remove hierarchy in a design with a multiplier.  Also, you
> will need a DesignWare licence to get the faster multipliers.


From: kurt@wsfdb.com (Kurt Baty)

Hi, John,

Here is a set of compiles of the DW02_mult, comparing the gate-count and
speed of the Carry-Save-Addition (CSA) multipliers versus the Wallace
architecture.  The two ASIC vendors were both 0.6 micron CMOS processes.
Each were compiled against both ASIC libs with worst-case industrial 
conditions and with set_input_drive, set_input and set_output loads and a
wire load table.  (These tables represent over 100 hours of SPARC 10/51
compute time.)

The peak differences between the two multiplier architectures happen when
they're multiplying two same sized (bit-wise) vectors.  This all goes away
when you have greatly different bit widths.  For example, there won't be
much differences between a Wallace tree and a CSA implementation if you
were multiplying 32 bits by 5 bits.  Therefore, all the data is based on
A_width and B_width being equal.

ASIC Vendor 1			

  A_width    CSA Multipliers        Wallace Trees
  B_width    area    speed          area   speed  faster %
    2 bits     64     4.68           100    7.41   -58%
    3         187     8.44           262    9.31   -10%
    4         271    11.68           392   11.32     3%
    5         433    12.93           550   12.54     3%
    6         568    15.01           720   15.18    -1%
    7         724    17.05           824   17.41    -2%
    8         904    18.96           984   18.46     3%
    9        1080    23.05          1290   18.25    21%
   10        1430    22.18          1619   19.55    12%
   11        1776    24.1           1898   19.76    18%
   12        1831    27.03          2178   22.11    18%
   13        2696    25.39          2439   21.23    16%
   14        2951    27.36          2736   24.68    10%
   15        3116    29.65          3185   24.08    19%
   16        3493    31.55          3658   25.51    19%
   17        3688    44.72          4077   27.41    39%
   18        4115    34.73          4541   25.84    26%
   19        4531    36.87          5055   26.27    29%
   20        4944    39.84          5589   26.08    35%
   21        5491    40.46          5804   29.02    28%
   22        5970    40.79          6382   29.99    26%
   23        6541    42.4           7135   29.31    31%
   24        6957    45.41          7389   34.69    24%

        [ No data for 25 to 27 bit widths. ]

   28        9493    50.97         10145   29.99    41%

        [ No data for 29 to 31 bit widths. ]

   32       12328    56.11         12791   36.35    35%


ASIC Vendor 2			

  A_width    CSA Multipliers        Wallace Trees
  B_width    area    speed          area   speed  faster %
    2 bits     73     4.86           114    6.92   -42%
    3         217     9.03           187    9.77    -8%
    4         293    12.43           338   12.72    -2%
    5         371    14.12           475   13.19     7%
    6         511    17.41           598   16.28     6%
    7         722    18.66           858   16.83    10%
    8         890    21.08          1137   17.71    16%
    9         995    23.17          1282   17.81    23%
   10        1409    25.75          1522   20.79    19%
   11        1411    29.27          1986   19.65    33%
   12        1818    29.57          2072   21.94    26%
   13        2009    32.29          2452   21.65    33%
   14        2292    34.19          2833   22.04    36%
   15        2550    37.08          3072   22.46    39%
   16        3168    37.63          3604   25.09    33%
   17        3150    41.29          3776   25.98    37%
   18        3807    41.36          4409   24.94    40%
   19        4054    44.31          4622   26.56    40%
   20        4475    46.92          5051   27.67    41%
   21        5073    48.02          5443   27.9     42%
   22        5135    51.74          5806   29.02    44%
   23        5895    52.22          6575   27.97    46%
   24        6409    54.57          7177   29.85    45%

        [ No data for 25 to 27 bit widths. ]

   28        8676    61.91          9177   30.44    51%

        [ No data for 29 to 31 bit widths. ]

   32       10871    70.69         12124   30.43    57%


These tables show that, starting at about eight bits, the Wallace tree 
architecture has a significant speed difference and has only up to about
ten percent increase in gate count.  (What's not shown is that I know the
effect of A_width not being equal to B_width would slightly diminish the
advantages of the Wallace architecture, though.)

The reason why you see a variation between these two ASIC libraries is the
relative difference in the speed of doing the majority veruses doing the
inputs to carry out on their adders.  As that ratio tightens there is less
speed gain.

  - Kurt Baty
    WSFDB Consulting



 Sign up for the DeepChip newsletter.
Email
 Read what EDA tool users really think.


Feedback About Wiretaps ESNUGs SIGN UP! Downloads Trip Reports Advertise

"Relax. This is a discussion. Anything said here is just one engineer's opinion. Email in your dissenting letter and it'll be published, too."
This Web Site Is Modified Every 2-3 Days
Copyright 1991-2024 John Cooley.  All Rights Reserved.
| Contact John Cooley | Webmaster | Legal | Feedback Form |

   !!!     "It's not a BUG,
  /o o\  /  it's a FEATURE!"
 (  >  )
  \ - / 
  _] [_     (jcooley 1991)