Solido Brainiac's Trip Report of the recent CICC'12 conference

( ESNUG 512 Item 4 ) -------------------------------------------- [10/18/12]

From: [ Trent McConaghy of Solido Design ]
Subject: Solido Brainiac's Trip Report of the recent CICC'12 conference

Hi, John,

The Custom Integrated Circuits Conference (CICC) 2012 took place Sunday Sept 
9 until Wednesday Sept 12 at the Doubletree Hotel in San Jose, California.

Approximately 300 custom circuit designers and custom CAD engineers attended 
CICC.  Sunday had the tutorials.  Monday morning started with a keynote.  
The remainder of the conference had five simultaneous threads of technical 
sessions, with paper presentations and panels.  On Monday and Tuesday 
evening, there was a combined poster session, exhibitors' display, and food 
and drinks. 

Below are some of my learnings related to FinFETs and other new devices, 
variation and reliability, analog / mixed-signal design, and other topics 
that caught my eye.

DEVICES

"At the 22-nm technology node, fully-depleted tri-gate [FinFET] transistors 
were introduced for the first time on a high-volume manufacturing process." 
This is the opening line in the paper by Chris Auth of Intel.  It's hard to 
overstate the ramifications of these devices to the semiconductor industry. 
Intel had announced them earlier, but this paper described them in detail. 

Salient data:

    - A new bar for low-voltage operation. Threshold voltages can be 0.2 
      to 0.25V, or about 0.1 V lower than the previous 32 nm technology. 
      This is enabled by steep subthreshold slopes (70mV/decade) and 
      very low drain-induced barrier layer (DIBL) (50mV/V).

    - Contacts are self-aligned, via new steps in the process flow.  
      This makes it really, really hard for contacts to inadvertently 
      short to gates.  This also means that gate width can be optimized 
      for transistor performance independent of yield requirements.

    - Fins are 8 nm wide at the channel.  Fin height is 34 nm, balancing 
      drive current vs. capacitance.  Fin pitch (distance between fins) 
      is 60nm, balancing density and parasitic capacitance vs. sane 
      aspect ratios and space for raised source / drains.  Gate pitch is 
      90 nm.

    - SRAM cell size is 0.092 um^2, which maintains traditional scaling 
      trends (0.5X every 2 years).  It is a 70% speed gain at fixed 
      voltage compared to 32-nm SRAM design.

    - PMOS devices are strained

    - Variability challenges include introduction of new corners, 
      especially at the top of the fin; and multiple sidewall 
      considerations.  However, TDDB, NBTI, and PBTI are improved 
      compared to 32 nm.

Continuing this trend, several papers at CICC discussed FinFETs and 
partially/fully depleted silicon-on-insulator (PD/FD SOI) devices:

    - Dick James (Chipworks) described Chipworks' reverse-engineering of 
      Intel's 22nm devices.  The most remarkable thing is how well 
      Chipworks' numbers lined up with Intel's -- a real testament to 
      reverse engineering.

    - Terence Hook (IBM) described FDSOI and FinFET devices, and 
      compared them to bulk with vivid descriptions.  

    - "The transistor of 2010 looks very little like that of 10 years 
      previous, with bulging silicon-germanium junctions and heretofore 
      exotic materials such as hafnium oxide commonplace.   The 
      transistors of 2012 are physically so different as to be nearly 
      unrecognizable in many ways."  "It has been observed that a FinFET 
      is basically an FDSOI device turned on its side" [though some may 
      disagree]. 

    - FinFETs use up to 50% less area than planar devices; though a 
      limit is power density.

    - Terence described implications for designers, which include the 
      following:

        - Better low-voltage operation, though slightly worse high-
          voltage operation. 
        - Threshold voltage is independent of any previous states, 
          i.e. no floating-body effect. 
        - No body effect for SOI FinFETs, and tiny body effect for 
          bulk FinFETs.  On FinFETs, width-parameter is now quantized, 
          into a discrete number of fins. 
        - Fully-depleted means less dependence of doping variation, 
          which improves matching.  [Though fins can vary.]

Countering the excitement over FinFETs, A. Khakifirooz and other authors 
from IBM, ST, LETI, Renesas, and GlobalFoundries had a paper on extremely 
thin SOI (ETSOI).  ETSOI is in the same category as FDSOI. 

    - They pointed out that leakage is a major problem with FinFETs, and 
      "once the fin is doped to a level to obtain the required threshold 
      voltage Vt for SRAM, the random dopant fluctuations (RDFs) are 
      comparable to, if not higher than, planar devices, defeating one 
      of the main potential advantages of the FinFET structure." (!)

    - In contrast, ETSOI devices can be back-biased to increase Vt, 
      which allows for an extra degree of control compared to FinFETs. 
      The channel can be kept undoped for all devices, which eliminates 
      RDFs to achieve "record low transistor mismatch."

PROCESS VARIATION AND RELIABILITY

Hidetoshi Onodera (Kyoto University) and myself (Solido Design Automation) 
co-chaired a session on Modeling & Design for Variability and Reliability.

Xin Li (CMU) discussed ways to construct models of circuit performance,
as a function of a large number of process variables, using a relatively
small number of simulations.  The approach used sparse regression and
Bayesian model fusion techniques.  For example, an SRAM column example 
had >10K input variables, and needed <1K simulations to model.

Elie Maricau (a student of Georges Gielen, KU Leuven) described how 
reliability effects are modeled in devices and in analog circuits. 

    - Aging effects include hot carrier degradation (HCI), time-
      dependent dielectric breakdown (TDDB), and negative/positive bias 
      temperature instability (N/P BTI).  To properly model reliability 
      in analog circuits, one must include all important factors (W, L, 
      Vgs, Vds, temp), cover a broad range continuous range of values 
      (e.g. Vgs=0V to 1.5V, different W's...), and model time-varying 
      stress effects. 

    - Elie reminded us how TDDB is stochastic, and in sub-45 nm, BTI 
      becomes stochastic too.  That is, the performance distribution of 
      a design shifts over time due to TDDB and BTI.  The circuits most 
      sensitive to reliability effects are the ones most sensitive to 
      process variation.

    - He described techniques to design for reliability.  Improving 
      circuits for process variation helps for reliability too.  Another 
      example approach for a high-voltage line driver was to add extra 
      small output blocks to lower ON-resistance, in an adaptive fashion 
      when the need is detected.

Jyothi Velamala (a student of Kevin Cao, ASU) presented a paper on 
statistical aging under dynamic voltage scaling.  He described physical 
explanations for NBTI, and the models that result.  From that, he showed a 
"statistical aging" analysis along with test chip data.

Xin Li (of GlobalFoundries) (yes, a second Xin Li in the same session!) 
discussed how to model low-frequency noise statistics as a function of 
device geometry.  In a departure from past work, the model was a function of 
lognormal-distributed (as opposed to normally-distributed) random variables.

Kiyohiko Sakakibara (Renesas) described how an STI surface "bump" that 
occurs at the edge of a channel causes a "hump" in a device Ids vs. Vgs 
operating characteristic.  In short, bump causes hump. If the bump is not 
modeled, then the device operating characteristic is inaccurate, leading to 
inaccurate analog circuit simulations.  Kiyohiko described a solution -- in 
design, to use a ring-gate structure on half the periphery of the channel. 
This solution simplifies the models, and leads to more accurate predictions. 

Like the talks from KU Leuven and from ASU, Anthony Oates of TSMC also 
discussed the problems due to the interaction between process variation and 
reliability.  He defined "fall-out" as the "percentage of circuits whose 
delay after [aging] stress exceeds the value associated with the ... 3-sigma 
tail of its statistical distribution."  In other words, it is the drop in 
yield from a 3-sigma initial yield, due to aging.  He illustrated with data 
on a ring oscillator, showing how fall-out grows with each new process node. 
He also showed data on the soft-error-rate (SER) reliability issue; and 
mentioned that FinFETs are less susceptible to SER.

AMS SYSTEM SIMULATION AND MODELING 

Nagib Hakim (Intel) described a flow that Intel is using to model 
statistical variation of AMS systems.

    - He described how most post-Si bugs found in Intel are from 
      interactions among blocks.

        - 46% of bugs were from circuit simulation issues, such as 
          insufficient modeling
        - 18% of bugs were due to PVT variation
        - 18% were from mixed-signal verification (MSV) misses, such 
          as insufficient MSV test plan coverage 
        - the remaining bugs were from timing and logic simulation

    - Given sufficient simulator capacity, anything could be simulated. 
      But it can take too long.  So the challenge is to find a sweet 
      spot between speed and accuracy.

    - Nagib described a System Verilog-based approach that combines 
      trustworthy human-designed behavioral models with simulation- or 
      silicon-calibrated models that are a function of variation.  In a 
      PLL example, simulation time went from 8 h to 1 s.  The methodology 
      takes engineering time and effort, so is currently only 
      appropriate where the time to model is amortized by high reuse of 
      the system / model over many chips.

Ji-Eun Jang (a student of Jaeha Kim, Seoul National University) described a 
method for event-driven simulation of AMS systems with System Verilog.  She 
used System Verilog because it is the only analog HDL that supports "struct" 
style signals.  She leveraged "struct" and a unified basis function to model 
a wide variety of input signals and (linear) channels.  She demonstrated the 
work on a DFE receiver, showing how simulation time was insensitive to a 
100x finer timestep due to the event-driven nature of the simulation.

David Root (Agilent) described compact and behavioral modeling techniques, 
using measurements from a nonlinear network vector analyzer (NVNA). 

    - The NVNA is more than a frequency analyzer, linear network 
      analyzer, or oscilloscope. 

    - The idea is to adaptively choose different stimuli to be input to 
      the device under test (DUT), and to measure the response.  The 
      model can be a function of Vgs, Vds, temperature, and phase.

    - It can be performed in time or frequency domain, and does not 
      assume linearity. 

    - He described three ways to use the NVNA, in increasing order of 
      difference from status quo:

        - In time domain, tune parameters for a pre-existing compact 
          model 
        - In time domain, build an "advanced" compact model (e.g. 
          based on neural networks)
        - In frequency domain, build an "advanced" compact model.
          It  uses "x-parameters" which are a rigorous extension of
          s-parameters to large-signal conditions.  That is, whereas
          s-parameters work for linear models, x-parameters include 
          harmonics and nonlinear distortion.

One of the tutorials addressed the long-standing but little-discussed 
challenge of working with PDKs.  Cadence callbacks, anyone? The tutorial was 
organized by Larry Nagel (Omega Enterprises) and Siva Mudanai (Intel).  The 
tutorial was given by Juan Cordovez (Sentinel IC), and by Sylwester Warecki 
(Peregrine), and focused on the nuts-and-bolts of working with / around 
PDKs in design and simulation.

KEYNOTE: SCALING IN MEMORY AND MICROPROCESSORS

Bill Daly (Stanford) gave the Monday keynote.  He reviewed the challenges of 
microprocessor and memory design as we scale, and what it means to custom 
circuit designers:

    - As we scale down, wires don't scale.  Memory is really an 
      interconnect problem.  Program architecture can reduce data 
      movement: configuring the memory to the program saves 30% energy.

    - "Route packets not wires".  This goes back to his NoC (network-on-
      chip) proposals of 10+ years ago.

    - Traditional CPU design has focused on single-thread throughput,
      at the expense of energy due to fancy scheduling.  They need
      2 nJ / instruction, compared to modern streamed cores needing
      just 50 pJ / instruction.  What an era we live in: nobody
      blinked when Bill said "2 nJ is expensive." 

    - On choosing projects: "if you're wildly successful, what 
      difference does it make?"

    - He suggested "difference-making" projects for custom circuit 
      design: 

        - Variation tolerance.  He doesn't want to have to guardband 
          20% or more for PVT, variation, and noise.

        - Low energy/bit, low-latency interconnect.  For chip, 
          package, module.

        - Avoid squandering power on overhead.  This means to reduce 
          power needs of clocking, flip flops, and synchronizers. 
          Also, for scheduling instructions: "Never do every time at 
          runtime [in hardware] what you can do once in compile time 
          [in software]."

TOMORROW'S MARKET DRIVER... 

A final topic of interest is augmented reality (AR).  Today's semiconductor 
market is dominated by chips for smartphones.  Tomorrow's smartphones may 
well be in the form of AR-style goggles like Google Glass.  Two papers are 
anticipating new chips needed for AR applications:

    - Just as GPUs are key to today's video apps, object recognition 
      will be key to identifying objects in a scene, so that the scene 
      can be intelligently augmented.  In software, object recognition 
      is too slow and too power-hungry, especially for mobile 
      applications.  So object recognition chips will become necessary. 
      Anticipating this, Junyoung Park (KAIST) presented their latest 
      object recognition chip.  It is a NoC with cores to compute 
      "region-of-interest", "feature detection", "description 
      generation", and more.  This version uses reinforcement learning, 
      which avoids manual tuning of past approaches.

    - Hyo-Eun Kim (KAIST) and other authors from Samsung described how 
      the challenge is not just AR, but to keep the chip count low 
      while doing 3D reconstruction and 3D display.  To address this, 
      they proposed a "unified media application processor" (UMAP) that 
      can quickly and efficiently perform all of these tasks.  It is 
      also a heterogeneous many-core system.  Will this be in the Galaxy 
      Nexus VIII?

In closing, CICC was a great opportunity to learn a lot, and spend time with 
many high-quality custom design people.

    - Trent McConaghy
      Solido Design                              Vancouver, Canada

Join Index Next->Item

"Relax. This is a discussion. Anything said here is just one engineer's opinion. Email in your dissenting letter and it'll be published, too."
This Web Site Is Modified Every 2-3 Days
Copyright 1991-2024 John Cooley. All Rights Reserved.
| Contact John Cooley | Webmaster | Legal | Feedback Form |


   !!!     "It's not a BUG,
  /o o\  /  it's a FEATURE!"
 (  >  )
  \ - / 
  _] [_     (jcooley 1991)