ICCD 2003 Advance Program
Room locations at the Doubletree Hotel, San Jose CA, are included in parenthesis
after the session name. Please check back for updates and changes.
Registration : 12 PM - 1:30 PM
This is a special conference workshop targeting issues in low-power circuit and system design. Invited speakers will provide the context for discussion and interaction among the participants. Note that attendees must register separately for the workshop.
Workshop moderator: Bora Nikolic, UC Berkeley
1:00 - 1:15 PM Workshop welcome
1:15 - 2:00 PM David Brooks, Harvard University
"Architectural and System Level Power Analysis and Optimization"
2:00 - 2:45 PM Siva Narendra, Intel Corporation
"Silicon Integration Choices in Sub-45nm Power Limited Microprocessors"
2:45 - 3:15 PM Break
3:15 - 4:00 PM Anand Ragunathan, NEC
"Addressing the battery gap: Emerging system-level architectures
and design methodologies for mobile appliances"
4:00 - 4:45 PM Azeez Bhavnagarwala, IBM Research
"Scaling Limitations and Energy Efficient Solutions for CMOS SRAM Caches"
4:45 - 5:30 PM Wrap-up and discussion
Registration: 8 AM - 5 PM
9:30-10:15
Professor Mark Horowitz, Stanford University
Abstract
For the past 13 years, my research has included the design of high-speed
chip-to-chip communication links. When this work started, most chip
I/O were TTL levels, and ran at tens of MHz. Our initial work focused
on understanding the basic problems of chip I/O, creating the needed circuit
blocks to solve these problems, and predicting how the performance would
scale with technology. Today the situation is quite different. The basic
issues are well understood, and there are many variants of the needed
circuit blocks. However the speed demands have also grown with time,
causing new issues to arise, and forcing the resulting designs to grow
dramatically in complexity. This talk will review the basic issues of
a high-speed link, first looking at the problems of link design 10 years
ago. After describing how the basic issues of timing and signaling are
addressed and how these solution scale, I will look at the issues that
designers are facing today, and the techniques being used to cope with
these problems. These issues include: circuits limited by the bandwidth
of the external wires, worsening transistor matching, and rising bandwidth
requirements. The talk will close with some projections about the future,
and will look at both electronic and optical links.
Biography
Mark Horowitz is the Yahoo Founder's Professor of
Electrical Engineering and Computer Science at Stanford University.
He received his BS and MS in Electrical Engineering from MIT in 1978,
and his PhD from Stanford in 1984. Dr. Horowitz is the recipient of
a 1985 Presidential Young Investigator Award, and an IBM Faculty development
award, as well as the 1993 best paper award at the International Solid
State Circuits Conference. Dr Horowitz's research area is in digital
system design, and he has led a number of processor designs including
MIPS-X, one of the first processors to include an on-chip instruction
cache, TORCH, a statically-scheduled, superscalar processor that supported
speculative execution, and FLASH, a flexible DSM machine. He has also
worked in a number of other chip design areas including high-speed and
low-power memory design, high-bandwidth interfaces, and fast floating
point. In 1990 he took leave from Stanford to help start Rambus Inc,
a company designing high-bandwidth chip interface technology. His current
research includes multiprocessor design, low power circuits, memory
design, and high-speed links.
10:15-11:00
Dr. William Pulleyblank, IBM
Corporation
Abstract
Computer simulation is being broadly recognized as a third
pillar of research in science and engineering, joining Theory and
Experimentation. However the resulting modeling requirements go far beyond the
capabilities of current supercomputers. I will discuss this problem as well as
different solution approaches currently being tried for certain problems. This
takes us into the domain of tera-scale and peta-scale computing. In addition to
the significant hardware problems to be solved , we face software issues that
are at least as large. I will discuss these topics in the context of BlueGene/L
- a 360 teraflop/s super computer being built at IBM Research which will run on
the Linux operating system. Building BlueGene/L has necessitated a number of
innovations in order to achieve the targeted levels of performance. We also
discuss issues of reliability and availability, the so-called autonomic issues,
as well as projected applications and performance.
Biography
William R. Pulleyblank is the Director of Exploratory System Servers in IBM’s Research Division and the Director of the IBM Deep Computing Institute. He has also served as the Research relationship executive responsible for Financial Services sector in IBM, the Utility and Energy Services industry, and for the Business Intelligence group. Before joining IBM Research in 1990, Dr. Pulleyblank was the holder of the Canadian Pacific Rail/NSERC Chair of Optimization and Computer Applications at the University of Waterloo. He is a member of a number of boards, including the Advisory Committee of the Division of Mathematics & Physical Sciences of the National Science Foundation, iCORE Board of Directors, the Advisory Council of the Pacific Institute for the Mathematical Sciences - PIMS, and a member of the Scientific Advisory Panel of The Fields Institute for Research in Mathematical Sciences . In addition he serves on the editorial boards of a number of journals. Dr. Pulleyblank’s personal research interests are in Operations Research, Combinatorial Optimization, and Applications of Optimization. In addition to writing a number of scientific papers and books, he has consulted for several companies including: Mobil Oil on helicopter routing; Marks and Spencer on depot management; Statistics Canada on survey validation; and CP Rail on train scheduling.
11:00-11:45
Keynote Address: Advanced
EDA Tools for High-Performance Design
Ted
Vucurevich, Cadence Design Systems, Inc.
Abstract
This talk will focus on Cadence's vision for advanced EDA tools and technology for high-performance designs at sub-nanometer process nodes. There will be specific emphasis on design for manufacturability to maximize yield, "reliable" design techniques and supporting technologies, and tool support for new high performance and low power circuit design techniques.
Biography
Ted Vucurevich serves as a Cadence
Senior Vice President, responsible for driving advanced technology development
and directing Cadence Laboratories. In addition, he serves as an
executive fellow.
Vucurevich leads the Strategic Technology Office (STO). The STO researches,
plans, and promotes a world-class Cadence technology roadmap and vision to Cadence
employees, customers, and analysts. As director of Cadence Laboratories, Vucurevich
represents Cadence on various external boards and interfaces between research
efforts and product development.
In his prior role as chief architect at Cadence, Vucurevich helped develop the
strategies and technology initiatives in system-on-a-chip (SoC)-based design,
DSM infrastructure, software interoperability, design methodology development,
and Internet-based electronic system design.
Vucurevich joined Cadence in 1992 as director of the Analog Physical Design
group. In 1994 he was promoted to work as an architect in the Viper Development
group. He was later named chief architect and held that position for five
years. Prior to Cadence, Vucurevich worked 14 years at Analog Devices
where he held roles in product, design, and computer-aided design (CAD) engineering.
He was a co-founder of the Linear Signal Processing Division, where he was
responsible for the implementation of a complete mixed-signal ASIC CAD environment.
Vucurevich received his bachelor of science degree in electrical engineering
from the University of Arizona.
11:45-1:00
Lunch, Sponsored by IBM
1:00-3:00
Session 1.1: Energy Efficiency (Carmel)
Memory is a major source for power consumption. The first paper addresses issues with low power register files by taking an asymmetrical ported approach yielding lower power consumption and access time. The paper called "Power Efficient Data Cache Designs" describes a combination of different threshold voltages with appropriate cache organizations, i.e. multi-banking, to achieve a good trade-off between power and performance. In their contribution "On Reducing Register Pressure and Energy in Multiple-Banked Register Files" the authors delay the dispatch of instructions in order to reduce register requirements as well as dynamic and static power dissipation. The last presentation decomposes operands to achieve lower power in multiplications.
Session Chair : Nihar Mahapatra, Michigan State University
1.1.1. Energy Efficient Asymmetrically Ported Register Files
Aneesh Aggarwal and Manoj Franklin, University of Maryland
1.1.2. Power
Efficient Data Cache Designs
Jaume Abella and Antonio Gonzalez, Departament d'Arquitectura de Computadors (UPC), Spain
1.1.3. On Reducing
Register Pressure and Energy in Multiple-Banked Register Files
Jaume Abella, and Antonio Gonzalez, Departament d'Arquitectura de Computadors (UPC), Spain
1.1.4. Low
Power Multiplication Algorithm for Switching Activity Reduction
through Operand Decomposition
Masayuki Ito, D. Chinnery and Kurt Keutzer, UC-Berkeley
Session 1.2 : Timing Verification (San Jose)
The first paper deals with verification of timing. It eliminates the need for the circuit under verification to be partitioned by automatically decomposing the circuit and using appropriate abstractions. The second paper tackles the issue of overtesting of sequential circuits with respect to delay faults. The third paper reports on an accurate simulation method for crosstalk defects. Last paper of the session applies bounded model checking to circuits with muiltiple clocks.
Session chair: Alex Orailoglu,
University of California-San Diego
1.2.1. Verification of Timed Circuits with Failure Directed Abstractions
Hao Zheng, IBM Microelectronics
Chris Myers, David Walter and Scott Little, University of Utah
Tomohiro Yoneda, National Institute
of Informatics, Tokyo
1.2.2. Procedures
for Identifying Untestable and Redundant Transition Faults in Synchronous
Sequential Circuits
Gang
Chen and Sudhakar
Reddy, University of Iowa
Irith Pomeranz, Purdue University
1.2.3.
Event-Centric Simulation of Crosstalk
Pulse Faults in Sequential Circuits
Marong Phadoongsidhi and Kewal K. Saluja, University of Wisconsin-Madison
1.2.4.
Specifying and Verifying Systems with Multiple
Clocks
Edmund Clarke, Daniel Kroening and Karen Yorav, Carnegie-Mellon University
Session
1.3: Electrical Analysis for System LSI (Santa Clara)
Electrical modeling of LSI has risen to the fore as a research topic with sub-100nm geometries on the cusp of wide deployment. The first paper reports 10x improvement in speed as well as memory reduction in 3-D capacitance extraction. The second paper reports an method for coupling noise estimation that has an average error of 8% w.r.t. SPICE, but is 10x faster. The authors report the ability to run on an 80K net microprocessor in 13 minutes. The next paper reports a method for leakage computation based on a symbolic representation of the leakage scenarios for a module. This allows input don't cares to be considered in the computation, making the calculations less pessimistic. Finally, we have a method for worst case crosstalk delay computation based on an iterative algorithm that avoids non-linear driver simulations and computes the delay to within 10% of SPICE.
Session chair: Wolfgang Roethig, NEC Electronics
1.3.1.
Enhanced QMM-BEM Solver for 3-D Finite-Domain Capacitance Extraction
with Multilayered Dielectrics
Wenjian Yu and Zeyi Wang, Tsinghua University, Beijing China
1.3.2. An Improved
method for Fast Noise Estimation based on Net Segmentation
Chih-Liang Huang and Aurobindo Dasgupta, Intel Corporation
1.3.3. Symbolic
Failure Analysis of Custom CMOS Circuits due to Excessive Leakage Current
Hui-Yuan
Song, Saswat Bohidar and Iris Bahar, Brown University
Joel Grodstein, Intel Corporation
1.3.4. An Efficient Algorithm for Calculating the Worst-case Delay due to Crosstalk
Venkat
Rajappan and Sachin Sapatnekar, University of Minnesota
3:30-5:30
Session 2.1: Power Optimization (Carmel)
Session chair: Borivoje Nikolic, University of California-Berkeley
2.1.1.
A Compact Model for Analysis and Design of CAM Block Power Network with
On-chip Decoupling Capacitors
Payman Zarkesh-Ha, Ken Doniger,
William Loh, Dechang Sun, Rick Stephani and Gordon Priebe, LSI Logic
2.1.2. Precomputation-based
Guarding for Dynamic and Leakage Power Reduction
Afshin
Abdollahi and Massoud Pedram, USC
Farzan Fallah, and Indradeep Ghosh,
Fujitsu Labs of America
2.1.3. Charge-recycling
voltage domains for energy-efficient low-voltage operation of digital CMOS circuits
S. Rajapandian, Z. Xu, and K. L. Shepard, Columbia University
2.1.4. Low Power
Adder with Adaptive Supply Voltage (short)
Hiroaki
Suzuki, Woopyo Jeong, and Kaushik Roy, Purdue University
2.1.5. A Transparent Voltage Conversion Method
and Its Application to a Dual-Supply-Voltage Register File (short)
Nestoras
Tzartzanis and William W. Walker, Fujitsu
Session 2.2 : Gene Chip Design: special session
(San Jose)
Session chair: Ken Shepard,
Columbia University
2.2.1. Embedded Tutorial: Detection of Biological Molecules: From Self-Assembled Films to Self-Integrated Devices (invited)
Rastislav
Levicky, Department of Chemical Engineering, Columbia University
2.2.2. Design Flow Enhancements for DNA
Arrays
Andrew
B. Kahng, Ion Mandoiu, Sherief Reda and Xu Xu, University of California-San Diego
Alex Zelikvosky (GSU)
Session
2.3: System Level Design (Santa Clara)
Four of the five papers in this session investigate optimizations involving communication fabrics in system LSI. The first paper presents a technique for customized bus architecture synthesis based on simulated annealing. The second paper is an interesting exploration of power reduction by means of dynamic adjustment of frequencies of the source, target and interconnecting bus during synchronous communication. The third paper presents an approach to make efficient use of embedded FPGA RAM for mapping variables shared between the host and an FPGA-based co-processor. The next paper proposes a method for efficient compiled code simulation.
Session chair: Anand Raghunathan, NEC Labs
2.3.1.
Bus Architecture Synthesis for Hardware-Software Co-Design for Deep Submicron
Systems on Chip
Nattawut Thepayasuwan, V. Damle,
and A. Doboli, SUNY-Stony Brook
2.3.2. Dynamically
Optimized Synchronous Communication for Low Power System on Chip Designs
Vikas
Chandra, Carnegie-Mellon University
Gary Carpenter and Jeff Burns, IBM
Austin Research Lab
2.3.3.
Interface Synthesis using Memory Mapping for an FPGA Platform
Manev
Luthra, Sumit Gupta and Nikil Dutt University of California-Irvine
Rajesh Gupta University of California,
San Diego
Alex Nicolau University of California
2.3.4.
Efficient Synthesis of Networks On Chip (short)
Alessandro Pinto, Luca P. Carloni and Alberto
L. Sangiovanni-Vincentelli University of California-Berkeley
2.3.5. Reducing
Compilation Time Overhead in Compiled Simulators (short)
9:00-10:30
Session 3.1: Systems Performance (Carmel)
The first paper in this session studies changes in interrupt handler performance across a range of IA-32 based systems. The second paper examines commercial server workload behavior when using bus structures that dynamically compress their traffic. The final paper in this session analyzes multi-hop bypassing networks for ALU results.
Session chair: Ed Grochowski, Intel Corporation
3.1.1. Profiling Interrupt Handler Performance through Kernel Instrumentation
Branden Moore, Thomas Slabach and
Lambert Schaelicke, University of Notre Dame
3.1.2. Design
and Performance of Compressed Interconnects for High Performance
Servers
Krishna Kant and Ravishankar Iyer, Intel Corporation
3.1.3. Routed Inter-ALU
Networks for ILP Scalability and Performance
Karthikeyan Sankaralingam, Vincent Ajay Singh, Stephen W. Keckler and Doug Burger University of Texas-Austin
Session 3.2 : uP Test & Diagnosis (San Jose)
The first paper of the session studies how the critical paths of a microprocessor behave in silicon and suggest a test strategy based on the results. The second paper describes a SAT-based ATPG which does not require that controller is separated from the datapath. Next, a fault tolerance design for branch predictors is proposed. The final paper extends simple-fault diagnosis to multiple faults using n-detection test sets.
Session chair: Sudhakar Reddy, The University of Iowa
3.2.1. Automatic Generation of Critical-Path Tests for a Partial-Scan Microprocessor
Joel Grodstein, Dilip Bhavsar and
Vijay Bettada, Intel Corporation
Richard Davies, Hewlett Packard
3.2.2. Test
Generation for Non-separable RTL Controller-datapath Circuits using
a Satisfiability based Approach
Loganathan
Lingappan and Niraj K. Jha, Princeton University
Srivaths Ravi, NEC Research Labs
3.2.3. Cost-Effective
Graceful Degradation in Speculative Processor Subsystems: The Branch
Prediction Case (short)
Sobeeh
Almukhaizim, Thomas Verdel and Yiorgos Makris, Yale University
3.2.4. Multiple
Fault Diagnosis Using n-Detection Tests (short)
Zhiyuan
Wang and Malgorzata Marek-Sadowska, University of California-Santa Barbara
Kun-Han Tsai and Janusz Rajski Mentor
Graphics Corporation
Session
3.3: Physical Design (Santa Clara)
The first paper presents a physical design methodology that was applied to the design of a current generation microprocessor, and a tool-set derived from this experience. As integration by means of smaller geometries becomes a challenge, it is natural to consider chip stacking- the next paper explores floor-planning and place-and-route tools for stacked chips. Finally, we have a placement algorithm targeted at achieving a desired cell density distribution with application in IP based design as well as for facilitating ECO.
Session chair: Shantanu Dutt, University of Illinois-Chicago
3.3.1.
A Physical Design Methodology for 1.3GHz SPARC64 Microprocessor
Noriyuki Ito, Hiroaki Komatsu,
Yoshiyasu Tanamura, Roichi Yamashita, Hiroyuki Sugiyama, Yaroku Sugiyama and
Hirofumi Hamamura, Fujitsu, Kawasaki
Japan
3.3.2. Physical
Design of the 2.5D Stacked System
Yangdong Deng and Wojciech Maly, Carnegie-Mellon University Y
3.3.3. Flow-Based
Cell Moving Algorithm for Desired Cell Distribution
Bo-Kyung Choi, Huaiyu Xu and Majid Sarrafzadeh, UCLA
11:00-12:30
Session 4.1: Performance Optimization (Carmel)
Performance optimizations need realistic benchmarks. The first presentation specializes on a specific benchmark for network processors. As busses increase area and power consumption, the next paper proposes a scheme for hardware-based compression and decompression of underutilized processor busses. The last paper combines pipelined multiplicative division with IEEE rounding.
Session Chair: Christos Kozyrakis, Stanford University
4.1.1. NpBench: A Benchmark Suite for Control plane and Data plane
Applications of the Network Processor
Byeong Kil Lee and Lizy Kurian John, University of Texas-Austin
4.1.2. Hardware-Only
Compression of Underutilized Address Buses: Design and Performance, Power, and
Cost Analysis
Nihar Mahapatra, Jiangjiang Liu, and Krishnan Sundaresan, Michigan State University and University at Buffalo
4.1.3.
Pipelined Multiplicative Division with IEEE Rounding
Guy
Even, Tel-Aviv University
Peter-Michael Seidel, Southern Methodist
University
Session 4.2 : Clock & Signal Distribution
(San Jose)
Session chair: Frank O'mahony, Intel Corporation
4.2.1. Design of resonant global clock distributions
S. C. Chan, K. L. Shepard, Columbia
University,
P. J. Restle, IBM
4.2.2. Modeling
and Mitigation of Jitter in Multi-Gbps Source-Synchronous I/O Links
Ganesh Balamurugan and Naresh Shanbhag, University of Illinois
4.2.3. A Mixed-Mode
Delay-Locked Loop Architecture (short)
Daniel Eckerbert, L. Svensson and P. Larsson-Edefors, Chalmers University of
Technology, Goteborg Sweden
4.2.4. Optimal
Inductance for On-chip RLC Interconnections (short)
Shidhartha Das, Kanak Agarwal, David Blaauw and Dennis Sylvester , University of Michigan
Session
4.3: Performance and Power-Driven Physical Design (Santa Clara)
The first paper presents a dynamic programming based formulation of automatic insertion of buffers and flip-flops in use at Intel for the synthesis of high-end production chips. The next paper presents a Game Theory based formulation of the voltage scaling and gate sizing problem with the claim that it leads to rapid convergence and better results. The final paper presents a method to reduce wire length by up to 60% in prescribed- skew clock routing.
Session chair: Arun Balakrishnan, NEC Electronics
4.3.1
Spec Based Flip-Flop And Buffer Insertion
Nataraj Akkiraju and Mosur Mohan,
Intel Corporation
4.3.2.
A Microeconomic Model for Simultaneous Gate Sizing and Voltage Scaling
for Power Optimization
N. Ranganathan and A. K. Murugavel, University of South Florida
4.3.3. A Simple
Yet Effective Merging Scheme for Prescribed-Skew Clock Routing
Rishi Chaturvedi and Jiang Hu, Texas A&M University
12:30-1:30
Lunch, Sponsored by Intel
1:30-3:30
Session 5.1: Instruction Execution (Carmel)
This session deals with faster instruction execution. The first contribution presents a hardware-based prefetching technique to alleviate load misses. In the next paper the authors propose an algorithm to match dispatch resources to average needs rather than peak needs. The next paper depicts an efficient VLIW DSP architecture for baseband processing. The session is concluded with a paper on a novel approach to speculative multithreading.
Session chair: Nihar Mahapatra, Michigan State University
5.1.1. Hardware-based pointer data prefetcher
Shih-Chang Kevin Lai,
Silicon Integrated Systems, Hsin-Chu Taiwan
Shih-Lien Lu, Intel Corp
5.1.2. A Dependence
Driven Efficient Dispatch Scheme
Sriram Nadathur and Akhilesh Tyagi, Iowa State University
5.1.3. An
Efficient VLIW DSP Architecture for Baseband Processing
Tay-Jyi Lin, Chin-Chi Chang, Chen-Chia Lee, and Chein-Wei Jen, National Chiao Tung University, Hsin-Chu Taiwan
5.1.4. Dynamic
Thread Resizing for Speculative Multithreaded Processors
Mohamed Zahran and Manoj Franklin, University of Maryland
Session 5.2 : Test Compression Technology: invited
session (San Jose)
One of the most notable breakthroughs in the field of testing in recent years is test compression where an orders of magnitude improvement in test data volume and application time has been achieved. This session captures the major contributions made in the field and new improvements.
Session chair: Jim Sproch, Sr. Director of Research and Development, Test Automation Products, Synopsys
5.2.1. Care Bit Density and Test Cube Clusters: Multi-Level Compression Opportunities (invited)
Bernd Koenemann, Cadence Design
Systems
5.2.2. XMAX:
X-Tolerant Architecture for MAXimal Test Compression (invited)
Subhasish Mitra and Kee Sup Kim, Intel Corporation
5.2.3. Test
Data Compression and Compaction for Embedded Test of Nanometer Technology
Designs (invited)
Janusz Rajski and Jerzy Tyszer, Mentor Graphics
Session
5.3: Physical Design for Regular Fabrics and FPGA's (Santa Clara)
The first paper explores synthesis into a regular circuit fabric based on a novel BDD-like Boolean representation. The paper claims greater than 60% reductions in area and delay. The next paper proposes a new model to characterize the routing process under timing-driven constraints and use it to estimate wiring resources required in such routing for FPGAs. The final paper presents a high-quality time efficient detailed router for FPGAs.
Session chair: Tim Burks, Magma Design Automation
5.3.1. Non-Crossing Ordered BDD for
Physical Synthesis of Regular Circuit Structure
Aiqun Cao and Cheng-Kok Koh, Purdue University
5.3.2. Interconnect
Estimation for FPGAs under Timing Driven Domains
Parivallal Kannan and Dinesh Bhatia, University of Texas-Dallas
5.3.3.
ROAD: An Order-Impervious Optimal Detailed Router for FPGA's
Hasan Arslan and Shantanu Dutt, University of Illinois at Chicago
4:00-5:30
Session 6.1: Array Design Optimization (Carmel)
Energy optimization of internal processor tables and arrays is the focus in this session. We start with a paper on dynamically resizing the data TLB depending on application execution thus yielding less power requirements. The second presentation considers several approaches for reducing reorder buffer complexity and power dissipation. Finally, the goal of lower power TLBs is tackled by a virtual page tag reduction scheme and power-efficient clustered microarchitectures are described.
Session chair: Peter-Michael Seidel, Southern Methodist University
6.1.1. Reducing dTLB Energy Through Dynamic Resizing
Victor Delaluz, M. Kandemir, A.
Sivasubramaniam, M.J. Irwin and N. Vijaykrishnan, Penn State
University
6.1.2. Distributed
Reorder Buffer Schemes for Low Power
Gurhan Kucuk, Oguz Ergin, Dmitry Ponomarev and Kanad Ghose SUNY-Binghamton
6.1.3. Virtual
Page Tag Reduction for Low-power TLBs (short)
Peter Petrov and A. Orailoglu, University of California-San Diego
6.1.4. Dynamic
Cluster Resizing (short)
Jose Gonzalez and Antonio Gonzalez, Intel Labs, Barcelona Spain
Session 6.2 : Test Compaction (San Jose)
The session is opened by the paper which applies linear programming to test compaction, yielding a polynomial method achieving results close to optimum. The next paper identifies don't cares in a test set an assigns them in order to ensure effective compression. The third paper constructs a compact test set for a circuit consisting of several sub-circuits, starting with their individual test sets. The last paper of the session proposes an efficient heuristic for relaxing test sequences.
Session chair: Piet Engelke,
University of Freiburg, Germany
6.2.1. Independent Test Sequences Compaction through Integer Programming
Petros Drineas, Rensselaer
Polytechnic Institute
Yiorgos Makris, Yale University
6.2.2. On
Combining Pinpoint Test Set Relaxation and Run-Length Codes for
Reducing Test Data Volume
Seiji
Kajihara and Yasutoshi Doi, Kyushu Institute of Technology, Iizuka Japan
Lei Li and Krishnendu Chakrabarty,
Duke University
6.2.3. Static Test Compaction for Multiple
Full-Scan Circuits (short)
Irith Pomeranz, Purdue University
Sudhakar M. Reddy, University of Iowa
6.2.4. A Method
to Find Don't Care Values in Test Sequences for Sequential Circuits
(short)
Yoshinobu
Higami,Shin-ya Kobayashi and Yuzo Takamatsu, Ehime University, Matsuyama Japan
Seiji Kajihara, Kyushu Institute of
Technology
Irith Pomeranz, Purdue University
Session
6.3: Techniques for Synthesizing into Fabrics: invited session
(Santa Clara)
The design and manufacture of conventional ASICs faces many hurdles. It has been proposed that some of these hurdles can be overcome by means of a methodology that implements into a predetermined fabric. This session presents an EDA company view of this methodology followed and a case study at IBM of trying to implement an SOC by limiting the application specific portion to a minimum. Finally, we have a panel discussion to discuss the pros-and-cons of this methodology.
Session chair: Sinan Kaptanoglu, Altera Corporation
6.3.1.
Simplifying SoC design with the Customizable Control Processor Platform
(invited)
C.
Ogilvie, R. Ray, R. Devins, M. Kautzman, M. Hale, R. Bergamaschi, B. Lynch,
and S. Gaur, IBM
6.3.2. Structured ASICs: Opportunities and Challenges (invited)
Behrooz Zahiri, Magma Design
Automation
6.3.3. Panel Discussion:
"System LSI Implementation Fabrics for the Future" (special panel discussion)
Sinan Kaptanoglu, Altera, moderator
Abstract for Panel Discussion
IC design methodology and practice are going through major changes. The ASIC industry is being squeezed by FPGAs on one end, and by standard ICs on the other. Most of the changes are fueled by sub-100 nm process technology. The mask costs are skyrocketing, and first time silicon success is way down due to complications with process parasitics, cross talk, high device leakage current and other issues. The state of the art CAD tools are unable to deal with the process related complications and the ever increasing design complexity fully automatically. More than ever before, the involvement of the designer in critical CAD related decisions has exceeded the capacity of what we may rightfully expect from any individual.
Structured ASICs (SASICs) are one response to these problems. SASICs aim to fill a void between FPGAs and traditional ASICs. SASICs reduce design risk and complexity at the expense of design performance and possibly total cost, if the production volume is very high. SASICs are dismissed by some pundits as a mere marketing gimmick. Others claim SASICs are the long awaited salvage of the otherwise doomed ASIC industry. Predictions of SASICs disappearing in a few years are common. However, it is equally common to hear predictions that SASICs will expand tremendously and will push ASICs into a small corner of very high performance and very high production volume.
SASICs are still very much being defined in terms of their boundaries and their application space. Some SASICs are very close to traditional ASICs. Other SASICs are quite different; they are perhaps closer to traditional gate arrays in their look and feel. A third group of SASICs provide a much closer link with FPGAs.
In the panel discussion we will explore the fluid boundaries among ASICs, SASICs, and FPGAs. We will also explore different flavors of SASICs. We will discuss CAD tools and design methodology for SASICs. Of course, as any good panel should do, we will speculate on the implications of all these changes and what the future may bring. We may even come up with a few bold (and controversial) predictions about the fate of SASICs and what they will mean to the IC industry and IC CAD industry five years from now.
Bella Mia Restaurant, Downtown San Jose CA
9:00-10:30
Session 7.1: Hardware Partitioning (Carmel)
Session chair: Tom Dillinger, Sun Microsystems
7.1.1 Multiple-Vdd Scheduling/Allocation
for Partitioned Floorplan
Dongku Kang, Mark C. Johnson and Kaushik Roy, Purdue University
7.1.2.
SCATOMi : Scheduling Driven Circuit Partitioning Algorithm for Multiple FPGAs
using Time-multiplexed, Off-chip, Multicasting Interconnection Architecture
Young-Su
Kwon and Chong-Min Kyung,, KAIST, Taejon, Korea
Bong-Il Park, Dynalith Systems
7.1.3. A Study of Hardware Techniques That Dynamically Exploit Frequent Operands to Reduce Power Consumption in Integer Function Units (short)
K. Gandhi and Nihar R. Mahapatra, Michigan State University
Session 7.2 : Energy-Aware Design and Application
(San Jose)
Consideration of energy and power consumption during system design is a must. The first three papers consider optimization of various energy and power related metrics in high-level design. The last paper proposes a detailed simulation model and protocol for energy efficiency driven workload distribution in a wireless ad-hoc network.
Session chair: Farzan Fallah, Fujitsu Labs of America
7.2.1. KnapBind: An Area-Efficient Binding Algorithm for Low-leakage Datapaths
Chandramouli Gopalakrishnan and
Srinivas Katkoori,
University of South Florida
7.2.2. A Novel
Synthesis Strategy Driven by Partial Evaluation Based Circuit Reduction
for Application Specific DSP Circuits (short)
Madhubanti Mukherjee and Ranga Vemuri, University of Cincinnati
7.2.3. Power
Fluctuation Minimization During Behavioral Synthesis using ILP-Based
Datapath Scheduling (short)
Saraju
Mohanty, N. Ranganathan and Sunil K. Chappidi, University of South Florida
7.2.4. An
Energy-Aware Simulation Model and Transaction Protocol for Dynamic Workload
Distribution in Mobile Ad Hoc Networks
Farhad Ghasemi
and M. Pedram, Sharif University Of
Technology
Peng
Rong, USC
Session
7.3: High-Speed Design Issues and Test Challenges: invited session
(Santa Clara)
At above 10GHz, engineers face unique challenges in design and testing. The first two papers presents design issues and case studies. The last paper describes the test solutions for above 10GHz devices.
Session Chair: Kee Sup Kim, Director of DFX, Intel Communications Group, Intel
7.3.1.
CMOS High-Speed Serial I/O's: Present and Future (invited)
M. Lee, William Dally, Ramin Farjad-Rad,
Hiok-Tiaq Ng, Ramesh Senthinathan, John Edmondson, and John Poulton, Velio
7.3.2. Fully
Differential Receiver Chipset for 40 Gb/s Applications using InP/InGaAs
Single Heterojunction Bipolar Transistors (invited)
Kursad Kiziloglu, S. Seetharaman, K.W. Glass, C. Bil, H.V. Duong, G. Asmanis, Intel Corporation
7.3.3. Paradigm
Shift in Designing and Testing >GB/s Data Communication Systems
(invited)
Mike Li, Jan Wilstrup, Wavecrest
11:00-12:30
Session 8.1: Efficiency & Reliability (Carmel)
The first paper in this session studies memory systems built around NAND flash memory devices, storing both code and data and using caching tailored to NAND flash characteristics. The second paper in this session investigates microarchitecture redundancy for enhancing defect tolerance and for increasing yield. It examines component-level redundancy as well as array row and column redundancy and defective queue-entry isolation. The final paper of this session looks at a real-time scheduling problem involving dynamic voltage and frequency scaling mechanism that attempts to match the multimedia decoder's output rate to the display rate.
Session chair: Ed Grochowski, Intel Corporation
8.1.1.
Cost-Efficient Memory Architecture Design of NAND Flash
Memory Embedded Systems
Chanik Park, Jaeyu Seo,
Sunghwan Bae, Shinhan Kim and Bumsoo Kim, Samsung,
Seoul Korea
8.1.2. Exploiting
Microarchitectural Redundancy For Defect Tolerance
Premkishore Shivakumar, Stephen W. Keckler, Charles R. Moore and Doug Burger, University of Texas-Austin
8.1.3. Reducing
Multimedia Decode Power using Feedback Control
Zhijian Lu, John Lach, Mircea Stan and Kevin Skadron, University of Virginia
Session 8.2 : Novel Methods in Logic Synthesis
(San Jose)
The first paper explores the fast detection of symmetries in a logic netlist that works very well in practice. Once detected, symmetries have very useful application in verification and synthesis. The second paper explores a fast new method for Boolean decomposition, while the final paper explores the application of SAT based methods in the inner loop of Espresso-like heuristic two-level minimization.
Session chair: Rajeev Murgai, Fujitsu Labs of America
8.2.1. Structural Detection of Symmetries in Boolean Functions
Guoqiang Wang, Andreas Kuehlmann and
Alberto Sangiovanni-Vincentelli, University of California-Berkeley
8.2.2. Boolean
Decomposition Based on Cyclic Chains
Elena Dubrova, Maxim Teslenko and Johan Karlsson, KTH Royal Institute of Technology, Stockholm Sweden
8.2.3.
SAT-Based Algorithms for Logic Minimization
Samir Sapra, Michael Theobald and Edmund Clarke, Carnegie-Mellon University
12:30-1:30
Lunch
1:30-3:00
Session 9.1: Communications & Context Management (Carmel)
The first contribution presents low-density parity-check codes and novel architectures to achieve high bit error rate performance. The authors of the next paper show new schemes to reduce the negative impact of context switches on branch prediction in embedded processors. The last contributions overcome operand transport complexity using a novel distributed register file architecture and deal with scalable and high performance network-on-chip architectures.
Session chair: Christos Kozyrakis, Stanford University
9.1.1. Low-Density Parity-Check Decoder Architecture for High Throughput
Optical Fiber Channels
Anand Selvarathinam,
Euncheol Kim and Gwan Choi, Texas A&M University
9.1.2. Improving
Branch Prediction Accuracy in Embedded Processors in the Presence
of Context Switches
Sudeep Pasricha and Alex Veidenbaum, University of California-Irvine
9.1.3. Reducing
Operand Transport Complexity of Superscalar Processors using Distributed
Register Files (short)
Santithorn Bunchua, D. Scott Wills, and Linda M. Wills, Georgia Institute of Technology
9.1.4. Xpipes:
a Latency Insensitive Parameterized Network-on-chip Architecture
for Multi-Processor SoCs (short)
Matteo Dall'Osso, Gianluca Biccari, Luca Giovannini, Davide Bertozzi,and Luca Benini, Universita di Bologna, Bologna Italy
Session 9.2 : Board Test and Power-Aware
Test (San Jose)
The first paper in this session achieves test power reduction by modifying the scan chain. Next, an MILP-based test scheduling algorithm taking the power profile of non-embedded cores into account is presented. The final paper of the session introduces a novel model for board test and proposes extending the JTAG architecture by adding a delay/skew sensor.
Session chair: Seiji Kajihara,
Kyushu Institute of Technology, Japan
9.2.1. Aggressive Test Power Reduction Through Test Stimuli Transformation
Ozgur Sinanoglu and Alex Orailoglu, University
of California-San Diego
9.2.2. Power-Time
Tradeoff in Test Scheduling for SoCs
Mehrdad Nourani and James Chin, University of Texas-Dallas
9.2.3. Multiple
Transition Model and Enhanced Boundary Scan Architecture to Test
interconnects for Signal Integrity
Mohammad Tehranipour, Nisar Ahmed and Mehrdad Nourani, University of Texas-Dallas