ICCD 2003 Technical Program

ICCD 2003 Advance Program

Room locations at the Doubletree Hotel, San Jose CA, are included in parenthesis after the session name. Please check back for updates and changes.

Sunday, October 12

Registration : 12 PM - 1:30 PM

Conference Workshop : Low-Power Circuit and System Design
1:00 PM - 5:30 PM (Santa Clara)

This is a special conference workshop targeting issues in low-power circuit and system design. Invited speakers will provide the context for discussion and interaction among the participants. Note that attendees must register separately for the workshop.

Workshop moderator: Bora Nikolic, UC Berkeley

1:00 - 1:15 PM Workshop welcome

1:15 - 2:00 PM David Brooks, Harvard University

"Architectural and System Level Power Analysis and Optimization"

2:00 - 2:45 PM Siva Narendra, Intel Corporation

"Silicon Integration Choices in Sub-45nm Power Limited Microprocessors"

2:45 - 3:15 PM Break

3:15 - 4:00 PM Anand Ragunathan, NEC

"Addressing the battery gap: Emerging system-level architectures
and design methodologies for mobile appliances"

4:00 - 4:45 PM Azeez Bhavnagarwala, IBM Research

"Scaling Limitations and Energy Efficient Solutions for CMOS SRAM Caches"

4:45 - 5:30 PM Wrap-up and discussion

Monday, October 13

Registration: 8 AM - 5 PM

Convocation session 9 AM - 11:45 AM (Donner Pass)

9:00-9:30

Welcome from Ken Shepard, ICCD 2003 General Chair, and Tom Dillinger, ICCD 2003 Technical Program Chair

9:30-10:15

Keynote Address: High-Speed Link Design; Then and Now

Professor Mark Horowitz, Stanford University

Abstract

For the past 13 years, my research has included the design of high-speed chip-to-chip communication links. When this work started, most chip I/O were TTL levels, and ran at tens of MHz. Our initial work focused on understanding the basic problems of chip I/O, creating the needed circuit blocks to solve these problems, and predicting how the performance would scale with technology. Today the situation is quite different. The basic issues are well understood, and there are many variants of the needed circuit blocks. However the speed demands have also grown with time, causing new issues to arise, and forcing the resulting designs to grow dramatically in complexity. This talk will review the basic issues of a high-speed link, first looking at the problems of link design 10 years ago. After describing how the basic issues of timing and signaling are addressed and how these solution scale, I will look at the issues that designers are facing today, and the techniques being used to cope with these problems. These issues include: circuits limited by the bandwidth of the external wires, worsening transistor matching, and rising bandwidth requirements. The talk will close with some projections about the future, and will look at both electronic and optical links.

Biography

Mark Horowitz is the Yahoo Founder's Professor of Electrical Engineering and Computer Science at Stanford University. He received his BS and MS in Electrical Engineering from MIT in 1978, and his PhD from Stanford in 1984. Dr. Horowitz is the recipient of a 1985 Presidential Young Investigator Award, and an IBM Faculty development award, as well as the 1993 best paper award at the International Solid State Circuits Conference. Dr Horowitz's research area is in digital system design, and he has led a number of processor designs including MIPS-X, one of the first processors to include an on-chip instruction cache, TORCH, a statically-scheduled, superscalar processor that supported speculative execution, and FLASH, a flexible DSM machine. He has also worked in a number of other chip design areas including high-speed and low-power memory design, high-bandwidth interfaces, and fast floating point. In 1990 he took leave from Stanford to help start Rambus Inc, a company designing high-bandwidth chip interface technology. His current research includes multiprocessor design, low power circuits, memory design, and high-speed links.

10:15-11:00

Keynote Address: Terascale Computing and BlueGene

Dr. William Pulleyblank, IBM Corporation

Abstract

Computer simulation is being broadly recognized as a third pillar of research in science and engineering, joining Theory and Experimentation. However the resulting modeling requirements go far beyond the capabilities of current supercomputers. I will discuss this problem as well as different solution approaches currently being tried for certain problems. This takes us into the domain of tera-scale and peta-scale computing. In addition to the significant hardware problems to be solved , we face software issues that are at least as large. I will discuss these topics in the context of BlueGene/L - a 360 teraflop/s super computer being built at IBM Research which will run on the Linux operating system. Building BlueGene/L has necessitated a number of innovations in order to achieve the targeted levels of performance. We also discuss issues of reliability and availability, the so-called autonomic issues, as well as projected applications and performance.

Biography

William R. Pulleyblank is the Director of Exploratory System Servers in IBM’s Research Division and the Director of the IBM Deep Computing Institute. He has also served as the Research relationship executive responsible for Financial Services sector in IBM, the Utility and Energy Services industry, and for the Business Intelligence group. Before joining IBM Research in 1990, Dr. Pulleyblank was the holder of the Canadian Pacific Rail/NSERC Chair of Optimization and Computer Applications at the University of Waterloo. He is a member of a number of boards, including the Advisory Committee of the Division of Mathematics & Physical Sciences of the National Science Foundation, iCORE Board of Directors, the Advisory Council of the Pacific Institute for the Mathematical Sciences - PIMS, and a member of the Scientific Advisory Panel of The Fields Institute for Research in Mathematical Sciences . In addition he serves on the editorial boards of a number of journals. Dr. Pulleyblank’s personal research interests are in Operations Research, Combinatorial Optimization, and Applications of Optimization. In addition to writing a number of scientific papers and books, he has consulted for several companies including: Mobil Oil on helicopter routing; Marks and Spencer on depot management; Statistics Canada on survey validation; and CP Rail on train scheduling.

11:00-11:45

Keynote Address: Advanced EDA Tools for High-Performance Design

Ted Vucurevich, Cadence Design Systems, Inc.

Abstract

This talk will focus on Cadence's vision for advanced EDA tools and technology for high-performance designs at sub-nanometer process nodes. There will be specific emphasis on design for manufacturability to maximize yield, "reliable" design techniques and supporting technologies, and tool support for new high performance and low power circuit design techniques.

Biography

Ted Vucurevich serves as a Cadence Senior Vice President, responsible for driving advanced technology development and directing Cadence Laboratories. In addition, he serves as an executive fellow.

Vucurevich leads the Strategic Technology Office (STO). The STO researches, plans, and promotes a world-class Cadence technology roadmap and vision to Cadence employees, customers, and analysts. As director of Cadence Laboratories, Vucurevich represents Cadence on various external boards and interfaces between research efforts and product development.

In his prior role as chief architect at Cadence, Vucurevich helped develop the strategies and technology initiatives in system-on-a-chip (SoC)-based design, DSM infrastructure, software interoperability, design methodology development, and Internet-based electronic system design.

Vucurevich joined Cadence in 1992 as director of the Analog Physical Design group. In 1994 he was promoted to work as an architect in the Viper Development group. He was later named chief architect and held that position for five years. Prior to Cadence, Vucurevich worked 14 years at Analog Devices where he held roles in product, design, and computer-aided design (CAD) engineering. He was a co-founder of the Linear Signal Processing Division, where he was responsible for the implementation of a complete mixed-signal ASIC CAD environment.

Vucurevich received his bachelor of science degree in electrical engineering from the University of Arizona.

11:45-1:00

Lunch, Sponsored by IBM

1:00-3:00

Session 1.1: Energy Efficiency (Carmel)

Memory is a major source for power consumption. The first paper addresses issues with low power register files by taking an asymmetrical ported approach yielding lower power consumption and access time. The paper called "Power Efficient Data Cache Designs" describes a combination of different threshold voltages with appropriate cache organizations, i.e. multi-banking, to achieve a good trade-off between power and performance. In their contribution "On Reducing Register Pressure and Energy in Multiple-Banked Register Files" the authors delay the dispatch of instructions in order to reduce register requirements as well as dynamic and static power dissipation. The last presentation decomposes operands to achieve lower power in multiplications.

Session Chair : Nihar Mahapatra, Michigan State University

1.1.1. Energy Efficient Asymmetrically Ported Register Files

Aneesh Aggarwal and Manoj Franklin, University of Maryland

1.1.2. Power Efficient Data Cache Designs

Jaume Abella and Antonio Gonzalez, Departament d'Arquitectura de Computadors (UPC), Spain

1.1.3. On Reducing Register Pressure and Energy in Multiple-Banked Register Files

Jaume Abella, and Antonio Gonzalez, Departament d'Arquitectura de Computadors (UPC), Spain

1.1.4. Low Power Multiplication Algorithm for Switching Activity Reduction through Operand Decomposition

Masayuki Ito, D. Chinnery and Kurt Keutzer, UC-Berkeley

Session 1.2 : Timing Verification (San Jose)

The first paper deals with verification of timing. It eliminates the need for the circuit under verification to be partitioned by automatically decomposing the circuit and using appropriate abstractions. The second paper tackles the issue of overtesting of sequential circuits with respect to delay faults. The third paper reports on an accurate simulation method for crosstalk defects. Last paper of the session applies bounded model checking to circuits with muiltiple clocks.

Session chair: Alex Orailoglu, University of California-San Diego

1.2.1. Verification of Timed Circuits with Failure Directed Abstractions

        Hao Zheng, IBM Microelectronics
        Chris Myers, David Walter and Scott Little, University of Utah
        Tomohiro Yoneda, National Institute of Informatics, Tokyo

1.2.2. Procedures for Identifying Untestable and Redundant Transition Faults in Synchronous Sequential Circuits

Gang Chen and Sudhakar Reddy, University of Iowa
Irith Pomeranz, Purdue University

1.2.3. Event-Centric Simulation of Crosstalk Pulse Faults in Sequential Circuits

Marong Phadoongsidhi and Kewal K. Saluja, University of Wisconsin-Madison

1.2.4. Specifying and Verifying Systems with Multiple Clocks

Edmund Clarke, Daniel Kroening and Karen Yorav, Carnegie-Mellon University

Session 1.3: Electrical Analysis for System LSI (Santa Clara)

Electrical modeling of LSI has risen to the fore as a research topic with sub-100nm geometries on the cusp of wide deployment. The first paper reports 10x improvement in speed as well as memory reduction in 3-D capacitance extraction. The second paper reports an method for coupling noise estimation that has an average error of 8% w.r.t. SPICE, but is 10x faster. The authors report the ability to run on an 80K net microprocessor in 13 minutes. The next paper reports a method for leakage computation based on a symbolic representation of the leakage scenarios for a module. This allows input don't cares to be considered in the computation, making the calculations less pessimistic. Finally, we have a method for worst case crosstalk delay computation based on an iterative algorithm that avoids non-linear driver simulations and computes the delay to within 10% of SPICE.

Session chair: Wolfgang Roethig, NEC Electronics

1.3.1. Enhanced QMM-BEM Solver for 3-D Finite-Domain Capacitance Extraction with Multilayered Dielectrics

Wenjian Yu and Zeyi Wang, Tsinghua University, Beijing China

1.3.2. An Improved method for Fast Noise Estimation based on Net Segmentation

Chih-Liang Huang and Aurobindo Dasgupta, Intel Corporation

1.3.3. Symbolic Failure Analysis of Custom CMOS Circuits due to Excessive Leakage Current

Hui-Yuan Song, Saswat Bohidar and Iris Bahar, Brown University
Joel Grodstein, Intel Corporation

1.3.4. An Efficient Algorithm for Calculating the Worst-case Delay due to Crosstalk

Venkat Rajappan and Sachin Sapatnekar, University of Minnesota

3:30-5:30

Session 2.1: Power Optimization (Carmel)

Session chair: Borivoje Nikolic, University of California-Berkeley

2.1.1. A Compact Model for Analysis and Design of CAM Block Power Network with On-chip Decoupling Capacitors

Payman Zarkesh-Ha, Ken Doniger, William Loh, Dechang Sun, Rick Stephani and Gordon Priebe, LSI Logic

2.1.2. Precomputation-based Guarding for Dynamic and Leakage Power Reduction

Afshin Abdollahi and Massoud Pedram, USC
Farzan Fallah, and Indradeep Ghosh, Fujitsu Labs of America

2.1.3. Charge-recycling voltage domains for energy-efficient low-voltage operation of digital CMOS circuits

S. Rajapandian, Z. Xu, and K. L. Shepard, Columbia University

2.1.4. Low Power Adder with Adaptive Supply Voltage (short)

Hiroaki Suzuki, Woopyo Jeong, and Kaushik Roy, Purdue University

2.1.5. A Transparent Voltage Conversion Method and Its Application to a Dual-Supply-Voltage Register File (short)

Nestoras Tzartzanis and William W. Walker, Fujitsu

Session 2.2 : Gene Chip Design: special session (San Jose)

Session chair: Ken Shepard, Columbia University

2.2.1. Embedded Tutorial: Detection of Biological Molecules: From Self-Assembled Films to Self-Integrated Devices (invited)

Rastislav Levicky, Department of Chemical Engineering, Columbia University

2.2.2. Design Flow Enhancements for DNA Arrays

Andrew B. Kahng, Ion Mandoiu, Sherief Reda and Xu Xu, University of California-San Diego
Alex Zelikvosky (GSU)

Session 2.3: System Level Design (Santa Clara)

Four of the five papers in this session investigate optimizations involving communication fabrics in system LSI. The first paper presents a technique for customized bus architecture synthesis based on simulated annealing. The second paper is an interesting exploration of power reduction by means of dynamic adjustment of frequencies of the source, target and interconnecting bus during synchronous communication. The third paper presents an approach to make efficient use of embedded FPGA RAM for mapping variables shared between the host and an FPGA-based co-processor. The next paper proposes a method for efficient compiled code simulation.

Session chair: Anand Raghunathan, NEC Labs

2.3.1. Bus Architecture Synthesis for Hardware-Software Co-Design for Deep Submicron Systems on Chip

Nattawut Thepayasuwan, V. Damle, and A. Doboli, SUNY-Stony Brook

2.3.2. Dynamically Optimized Synchronous Communication for Low Power System on Chip Designs

Vikas Chandra, Carnegie-Mellon University
Gary Carpenter and Jeff Burns, IBM Austin Research Lab

2.3.3. Interface Synthesis using Memory Mapping for an FPGA Platform

        Manev Luthra, Sumit Gupta and Nikil Dutt University of California-Irvine
        Rajesh Gupta University of California, San Diego
        Alex Nicolau University of California

2.3.4. Efficient Synthesis of Networks On Chip (short)

Alessandro Pinto, Luca P. Carloni and Alberto L. Sangiovanni-Vincentelli University of California-Berkeley

2.3.5. Reducing Compilation Time Overhead in Compiled Simulators (short)

Mehrdad Reshadi and Nikil Dutt, University of California-Irvine

Tuesday, October 14

Registration: 8 AM - 4 PM

9:00-10:30

Session 3.1: Systems Performance (Carmel)

The first paper in this session studies changes in interrupt handler performance across a range of IA-32 based systems. The second paper examines commercial server workload behavior when using bus structures that dynamically compress their traffic. The final paper in this session analyzes multi-hop bypassing networks for ALU results.

Session chair: Ed Grochowski, Intel Corporation

3.1.1. Profiling Interrupt Handler Performance through Kernel Instrumentation

Branden Moore, Thomas Slabach and Lambert Schaelicke, University of Notre Dame

3.1.2. Design and Performance of Compressed Interconnects for High Performance Servers

Krishna Kant and Ravishankar Iyer, Intel Corporation

3.1.3. Routed Inter-ALU Networks for ILP Scalability and Performance

Karthikeyan Sankaralingam, Vincent Ajay Singh, Stephen W. Keckler and Doug Burger University of Texas-Austin

Session 3.2 : uP Test & Diagnosis (San Jose)

The first paper of the session studies how the critical paths of a microprocessor behave in silicon and suggest a test strategy based on the results. The second paper describes a SAT-based ATPG which does not require that controller is separated from the datapath. Next, a fault tolerance design for branch predictors is proposed. The final paper extends simple-fault diagnosis to multiple faults using n-detection test sets.

Session chair: Sudhakar Reddy, The University of Iowa

3.2.1. Automatic Generation of Critical-Path Tests for a Partial-Scan Microprocessor

Joel Grodstein, Dilip Bhavsar and Vijay Bettada, Intel Corporation
Richard Davies, Hewlett Packard

3.2.2. Test Generation for Non-separable RTL Controller-datapath Circuits using a Satisfiability based Approach

Loganathan Lingappan and Niraj K. Jha, Princeton University
Srivaths Ravi, NEC Research Labs

3.2.3. Cost-Effective Graceful Degradation in Speculative Processor Subsystems: The Branch Prediction Case (short)

Sobeeh Almukhaizim, Thomas Verdel and Yiorgos Makris, Yale University

3.2.4. Multiple Fault Diagnosis Using n-Detection Tests (short)

Zhiyuan Wang and Malgorzata Marek-Sadowska, University of California-Santa Barbara
Kun-Han Tsai and Janusz Rajski Mentor Graphics Corporation

Session 3.3: Physical Design (Santa Clara)

The first paper presents a physical design methodology that was applied to the design of a current generation microprocessor, and a tool-set derived from this experience. As integration by means of smaller geometries becomes a challenge, it is natural to consider chip stacking- the next paper explores floor-planning and place-and-route tools for stacked chips. Finally, we have a placement algorithm targeted at achieving a desired cell density distribution with application in IP based design as well as for facilitating ECO.

Session chair: Shantanu Dutt, University of Illinois-Chicago

3.3.1. A Physical Design Methodology for 1.3GHz SPARC64 Microprocessor

Noriyuki Ito, Hiroaki Komatsu, Yoshiyasu Tanamura, Roichi Yamashita, Hiroyuki Sugiyama, Yaroku Sugiyama and Hirofumi Hamamura, Fujitsu, Kawasaki Japan

3.3.2. Physical Design of the 2.5D Stacked System

Yangdong Deng and Wojciech Maly, Carnegie-Mellon University Y

3.3.3. Flow-Based Cell Moving Algorithm for Desired Cell Distribution

Bo-Kyung Choi, Huaiyu Xu and Majid Sarrafzadeh, UCLA

11:00-12:30

Session 4.1: Performance Optimization (Carmel)

Performance optimizations need realistic benchmarks. The first presentation specializes on a specific benchmark for network processors. As busses increase area and power consumption, the next paper proposes a scheme for hardware-based compression and decompression of underutilized processor busses. The last paper combines pipelined multiplicative division with IEEE rounding.

Session Chair: Christos Kozyrakis, Stanford University

4.1.1. NpBench: A Benchmark Suite for Control plane and Data plane Applications of the Network Processor

Byeong Kil Lee and Lizy Kurian John, University of Texas-Austin

4.1.2. Hardware-Only Compression of Underutilized Address Buses: Design and Performance, Power, and Cost Analysis

Nihar Mahapatra, Jiangjiang Liu, and Krishnan Sundaresan, Michigan State University and University at Buffalo

4.1.3. Pipelined Multiplicative Division with IEEE Rounding

        Guy Even, Tel-Aviv University
        Peter-Michael Seidel, Southern Methodist University

Session 4.2 : Clock & Signal Distribution (San Jose)

Session chair: Frank O'mahony, Intel Corporation

4.2.1. Design of resonant global clock distributions

S. C. Chan, K. L. Shepard, Columbia University,
P. J. Restle, IBM

4.2.2. Modeling and Mitigation of Jitter in Multi-Gbps Source-Synchronous I/O Links

Ganesh Balamurugan and Naresh Shanbhag, University of Illinois

4.2.3. A Mixed-Mode Delay-Locked Loop Architecture (short)

Daniel Eckerbert, L. Svensson and P. Larsson-Edefors, Chalmers University of Technology, Goteborg Sweden

4.2.4. Optimal Inductance for On-chip RLC Interconnections (short)

Shidhartha Das, Kanak Agarwal, David Blaauw and Dennis Sylvester , University of Michigan

Session 4.3: Performance and Power-Driven Physical Design (Santa Clara)

The first paper presents a dynamic programming based formulation of automatic insertion of buffers and flip-flops in use at Intel for the synthesis of high-end production chips. The next paper presents a Game Theory based formulation of the voltage scaling and gate sizing problem with the claim that it leads to rapid convergence and better results. The final paper presents a method to reduce wire length by up to 60% in prescribed- skew clock routing.

Session chair: Arun Balakrishnan, NEC Electronics

4.3.1 Spec Based Flip-Flop And Buffer Insertion

Nataraj Akkiraju and Mosur Mohan, Intel Corporation

4.3.2. A Microeconomic Model for Simultaneous Gate Sizing and Voltage Scaling for Power Optimization

N. Ranganathan and A. K. Murugavel, University of South Florida

4.3.3. A Simple Yet Effective Merging Scheme for Prescribed-Skew Clock Routing

Rishi Chaturvedi and Jiang Hu, Texas A&M University

12:30-1:30

Lunch, Sponsored by Intel

1:30-3:30

Session 5.1: Instruction Execution (Carmel)

This session deals with faster instruction execution. The first contribution presents a hardware-based prefetching technique to alleviate load misses. In the next paper the authors propose an algorithm to match dispatch resources to average needs rather than peak needs. The next paper depicts an efficient VLIW DSP architecture for baseband processing. The session is concluded with a paper on a novel approach to speculative multithreading.

Session chair: Nihar Mahapatra, Michigan State University

5.1.1. Hardware-based pointer data prefetcher

Shih-Chang Kevin Lai, Silicon Integrated Systems, Hsin-Chu Taiwan
Shih-Lien Lu, Intel Corp

5.1.2. A Dependence Driven Efficient Dispatch Scheme

Sriram Nadathur and Akhilesh Tyagi, Iowa State University

5.1.3. An Efficient VLIW DSP Architecture for Baseband Processing

Tay-Jyi Lin, Chin-Chi Chang, Chen-Chia Lee, and Chein-Wei Jen, National Chiao Tung University, Hsin-Chu Taiwan

5.1.4. Dynamic Thread Resizing for Speculative Multithreaded Processors

Mohamed Zahran and Manoj Franklin, University of Maryland

Session 5.2 : Test Compression Technology: invited session (San Jose)

One of the most notable breakthroughs in the field of testing in recent years is test compression where an orders of magnitude improvement in test data volume and application time has been achieved. This session captures the major contributions made in the field and new improvements.

Session chair: Jim Sproch, Sr. Director of Research and Development, Test Automation Products, Synopsys

5.2.1. Care Bit Density and Test Cube Clusters: Multi-Level Compression Opportunities (invited)

Bernd Koenemann, Cadence Design Systems

5.2.2. XMAX: X-Tolerant Architecture for MAXimal Test Compression (invited)

Subhasish Mitra and Kee Sup Kim, Intel Corporation

5.2.3. Test Data Compression and Compaction for Embedded Test of Nanometer Technology Designs (invited)

Janusz Rajski and Jerzy Tyszer, Mentor Graphics

Session 5.3: Physical Design for Regular Fabrics and FPGA's (Santa Clara)

The first paper explores synthesis into a regular circuit fabric based on a novel BDD-like Boolean representation. The paper claims greater than 60% reductions in area and delay. The next paper proposes a new model to characterize the routing process under timing-driven constraints and use it to estimate wiring resources required in such routing for FPGAs. The final paper presents a high-quality time efficient detailed router for FPGAs.

Session chair: Tim Burks, Magma Design Automation

5.3.1. Non-Crossing Ordered BDD for Physical Synthesis of Regular Circuit Structure

Aiqun Cao and Cheng-Kok Koh, Purdue University

5.3.2. Interconnect Estimation for FPGAs under Timing Driven Domains

Parivallal Kannan and Dinesh Bhatia, University of Texas-Dallas

5.3.3. ROAD: An Order-Impervious Optimal Detailed Router for FPGA's

Hasan Arslan and Shantanu Dutt, University of Illinois at Chicago

4:00-5:30

Session 6.1: Array Design Optimization (Carmel)

Energy optimization of internal processor tables and arrays is the focus in this session. We start with a paper on dynamically resizing the data TLB depending on application execution thus yielding less power requirements. The second presentation considers several approaches for reducing reorder buffer complexity and power dissipation. Finally, the goal of lower power TLBs is tackled by a virtual page tag reduction scheme and power-efficient clustered microarchitectures are described.

Session chair: Peter-Michael Seidel, Southern Methodist University

6.1.1. Reducing dTLB Energy Through Dynamic Resizing

Victor Delaluz, M. Kandemir, A. Sivasubramaniam, M.J. Irwin and N. Vijaykrishnan, Penn State University

6.1.2. Distributed Reorder Buffer Schemes for Low Power

Gurhan Kucuk, Oguz Ergin, Dmitry Ponomarev and Kanad Ghose SUNY-Binghamton

6.1.3. Virtual Page Tag Reduction for Low-power TLBs (short)

Peter Petrov and A. Orailoglu, University of California-San Diego

6.1.4. Dynamic Cluster Resizing (short)

Jose Gonzalez and Antonio Gonzalez, Intel Labs, Barcelona Spain

Session 6.2 : Test Compaction (San Jose)

The session is opened by the paper which applies linear programming to test compaction, yielding a polynomial method achieving results close to optimum. The next paper identifies don't cares in a test set an assigns them in order to ensure effective compression. The third paper constructs a compact test set for a circuit consisting of several sub-circuits, starting with their individual test sets. The last paper of the session proposes an efficient heuristic for relaxing test sequences.

Session chair: Piet Engelke, University of Freiburg, Germany

6.2.1. Independent Test Sequences Compaction through Integer Programming

Petros Drineas, Rensselaer Polytechnic Institute
Yiorgos Makris, Yale University

6.2.2. On Combining Pinpoint Test Set Relaxation and Run-Length Codes for Reducing Test Data Volume

        Seiji Kajihara and Yasutoshi Doi, Kyushu Institute of Technology, Iizuka Japan
        Lei Li and Krishnendu Chakrabarty, Duke University

6.2.3.    Static Test Compaction for Multiple Full-Scan Circuits (short)

Irith Pomeranz, Purdue University
Sudhakar M. Reddy, University of Iowa

6.2.4. A Method to Find Don't Care Values in Test Sequences for Sequential Circuits (short)

        Yoshinobu Higami,Shin-ya Kobayashi and Yuzo Takamatsu, Ehime University, Matsuyama Japan
        Seiji Kajihara, Kyushu Institute of Technology
        Irith Pomeranz, Purdue University

Session 6.3: Techniques for Synthesizing into Fabrics: invited session (Santa Clara)

The design and manufacture of conventional ASICs faces many hurdles. It has been proposed that some of these hurdles can be overcome by means of a methodology that implements into a predetermined fabric. This session presents an EDA company view of this methodology followed and a case study at IBM of trying to implement an SOC by limiting the application specific portion to a minimum. Finally, we have a panel discussion to discuss the pros-and-cons of this methodology.

Session chair: Sinan Kaptanoglu, Altera Corporation

6.3.1. Simplifying SoC design with the Customizable Control Processor Platform (invited)

C. Ogilvie, R. Ray, R. Devins, M. Kautzman, M. Hale, R. Bergamaschi, B. Lynch, and S. Gaur, IBM

6.3.2. Structured ASICs: Opportunities and Challenges (invited)

Behrooz Zahiri, Magma Design Automation

6.3.3. Panel Discussion: "System LSI Implementation Fabrics for the Future" (special panel discussion)

Sinan Kaptanoglu, Altera, moderator

Abstract for Panel Discussion

IC design methodology and practice are going through major changes. The ASIC industry is being squeezed by FPGAs on one end, and by standard ICs on the other. Most of the changes are fueled by sub-100 nm process technology. The mask costs are skyrocketing, and first time silicon success is way down due to complications with process parasitics, cross talk, high device leakage current and other issues. The state of the art CAD tools are unable to deal with the process related complications and the ever increasing design complexity fully automatically. More than ever before, the involvement of the designer in critical CAD related decisions has exceeded the capacity of what we may rightfully expect from any individual.

Structured ASICs (SASICs) are one response to these problems. SASICs aim to fill a void between FPGAs and traditional ASICs. SASICs reduce design risk and complexity at the expense of design performance and possibly total cost, if the production volume is very high. SASICs are dismissed by some pundits as a mere marketing gimmick. Others claim SASICs are the long awaited salvage of the otherwise doomed ASIC industry. Predictions of SASICs disappearing in a few years are common. However, it is equally common to hear predictions that SASICs will expand tremendously and will push ASICs into a small corner of very high performance and very high production volume.

SASICs are still very much being defined in terms of their boundaries and their application space. Some SASICs are very close to traditional ASICs. Other SASICs are quite different; they are perhaps closer to traditional gate arrays in their look and feel. A third group of SASICs provide a much closer link with FPGAs.

In the panel discussion we will explore the fluid boundaries among ASICs, SASICs, and FPGAs. We will also explore different flavors of SASICs. We will discuss CAD tools and design methodology for SASICs. Of course, as any good panel should do, we will speculate on the implications of all these changes and what the future may bring. We may even come up with a few bold (and controversial) predictions about the fate of SASICs and what they will mean to the IC industry and IC CAD industry five years from now.

7:00-9:00

Banquet

Bella Mia Restaurant, Downtown San Jose CA

Wednesday, October 15

Registration: 8:30 AM - 4 PM

9:00-10:30

Session 7.1: Hardware Partitioning (Carmel)

Session chair: Tom Dillinger, Sun Microsystems

7.1.1 Multiple-Vdd Scheduling/Allocation for Partitioned Floorplan

Dongku Kang, Mark C. Johnson and Kaushik Roy, Purdue University

7.1.2. SCATOMi : Scheduling Driven Circuit Partitioning Algorithm for Multiple FPGAs using Time-multiplexed, Off-chip, Multicasting Interconnection Architecture

Young-Su Kwon and Chong-Min Kyung,, KAIST, Taejon, Korea
Bong-Il Park, Dynalith Systems

7.1.3. A Study of Hardware Techniques That Dynamically Exploit Frequent Operands to Reduce Power Consumption in Integer Function Units (short)

K. Gandhi and Nihar R. Mahapatra, Michigan State University

Session 7.2 : Energy-Aware Design and Application (San Jose)

Consideration of energy and power consumption during system design is a must. The first three papers consider optimization of various energy and power related metrics in high-level design. The last paper proposes a detailed simulation model and protocol for energy efficiency driven workload distribution in a wireless ad-hoc network.

Session chair: Farzan Fallah, Fujitsu Labs of America

7.2.1. KnapBind: An Area-Efficient Binding Algorithm for Low-leakage Datapaths

Chandramouli Gopalakrishnan and Srinivas Katkoori, University of South Florida

7.2.2. A Novel Synthesis Strategy Driven by Partial Evaluation Based Circuit Reduction for Application Specific DSP Circuits (short)

Madhubanti Mukherjee and Ranga Vemuri, University of Cincinnati

7.2.3. Power Fluctuation Minimization During Behavioral Synthesis using ILP-Based Datapath Scheduling (short)

Saraju Mohanty, N. Ranganathan and Sunil K. Chappidi, University of South Florida

7.2.4. An Energy-Aware Simulation Model and Transaction Protocol for Dynamic Workload Distribution in Mobile Ad Hoc Networks

Farhad Ghasemi and M. Pedram, Sharif University Of Technology
Peng Rong, USC

Session 7.3: High-Speed Design Issues and Test Challenges: invited session (Santa Clara)

At above 10GHz, engineers face unique challenges in design and testing. The first two papers presents design issues and case studies. The last paper describes the test solutions for above 10GHz devices.

Session Chair: Kee Sup Kim, Director of DFX, Intel Communications Group, Intel

7.3.1. CMOS High-Speed Serial I/O's: Present and Future (invited)

M. Lee, William Dally, Ramin Farjad-Rad, Hiok-Tiaq Ng, Ramesh Senthinathan, John Edmondson, and John Poulton, Velio

7.3.2. Fully Differential Receiver Chipset for 40 Gb/s Applications using InP/InGaAs Single Heterojunction Bipolar Transistors (invited)

Kursad Kiziloglu, S. Seetharaman, K.W. Glass, C. Bil, H.V. Duong, G. Asmanis, Intel Corporation

7.3.3. Paradigm Shift in Designing and Testing >GB/s Data Communication Systems (invited)

Mike Li, Jan Wilstrup, Wavecrest

11:00-12:30

Session 8.1: Efficiency & Reliability (Carmel)

The first paper in this session studies memory systems built around NAND flash memory devices, storing both code and data and using caching tailored to NAND flash characteristics. The second paper in this session investigates microarchitecture redundancy for enhancing defect tolerance and for increasing yield. It examines component-level redundancy as well as array row and column redundancy and defective queue-entry isolation. The final paper of this session looks at a real-time scheduling problem involving dynamic voltage and frequency scaling mechanism that attempts to match the multimedia decoder's output rate to the display rate.

Session chair: Ed Grochowski, Intel Corporation

8.1.1. Cost-Efficient Memory Architecture Design of NAND Flash Memory Embedded Systems

Chanik Park, Jaeyu Seo, Sunghwan Bae, Shinhan Kim and Bumsoo Kim, Samsung, Seoul Korea

8.1.2. Exploiting Microarchitectural Redundancy For Defect Tolerance

Premkishore Shivakumar, Stephen W. Keckler, Charles R. Moore and Doug Burger, University of Texas-Austin

8.1.3. Reducing Multimedia Decode Power using Feedback Control

Zhijian Lu, John Lach, Mircea Stan and Kevin Skadron, University of Virginia

Session 8.2 : Novel Methods in Logic Synthesis (San Jose)

The first paper explores the fast detection of symmetries in a logic netlist that works very well in practice. Once detected, symmetries have very useful application in verification and synthesis. The second paper explores a fast new method for Boolean decomposition, while the final paper explores the application of SAT based methods in the inner loop of Espresso-like heuristic two-level minimization.

Session chair: Rajeev Murgai, Fujitsu Labs of America

8.2.1. Structural Detection of Symmetries in Boolean Functions

Guoqiang Wang, Andreas Kuehlmann and Alberto Sangiovanni-Vincentelli, University of California-Berkeley

8.2.2. Boolean Decomposition Based on Cyclic Chains

Elena Dubrova, Maxim Teslenko and Johan Karlsson, KTH Royal Institute of Technology, Stockholm Sweden

8.2.3. SAT-Based Algorithms for Logic Minimization

Samir Sapra, Michael Theobald and Edmund Clarke, Carnegie-Mellon University

12:30-1:30

Lunch

1:30-3:00

Session 9.1: Communications & Context Management (Carmel)

The first contribution presents low-density parity-check codes and novel architectures to achieve high bit error rate performance. The authors of the next paper show new schemes to reduce the negative impact of context switches on branch prediction in embedded processors. The last contributions overcome operand transport complexity using a novel distributed register file architecture and deal with scalable and high performance network-on-chip architectures.

Session chair: Christos Kozyrakis, Stanford University

9.1.1. Low-Density Parity-Check Decoder Architecture for High Throughput Optical Fiber Channels

Anand Selvarathinam, Euncheol Kim and Gwan Choi, Texas A&M University

9.1.2. Improving Branch Prediction Accuracy in Embedded Processors in the Presence of Context Switches

Sudeep Pasricha and Alex Veidenbaum, University of California-Irvine

9.1.3. Reducing Operand Transport Complexity of Superscalar Processors using Distributed Register Files (short)

Santithorn Bunchua, D. Scott Wills, and Linda M. Wills, Georgia Institute of Technology

9.1.4. Xpipes: a Latency Insensitive Parameterized Network-on-chip Architecture for Multi-Processor SoCs (short)

Matteo Dall'Osso, Gianluca Biccari, Luca Giovannini, Davide Bertozzi,and Luca Benini, Universita di Bologna, Bologna Italy

Session 9.2 : Board Test and Power-Aware Test (San Jose)

The first paper in this session achieves test power reduction by modifying the scan chain. Next, an MILP-based test scheduling algorithm taking the power profile of non-embedded cores into account is presented. The final paper of the session introduces a novel model for board test and proposes extending the JTAG architecture by adding a delay/skew sensor.

Session chair: Seiji Kajihara, Kyushu Institute of Technology, Japan

9.2.1. Aggressive Test Power Reduction Through Test Stimuli Transformation

Ozgur Sinanoglu and Alex Orailoglu, University of California-San Diego

9.2.2. Power-Time Tradeoff in Test Scheduling for SoCs

Mehrdad Nourani and James Chin, University of Texas-Dallas

9.2.3. Multiple Transition Model and Enhanced Boundary Scan Architecture to Test interconnects for Signal Integrity

Mohammad Tehranipour, Nisar Ahmed and Mehrdad Nourani, University of Texas-Dallas

Conference Workshop : Low-Power Circuit and System Design 1:00 PM - 5:30 PM (Santa Clara)