# **Extending the Applicability of Parallel-Serial Scan Designs**

Baris Arslan, Ozgur Sinanoglu and Alex Orailoglu Computer Science and Engineering Department University of California, San Diego La Jolla, CA 92093

{barslan, ozgur, alex}@cs.ucsd.edu

## ABSTRACT

Although scan-based designs are widely used in order to reduce the complexity of test generation, test application time and test data volume are substantially increased. We propose two different methodologies for test cost reduction in scan-based designs. The first methodology improves on the Illinois Scan Architecture, aiming at reducing the high test cost of the test vectors that necessitate the serial test application mode. The second methodology employs on-chip serial transformations to generate an input stimulus that can be applied efficiently. The transformation-based methodology utilizes the proposed scan design to obtain the minimal cost input stimulus. The experimental results indicate that a substantial test cost reduction, reaching 90% levels, can be obtained.

#### 1. INTRODUCTION

Full scan is the most widely used design for testability methodology in today's integrated circuits. In scan-based design, the circuit can be set to any possible state by a serial shift-in operation and the current state of the circuit can be obtained by a serial shift-out operation. Consequently, the full controllability and observability of the memory elements reduces the sequential test generation problem into a combinational one, keeping the test generation complexity of today's large circuits within practical limits. Nevertheless, scan-based designs are costly in terms of test application time and test data volume due to access serialization.

The number of scan chains in the design is determined by the number of scan input pins. Increases in the scan input pins help reduce the average length of the scan chains, driving the test application time down. Nevertheless, an increased scan input count necessitates a higher cost automatic test equipment (ATE) with a large pin count. The costs of such ATEs range at the level of thousand dollars per pin [1]. The Illinois scan architecture [2] that partitions the scan cells of the circuit into multiple scan chains has been proposed as a test cost reduction methodology for scan-based designs. Instead of sourcing each scan chain with a dedicated scan input pin, which necessitates a higher cost ATE with a high pin count, all scan chains are identically loaded in parallel from a single pin. Due to the constraints set on the test data by the new configuration, some faults may remain undetected in the parallel mode. The Illinois scan architecture additionally supports a serial mode in order to test the faults that are not detected in the parallel mode. The Illinois scan architecture, due to its simplicity and the test cost reduction levels that it is capable of delivering, has served as a foundation for many subsequent methods [3, 4] and still stands as one of the standard test cost reduction methodologies. An improvement on the Illinois Scan can be seen in [3] which provides two different scan cell partitionings, reducing the number of serial test patterns. A feedback architecture that enables the use of prelude vectors to resolve the dependencies generated by the parallel load mode in [2] is presented in [4], eliminating the serial test patterns.

In this work, we propose two new methodologies for test cost reduction. The first methodology targets the fundamental shortcoming of the original Illinois scan architecture, namely, the time consuming serial test application step. A new scan design is proposed that improves on the serial test application mode of the Illinois scan architecture, enabling the load of only a subset of test slices of a particular test pattern in the serial mode and the remaining slices in the parallel mode. The hardware cost of the new design is identical to the original Illinois scan architecture. The second methodology utilizes the serial transformations that are modeled in [5] in order to transform a given test set into a new one that can be applied efficiently in a parallel load scan architecture. The proposed methods perfectly complement each other, wherein the transformationbased methodology is utilized to obtain the input stimulus that can be applied efficiently by the improved scan architecture.

Section 2 discusses the Illinois scan architecture. The new scan design that overcomes the fundamental limitations of the Illinois scan architecture is outlined in Section 3. Section 4 outlines the serial on-chip transformation based methodology that generates from a given test vector an input stimulus that can be applied in the parallel mode. Section 5 provides an experimental evaluation of the proposed methods and a brief conclusion is drawn in Section 6.

### 2. ILLINOIS SCAN ARCHITECTURE

The Illinois scan architecture increases the number of scan chains to reduce the test cost. Nonetheless, the reduction is achieved while retaining the original scan input count. The Illinois scan architecture supports two different configurations of the scan cells, namely, serial and parallel modes. In the parallel mode, the scan chains are partitioned into multiple smaller scan chains, reducing the average length of the scan chains. Instead of sourcing each scan chain with a dedicated scan pin that requires a higher pin count ATE, all scan chains are sourced in parallel with the same test data from the original scan input. In the Illinois scan architecture, the scan cells equidistant from the head of the scan chains, i.e. a test slice, are all constrained to the same logic value due to the parallel load of the scan chains from the same pin. Consequently, the set of applicable patterns is limited to a subset of all possible test patterns, wherein the cells in the same slice are restricted to the same logic value, resulting in a set of faults that are untestable in the parallel mode. The remaining undetected faults are tested in the serial mode.

In the serial mode, the scan chains are configured as in the traditional scan architecture. Both test application time and test data size for these particular vectors are identical to those of the traditional architecture, delivering no improvement in test cost but providing preservation of the fault coverage. The selection between serial and parallel modes is achieved through multiplexers that pass the test data from the scan input in the parallel mode or from the scan output of the previous scan chain in the test mode. Further details can be found in [2].



Figure 1: A Sample Test Pattern

Test generation and application with the Illinois scan architecture is performed in two steps. In the first step, test generation is performed under the constraints imposed by the parallel mode. In the second step, a second test set is generated for the faults remaining undetected in the first step by removing all constraints. The first test set is applied in the parallel mode and the second test set is applied in the serial mode.

#### 3. PROPOSED SCAN DESIGN

Assuming N internal scan chains and one scan input in the Illinois scan architecture, an N-fold decrease in the test application time and test data size is delivered by the test vectors generated under the constraints imposed in the parallel mode. Since the test set for the remaining undetected faults is applied in the serial mode, the test cost for this particular set of test vectors is identical to the original test cost of the traditional scan configuration and subsequently no reduction can be delivered. Although increasing the number of internal scan chains may seem to promise a further reduction due to the increased reduction ratio in the parallel mode, the more stringent constraints imposed by the increased scan chain count result in more undetected faults in the parallel mode and subsequently to a sizable test vector set to be applied in serial mode.

In the Illinois scan architecture, the cost of the application of a test vector in serial mode is N times the cost of its application in parallel mode. Consequently, serial test vectors constitute a strong limitation on the attainable levels of test cost reduction with the Illinois Scan Architecture. Due to the configuration of the scan chains, each test vector is applied completely in either parallel or serial mode. Although a fault may not be detected under the constraints of the parallel mode, the removal of the logic value identicality constraint for only a small number of test slices may allow the generation of a test vector that detects this particular fault. A test application methodology that enables the application of a subset of the test slices in serial mode removes the constraints for these particular slices. The underlying motivation can be seen more clearly in the example test vector that is shown in Figure 1. Since this particular test vector consists of both logic '0 and '1 in the second test slice, it cannot be applied in the parallel mode, necessitating the use of the serial mode. Although the parallel application is performed in 4 cycles, the serial application requires 16 cycles. A close look shows that all but the second slice containing complementary logic values can be applied in parallel mode. Consequently, the application of all test slices but the second one in parallel mode and only the application of the second slice in serial mode requires a total of 7 cycles instead of 16 cycles; 1 cycle each for test slices 1,3 and 4 and 4 cycles for test slice 2. While the potential benefits of this approach are evident, Illinois Scan falls short of being able to exploit it, as the scan output of each scan chain is connected to the scan input of the subsequent scan chain in it.

If the serial configuration is generated by connecting test slices instead of scan chains, the application of the test slices with complementary logic values in serial mode with the rest applied in parallel mode is enabled. Nevertheless, since the number of test slices is quite large in comparison to the number of scan chains, the re-



Figure 2: Proposed Scan Design

quired number of multiplexers can be quite a bit larger than Illinois Scan in this particular configuration, increasing the hardware cost. A proposed scan design that supports the aforementioned test application methodology with the same hardware cost as Illinois Scan is provided in Figure 2. Instead of configuring the serial mode by connecting the scan outputs with the subsequent scan inputs, the head scan cells of the scan chains are connected in the serial mode. In the new design, when a test slice necessitates application in serial mode, a broadcast is performed in parallel mode and then the current slice is set by using the serial mode. The first broadcast step enables the shift of the test data in head scan cells of the scan chains to the next level, preserving previously applied test slices. Furthermore, the broadcast mode eliminates the need to apply new test data serially to all scan cells in the current test slice. Assume that the last k values of a test slice of length N exhibit the same logic value. N + 1 cycles are required if one broadcast and N serial shift cycles are utilized. Nevertheless, the broadcast of the logic value of the last k bits and only the shift of the first N - kbits of the test slice suffices in order to apply this particular test slice, a total of N - k + 1 cycles. Subsequently, in the worst case, the application of a test slice requires N cycles.

Test application in the proposed design for the example in Figure 1 is performed as follows. In the first two cycles the parallel mode is employed and logic 0 and 1 are loaded into scan chains, respectively. In the third test cycle, the rightmost bit of the second slice, logic 0, is applied in the parallel mode. Since the current content of the test slice is the broadcast value, only the first 2 bits of the second slice are applied in serial mode. Finally, the parallel mode is used for the application of the first test slice, resulting in a total of 6 cycles. The test data is modified to indicate whether the serial or parallel mode is used. For each cycle, a single bit, *the configuration selection bit*, is used to represent the specific scan configuration. In the parallel mode, the configuration selection bit is followed by the logic value to be broadcast, and in the serial mode, it is followed by the number of required serial shifts and then subsequently the bits to be shifted in.

The first step of the test application with the new scan design is identical to the Illinois scan architecture. An initial test generation is performed under the constraints of the parallel mode and this particular set of test vectors is applied by broadcasting the test data. In the second step, a new test set is generated for the remaining undetected faults. The new design differentiates itself during the application of this second test. Although the Illinois scan architecture applies these particular test vectors by using completely the serial mode, the new design enables the application of only the test slices with conflicting logic values in serial mode, reducing substantially the application cost of this particular test set.

## 4. TRANSFORMATION BASED TEST COST REDUCTION

Serial on-chip transformations through logic gate insertion between the scan chains are modeled as a matrix formulation and a matrix band algebra is developed to implement the scan chain modification that realizes the desired transformation in [5]. In the modified scan chain, the stimulus inserted is transformed to the actual test vector through the transformation during the shift cycles. The scan chain modifications are performed by only utilizing bijective transformations such as XOR gates and inverters in order to ensure preservation of fault coverage. A simple on-chip transformation through 2 XOR gates can be seen in Figure 3. The transformation matrix that corresponds to the depicted scan chain is denoted by T in the figure, and this particular modification transforms, for example, the stimulus '11010' into '11111' over the shift cycles. For a transformation matrix, T, and a test set, I, the stimulus to be inserted, S, can be determined as follows:

$$I \times T^{-1} = S \tag{1}$$

When a single scan input is used to load multiple scan chains by broadcasting the test data as in Illinois Scan, a particular test vector can be applied only if each slice of this vector is comprised of the identical logic value. Since serial on-chip transformation enables the application of a stimulus that is different from the actual test vector, the given set of test vectors can be transformed into a stimulus, wherein each test slice consists of the identical logic value and consequently can be applied in parallel mode. The described transformation scheme is the fundamental underpinning of the test cost reduction methodology presented in this section. Each scan chain of the circuit is modified by gate insertion to perform a particular transformation. The transformations of the scan chains are selected in order to generate the actual test vectors from the broadcast data. Formally, in a scan design with N scan chains, for a given test vector I, assume that  $I_i$  represents the part of the test vector corresponding to scan chain *i* and the transformation matrix of the scan chain *i* is denoted as  $T_i$ . The given test vector can be completely applied in parallel mode if a set of transformations can be obtained that generates an input stimulus, wherein each slice consists of the identical logic value even though the test slices of the applied test vector are not comprised of the identical logic values. If  $S_i^k$  denotes the  $k_{th}$  bit of the part of the input stimulus that corresponds to the  $i_{th}$  scan chain and L denotes the length of the scan chains, the solution of the following set of equations for a given test set, I, in order to identify the unknown transformation matrix, T, provides the necessary transformations that enable the parallel application of the test vectors.

$$I_i \times T_i^{-1} = S_i \quad \forall \ 1 \le i \le N \tag{2}$$

$$S_i^k = S_j^k \quad \forall \ 1 \le i, j \le N, \ 1 \le k \le L$$
(3)

In Figure 4, the given test vector, I, cannot be applied in parallel mode due to conflicts at the slices 2, 3 and 4. If the modifications



Figure 3: Serial Transformations



**Figure 4: Serial Transformations** 

that are shown on the depicted scan chains are performed, test data is transformed to an input stimulus, S, wherein each slice consists of the same logic value. Consequently, this input stimulus can be applied in the parallel mode and the transformations during the shift cycles generate the actual test vector.

Due to the possible existence of unsolvable equations (2) and (3), a subset of the test vectors may remain inapplicable in parallel mode, necessitating the rather costly serial mode. For this particular set of test vectors, the proposed scan design presented in Section 3 perfectly complements the transformation based scheme. Since the scan design in Section 3 enables the serial application of individual test slices, during the process of the identification of the required transformations, if a test vector cannot be transformed into a form that can be completely applied in parallel, the maximal number of test slices that can be applied in parallel mode is instead targeted, thus minimizing the number of test slices that requires the costly serial mode. Consequently, only the application of test slices that cannot be transformed into a parallel applicable form is performed in serial mode.

The scan chain modifications are performed for a given test set. No architecture specific test generation is required for the proposed transformation based test cost reduction methodology. This property further differentiates this particular method from the Illinois scan architecture, wherein a constrained test generation is performed to obtain the test vectors that can be applied in parallel mode. The elimination of the necessity of test generation makes the transformation based method perfectly suitable for the testing of the intellectual property cores.

#### 5. EXPERIMENTAL RESULTS

The performance of the proposed methodologies has been analyzed on the larger ISCAS89 [6] benchmark circuits. The ATA-LANTA test generation tool [7] and the HOPE fault simulation tool [8] have been used for the experiments.

In the first set of experiments, the test application time and data volume reduction ratios in the application of the *serial test vectors* are analyzed. Table 1 lists the percentage of the improvements in test time and data volume over the Illinois scan architecture for various scan input counts. In Table 1, the columns denote test time improvement, T, and test data volume improvement, V, for scan input pin counts of 3, 4, 6, 10, 16, 20 and 30, respectively. The

|         | S=3  |      | S=4  |      | S=6  |      | S=10 |      | S=16 |      | S=20 |      | S=30 |      |
|---------|------|------|------|------|------|------|------|------|------|------|------|------|------|------|
| Circuit | Т    | V    | Т    | V    | Т    | V    | Т    | V    | Т    | V    | Т    | V    | Т    | V    |
| s13207  | 65.2 | 31.5 | 72.8 | 47.6 | 80.9 | 63.6 | 82.8 | 69.1 | 80.9 | 70.9 | 79.6 | 69.5 | 72.2 | 63.0 |
| s15850  | 65.4 | 32.0 | 73.8 | 48.6 | 80.7 | 62.8 | 76.1 | 60.8 | 78.9 | 68.7 | 65.9 | 52.0 | 42.2 | 27.3 |
| s35932  | 66.3 | 32.4 | 74.4 | 49.4 | 82.6 | 65.8 | 88.6 | 78.1 | 92.2 | 85.4 | 93.5 | 88.5 | 90.1 | 85.4 |
| s38417  | 64.4 | 29.3 | 70.9 | 43.0 | 81.1 | 63.1 | 86.4 | 75.3 | 75.2 | 63.9 | 74.4 | 62.8 | 72.9 | 65.1 |
| s38584  | 66.2 | 32.8 | 73.7 | 48.6 | 81.5 | 64.1 | 87.6 | 76.7 | 86.6 | 78.6 | 87.7 | 80.5 | 78.9 | 72.6 |

Table 1: Improvement over Illinois Scan in Serial Mode

|         | S    | =3   | S    | =4   | S    | =6   | S=   | =10  | S=   | =16  | S=   | 20   | S=   | =30  |
|---------|------|------|------|------|------|------|------|------|------|------|------|------|------|------|
| Circuit | Т    | V    | Т    | V    | Т    | V    | Т    | V    | Т    | V    | Т    | V    | Т    | V    |
| s13207  | 42.7 | 42.0 | 57.9 | 57.1 | 71.2 | 70.6 | 83.2 | 82.1 | 89.0 | 88.0 | 90.0 | 88.7 | 90.9 | 89.4 |
| s15850  | 20.8 | 20.2 | 44.2 | 41.5 | 59.6 | 55.1 | 76.3 | 72.8 | 82.5 | 79.7 | 82.9 | 79.6 | 77.1 | 72.5 |
| s35932  | 17.2 | 11.9 | 30.9 | 26.9 | 51.2 | 47.8 | 72.8 | 70.7 | 83.0 | 81.4 | 86.7 | 85.5 | 90.0 | 88.9 |
| s38417  | 0.0  | 0.0  | 3.0  | 1.7  | 35.1 | 33.5 | 60.5 | 59.1 | 74.9 | 74.2 | 79.2 | 77.7 | 82.0 | 80.1 |
| s38584  | 2.1  | 1.2  | 25.8 | 25.2 | 49.7 | 49.2 | 68.5 | 67.4 | 80.1 | 78.6 | 82.7 | 80.3 | 83.9 | 81.4 |

Table 2: Time & Volume Reduction Percentages over Traditional Scan Design

results show that a reduction ratio reaching to 90% levels can be achieved in the application of the serial vectors.

In the second set of experiments, test cost reduction levels delivered by the proposed scan design over the traditional scan architecture are analyzed. Table 2 lists the test cost reduction levels, wherein the columns denote test time reduction, T, and test data volume reduction, V, for scan input pin counts of 3, 4, 6, 10, 16, 20 and 30, respectively. The highest reduction ratios for each circuit are indicated in boldface. The reduction ratio consistently improves as the scan input count increases with the exception of s15850. The results provided in Table 2 confirm that the proposed architecture delivers a substantial test cost reduction, exceeding 90% levels for some circuits, despite its simplicity and negligible hardware cost.

In the last set of experiments, the performance of the on-chip transformation based test cost reduction methodology that is presented in Section 4 is analyzed. In this set of experiments, for a given compacted test set, the on-chip transformations that generate the minimal cost input stimulus are identified. Table 3 provides the test application time reduction ratios obtained by this particular methodology over traditional scan architecture. The highest reduction ratios for each circuit are indicated in boldface. A reduction level reaching 90% for some circuits by only transforming the given test set is obtained even though no dedicated test generation is performed. A closer look indicates that the proposed method delivers a reduction that is quite close to the theoretical maximum for small scan input counts in some circuits. For example, the cost reduction ratios for S13207 of 66.0% and 74.2% for scan chain counts of 3 and 4, respectively, are almost identical to the levels of the theoretical maximum ratios of 66.6% and 75.0% for these particular scan input counts.

The experimental results indicate that a substantial reduction in test cost can be obtained by the proposed methodologies. The reduction levels achieved by the transformation based methodology for a given test set are quite promising in terms of its applicability to intellectual property core testing, wherein the lack of structural information prevents the possibility of a test cost reduction oriented test generation.

| Circuit | S=3  | S=4  | S=6  | S=10 | S=16 | S=20 | S=30 |
|---------|------|------|------|------|------|------|------|
| s13207  | 66.0 | 74.2 | 81.7 | 87.2 | 89.3 | 89.4 | 88.9 |
| s15850  | 64.8 | 72.3 | 78.8 | 82.6 | 84.4 | 84.4 | 81.5 |
| s35932  | 57.3 | 60.0 | 65.5 | 61.0 | 57.2 | 54.6 | 56.8 |
| s38417  | 64.2 | 70.0 | 75.8 | 79.6 | 77.8 | 73.4 | 66.9 |
| s38584  | 64.5 | 71.9 | 78.6 | 82.4 | 83.0 | 82.6 | 81.4 |

**Table 3: Transformation-based Test Time Reduction** 

#### 6. CONCLUSIONS

Scan-designs are widely used to improve controllability and observability of the circuit. Nevertheless, scan-based designs substantially increase test application time and test data volume. In this paper, two new methodologies are proposed in order to reduce the test cost of the scan-based designs.

Test vectors that can only be applied in serial mode limit severely the test cost reduction levels in the Illinois scan architecture. A scan design that targets this particular shortcoming of the Illinois scan architecture is proposed herein. The proposed scan design enables the application of serial vectors efficiently by supporting both parallel and serial modes through the application of a test vector.

The transformation-based methodology augments the proposed test architecture by utilizing on-chip transformations in order to generate an input stimulus that can be efficiently applied. The transformation-based methodology uses a given test set with no necessity for test generation. Consequently, the transformationbased methodology is especially suitable for intellectual property core testing.

Experimental results indicate that a substantial reduction in test cost, reaching 90% levels, can be attained; the transformation-based methodology is especially useful for smaller scan chain counts, de-livering near theoretical optimal cost reduction levels.

#### 7. REFERENCES

- V. Agrawal and M. Bushnell, Essentials of Electronic Testing for Digital, Memory and Mixed-Signal VLSI Circuits, Kluwer Academic Publishers, 2000.
- [2] I. Hamzaoglu and J. Patel, "Reducing Test Application Time for Full Scan Embedded Cores", in *FTCS*, pp. 260–267, 1999.
- [3] A. R. Pandey and J. H. Patel, "Reconfiguration Technique for Reducing Test Time and Test Data Volume in Illinois Scan Architecture Based Designs", in VTS, pp. 9–15, 2002.
- [4] N. Oh, R. Kapur, T. W. Williams and J. Sproch, "Test Pattern Compression Using Prelude Vectors in Fan-out Scan Chain with Feedback Architecture", in *DATE*, pp. 110–115, 2003.
- [5] O. Sinanoglu and A. Orailoglu, "Modeling Scan Chain Modifications for Scan-In Test Power Minimization", in *ITC*, pp. 602–611, 2003.
- [6] F. Brglez, D. Bryan and K. Kozminski, "Combinational Profiles of Sequential Benchmark Circuits", *ISCAS*, vol. 3, pp. 1929–1934, May 1989.
- [7] H. K. Lee and D. S. Ha, On the Generation of Test Patterns for Combinational Circuits, Technical Report 12-93, Department of Electrical Eng., Virginia Polytechnic Institute and State University.
- [8] H. K. Lee and D. S. Ha, "HOPE: An Efficient Parallel Fault Simulator", in DAC, pp. 336–340, 1992.