PEKO Suite

( Placement Example with Known Optimal wirelength )

Director : Prof. Jason Cong

Authors : Chin-Chih Chang and Min Xie

Copyright© 2002-2003 the Regents of University of California


¡¡

Motivation

Placement is an important step in the overall IC design process in DSM technologies, as it defines the on-chip interconnects, which have become the bottleneck in determining circuit performance. The development in placement algorithms is slow but steady. Various algorithms have been proposed in the past 30 years,including min-cut methods, iterative methods, and analytical methods. However, in terms of wirelength reduction, the rate of improvement is only 5-10% every 2-3 years since the 1980s. This prompts us to think whether there remains any room for improvement for circuit placement (at least in terms of wirelength minimization).

According to ITRS¡¯01, the maximum transistor number per chip will be over 1.6 billion, with a frequency of 28.7 GHz by the year 2016. Such design complexity will pose significant challenge to existing placement algorithms. There is a need to quantify the optimality and scalability of state-of-the-art placement engines.

However, most existing works only compared with known heuristics; they did not tell how close the solutions were to the optimal. The study in [1] attempted to quantify the suboptimality of placement algorithms in terms of chip area by ¡°stitching¡± small designs to form large designs. Although the circuits in [1] did not have known optimal solutions, upper bounds of the optimal solutions were derived and compared with the solutions given by several VLSI heuristics.

 

Circuit Description

The PEKO suite is developed at UCLA VLSI CAD LAB. The placement examples are constructed such that their optimal wirelengths are known. The module numbers and netlist profiles are extracted from ISPD98 [2]. To mimic real designs and to make the legalization process easier, we inserted 15% white space into the examples by expanding one dimension of the chip. A detailed description about the algorithm to generate these examples is given in [5]. We generated four suites of PEKO examples, named from Suite1 to Suite4. The examples are given in both GSRC BookShelf and LEF/DEF format and can be downloaded here.

When generating Suite1, we remove all the connections to pads with the concern that such connections may give hint about where to place each module. Suite2 are generated by scaling the profiles by a factor of 10. Table 1 gives the description of Suite1.

Table 1. Characteristics of Suite1

circuit

#cell

#net

#row

row utilization

optimal wirelength

Peko01

12506

13865

113

85%

8.14E+05

Peko02

19342

19325

140

85%

1.26E+06

Peko03

22853

27118

152

85%

1.50E+06

Peko04

27220

31683

166

85%

1.75E+06

Peko05

28146

27777

169

85%

1.91E+06

Peko06

32332

34660

181

85%

2.06E+06

Peko07

45639

47830

215

85%

2.88E+06

Peko08

51023

50227

227

85%

3.14E+06

Peko09

53110

60617

231

85%

3.64E+06

Peko10

68685

74452

263

85%

4.73E+06

Peko11

70152

81048

266

85%

4.71E+06

Peko12

70439

76603

266

85%

5.00E+06

Peko13

83709

99176

290

85%

5.87E+06

Peko14

147088

152255

385

85%

9.01E+06

Peko15

161187

186225

402

85%

1.15E+07

Peko16

182980

189544

429

85%

1.25E+07

Peko17

184752

188838

431

85%

1.34E+07

Peko18

210341

201648

460

85%

1.32E+07

Using similar methods, we generated two suites that allow connections to pads and name them Suite3 and Suite4 respectively. Suite4 is generated by scaling the profiles by a factor of 10. Fixed modules are used in these examples to emulated pads. Table 2 gives the description of Suite3.

Table 2. Characteristics of Suite3

circuit

#cell

#net

#row

Row utilization

optimal wirelength

Peko01

12506

14111

113

85%

8.22E+05

Peko02

19342

19584

140

85%

1.27E+06

Peko03

22853

27401

152

85%

1.51E+06

Peko04

27220

31970

166

85%

1.76E+06

Peko05

28146

28446

169

85%

1.95E+06

Peko06

32332

34826

181

85%

2.07E+06

Peko07

45639

48117

215

85%

2.89E+06

Peko08

51023

50513

227

85%

3.15E+06

Peko09

53110

60902

231

85%

3.65E+06

Peko10

68685

75196

263

85%

4.75E+06

Peko11

70152

81454

266

85%

4.72E+06

Peko12

70439

77240

266

85%

5.02E+06

Peko13

83709

99666

290

85%

5.89E+06

Peko14

147088

152772

385

85%

9.03E+06

Peko15

161187

186608

402

85%

1.16E+07

Peko16

182980

190048

429

85%

1.25E+07

Peko17

184750

189581

431

85%

1.35E+07

Peko18

210341

201920

460

85%

1.32E+07

 

Experimental Result

We experimented with four state-of-the-art placers, including:

¡¤         Dragon. Dragon is based on a multilevel framework. It uses hMetis to derive an initial partition result on the circuits, then undergoes a series of refinement stages doing bin-based swapping with simulated annealing [3]. We used Dragon v.2.20.

¡¤         Capo. Capo is built on a multilevel partitioner. It aims to enhance the routability with such techniques as tolerance computation and block splitting heuristics [4]. We used Capo v.8.0.

¡¤         mPL. mPL is also based on a multilevel framework.It uses non-linear programming to handle the non-overlapping constraints on the coarsest level, and uses Goto based relaxation in subsequent refinement stages. We used mPL v.1.2. Compared with mPL v.1.0, mPL v.1.2 uses an additional V cycle and adopts distance based clustering in the second V cycle [5].

¡¤         QPlace. QPlace is the placement engine used in Silicon Ensemble v.5.3 of Cadence Inc. The version we used is QPlace v.5.1.55.

We set an upper limit of 24 hours to the runtime of a placer. Processes running over 24 hours will be terminated. The experiments are performed on the following two machines:

¡¤         Sun Blade workstation 750 MHz running SunOs 5.8 with 4GB of memory. Experiments with Dragon and mPL are performed on this machine.

¡¤         Pentium IV 1.4 GHz running Windows 2000 with 2GB of memory. Experiment with Capo is performed on this machine.

We were given the latest version of Capo v.8.5 for Linux and experimented on a Pentium IV 2.2GHz running Redhat v.8.0. The results are included in the two spreadsheets below.

Fig. 1. and Fig. 2. gives the experimental results. Here are the spreadsheets of the experimental data with Suite1 and Suite2.

Figure 1. Solution Quality vs Cell Number

Figure 2. Runtime vs Cell Number

The experimental results indicate that the wirelengths produced by these tools are 1.66 to 2.53 times the optimal in the worst cases, and are 1.46 to 2.38 times the optimal on the average. As for scalability, the average solution quality of each tool shows deterioration by an additional 4% to 25% when the problem size increases by a factor of 10.

Our experiment is not intended to compare the four placers. Rather, it is for a gap analysis to see how much room remains for improvement in placement algorithms. Please do not interpret the results here as a ranking of the placers.

PEKO has obvious limitations: all the nets in the optimal solutions are local nets, i.e., they only connect cells in contiguous areas, which may not be true on real circuits. Therefore, achieving good results on PEKO does not guarantee good performance on real circuits. Also, it is not encouraged to tune the placer in order to beat these examples.

¡¡Reference

1.    L. W. Hagen, D. J. Huang and A. B. Kahng, ¡°Quantified Suboptimality of VLSI Layout Heuristics,¡± Proc. Design Automation Conference, pp. 216-221, 1995.

2.    C. J. Alpert, "The ISPD98 Circuit Benchmark Suite," Proc. International Symposium on Physical Design, pp. 85-90, 1998.

3.    M. Wang, X. Yang and M. Sarrafzadeh, "Dragon2000: Standard-cell Placement Tool for Large Industry Circuits," Proc. International Conference on Computer-Aided Design, pp. 260-264, 2000.

4.    E. Caldwell, A. B. Kahng and I. L. Markov. "Can Recursive Bisection Produce Routable Placements?" Proc. Design Automation Conference, pp. 477-482, 2000.

5.    T. Chan, J. Cong, T. Kong, and J. Shinnerl, "Multilevel Optimization for Large-Scale Circuit Placement," Proc. International Conference on Computer-Aided Design, pp. 171-176, 2000.

6.    C. C. Chang, J. Cong, M. Xie, "Optimality and Scalability Study of Existing Placement Algorithms," To appear in Proc. Asia South-Pacific Design Automation Conference, 2003.


¡¡Please direct your questions to xie@cs.ucla.edu. Thanks.