PEKO Benchmarks

PEKO Suite

( Placement Example with Known Optimal wirelength )

Authors : Chin-Chih Chang and Min Xie

¡¡

Motivation

Placement is an important step in the overall IC design process in DSM technologies, as it defines the on-chip interconnects, which have become the bottleneck in determining circuit performance. The development in placement algorithms is slow but steady. Various algorithms have been proposed in the past 30 years,including min-cut methods, iterative methods, and analytical methods. However, in terms of wirelength reduction, the rate of improvement is only 5-10% every 2-3 years since the 1980s. This prompts us to think whether there remains any room for improvement for circuit placement (at least in terms of wirelength minimization).

According to ITRS¡¯01, the maximum transistor number per chip will be over 1.6 billion, with a frequency of 28.7 GHz by the year 2016. Such design complexity will pose significant challenge to existing placement algorithms. There is a need to quantify the optimality and scalability of state-of-the-art placement engines.

However, most existing works only compared with known heuristics; they did not tell how close the solutions were to the optimal. The study in [1] attempted to quantify the suboptimality of placement algorithms in terms of chip area by ¡°stitching¡± small designs to form large designs. Although the circuits in [1] did not have known optimal solutions, upper bounds of the optimal solutions were derived and compared with the solutions given by several VLSI heuristics.

Circuit Description

The PEKO suite is developed at UCLA VLSI CAD LAB . The placement examples are constructed such that their optimal wirelengths are known. The module numbers and netlist profiles are extracted from ISPD98 [2]. To mimic real designs and to make the legalization process easier, we inserted 15% white space into the examples by expanding one dimension of the chip. A detailed description about the algorithm to generate these examples is given in [5]. We generated four suites of PEKO examples, named from Suite1 to Suite4. The examples are given in both GSRC BookShelf and LEF/DEF format and can be downloaded here .

When generating Suite1, we remove all the connections to pads with the concern that such connections may give hint about where to place each module. Suite2 are generated by scaling the profiles by a factor of 10. Table 1 gives the description of Suite1.

Table 1. Characteristics of Suite1

circuit	#cell	#net	#row	row utilization	optimal wirelength
Peko01	12506	13865	113	85%	8.14E+05
Peko02	19342	19325	140	85%	1.26E+06
Peko03	22853	27118	152	85%	1.50E+06
Peko04	27220	31683	166	85%	1.75E+06
Peko05	28146	27777	169	85%	1.91E+06
Peko06	32332	34660	181	85%	2.06E+06
Peko07	45639	47830	215	85%	2.88E+06
Peko08	51023	50227	227	85%	3.14E+06
Peko09	53110	60617	231	85%	3.64E+06
Peko10	68685	74452	263	85%	4.73E+06
Peko11	70152	81048	266	85%	4.71E+06
Peko12	70439	76603	266	85%	5.00E+06
Peko13	83709	99176	290	85%	5.87E+06
Peko14	147088	152255	385	85%	9.01E+06
Peko15	161187	186225	402	85%	1.15E+07
Peko16	182980	189544	429	85%	1.25E+07
Peko17	184752	188838	431	85%	1.34E+07
Peko18	210341	201648	460	85%	1.32E+07

Using similar methods, we generated two suites that allow connections to pads and name them Suite3 and Suite4 respectively. Suite4 is generated by scaling the profiles by a factor of 10. Fixed modules are used in these examples to emulated pads. Table 2 gives the description of Suite3.

Table 2. Characteristics of Suite3

circuit	#cell	#net	#row	Row utilization	optimal wirelength
Peko01	12506	14111	113	85%	8.22E+05
Peko02	19342	19584	140	85%	1.27E+06
Peko03	22853	27401	152	85%	1.51E+06
Peko04	27220	31970	166	85%	1.76E+06
Peko05	28146	28446	169	85%	1.95E+06
Peko06	32332	34826	181	85%	2.07E+06
Peko07	45639	48117	215	85%	2.89E+06
Peko08	51023	50513	227	85%	3.15E+06
Peko09	53110	60902	231	85%	3.65E+06
Peko10	68685	75196	263	85%	4.75E+06
Peko11	70152	81454	266	85%	4.72E+06
Peko12	70439	77240	266	85%	5.02E+06
Peko13	83709	99666	290	85%	5.89E+06
Peko14	147088	152772	385	85%	9.03E+06
Peko15	161187	186608	402	85%	1.16E+07
Peko16	182980	190048	429	85%	1.25E+07
Peko17	184750	189581	431	85%	1.35E+07
Peko18	210341	201920	460	85%	1.32E+07

Experimental Result

We experimented with four state-of-the-art placers, including:

¡¤ Dragon. Dragon is based on a multilevel framework. It uses hMetis to derive an initial partition result on the circuits, then undergoes a series of refinement stages doing bin-based swapping with simulated annealing [3]. We used Dragon v.2.20.

¡¤ Capo. Capo is built on a multilevel partitioner. It aims to enhance the routability with such techniques as tolerance computation and block splitting heuristics [4]. We used Capo v.8.0.

¡¤ mPL. mPL is also based on a multilevel framework.It uses non-linear programming to handle the non-overlapping constraints on the coarsest level, and uses Goto based relaxation in subsequent refinement stages. We used mPL v.1.2. Compared with mPL v.1.0, mPL v.1.2 uses an additional V cycle and adopts distance based clustering in the second V cycle [5].

¡¤ QPlace. QPlace is the placement engine used in Silicon Ensemble v.5.3 of Cadence Inc. The version we used is QPlace v.5.1.55.

We set an upper limit of 24 hours to the runtime of a placer. Processes running over 24 hours will be terminated. The experiments are performed on the following two machines:

¡¤ Sun Blade workstation 750 MHz running SunOs 5.8 with 4GB of memory. Experiments with Dragon and mPL are performed on this machine.

¡¤ Pentium IV 1.4 GHz running Windows 2000 with 2GB of memory. Experiment with Capo is performed on this machine.

We were given the latest version of Capo v.8.5 for Linux and experimented on a Pentium IV 2.2GHz running Redhat v.8.0. The results are included in the two spreadsheets below.

Fig. 1. and Fig. 2. gives the experimental results. Here are the spreadsheets of the experimental data with Suite1 and Suite2 .

Figure 1. Solution Quality vs Cell Number

Figure 2. Runtime vs Cell Number

The experimental results indicate that the wirelengths produced by these tools are 1.66 to 2.53 times the optimal in the worst cases, and are 1.46 to 2.38 times the optimal on the average. As for scalability, the average solution quality of each tool shows deterioration by an additional 4% to 25% when the problem size increases by a factor of 10.

Our experiment is not intended to compare the four placers. Rather, it is for a gap analysis to see how much room remains for improvement in placement algorithms. Please do not interpret the results here as a ranking of the placers.

PEKO has obvious limitations: all the nets in the optimal solutions are local nets, i.e., they only connect cells in contiguous areas, which may not be true on real circuits. Therefore, achieving good results on PEKO does not guarantee good performance on real circuits. Also, it is not encouraged to tune the placer in order to beat these examples.

¡¡Reference

1. L. W. Hagen, D. J. Huang and A. B. Kahng, ¡°Quantified Suboptimality of VLSI Layout Heuristics,¡± Proc. Design Automation Conference, pp. 216-221, 1995.

2. C. J. Alpert, "The ISPD98 Circuit Benchmark Suite," Proc. International Symposium on Physical Design, pp. 85-90, 1998.

3. M. Wang, X. Yang and M. Sarrafzadeh, "Dragon2000: Standard-cell Placement Tool for Large Industry Circuits," Proc. International Conference on Computer-Aided Design, pp. 260-264, 2000.

4. E. Caldwell, A. B. Kahng and I. L. Markov. "Can Recursive Bisection Produce Routable Placements?" Proc. Design Automation Conference, pp. 477-482, 2000.

5. T. Chan, J. Cong, T. Kong, and J. Shinnerl, "Multilevel Optimization for Large-Scale Circuit Placement," Proc. International Conference on Computer-Aided Design, pp. 171-176, 2000.

6. C. C. Chang, J. Cong, M. Xie, "Optimality and Scalability Study of Existing Placement Algorithms," To appear in Proc. Asia South-Pacific Design Automation Conference, 2003.

¡¡Please direct your questions to xie@cs.ucla.edu. Thanks.