| PTLS-Linux-v10.tar.gz |
| PTLS-SunOS-v10.tar.gz |
| PTLS-Source-v10.tar.gz |
The PTLS program accepts the Boolean network in Berkeley Logic Interchange Format (BLIF) and generates the transistor netlists in SPICE format, which can be simulated using standard circuit simulators such as HSPICE. The program also reports the number of pass transistors, inverters, buffers and the delay in the final circuit using Elmore-like delay model. User should specify the kind of BDD's to be used - monolithic or multi-level. Depending on the kind of BDD's used, the results will be different. We advise that for MCNC benchmarks, one should use monolithic BDD's while for ISCAS benchmarks (except for C17), one should use multi-level BDD's. In case of ISCAS benchmarks, one should use script.rugged (or such similar script) in SIS to process the netlist. In either case, the logical netlists should not have "constant" nodes; "constant" nodes can be eliminated using "sweep" command in sis. These two flow are shown in the following Figure.
sis> read_blif file1.blif | sis> read_blif file1.blif |
sis> sweep | sis> source script |
sis> write_blif file1Sweep.blif | sis>sweep |
sis> quit | sis> write_blif file1Sweep.blif |
$ ptls -global file1Sweep.blif | sis> quit |
$ ptls -local file1Sweep.blif |
The Figure 2 shows the BDD for carry output of 3-bit adder and the corresponding PTL implementation obtained by direct mapping of BDD nodes on to pass transistors.
The netlists generated by the PTLS program tend to have smaller delays than the netlists generated by direct mapping of BDD's, albeit at the cost of area. Using max-flow min-cut algorithm, PTLS minimizes the area penalty (with some inaccuracies due to area estimation). The static CMOS implementation of the same carry function obtained by running script.delay in sis is shown in the following Figure.
The PTL implementation shown in Figure 3 compares favorably with the static CMOS implementation for area, power and delay. This trend is not only true for arithmetic circuits (which usually can be implemented in PTL efficiently) but for AND-intensive random logic circuits as well. The efficiency in the PTL implementation comes due to the following factors:
Exploration of larger design space due to libraryless-ness
Optimality of decomposition algorithm up to the accuracy in estimation
Merits and demerits of NMOS transistors in PTL
Normally, designers use cell characterization to get the area, power and delay numbers for the cells; these numbers are used while designing a circuit. The libraryless design may create problems for designers as every time they need to come up with a circuit, designers will have to consider whole design space and characterize the all cells that can possibly be used to design a circuit. The cell characterization can be avoided by using the transistor level delay models and then, just characterizing the transistor would suffice. In case of PTL, designer will have to characterize the pass transistor (and inverters with weak pull-up) and then, use C-R-C PI models to estimate the delay. NMOS pass transistors pass the rising transitions slower than the falling transition. Since inverters are inserted after every few (typically, 3) transistors in series, the rising transition in the current segment becomes falling transition in the next segment and the effect of slow rising transition averages out.
The following table lists the ISCAS benchmarks for which "Local" option for BDD's is recommended. For all other benchmarks, one should use the "Global" option keeping in mind the guideline at the end of the section 3.
Example | Option for PTLS |
C7552 | Local |
C6288 | Local |
C3540 | Local |
C1908 | Local |
C432 | Local |
C499 | Local |
C1355 | Local |
C1908 | Local |
C2670 | Local |
C5315 | Local |