PTLS.html

Pass transistor logic (PTL) offers a good area/power-delay trade-off alternative to static CMOS circuits in today's technologies. It may continue to do so even when leakage power becomes dominant in sub-100 nano-meter era due to smaller area implementations as compared to the corresponding static CMOS implementations. However, the design automation tools, specifically synthesis and layout tools, targeting pass transistor logic are not available and the potential of pass transistor logic remains unexplored. Pass transistor logic synthesizer (PTLS) tool is being developed for addressing the design automation needs for pass transistor logic. The current executable of PTLS addresses the performance driven synthesis problem for PTL using recursive bipartitioning of BDD's reported in this

paper to minimize the delays in the PTL circuits using Elmore-like delay model.

2. Executable & Source Code

3. I/O Formats

The PTLS program accepts the Boolean network in Berkeley Logic Interchange Format (BLIF) and generates the transistor netlists in SPICE format, which can be simulated using standard circuit simulators such as HSPICE. The program also reports the number of pass transistors, inverters, buffers and the delay in the final circuit using Elmore-like delay model. User should specify the kind of BDD's to be used - monolithic or multi-level. Depending on the kind of BDD's used, the results will be different. We advise that for MCNC benchmarks, one should use monolithic BDD's while for ISCAS benchmarks (except for C17), one should use multi-level BDD's. In case of ISCAS benchmarks, one should use script.rugged (or such similar script) in SIS to process the netlist. In either case, the logical netlists should not have "constant" nodes; "constant" nodes can be eliminated using "sweep" command in sis. These two flow are shown in the following Figure.

Figure1. Overall Flow for the Performance Driven PTL Synthesis

If the file1.blif is a file describing combinational logic in BLIF format, then the actual commands to be run corresponding to the flows in the Figure 1(a) and 1(b) are shown in the following table. The output file generated contains transistor netlist and it is named as file1Sweep.blif.sp which can be simulated using HSPICE.

Table 1. Actual commands to be run for the flows in 1(a) and 1(b)

PTLS package uses Colarado University Decision Diagram (CUDD) and the parser from BDD based Synthesis (BDS) package to build the BDD's. The PTLS program does not use any BDD optimization algorithms except for variable ordering algorithms. The "-global" option specifies that monolithic BDD's are to be built while "-local" option specifies that BDD's are to be built for each node in the Boolean network separately. Typically, one should use "-global" option for MCNC benchmarks (and C17) and "-local" for ISCAS benchmarks - essentially, use the monolithic BDD's whenever they can be built and are of reasonable size; use local BDD's, otherwise.

4. Example

The Figure 2 shows the BDD for carry output of 3-bit adder and the corresponding PTL implementation obtained by direct mapping of BDD nodes on to pass transistors.

Figure 2. (a) BDD for carry output for 3-bit adder, (b) Corresponding PTL implementation

The PTLS tool recursively bipartitions the BDD halving the critical paths. The resulting BDD's and one-hot multiplexers are used to implement the given function. Following Figure shows the PTL implementation generated by the PTLS program for the function shown in the Figure 2(a). The functions O0, O1, O2 are the select functions for the one-hot multiplexer and only one of the {O0, O1, O2} evaluates to 1 for any assignment of the inputs a0, b0 and a1. The output of the multiplexer implements the carry function while the data functions are simply the PTL implementation of the BDD's rooted at the nodes b1 and a2's in the BDD for the carry function shown in Figure 2(a).

Figure 3. PTL implementation of the carry function using BDD decomposition and one-hot multiplexer

The netlists generated by the PTLS program tend to have smaller delays than the netlists generated by direct mapping of BDD's, albeit at the cost of area. Using max-flow min-cut algorithm, PTLS minimizes the area penalty (with some inaccuracies due to area estimation). The static CMOS implementation of the same carry function obtained by running script.delay in sis is shown in the following Figure.

The PTL implementation shown in Figure 3 compares favorably with the static CMOS implementation for area, power and delay. This trend is not only true for arithmetic circuits (which usually can be implemented in PTL efficiently) but for AND-intensive random logic circuits as well. The efficiency in the PTL implementation comes due to the following factors:

Remark:

Normally, designers use cell characterization to get the area, power and delay numbers for the cells; these numbers are used while designing a circuit. The libraryless design may create problems for designers as every time they need to come up with a circuit, designers will have to consider whole design space and characterize the all cells that can possibly be used to design a circuit. The cell characterization can be avoided by using the transistor level delay models and then, just characterizing the transistor would suffice. In case of PTL, designer will have to characterize the pass transistor (and inverters with weak pull-up) and then, use C-R-C PI models to estimate the delay. NMOS pass transistors pass the rising transitions slower than the falling transition. Since inverters are inserted after every few (typically, 3) transistors in series, the rising transition in the current segment becomes falling transition in the next segment and the effect of slow rising transition averages out.

5. Test cases

The following table lists the ISCAS benchmarks for which "Local" option for BDD's is recommended. For all other benchmarks, one should use the "Global" option keeping in mind the guideline at the end of the section 3.


Example	Option for PTLS
C7552	Local
C6288	Local
C3540	Local
C1908	Local
C432	Local
C499	Local
C1355	Local
C1908	Local
C2670	Local
C5315	Local

Table 5. The options for ISCAS benchmarks

References

R. S. Shelar and S. S. Sapatnekar, Recursive Bipartitioning of BDD's for Performance Driven Synthesis of Pass Transistor Logic , Proceedings of IEEE/ACM ICCAD, Nov. 2001, pp. 449 - 452
R. S. Shelar and S. S. Sapatnekar, BDD Decomposition for the Synthesis of High Performance PTL Circuits , Workshop Notes of IEEE IWLS, June 2001, pp. 298 - 303

Executable (i686/Linux 2.4.9)	PTLS-Linux-v10.tar.gz
Executable (Sparc/SunOS 5.8)	PTLS-SunOS-v10.tar.gz
Source code	PTLS-Source-v10.tar.gz

sis> read_blif file1.blif	sis> read_blif file1.blif
sis> sweep	sis> source script
sis> write_blif file1Sweep.blif	sis>sweep
sis> quit	sis> write_blif file1Sweep.blif
$ ptls -global file1Sweep.blif	sis> quit
	$ ptls -local file1Sweep.blif

Pass Transistor Logic Synthesizer (PTLS)

1. Introduction

2. Executable & Source Code

3. I/O Formats

Figure1. Overall Flow for the Performance Driven PTL Synthesis

Table 1. Actual commands to be run for the flows in 1(a) and 1(b)