An Approach to Low-power, High-performance, Fast Fourier Transform Processor Design
Department of Electrical Engineering
The Fast Fourier Transform (FFT) is one of the most widely used digital signal processing algorithms. While advances in semiconductor
processing technology have enabled the performance and integration of FFT processors to increase steadily, these advances have also,
unfortunately, caused the power consumed by processors to increase as well. This has resulted in a situation where the number of potential FFT
applications that are limited by power - not performance - is significant and growing.
For many CMOS circuits, energy-efficiency is proportional to the supply voltage squared. Consequently, tremendous efficiency can be gained by
aggressively reducing the supply voltage. In order to maintain good performance with robust operation, however, MOSFET threshold voltages
must also be reduced and some circuits must be redesigned.
A proposed data-caching algorithm caches data from main memory using a much smaller multi-ported cache, and facilitates increased energy-
efficiency (by reducing communication energy) and performance (through deep pipelining). This algorithm also allows the processor to be
partitioned such that roughly one-half of the processor, comprising the datapath and caches, has high activity, while the other half (main memory)
has lower activity and can be operated at a higher threshold voltage to reduce leakage currents.
Spiffee is a full-custom, 460,000 transistor, single-chip, 1024-point, 36-bit (18 Re + 18 Im) FFT processor designed to operate at very low
supply voltages. It calculates a complex radix-2 butterfly every cycle and uses unique "hierarchical bitline" SRAM and ROM memories which
operate well in a low-Vdd, low-Vt environment. The processor's substrate and well nodes are connected to pads and are accessible for biasing
to adjust transistor thresholds. Spiffee has been fabricated in a standard 0.7Ám (Lpoly = 0.6Ám) CMOS process and is fully functional. At a
supply voltage of 1.1 V, it operates at 16 MHz and 9.5 mW with an adjusted energy efficiency more than 16 times greater than the previously
most-efficient known FFT processor. At 3.3 V, it is functional at 173 MHz. Portions of the processor were fabricated in a low-threshold process
using 0.8Ám design rules and Lpoly = 0.26Ám. Early measurements predict operation of the processor to be 57 MHz and less than 10 mW at
Vdd=400 mV, resulting in an increase in efficiency over 65 times more than other known processors.