|
 |
dspZONE Products for the week of October 27, 2008
picoChip Says…
Leadership Extended With 262 Ginstruction/s, 35 Gmultiply-accumulate operation/s Multi-Core Wireless Processors Expanded product line sets new benchmark in high-performance signal processing; the market’s only device to deliver true single-chip IO-MIMO for WiMAX
picoChip has expanded its leadership in massively multi-core architectures with the announcement of new high-performance picoArray devices. The PC202-10, PC203-10 and PC205-10 multi-core DSPs deliver processing power of up to 262GIPS and 35GMACS. These enhanced members of the industry-leading PC20x series allow designers to deliver advanced features and use fewer devices in highly demanding applications such as advanced wireless infrastructure products.
The PC20xx-10 devices are based on the same proven multi-core picoArray architecture as existing PC20x products, but with a larger array of processors, whilst retaining full code compatibility. The picoChip picoArray has been demonstrated in independent benchmarks to deliver a 40-fold advantage in price-performance, and eight-times higher absolute performance, compared with traditional DSPs.
The performance uplift in the PC20x-10 devices is achieved not by traditional clock speed increases or by shrinking the semiconductor manufacturing technology employed: instead the multi-core nature of the picoArray is used to increase the number of processor elements in the devices from 248 to 273. This strategy also increases memory size by eight per cent, giving designers more scope to implement value-added functionality as well as to create and enhance robust turnkey solutions. Because clock rate does not change, the new devices are fully code-compatible; software written for the smaller chips can immediately run on the new devices.
The extra processing headroom of the PC2xx-10 devices gives OEMs significantly more flexibility in creating 3G, WiMAX and 4G wireless products. For example, complex multi-sector cellular basestations can be realized using fewer chips; femtocells can include new features or custom functionality; and premium products can be equipped with enhanced features. The PC205-10 is at the heart of picoChip’s single-chip WiMAX solution, the industry’s only offering capable of delivering true IO-MIMO – a critical requirement of mobile WiMAX.
Doug Pulley, CTO and co-founder at picoChip said: “The PC20x-10 devices represent a vital step in the evolution of the picoChip product offering. The PC20x family is the market leader, and for most applications we see today provides more than enough processing power. However, we see some customers pushing the boundaries in design innovation and that’s where the PC20x-10s will really come into their own. An attraction of our multi-core architecture is how easy it is to scale the array to suit the needs of the problem: smaller for lower price or bigger, as here, for higher performance.”
Launched in 2006, the PC20x series of cost-effective multi-core DSP chips for next-generation wireless broke the $1/GMAC barrier. Amongst other applications they have become the industry-leading femtocell solutions and the de facto standard for WiMAX PHY implementations. All three basic variants in the series are suited to complete software radio systems, and full reference designs are available for WiMAX (both 16d and 16e) and WCDMA (including HSDPA, with upgrade to HSUPA).
EN-Genius Says…
Although picoChip’s recent announcement is a relatively low-level upgrade of their current 202/203/205 family of multi-core DSPs, it’s a good opportunity to finally take a closer look at their remarkable array-based processing architecture that’s helped them win places in so many designs. When it debuted back around 2002, picoChip was one of a flurry of alternative DSP architectures (such as the MorphICs, Chameleon, and Morpho/Motorola compute fabrics) that promised to deliver more processing power, more flexibility, and lower energy consumption. Six years later, its flexibility, efficiency, and ease of programming helped it survive and thrive while most of its contemporaries have been consigned to the dustbin of high-tech history.
The picoChip family of compute arrays use anywhere from 250 - 350 16-bit LIW DSP-oriented cores, each running at 160 MHz. Each processor has its own local instruction and data memory, allowing it to run as a stand-alone element with no need to steal bus bandwidth or other resources from its neighbours for any basic operation. The DSPs can be connected at will using a switched matrix that is mapped into the processor memory space. Bus bandwidth is allocated on a time slot basis, and completely deterministic, so the bus appears to be completely non-blocking. picoChip says that this eliminates the non-determinism caused by resource conflicts, cycle stealing, cache misses, pipeline bubbles, and other random events experienced by many other multi-processor architectures.
The processor array is also sprinkled with hardware accelerator cores that look just like another processor to the regular DSPs. They help make quick work of compute-intensive tasks like FFT/IFFT, Viterbi, Turbo coding, or MIMO processing. Off-chip memory (32-bit SDRAM or DDRAM) shares the same internal switched bus and looks like another processor to the DSP cores.
The processor 160 MHz clock frequency does mean that it’s not much good for signals that require more than 160 Msample/s but that’s more than good enough to handle both the voice processing and baseband processing tasks in 3G/4G wireless or other systems (WiMax, Wi-Fi). An on-board ARM-9 RISC engines that can handle UI, host processor or MAC functions is available on some models so that all you need to do is add an RF front-end and some memory to build a complete wireless picocell or commercial-grade access point.
Over the years I’ve seen several very promising compute array chips fail to gain market traction, often because they were painfully difficult to program. picoChip appears to have solved this problem by developing an ANSI C compiler that masks enough of the complexity of the compute array to make development relatively straightforward without introducing too much inefficiency. The compiler breaks the high-level code down into tasks for each core or core cluster as needed. The one unconventional step a programmer is required to perform is to define the software as a series of functional blocks and provide the compiler with a map that defines the connections between them. From there, the compiler builds local processing clusters out of the individual cores to handle specific functions or even build multi-tiered hierarchical processing structures. If necessary, a separate processor can be used as a controller to manage the cluster. The compiler also has a manual mode that allows you to control things like how many cores to assign to a particular task, or assign variables to a specific core, but picoChip says it’s rarely used.
This most recent addition to the picoChip family adds around 10% more processing power to the 248-core array by putting the reserve column of redundant cores to work that are normally held in reserve to mask defective processor cores. It’s not a new strategy (Wintegra & Cavium, for example, use redundant compute elements to improve yield) but it does highlight the improvements in yield and the higher production volume that these parts are enjoying. While picoChip would not confirm it, I’d expect they’ll offer a similar upgrade for their 350-core model in the near future.
Available now, the PC202-10 is priced from $31 in volume.
Product Page
|
|
|
|
|