|
 |
programmablelogicZONE Products for the week of December 15, 2008
Achronix Semiconductor Says…
Patented picoPIPE Acceleration Delivers World’s Fastest FPGA
- Capable of 1.5 GHz peak performance
- First device in the Speedster family embeds 20 lanes of 10.3 Gbit/s SerDes and four independent 1066 Mbit/s DDR2/DDR3 controllers
- Speedster uses familiar LUT-based fabric and standard synthesis and simulation tools so designers can use their existing RTL
Signaling a breakthrough in three decades of field-programmable gate array (FPGA) design where performance has been sacrificed for flexibility and time-to-market, Achronix Semiconductor announced that it has already begun shipping the world’s fastest FPGAs. The Speedster family, with the SPD60 as its initial member, delivers speeds up to 1.5 GHz, which represents a three-fold increase in performance over existing FPGAs.
Achronix early engagement customers have already found success with Speedster in applications requiring ASIC-like performance namely networking, telecommunications, test and measurement, encryption and other high-performance applications. These types of applications are an ideal fit for the Speedster family of FPGAs. Achronix has partnered with leading synthesis vendors to make industry-standard tools and methodologies compatible with the Speedster family. Designers can leverage their existing Verilog and VHDL designs. The Achronix CAD environment supports both Synopsys (formerly Synplicity) Synplify Pro and Mentor Graphics’ Precision Synthesis tools for RTL synthesis. In addition, the Achronix CAD environment provides the necessary tools for physical implementation, performance optimization, timing analysis, simulation, debug, and device programming.
“Designers have been lulled into expecting only incremental performance improvements with each new generation of FPGA,” said John Holt, Achronix founder, chairman and CEO. “Our product provides a disruptive leap in performance that opens up new worlds of application design previously unavailable to engineers using FPGAs.” The Speedster family of FPGAs uses the Achronix patented picoPIPE acceleration technology that speeds the way data moves through the FPGA fabric. In the absence of a global clock, picoPIPEs use simple handshake protocols to efficiently control data flow, resulting in significantly improved performance, all along using standard RTL for design-entry and employing familiar FPGA tools. By coupling this innovative technology with a 10.3 Gbits/s serializer/deserializer (SerDes) to facilitate high system throughput and integrated DDR2/DDR3 controllers for highspeed memory interface, the Speedster family provides the I/O speed to match its outstanding core performance. The device is manufactured in TSMC’s high performance 65 nm G+ CMOS process.
“The FPGA market is a tough arena to enter and a truly innovative approach is needed to compete and succeed in this space,” said Rich Wawrzyniak, senior market analyst at Semico Research Corp. “Based on their innovative approach to combining the benefits of their picoPIPE technology with a synchronous interface, coupled with an experienced team of FPGA industry executives and designers, Achronix appears to be the company that can finally provide an FPGA with ASIC-like performance and deliver a product that can compete with ASIC’s at the high end of the market.”
The Speedster 10.3 Gbits/s SerDes supports numerous high-speed interfaces, such as 40G/100G Ethernet, CEI-6G, 10 Gbits/s backplane, XFI, PCI Express (Generations 1 and 2), XAUI, Serial Rapid IO and Infiniband. Speedster FPGAs also includes a complete out-of-the-box DDR2/DDR3 solution which includes a physical layer and controller supporting memory interface speeds of up to 1066 Mbits/s. A key feature of Achronix’ picoPIPE technology is its tolerance to substantial variations in supply Voltage. This gives users a valuable power-management tool. Power consumption can be lowered by adjusting core supply Voltage.
EN-Genius Says…
I had a nice tame FPGA development tool scheduled for this review slot, but when I stumbled on Achronix' first public demonstration of its SPD60, a 1.5 GHz FPGA, here at the FPGA Summit, my priorities changed. When they came out of stealth mode in September 2008 and announced their plans to produce an ultra-fast FPGA, I’d unsuccessfully attempted to get in touch with Achronix to see if there was anything to their ambitious claims or if they were just another Silicon Valley scam. I had not been able to actually talk with anyone from the company until I caught their demo on the show floor. From what I was able to learn it was worth the wait, and in my initial discussions with Achronix technical staff, their recently-released SPD60, a 47 k-LUT FPGA actually runs at its 1.5 GHz rated speed and could prove to be an awfully useful device in designs where performance and time-to-market are more important than price or power consumption. Due to the time constraints involved with getting this review to press for this issue, it will not have as much of the deep technical details as usual, but hopefully it will arouse your interest enough for a more detailed follow-on article I hope to publish in the future.
The key to Achronix’ extreme performance is their unique asynchronous logic technology that allows data to pour through the SPD60’s internal elements at something close to the switching speed of its individual devices instead of having to wait for a traditional central clock signal to push everything along. Actually, it’s not the classical asynchronous logic with absolutely no clocking at all. Achronix has developed an architecture where the data itself provides some local timing information to the logic it's interacting with. The release above does a pretty good job of explaining it but I think a few additional comments might clarify things a bit more.
Simply put, data being moved between logic elements is passed to the receiving stage as soon as the receiving logic issues a ready handshake signal. Since there is no sample clock to identify a time period where data is valid, each bit is represented by two signal lines, one for a logical 0 and the other for a logical 1. Race conditions are avoided on the receive side with logical elements that will not generate a new output until all inputs are received. Inputs and outputs from each LUT block or multiplier element are connected to other FPGA elements using a crosspoint switch mechanism that’s very similar to the ones used in most SRAM-based FPGAs except that the switch provides a gating function for the data so that each inter-switch connection along a connection can act like a pipeline stage to store a bit. Besides increasing the data capacity of each pipeline significantly, this self-gated arrangement dramatically simplifies (but does not eliminate) timing closure issues that can plague some conventional FPGA designs. If this has intrigued you and you want to know more I’ll suggest that you download Achronix' well-written white paper, Introduction to Achronix FPGAs.
Virtually all data enters and leaves the SPD60 via one of its many SerDes interfaces. I only had a few minutes to take a look at the output of one its twenty-eight 10.3 Gbit/s transceivers after it had been passed through a backplane, but the signal I saw was clean and had a nice open data eye (one would expect no less for a show demo, though). I was told that the transceivers use a sophisticated mix of digital and analog equalization techniques (including some sort of DFE) plus transmit pre-emphasis. Unfortunately, the flurry of activity on the show floor and a tight schedule made it impossible to get many details on the transceiver's guts – something I’ll definitely correct in my next review. In addition, the device sports eight 5 Gbit/s SerDes connections: just the right amount to implement four two-lane 10 Gbit/s PCIe 2.0 host system interfaces or a single honking-fast 40 Gbit/s connection. In these early parts, there is no hard MAC logic for any of the SerDes but Achronix will supply IP that can turn any transceiver into virtually any SerDes-based interface including Gigabit Ethernet, PCIe, XAUI, and XFI. Of course, this eats up large chunks of programmable logic that could be used for other purposes but this is forgivable for these early chips. While no plans have been officially announced, it’s a good bet that we’ll see a second generation of Achronix chips sporting hard MACs or configurable gearboxes (a la Xilinx Virtex-5TXT) that will support most of the common interfaces without using up precious LUTs.
Despite their impressive performance, these are not miracle chips and, initial appearances notwtihstanding, they still follow the laws of physics. In fact, device limitations and trade-offs that Achronix clearly admits to are good indications that the products are quite real and should come close to working as-advertised. One of the most apparent handicaps of the Achronix architecture is that the extra handshaking required at each logic stage adds to the complexity of the chip, resulting in lower logic density per unit area of silicon than a conventional FPGA. The other drawback is that the same handshaking logic will slow down the pipeline speed of any data architecture which requires results to be fed back into the input. This means that any feedback-based filters or iterative processing functions will operate significantly below the 1.5 GHz that straight pipelined logic elements can sustain. In most cases, even highly-iterative functions will still run significantly faster than on a conventional FPGA but if designers want to exploit the full potential of the chips they may have to look into alternative functions that are more amenable to pipelined architectures.
The other inconvenient truth is that speed is usually purchased by burning lots of power, and Achronix FPGAs are no exception. Between the extra complexity of the handshakes in the data path and the ridiculous speeds that this thing runs at, it’s not surprising that the SPD60 power consumption can run in the neighborhood of 50 W for designs where a significant amount of its logic is being used at its highest speeds. To be fair, Achronix says that many of the reference designs they’ve developed draw much less power – on the order of 20 W – 30 W. I also suspect that the logic itself is not all to blame and that the SerDes transceivers are also a major contributors to such high power consumption. In any case, when you consider how much processing power you can extract from a single Speedster device, the 20 W – 50 W you have to feed it seems like a reasonable price to pay.
The most impressive thing about the SPD60 is that, other than its blazing performance (and the large heatsink required to keep it cool), it is designed to work pretty much like an ordinary mid-size FPGA. The SPD60 asynchronous logic elements are hidden from the outside world by a thin wrapper of synchronous logic that allows designers to treat it like any other FPGA or ASIC. It packs 47 k LUT elements, 144 18 kbit of block RAM, 735 kbit of distributed RAM, 98 18 x 18 hardware multipliers that can work with either Synopsys' Synplify Pro or Mentor Graphics' Precision Synthesis tools for RTL synthesis. The synthesis output is then fed to Achronix own post-processor to handle the actual place and route functions.
After seeing their demonstration, the big question in my mind is not whether Achronix' Speedster line of high-performance FPGAs will work, but whether they can be produced in sufficient volumes on a commercial fab process to make them commercially feasible. I do not know enough about the gate level architecture to say whether the Speedster will fall victim to small process variations that make the devices non-functional or whether such variations will simply affect the overall speed of the chip. My other concern is whether the design and back-end tools will actually be as simple to use as is claimed. I’ll try to find more out about this in subsequent reviews, along with verifying that their transceivers work somewhere near as well in real life as they appeared to in the demo. The only other question in my mind is whether there is a large enough market to support such an exotic family of devices profitably. I suspect that Achronix is, in performance, 2 - 3 years ahead of anything that mainstream applications will require. I also suspect that, even today, there are enough high-end applications (most likely advanced wireless, medical imaging, ultra-high-end network security, and various military signal processing tasks) that could keep the Achronix production line open until the rest of the world catches up with them.
The SPD60 is currently available to customers. Achronix says that volume pricing for the Speedster FPGA family will range from under $200 for their smallest part (the SPD 30) in high production volumes to $2500 for their largest part (the SPD180) in low volumes. Given that the SPD60 is in the middle of the size range, it’s fair to assume that, depending on volume, it will probably cost between $800 and $1500.
SPD60 Product Brief Speedster Development Kit
|
|
|
|
|