|
 |
networkZONE Products for the week of September 22, 2008
LSI Corporation Says…
Tarari T2000 10G Content Processor Breaks 1 W Per Gbit/s Barrier Sets new 10 Gbit/s standard for performance and power consumption
LSI Corporation has announced the LSI Tarari T2000 series of silicon-based content processing solutions for high-speed networking applications. The T2000 offers 10 Gigabit per second (Gbit/s) performance on a single chip, breaking the one watt per Gbits/s barrier for the first time.
“Tarari T2000 content processors can significantly offload any X86-, MIPS- or PowerPC-based processor in a network or server environment,” said Randy Smerik, senior vice president, Network Components Group, LSI. “The pay-off is much faster content processing at a fraction of the cost and power consumption which is a critical requirement for equipment providers.”
Operating at wire speeds, Tarari T2000 content processors offload critical network applications such as anti-virus and anti-spam detection, intrusion prevention, content-based billing and filtering, bandwidth management and QoS. For high-bandwidth applications, two T2000 chips can be coupled to provide up to 16 Gbits/s of throughput in a very small form factor. T2000 software can detect and load balance up to four 16 Gbits/s PCI Express boards, providing up to 64 Gbits/s of performance for the most demanding network environments.
Bob Wheeler, senior analyst at The Linley Group, said, “LSI is the market leader in content processing technology because they understand the problem better. Now, they also have the broadest and fastest product line for PCI Express.”
T2000-based systems use low-cost DDR RAM, available at a fraction of the cost of expensive SRAM required by other solutions. The family is based upon common industry interfaces and a common application programming interface, resulting in shorter design cycles, faster times to market and longer product life.
EN-Genius Says…
When I reviewed the Tarari FPGA-based T10 10 Gbit/s deep packet inspection engine board back in February of 2007, they promised they’d introduce a single-chip ASSP version some time in the near future. I suspect that the T2000 roll-out was delayed a bit as Tarari underwent acquisition by LSI last year, but it seems to have been worth the wait. The T2000's expanded processing capacity and enhanced I/O should go a long way towards addressing the growing bandwidth demands and increasingly-complex security and QoS issues that carriers and enterprise networks are encountering. Designers will also welcome the T2000 greatly-improved power/performance ratio that is well-suited for the high-density systems that are beginning to dominate both CPE and network-side equipment.
The T2000 uses a similar architecture and programming model to the T1000 (announced March 2008) but adds significantly more processing power and throughput. At the heart of both processors is a cluster of regular expression (RegEx) processing engines that can look for patterns anywhere within a packet or across packet boundaries and, if necessary flag the packet for further processing. The RegEx cores are fed by an array of what LSI refers to as DMA Agents which parse the traffic and assign a particular stream to a particular RegEx core. The T2000 boasts 15 RegEx cores compared to six for the earlier T1000. Although it’s much faster than the T1000, the new chip is still able to use DDR2 DRAM instead of the hot, costly SRAM used in earlier high-performance engines thanks to an integrated L1/L2 cache that eliminates most of the delays involved with the slower memory.
Much of the T1000 and T2000 speed comes from careful attention to minimizing data latency at every potential choke point. Unlike most packet processors which must first store an incoming packet in a buffer memory (Bay Micro and Xelerated being the notable exceptions) LSI’s Tarari uses a cut-through mechanism that begins processing the beginning portion of a packet even before the last bytes have arrived. Chunks of incoming packets are DMA-ed directly to its assigned RegEx core as they begin to stream, with no CPU intervention required. The cut-through mechanism accumulates intermediate inspection results as bytes are processed and uses them on subsequent parts of the packet.
Internal latency is further reduced by a multi-tiered approach to storage, much like the virtualization techniques used in high-powered servers, the T2000 memory controller allows each RegEx engine to believe it ‘owns’ a piece of virtual memory. The processor cluster gets around most external memory delays by storing the most frequently used patterns (sometimes called top-of-tree results) on-chip within the rules engine itself. If the engine has to go-off chip for a less-frequently-used pattern, the integrated L1/L2 cache shortens the tree to reduce access time.
The Tarari designers have also enhanced its overall system performance with the ability to perform multiple operations with a single host bus transaction. Precious host processor cycles and PCIe bus bandwidth are conserved as the T2000 only requires a singe transaction to perform multiple inspections, decryption, decompression and scanning operations. This in large part explains why a T1000 device was able to process 3 Gbit/s worth of antivirus traffic on a host system powered by an Intel Tolapai host at an IDF demo while only consuming 3% of its capacity. LSI says that the T2000 also works nicely with either a Raza or Cavium as its host processor (no, there is no direct competition since Cavium dropped RegEx processing capability from their PCIe-equipped product line).
Regardless of which host processor you choose, the power savings you’ll see over most other approaches are significant. At a bit under 1 W/Gbit/s for the 10G chip (a bit more than 1 W/Gbit/s for the slower versions), the T2000 shaved 20 W off a previous design that used TCAMs and an earlier-model search engine that delivered only 2.5 Gbit/s.
The T2000 I/O capacity has been appropriately beefed up to match the throughput of its 15 RegEx engines beginning with an 8-lane PCI Express (Gen 1) host system interface. Depending on which of the T1000 series you are currently using, this represents either a 4x or 8x increase in bandwidth.
Outboard processing capability is expanded with a second 40 Gbit/s expansion bus (CPX) interface that allows you to gang up to two T2000s as slave processors. The CPX also provides a way to attach your own specialized FPGA-based accelerators for added performance in specialized applications (LSI says that its reference designs typically use Xilinx parts but the T2000 can work with Altera parts too). The CPX bus multi-lane serial interface connected to the T1000 internal crosspoint switch that’s also used to connect its internal resources so any outboard hardware looks just like on-chip resource except for delays. The second CPX port can be used to add a second slave T2000 processor but in many cases it will be used to attach an FPGA for hardware XML processing acceleration, cryptography, compression/decompression, or custom applications like normalization (upper case to lower case). T2000’s ability to simultaneously support RegEx and XML processing simultaneously is an industry exclusive.
With several generations of Tarari processors behind them, LSI software support is relatively mature. Its host-resident driver software performs load balancing for up to four master devices. It’s interesting to note that the logic in the T2000 Agent cluster controller eliminates the need for code-bloating software locks that some processors rely on to avoid timing and resource conflicts in host systems with multi-core processors. Driver software is available for 32/64-bit Linux, BSD Unix, and VX Works operating systems with others in process. Most of Tarari’s previous chips support Windows but there is no driver for the T2000 – at least for the moment. Nevertheless it’s no surprise that there is a big interest in using the chips on MS Exchange servers for anti-spam functions so I expect that there is a Windows driver in the T2000’s future.
With all this deep packet inspection capability at its disposal, there is a wealth of applications that the T2000 can address. Some of the top targets on LSI’s application hit list include:
- Enterprise gateways – security (intrusion detection, anti-virus, anti-spam, anti-malware and UTM)
- Specialized functions for service providers (wireless & wireline) to support content-based billing
- QoS management/enforcement
- Content-aware enterprise servers (Anti-virus, anti-spam & XML processing)
I also won’t quibble with LSI about their prediction that the T2000 could become an important player in mobile networks, for helping support QoS and LoS management as the bulk of traffic shifts from voice to data and carries a growing percentage of latency-sensitive high-bandwidth P-2-P services.
Rough pricing for the T2000 is $200 to $300, depending on performance.
Product Page
|
|
|
|
|