A New Standard for Multiproces

share

A New Standard for MulTIprocessing DSP Systems

Today's applicaTIons for computer technology are more demanding than ever before, and no where is this more evident than in the world of digital signal processors (DSPs). DSP chip manufacturers and board vendors are responding to this need with products that possess an inherent ability to support scalable, mulTIprocessing system architectures. These systems are powerful enough to address the most demanding DSP applicaTIons of today and provide a clear upgrade path for the future.

Multiprocessing computing has emerged as the only viable way of addressing a vast assortment of high-end DSP applications. Applications that require multiprocessing computing performance include medical, 3D graphics acceleration, military, industrial control, high end audio, telephony, and wireless communications infrastructure. The metrics that designers of these systems judge DSP performance by are processing per dollar, processing per watt, and processing per area.

The real advantage of multiprocessing systems is the ability to tune the performance and cost of a system to yield the required functionality and processing performance. This feature of multiprocessing system architectures is known as scalability. A scalable architecture allows users to tune performance based on the number of processing nodes required.

In fact, the ability of a multiprocessing system to scale is not just convenient, it's required; and the ability to design scalable multiprocessing systems starts at the chip level. The new ADSP-21160 SHARC processor, with its large internal memory blocks, multiple internal bus structure, and integrated I/O subsystem, possesses all of the features necessary to build multiprocessing systems that provide true scalability to any number of processors. Like its predecessor (the ADSP-21060 SHARC), the ADSP-21160 is setting the pace for high-performance multiprocessing floating-point DSP systems.

Defining Multiprocessing Systems

Processor technology has progressed at such a fast rate over the past decade that most of us cannot even remember how impressed we were with the power of our new 80286 PCs in the late 80's. But while chip manufacturers have awed us with their huge strides in processor technology, they have also exposed the performance limitations inherent in single-processor systems. Thus it is not surprising that high-performance system designers have started using aggregates of processors to build more powerful (multiprocessing) computing systems.

Figure 1. Multiprocessing Systems:(top) shared memory, (left) distributed memory, and (right) shared & distributed memory

This trend has become very apparent in the embedded DSP industry. As DSP applications become more and more demanding, board-level suppliers are responding with PCI and VME system components that squeeze larger numbers of processors into smaller spaces. Packaging technology has played a role here, as DSP chip-level manufacturers developed smaller package sizes that are relatively easy to cool.

However, a system of multiple processors cannot be considered truly multiprocessing based solely on the fact that more than one processor is used. The term multiprocessing implies that the processors in the system are able to work together, in an efficient manner, to perform the required calculations. This means that the exchange of data between processors is critical, and an effective multiprocessing DSP must possess a means for achieving this data transfer.

The SHARC processor family has answered this challenge with an internal I/O processor (or DMA engine) that allows data communication to occur without impeding the progress of the processing core. As a result, every time a SHARC processor is added to a multiprocessing network, both processing horsepower and data communication bandwidth are increased. This feature of the SHARC family, together with its unique link-port architecture, is one of the most important ingredients in its ability to support multiprocessing system design.

The new ADSP-21160 has increased the number of DMA channels over those available in the first generation ADSP-21060 SHARC from 10 to 14. This allows for a separate independent DMA channel for the transmit and receive buffers of the 2 serial ports, the 6 bi-directional link port buffers, and 4 bi-directional external port buffers. With these enhanced DMA capabilities, the ADSP-21160 has the flexibility to support a variety of scalable multiprocessing system architectures.

The Link Port Architecture

Multiprocessing system architectures come in two basic flavors: shared memory and distributed memory. The ADSP-21160 SHARC possesses built-in features that allow it to gluelessly support both of these architectures, as well as architectural hybrids. The key lies in the ADSP-21160's unique link-port architecture.

In shared memory systems, every processor has access to a global memory block (made up of internal and external memory) with processors exchanging data via a shared bus. This approach is reminiscent of traditional single-processor programming since all of the data is located in a single memory block. However, the shared memory architecture lacks the inherent ability to scale, since the addition of each new device on the bus decreases the average bus bandwidth available to each processor.

The SHARC family of processors gets around this issue through the use of dedicated data communication ports known as link ports. Link ports provide high-bandwidth, point-to-point connections between processors for the sole purpose of inter-processor communication. This allows the ADSP-21160 to support a distributed memory architecture in which all inter-processor communication takes place over the links, leaving the full bandwidth of the data bus for servicing external memory and I/O peripherals. Distributed memory architectures are truly scalable, and they allow users to configure very large scale multiprocessing networks using a natural mesh-like architecture.

One of the key strengths of the link-port architecture, however, is that system designers are not forced to choose between shared and distributed memory. Architectural hybrids combining these two philosophies are easy to construct, allowing users to glove-fit their system to their application.

The ADSP-21160's ability to support these multiple system architectures is another key aspect of system scaling. System designers are provided the freedom to easily tune their system's form and functionality as well as processing performance.

A Balanced Approach

It is well known that the most serious problem facing multiprocessing DSP system and chip-level designers is data flow. In order for a DSP to even approach its peak computational performance, it must be fed with a constant stream of data. This means that a multiprocessing system's ability to route data among the various nodes in the system is equally as important as its ability to process the data.

Early multiprocessing system architectures suffered from the malady of having high theoretical MFLOPS numbers but very few usable MFLOPS. This came about as a result of attaching rather inefficient communication engines to very high-performance RISC-style processors that were not designed to be used in multiprocessing systems. The result was a sub-linear scaling characteristic in which system performance increased only slightly as processors were added to the system.

The SHARC processing family, on the other hand, places its I/O subsystem and processing core on equal footing, creating a balance between processing and data routing efficiency. This balance allows the DSP application to supply the 21160's high-speed SIMD core with a constant stream of data, resulting in a nearly linear scaling over a wide range of system sizes.

Of course taking advantage of this balance in an actual application is a software development function as well. Third party board-level and software vendors supply software development tools for the SHARC family of processors to simplify this effort.

Today a variety of native and portable programming tools are available including SHARC-specific run-time environments and industry-standard real-time operating systems. These products not only simplify the task of targeting an application at a multiprocessing network, they also help programmers to take full advantage of the SHARC's balanced hardware design and squeeze the maximum performance out of their embedded SHARC systems.

COTS and the Multiprocessing DSP

Multiprocessing digital signal processing systems have become common place in a wide variety of military and commercial applications including RADAR, SONAR, industrial control, image processing and telecommunications. As the need for higher speed and more compact systems arises, multiprocessing DSP systems will become even more widespread.

One the fastest growing opportunities for the ADSP-21160 processor is the commercial off the shelf (COTS) board-level vendor market. The use of COTS products has become a mandate (over custom board developments) for both military and commercial users. Fueling this trend is a need for lower product development costs and a faster time to market.

The SHARC family, with its unique multiprocessing architecture, lends itself very well to the development of modular system components that can be used together to build high-end multiprocessing systems with essentially any performance and functionality characteristics. This flexibility is the key feature required by COTS customers.

Potential application areas for the 21160 appear boundless, with opportunities in many different markets. Regardless of the application, however, it is clear that the multiprocessing movement is here to stay in the embedded DSP marketplace. The ADSP-21160 SHARC from Analog Devices has secured a position to lead this processing revolution into the next millennium, truly setting a new standard for multiprocessing digital signal processing.

share