A parallel DSP-based architecture for a real-time multiple path acoustic echo cancellation
A new multi-path acoustic echo canceller for teleconferencing is presented in this thesis. This echo canceller contains digitized versions of multiple echo paths in its memory for all the microphones in the teleconferencing system. The echo paths were obtained by echo path modelling. The simulation for echo cancellation shows that the canceller has improved echo cancellation performance compared with a conventional single-path model echo canceller. The improvement is quite noticeable when active microphone switching occurs. A new adaptation technique is incorporated in this echo canceller which takes specific microphone related echo paths into account. The echo canceller demonstrates a highly parallel pipelined multi-processor architecture. The multiple-instruction multiple-data stream (MIMD) structure is utilized for implementing the echo cancellation system. The real-time acoustic echo cancellation application is computationally intensive. The intrinsic parallelism of a specific adaptive algorithm was exploited in this custom multiple microprocessor architecture. The design objective was to provide a system platform for efficient resource utilization and adaptive algorithm execution. The system consists of five high speed floating-point digital signal processors (DSP). One master processor handles system control, program downloads, data transfers, microphone selection and scalar adaptive filter computations. The other four slave processors execute filtering and adaptive algorithm computations in parallel. The interprocessor communication scheme accommodates large adaptive filters by permitting the addition of more slave processors. The system achieves excellent speedup rates when implementing FIR and DLMS algorithms for real-time acoustic echo cancellation applications. The system speedup with four SPMs varies from 2.756 to 3.967 when implementing a FIR filter with DLMS algorithm while the order of the filter increases from 32 to 2048. This indicates that the highest computational speedup can be achieved when DLMS algorithm is selected with high filter order.