NAMULINT
New Approaches to Distortion Processing for Multimedia Applications over Intelligent Mobile Devices
Project Data
Research Group: SigMAT, Signal Processing, Multimedia Transmission and Speech/Audio Technologies Group
Funding: Ministerio de Economia y Competitividad.
Project code: TEC2013-46690-P
Main Researcher: Victoria E. Sánchez Calle (E-mail: victoria [at] ugr [dot] es)
Summary
The main goal of this Project proposal is the achievement of a high quality of service (QoS) in those multimedia applications for intelligent mobile devices (IMDs) which are called to play a relevant role in the near future. Thus, we propose a set of new approaches for processing the degradation that may affect the multimedia signals. This degradation is a consequence of the specific context where IMD devices are used. This involves a connection mode and a physical environment which, in general, differ from those of a classical personal computer. In particular, we will consider two types of multimedia applications of special relevance for IMDs.
- First, we will consider the human-IMD interaction by means of automatic speech recognition (ASR). In this case, the physical environment involves an acoustic context which will be typically noisy. Although the current ASR techniques can achieve a high word accuracy under low noise conditions, their performance falls drastically if noise is present. Our proposal for combating this fact is based on changing the traditional approach of employing a single microphone, which forces to estimate the noise from silence intervals, to a new scheme where two microphones (integrated in the IMD) are used. This dual information can be used for developing both improved speech enhancement algorithms and a new feature compensation scheme able to exploit a 2-channel input.
- In second place, we will consider speech and video transmission applications. In this case, the main context distortion is the loss of data caused by a real-time transport over IP networks, which may be worsened by the wireless link employed by IMDs. Again, we propose new points of view which allow a high QoS. In the case of speech transmission, we think that this goal can be achieved by developing a specific representation of the excitation signal for a suitable signal reconstruction. Finally, we think that the recovery of lost video blocks requires to directly tackle with the non stationarity problem. Thus, we propose the development of new estimation techniques as those based on kernels or frequency domain extrapolation, and their combination with an adaptive sequential filling of the lost blocks.
Key words:
Multimedia Signals, Distortion, Intelligent Mobile Devices, Robust Speech Recognition, Voice over IP, Video Transmission.