Depersonalization of Speech Using Speaker-Specific Transform Based on Long-Term Spectrum

M. Rujzl, M. Sigmund

Depersonalization of Speech Using Speaker-Specific Transform Based on Long-Term Spectrum

Číslo: 4/2023
Periodikum: Radioengineering Journal
DOI: 10.13164/re.2023.0523

Klíčová slova: Speech depersonalization, long-term spectrum, voice transformation, depersonalized speech evaluation

Pro získání musíte mít účet v Citace PRO.

Přečíst po přihlášení

Anotace: This paper introduces a novel approach for hiding personal information in speech signals. The proposed approach applied a transform warping function, which is obtained from a long-term linear prediction spectrum individually for each speaker. The depersonalized speech was compared with the often used technique based on vocal tract length normalization. The proposed approach performs wider manipulation of fundamental frequency and provides higher intelligibility by 5% in clean speech and by 8% for signal-to-noise ratio 5 dB. It also significantly alters the derived glottal pulses, making them difficult to use for personality analysis. Speech intelligibility index and glottal pulse distortion are new aspects in the field of voice depersonalization.