Comparison of digital signal processing methods and deep learning models in voice authentication

Руда, Х.С. and Сабодашко, Д.В. and Микитин, Г.В. and Швед, М.Є. and Бордуляк, С.М. and Коршун, Наталія Володимирівна (2024) Comparison of digital signal processing methods and deep learning models in voice authentication Електронне фахове наукове видання "Кібербезпека: освіта, наука, техніка" (1(25)). pp. 140-160. ISSN 2663-4023

[thumbnail of K_Ruda_D_Sabodashko_H_Mykytyn_M_Shved_S_Borduliak_N_Korshun_CONT_1_2024.pdf] Text
K_Ruda_D_Sabodashko_H_Mykytyn_M_Shved_S_Borduliak_N_Korshun_CONT_1_2024.pdf

Download (1MB)

Abstract

This paper addresses the issues of traditional authentication methods, such as the use of passwords, which often prove to be unreliable due to various vulnerabilities. The main drawbacks of these methods include the loss or theft of passwords, their weak resistance to various types of attacks,and the complexity of password management, especially in large systems. Biometric authentication methods, particularly those based on physical characteristics such as voice, present a promising alternative as they offer a higherlevel of security and user convenience. Biometric authentication systems have advantages over traditional methods because the voice is a unique characteristic for each person, making it substantially more challenging to forge or steal.However, there are challenges regarding the accuracy and reliability of such systems. Specifically, voice biometric systems can encounter issues related to changes in voice due to health, emotional state, or the surrounding environment.The primary objective of this paper is to compare contemporary deep learning models with traditional digital signal processing methods used for speaker recognition.For this study, text-dependent methods (Mel-Frequency Cepstral Coefficients —MFCC, Linear Predictive Coding —LPC) and text-independent methods (ECAPA-TDNN—Emphasized Channel Attention, Propagation and Aggregation in Time Delay Neural Network, ResNet-Residual Neural Network) were selected to compare their effectiveness in voice biometric authentication tasks. The experiment involved implementing biometric authentication systems based on each of the described methods and evaluating their performance on a specially collected dataset. Additionally, the paper provides a detailed examination of audio signal preprocessing methods used in voice authentication systems to ensure optimal performance in speaker recognition tasks, including noise reduction using spectral subtraction, energy normalization, enhancement filtering, framing, and windowing.

Item Type: Article
Uncontrolled Keywords: biometric technologies; voice authentication; digital signal processing; mel-frequency cepstral coefficients; linear predictive coding; deep learning; neural networks
Subjects: Статті у періодичних виданнях > Фахові (входять до переліку фахових, затверджений МОН)
Divisions: Факультет інформаційних технологій та математики > Кафедра інформаційної та кібернетичної безпеки ім. професора Володимира Бурячка
Depositing User: Наталія Володимирівна Коршун
Date Deposited: 07 Oct 2024 10:17
Last Modified: 07 Oct 2024 10:17
URI: https://elibrary.kubg.edu.ua/id/eprint/49791

Actions (login required)

View Item View Item