International Joint Conference on Neural Networks (IJCNN 2025) – VERIMEDIA Workshop
The paper introduces RawNetLite, a lightweight deep learning model for detecting audio deepfakes directly from raw waveforms. Unlike traditional methods that rely on handcrafted features, RawNetLite processes raw audio end-to-end, using convolutional and recurrent layers to capture spectral and temporal patterns. To improve generalization, the authors train the model on multiple datasets (FakeOrReal, AVSpoof2021, CodecFake) and apply waveform-level augmentations like pitch shifting and noise. They also use Focal Loss to focus on hard-to-classify examples. Results show excellent in-domain performance (over 99% F1 on FakeOrReal) and strong cross-dataset robustness, especially when using all datasets and augmentations (83.4% F1 on unseen data). The study highlights that diverse training data and smart training strategies are key to building reliable audio deepfake detectors.
Authors: Andrea Di Pierno, Luca Guarnera, Dario Allegra, Sebastiano Battiato
#Audio #deepfake #ff4ll #VERIMEDIA #IJCNN
#Audio #deepfake #ff4ll #VERIMEDIA #IJCNN
