ARCHIVES

Original Article

Deep Guard AV: Audio-Visual Deepfake Detection Framework Using Hybrid Audio Learning, CNN-LSTM Video Analysis, and Automated Transcript Logging

Aylwin Vivian Singh1Arjun Vinod Shinde2

¹ Department of Computer Science (Artificial Intelligence), Shri Shankaracharya Technical Campus/Chhattisgarh Swami Vivekanand Technical University, India. ² Assistant Professor, Department of Computer Science, Shri Shankaracharya Technical Campus/Chhattisgarh Swami Vivekanand Technical University, India.

Published Online: March-April 2026

Pages: 493-499

Abstract

View PDF

The increasing realism of synthetic media generated using deep learning has intensified the threat posed by deepfake videos in domains such as social media, journalism, legal evidence, and digital identity verification. Existing deepfake detection systems often focus on a single modality, thereby limiting their robustness against sophisticated multimodal manipulations. This paper presents Deep Guard AV, an audio-visual deepfake detection framework that jointly analyzes manipulated video and extracted speech signals while preserving textual transcripts for forensic logging and interpretability. The proposed framework processes video inputs through a dual-stream pipeline. Visual frames are analyzed using a Convolutional Neural Network–Long Short-Term Memory (CNN-LSTM) architecture to capture spatial and temporal inconsistencies, while the extracted audio is processed using a hybrid deep learning model combining waveform-based and spectrogram-based representations. In parallel, the speech content is transcribed and saved locally to maintain an auditable forensic record of processed media. A weighted fusion strategy combines the outputs of audio and video models to produce the final authenticity score. Experimental evaluation demonstrates that integrating audio and video modalities improves detection robustness compared to unimodal analysis. The proposed framework provides an effective and scalable solution for practical deepfake forensics while enhancing transparency through transcript preservation.

Related Articles

2026

AI-Based Stomach Cancer Detection Using Biomarkers, Medical Images, and Voice Analysis

2026

Hydrogen-Efficient Eco-Driving and Route Planning for Fuel-Cell Electric Vehicles Using Multi-Objective Optimization Under Traffic and Terrain Uncertainty

2026

A Data-Driven Machine Learning Framework for Assessing Patent Commercial Value and Technological Significance

2026

A Hybrid Soft Computing Approach for Managing Uncertainty in Data Analytics

2026

Soft Computing Approaches for Robust Analysis of Imbalanced and Noisy Data

2026

Smart Attendance System Using Face Recognition and Gaze-Based Attention Monitoring

2026

Analyzing Customer Review Sentiments using Machine Learning

2026

Agentic Artificial Intelligence as a Strategic HR Partner: Redefining Decision-Making Authority and Strategic Roles

2026

Solid Waste Management Rules, 2026 (India): Regulatory Design Review and Environmental Benefits for Urban Sustainability

2026

Optimizing Hospital Resource Utilization Using Power BI Analytics