Advanced Liveness Detection Using Vision Transformers and the DINO Framework
Main Article Content
Abstract
Face recognition systems have been increasing used in banking, mobile authentication and secure access control - however, they are prone to presentation attacks such as printed photos, replay videos and spoof masks. This project proposes a strong face anti-spoofing or liveness detection using Vision Transformer (ViT) and DINO based self-supervised learning. Unlike typical CNN-based approaches relying considerably on local texture clues and large labeled data, in this approach the approach takes global facial patterns into account using fully attention mechanisms of transformer and helps to create such discriminative representations learning from augmented facial images. The system consists of face preprocessing, feature ex- traction and binary classification to determine whether the input is live or spoof. It is tested on standard biometric anti-spoofing measures such as Accuracy, APCER, BPCER and ACER. The proposed method attempts to enhance the generalization to the unseen spoofing attacks while preserving the practicality for deployment. This work introduces the potential of transformer based self-supervised models to secure and reliable biometric authentication systems.