🔥 Applications of Foundation Models in Biometrics

In this section, we review recent papers on the applications of foundation models in biometrics:

Foundation Models for Biometric Recognition
Foundation Models for Soft-biometric Detection
Foundation Models for Deepfake and Forgery Detection
Foundation Models for Anti-spoofing
Foundation Models for Synthetic Biometric Generation

Foundation Models for Biometric Recognition

Paper Title	Year	Modality / Task	Paper	Code
Exploring wav2vec 2.0 on speaker verification and language identification	2020	speaker and language identification	link	NA
ChatGPT and biometrics: an assessment of face recognition, gender detection, and age estimation capabilities	2024	face verification, gender detection, age estimation	link	NA
How Good is ChatGPT at Face Biometrics? A First Look into Recognition, Soft Biometrics, and Explainability	2024	face verification	link	NA
ChatGPT Meets Iris Biometrics	2024	iris recognition	link	NA
Foundation versus Domain-specific Models: Performance Comparison, Fusion, and Explainability in Face Recognition	2025	face verification	link	NA
Benchmarking Foundation Models for Zero-Shot Biometric Tasks	2025	face verification, soft biometric attribute prediction (gender and race), iris recognition, iris presentation attack detection, face morph detection, and face deepfake detection	link	NA
A fine-tuned wav2vec 2.0/hubert benchmark for speech emotion recognition, speaker verification and spoken language understanding	2021	speaker verification	link	link
Iris-SAM: Iris Segmentation Using a Foundation Model	2024	iris segmentation	link	link
SAM-Iris: A SAM-Based Iris Segmentation Algorithm	2025	iris segmentation	link	NA
Froundation: Are foundation models ready for face recognition?	2024	face recognition	link	link
HumanOmni: A Large Vision-Speech Language Model for Human-Centric Video Understanding	2025	audio-visual human video recognition (emotion recognition, expression description, and action understanding)	link	link
FaceLLM: A Multimodal Large Language Model for Face Understanding	2025	face recognition, anti-spoofing, deepfake detection, attribute prediction, expression, parsing, pose, crowd counting	link	link
FaceXBench: Evaluating Multimodal LLMs on Face Understanding	2025	face recognition, anti-spoofing, deepfake detection, attribute prediction, expression, parsing, pose, crowd counting	link	link
Face-Human-Bench: A Comprehensive Benchmark of Face and Human Understanding for Multi-modal Assistants	2025	facial attributes, age estimation, expression recognition, attack detection, recognition; human attributes, action, spatial/social relations, re-ID	link	link
From Pixels to Words: Leveraging Explainability in Face Recognition through Interactive Natural Language Processing	2024	face recognition explainability	link	NA
FaceOracle: Chat with a Face Image Oracle	2025	face image quality assessment	link	NA
Unispeech-sat: Universal speech representation learning with speaker aware pre-training	2022	speaker ID, verification, diarization, phoneme recognition, keyword spotting, emotion recognition	link	link
Large-scale self-supervised speech representation learning for automatic speaker verification	2022	speaker verification	link	link
General facial representation learning in a visual-linguistic manner	2022	face parsing, alignment, attribute recognition	link	link
Marlin: Masked autoencoder for facial video representation learning	2023	face attribute recognition, expression recognition, deepfake detection, lip synchronization	link	link
Self-Supervised Facial Representation Learning with Facial Region Awareness	2024	face expression and attribute recognition	link	link
Pose-disentangled contrastive learning for self-supervised facial representation	2023	face expression, face recognition, head pose estimation	link	link
Pros: Facial omni-representation learning via prototype-based self-distillation	2024	face parsing, attribute recognition, emotion detection, landmark detection	link	link
ComFace: Facial Representation Learning with Synthetic Data for Comparing Faces	2024	face expression change, weight change, age change estimation	link	NA
SwinFace: a multi-task transformer for face recognition, expression recognition, age estimation and attribute estimation	2023	face attributes, age estimation, expression recognition, face recognition	link	link
FaceXFormer: A Unified Transformer for Facial Analysis	2024	face parsing, landmarks, head pose estimation, age/gender/race estimation, attribute recognition, expression recognition,	link	link
Task-adaptive Q-Face	2024	head pose estimation, face attribute recognition, age estimation, expression recognition	link	NA
Faceptor: A generalist model for face perception	2024	face parsing, landmarks, age and gender estimation, attribute recognition, expression recognition, face recognition	link	link

Foundation Models for Soft-biometric Detection

Paper Title	Year	Modality / Task	Paper	Code
Robust light-weight facial affective behavior recognition with clip	2024	facial expression classification; action unit detection	link	link
Cliper: A unified vision-language framework for in-the-wild facial expression recognition	2024	face static & dynamic expression recognition	link	link
Emoclip: A vision-language method for zero-shot video facial expression recognition	2024	video facial emotion recognition	link	link
Finecliper: Multi-modal fine-grained clip for dynamic facial expression recognition with adapters	2024	dynamic facial expression recognition	link	NA
Face-mllm: A large face perception model	2024	face age/gender, expression, action units, attributes	link	NA
FaceGPT: Self-supervised Learning to Chat about 3D Human Faces	2024	face 3DMM parameter generation	link	NA
FaceBench: A Multi-View Multi-Level Facial Attribute VQA Dataset for Benchmarking Face Perception MLLMs	2025	face attribute detection	link	link
Face-LLaVA: Facial Expression and Attribute Understanding through Instruction Tuning	2025	face expression recognition, action unit detection, facial attribute detection, age estimation, and deepfake detection	link	link
FaceInsight: A Multimodal Large Language Model for Face Perception	2025	face attribute recognition, age/ gender/ race estimation, and expression prediction	link	NA
R1-omni: Explainable omni-multimodal emotion recognition with reinforcement learning	2025	audio-visual emotion recognition with reasoning	link	link
ChatGPT and biometrics: an assessment of face recognition, gender detection, and age estimation capabilities	2024	face gender detection, age estimation	link	NA
How Good is ChatGPT at Face Biometrics? A First Look into Recognition, Soft Biometrics, and Explainability	2024	age, gender, ethnicity, hair color	link	NA
ChatGPT Meets Iris Biometrics	2024	iris–face matching; soft-biometrics	link	NA

Foundation Models for Deepfake and Forgery Detection

Paper Title	Year	Modality / Task	Paper	Code
MFCLIP: Multi-modal Fine-grained CLIP for Generalizable Diffusion Face Forgery Detection	2024	face forgery detection	link	link
Forensics Adapter: Adapting CLIP for Generalizable Face Forgery Detection	2024	face forgery detection	link	link
MADation: Face Morphing Attack Detection with Foundation Models	2025	face morph attack detection	link	link
FSFM: A Generalizable Face Security Foundation Model via Self-Supervised Facial Representation Learning	2024	deepfake detection, anti-spoofing, unseen diffusion forgery	link	link
Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation	2022	voice spoofing & deepfake detection	link	link
X2-dfd: A framework for explainable and extendable deepfake detection	2024	face deepfake detection	link	link
Ffaa: Multimodal large language model based explainable open-world face forgery analysis assistant	2024	forgery analysis assistant	link	link
Towards general visual-linguistic face forgery detection (v2)	2025	face forgery detection	link	link
Evaluating the Effectiveness of Attack-Agnostic Features for Morphing Attack Detection	2024	face morph attack detection	link	link
Are Music Foundation Models Better at Singing Voice Deepfake Detection? Far-Better Fuse them with Speech Foundation Models	2024	speaker deepfake detection	link	NA
Rethinking Vision-Language Model in Face Forensics: Multi-Modal Interpretable Forged Face Detector	2025	face deepfake detection \newline+ description	link	link
Standing on the shoulders of giants: Reprogramming visual-language model for general deepfake detection	2025	face deepfake detection	link	link
Can chatgpt detect deepfakes? a study of using multimodal large language models for media forensics	2024	face deepfake detection	link	link
How Good is ChatGPT at Audiovisual Deepfake Detection: A Comparative Study of ChatGPT, AI Models and Human Perception	2024	audio-visual deepfake detection	link	NA
ChatGPT Encounters Morphing Attack Detection: Zero-Shot MAD with Multi-Modal Large Language Models and General Vision Models	2025	face morph detection	link	NA

Foundation Models for Anti-spoofing

Paper Title	Year	Modality / Task	Paper	Code
Flip: Cross-domain face anti-spoofing with language guidance	2023	fine‐tune CLIP image encoder for face (FLIP alignment)	link	link
On Self-Supervised Learning and Prompt Tuning of Vision Transformers for Cross-sensor Fingerprint Presentation Attack Detection	2023	SSL via masked‐fingerprint prediction with prompt tuning	link	NA
CPL-CLIP: Compound Prompt Learning for Flexible-Modal Face Anti-Spoofing	2024	face anti-spoofing	link	NA
Fm-clip: Flexible modal clip for face anti-spoofing	2024	cross‐modal antispoofing	link	NA
La-SoftMoE CLIP for Unified Physical-Digital Face Attack Detection	2024	Unified physical-digital face attack detection	link	NA
Cfpl-fas: Class free prompt learning for generalizable face anti-spoofing	2024	face anti-spoofing	link	NA
InstructFLIP: Exploring Unified Vision-Language Model for Face Anti-spoofing	2025	face anti-spoofing	link	link
Reliable and Balanced Transfer Learning for Generalized Multimodal Face Anti-Spoofing	2025	Multimodal face anti-spoofing	link	link
FaceShield: Explainable Face Anti-Spoofing with Multimodal Large Language Models	2025	face anti-spoofing (classification and attack localization)	link	link
Interpretable face anti-spoofing: Enhancing generalization with multimodal large language models	2025	face anti-spoofing	link	NA
Exploring Task-Solving Paradigm for Generalized Cross-Domain Face Anti-Spoofing via Reinforcement Fine-Tuning	2025	face anti-spoofing (spoofing detection and reasoning)	link	NA
VL-FAS: Domain Generalization via Vision-Language Model For Face Anti-Spoofing	2024	face anti‐spoofing	link	NA
FoundPAD: Foundation Models Reloaded for Face Presentation Attack Detection	2025	face anti‐spoofing	link	link
Towards Iris Presentation Attack Detection with Foundation Models	2025	iris anti‐spoofing	link	NA
Exploring ChatGPT for Face Presentation Attack Detection in Zero and Few-Shot in-Context Learning	2025	face presentation attack detection	link	link
Are Foundation Models All You Need for Zero-shot Face Presentation Attack Detection?	2025	face presentation attack detection	link	link
Shield: An evaluation benchmark for face spoofing and forgery detection with multimodal large language models	2025	face anti-spoofing (RGB, infrared, depth) and forgery detection	link	link
ChatGPT Meets Iris Biometrics	2024	iris presentation‐attack detection	link	NA

Foundation Models for Synthetic Biometric Generation

Paper Title	Year	Modality / Task	Paper	Code
Toward open-world text-driven face generation and manipulation via stylegan3	2024	Text-to-face synthesis	link	NA
AnyFace++: A unified framework for free-style text-to-face synthesis and manipulation	2024	Text-guided face editing	link	NA
AnyFace: Free-style text-to-face synthesis and manipulation	2022	Text-to-face generation	link	NA
Towards counterfactual image manipulation via clip	2022	Controllable text-to-face	link	link
Prompt-Based Modality Bridging for Unified Text-to-Face Generation and Manipulation	2024	Prompt-based face synthesis	link	NA
Tecm-clip: Text-based controllable multi-attribute face image manipulation	2022	face attribute / expression editing	link	link
Stylemc: Multi-channel based fast text-guided image generation and manipulation	2022	face multi-attribute editing	link	link
Photoverse: Tuning-free image customization with text-to-image diffusion models	2023	Few-shot personalised face portrait generation	link	link
Fastcomposer: Tuning-free multi-subject image generation with localized attention	2024	fast subject-driven face text-to-image	link	link
Moa: Mixture-of-attention for subject-context disentanglement in personalized image generation	2024	multi-concept face portrait generation	link	NA
Photomaker: Customizing realistic human photos via stacked id embedding	2024	high-fidelity face personalisation	link	link
Face0: Instantaneously conditioning a text-to-image model on a face	2023	Identity-preserving face text-to-image	link	NA
Ip-adapter: Text compatible image prompt adapter for text-to-image diffusion models	2023	face instant personalisation	link	link
Dreamidentity: Improved editability for efficient face-identity preserved image generation	2023	face identity-guided generation	link	NA
Portraitbooth: A versatile portrait model for fast identity-preserved personalization	2024	face few-shot portrait generation	link	NA
Instantid: Zero-shot identity-preserving generation in seconds	2024	face real-time personalisation	link	link
ID-Aligner: Enhancing Identity-Preserving Text-to-Image Generation with Reward Feedback Learning	2024	face identity-consistent generation	link	link
Facestudio: Put your face everywhere in seconds	2023	face ID & style controllable text-to-image	link	link
IDAdapter: Learning Mixed Features for Tuning-Free Personalization of Text-to-Image Models	2024	identity-aware face editing	link	NA
Arc2face: A foundation model for id-consistent human faces	2024	identity-conditioned face generation	link	link
Face Reconstruction from Face Embeddings using Adapter to a Face Foundation Model	2024	General identity-conditioned face generation	link	link
Arc2Avatar: Generating Expressive 3D Avatars from a Single Image via ID Guidance	2025	Identity-conditioned 3D head / avatar generation	link	link
ClipSwap: Towards High Fidelity Face Swapping via Attributes and CLIP-Informed Loss	2024	Face swapping	link	NA