Skip to content

techmn/rfop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Rethinking Fusion and Orthogonal Projection for Face-Voice Association (FAME 2026)

Paper Link: arxiv  

Overview

RFOP revisits the fusion and orthogonal projection for face-voice association by effectively focusing on the relevant semantic information within the two modalities.

image

Installation

Please follow the instructions here to make the environment and install the libraries.

Training

Use following command to train the model

python main.py --batch_size 64 --epochs 50 --dim_embed 256

Score Computation

Use following command to compute score for the trained model

python computeScore.py --ckpt <path to checkpoint.pth.tar> --dim_embed 256 

Acknowledgements

The codebase is inspired from the FOP repository. We thank them for releasing their valuable codebase.

Similar Works

  • FAME   Face-voice Association in Multilingual Environments (FAME Challenge)
  • PAEFF   Precise Alignment and Enhanced Gated Feature Fusion for Face-Voice Association (InterSpeech 2025)
  • SBNet   Single-branch Network for Multimodal Training (ICASSP 2023)
  • FOP   Fusion and Orthogonal Projection for Improved Face-Voice Association (ICASSP 2022)

Citation

@misc{rfop2025,
      title={RFOP: Rethinking Fusion and Orthogonal Projection for Face-Voice Association}, 
      author={Abdul Hannan and Furqan Malik and Hina Jabbar and Syed Suleman Sadiq and Mubashir Noman},
      year={2025},
      eprint={2512.02860},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2512.02860}, 
}

About

Rethinking Fusion and Orthogonal Projection for Face-Voice Association (FAME 2026)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages