MuSARCyto : Multi‐head self‐attention‐based representation learning for unsupervised clustering of cytometry data

Gupta, Anubha ; Hooda, Ritika ; Motwani, Sachin ; Sagar, Dikshant ; Aggarwal, Priya ; Abrol, Vinayak ; Gupta, Ritu (2025) MuSARCyto : Multi‐head self‐attention‐based representation learning for unsupervised clustering of cytometry data Cytometry Part A: Bioimaging, 107 (8). pp. 551-567. ISSN 1552-4922

Full text not available from this repository.

Official URL: https://doi.org/10.1002/cyto.a.24956

Related URL: http://dx.doi.org/10.1002/cyto.a.24956

Abstract

Cytometry enables simultaneous assessment of individual cellular characteristics, offering vital insights for diagnosis, prognosis, and monitoring various human diseases. Despite its significance, the process of manual cell clustering, or gating, remains labor-intensive, tedious, and highly subjective, which restricts its broader application in both research and clinical settings. Although automated clustering solutions have been developed, manual gating continues to be the clinical gold standard, possibly due to the suboptimal performance of automated solutions. We hypothesize that their performance can be improved via an appropriate representation of data from the clustering point of view. To this end, this work presents a novel unsupervised deep learning (DL) architecture wherein an efficient cytometry data representation is learned that helps discover cluster assignments. Specifically, we propose MuSARCyto, a multi-head self-attention-based representation learning network (RN) for the unsupervised clustering of cytometry data, utilizing a fully-connected representation network backbone. To benchmark MuSARCyto against the state-of-the-art cytometry clustering methods, we propose a cluster evaluation metric adjudicator score (Adn), which is an ensemble of prevalent cluster evaluation metrics. Extensive experimentation demonstrates the superior performance of MuSARCyto against the existing state-of-the-art cytometry clustering methods across six publicly available mass and flow cytometry datasets. The proposed DL achitectures are small and easily deployable for clinical settings. This work further suggests using DL methods for identifying meaningful clusters, particularly in the context of critical immunology applications.

Item Type:Article
Source:Copyright of this article belongs to John Wiley & Sons, Inc.
ID Code:141769
Deposited On:22 Jan 2026 17:57
Last Modified:22 Jan 2026 17:57

Repository Staff Only: item control page