MultiNicheNet: modeling cell-cell communication from multi-sample multi-condition single-cell transcriptomics data
Studying intercellular communication is essential for an improved understanding of tissue biology and disease pathophysiology. Advances in single-cell and spatial transcriptomics help address this need through their ability to generate molecular profiles of cells within a tissue1. Several computational approaches have been developed to investigate cell-cell communication from these profiles2. Most tools infer cell-cell communication by predicting interactions between ligands (membrane-bound or secreted extracellular proteins) expressed by sender cell types and receptors expressed by receiver cell types3. These tools provide a comprehensive list of potential expressed ligand-receptor pairs, but they don’t provide a functional understanding of cell-cell communication because they don’t infer the response of the receiver cell type in reaction to the ligand-receptor binding.
Other tools approach the cell-cell communication inference problem differently by incorporating downstream signaling of ligand-receptor interactions. We previously developed NicheNet (https://github.com/saeyslab/nichenetr) which predicts downstream affected target genes of expressed ligand-receptor pairs by combining the expression data of interacting cells with a model of ligand-target regulatory potential4. NicheNet then prioritizes expressed ligand-receptor pairs according to how strongly their predicted targets are enriched in the receiver cell type (their so-called ligand activity).
Both types of approaches have been applied successfully to study both communication in steady state and differences in communication between conditions. In the context of differential cell-cell communication analysis, the first type of tools will prioritize differential cell-cell communication patterns based on the differential expression of the ligand-receptor pairs. In contrast, NicheNet predicts “differentially active” ligand-receptor interactions for which prior knowledge supports that they could function upstream of the DE genes in a receiver cell type of interest. However, both prioritization approaches might be useful.
Moreover, both types of tools suffer from additional limitations when applied to infer differential cell-cell communication from multi-sample scRNA-seq data (e.g., from a cohort of several patients and healthy controls). Running the current cell-cell communication tools in their default mode on multi-sample data will generate results after pooling all cells across samples. This approach is statistically inadequate because it ignores sample-to-sample variation. These issues have already been discussed extensively in the context of classic differential expression (DE) analyses5–7. Noteworthy, this pooling procedure is also suboptimal from a biological perspective because it ignores that cell-cell communication occurs within one sample.
This is an important issue because of the expected rise in multi-sample datasets due to technological advances, for example, in sample multiplexing8. In parallel to this evolution, more and more datasets are added to existing atlases in projects like the Human Cell Atlas9. These atlases consist of several healthy and diseased samples of several tissues from multiple individuals. Deciphering the role of cell-cell communication in the pathogenesis of these diseases requires tools that can correct for the source of origin of the data and relevant clinical covariates. Ideally, these tools should also be able to exploit the wealth of these multi-sample multi-condition datasets and tackle more complex questions than just pairwise comparisons (such as comparing therapy response or disease progression over time between several diseases).
In summary, there is a need for dedicated differential cell-cell communication tools that consider both the expression and activity of ligand-receptor pairs and that can handle the challenges and exploit the opportunities of multi-sample scRNA-seq datasets.
Aim and rationale for the approach
To address this need, we propose MultiNicheNet (https://github.com/saeyslab/multinichenetr), a novel tool for differential cell-cell communication analysis from multi-sample multi-condition scRNA-seq data. The rationale behind MultiNicheNet is to build upon the principles of state-of-the-art approaches for DE analysis of multi-sample scRNA-seq data6. As a result, the algorithm considers inter-sample heterogeneity, can correct for batch effects and covariates, and can cope with complex experimental designs to address more challenging questions than pairwise comparisons.
Details of the suggested approach
The main idea behind MultiNicheNet’s prioritization strategy is to uncover essential interactions by considering several complementary aspects informative for cell-cell communication inference. As ideal ligand-receptor pairs, we consider those that are more strongly expressed in the condition of interest, for which predicted target genes are enriched in the receiver cell type, that are also cell-type specific, and present in most samples of the condition of interest. These criteria are calculated by applying state-of-the-art DE approaches like muscat6. The MultiNicheNet software package does not only provide this prioritization framework, but it also provides possibilities for further downstream analyses and the generation of several intuitive visualizations. These visualizations let users explore the data behind the predictions, which is essential to inform them before proceeding to experimental validation.
We applied MultiNicheNet to scRNA-seq data of several tissues and diseases (breast cancer, squamous cell carcinoma, MIS-C, and lung fibrosis)10–13. These applications demonstrate that MultiNicheNet both retrieves known biology and generates novel hypotheses, including the possible identification of previously undescribed subgroups of patients. Additional data modalities, such as spatial co-localization from spatial transcriptomics data and proteomics, were used to further validate some of the top predictions.
How it will affect the broader field
We anticipate that MultiNicheNet will be a useful tool for studying dysregulated cell-cell communication patterns from patient cohort scRNA-seq data. This might lead to improved insights into disease pathogenesis, indicate potential treatment strategies, and identify potential biomarkers for patient stratification.