Unraveling the complexities of pulmonary arterial hypertension (PAH) is challenging due to its multifaceted nature, encompassing molecular, cellular, tissue, and organ-level alterations. The advent of omics technologies, including genomics, epigenomics, transcriptomics, metabolomics, and proteomics, has generated a vast array of public and nonpublic datasets from both humans and model organisms, opening new avenues for understanding PAH. However, the insights provided by individual omics datasets into the molecular mechanisms of PAH are inherently limited. In response, efforts are increasing to develop integrative omics approaches designed to synthesize multidimensional omics data into a cohesive understanding of the molecular dynamics of PAH. In this review, we discuss various strategies for integrating multiomic data and illustrate their application in PAH research. We explore the challenges encountered and the profound potential of leveraging omics data for comprehensive molecular insight as well as for the identification of novel therapeutic targets and biomarkers specific to PAH. Furthermore, in this review, we seek to elucidate the process and rationale behind conducting integrative omics studies in PAH, raising critical questions about the feasibility and future prospects of multiomic integration in unraveling the complexities of this disease.
INTRODUCTION
Pulmonary arterial hypertension (PAH) is a fatal vasculopathy characterized by pulmonary vasoconstriction and adverse remodeling of the distal pulmonary arteries. Progression of the disease is manifested by a significant increase in pulmonary artery pressure, which strains the right ventricle (RV), leading to hypertrophy and ultimately heart failure, the leading cause of death in PAH patients.1 The pathogenesis of PAH is complex and involves sophisticated interactions between multiple organs—including the lungs, RV, bone marrow, and spleen—and different cell types—including smooth muscle cells, endothelial cells, fibroblasts, and inflammatory cells.2
The advent of omics technologies has advanced our understanding of the molecular intricacies of PAH, revealing extensive molecular dysfunction through genomic, epigenomic, transcriptomic, proteomic, and metabolomic studies.3 While authors of studies of individual omic layers have uncovered potential new therapeutic targets and biomarkers, it is increasingly recognized that a narrow focus on single-omic facets provides an incomplete picture of the intricate mechanisms linking molecular variations to clinical disease manifestations. Biological systems are manifested by complex networks and interactions that span multiple omic domains and underlie the pathology of PAH.4
The integration of multiomic data is essential to unravel the complex mechanisms of PAH and provide the basis for novel therapies and interventions. However, this integrative approach poses significant computational challenges, ranging from the development of sophisticated statistical methods to the creation of comprehensive databases that link omic levels to biological functions and disease states.5 Addressing these challenges requires computational and programming expertise not traditionally found in biological laboratories. The move toward multiomic integration requires a collaborative effort that brings together the knowledge of biologists, bioinformaticians, and computer scientists. This interdisciplinary collaboration is critical to overcoming the computational hurdles of multiomic data analysis, thereby moving PAH research into a new era of discovery and therapeutic development.
In this review, we will explore the field of integrative multiomic studies and their central role in advancing our understanding of PAH. First, we will provide an insightful overview of the core omics data types central to PAH research. Building on this foundation, we will explore the principles of multidimensional data integration and provide a thorough examination of the cutting-edge methods and tools that are shaping this vibrant area of study (Figure 1). Through a series of illustrative examples, we will highlight the real-world applications and significant achievements of multiomic studies in PAH, demonstrating their ability to unravel the intricacies of the disease. Finally, we will discuss the current challenges and limitations of integrative multiomic approaches and assess the gap between expectations and actual achievements, challenging common myths and highlighting the tangible benefits these studies offer.
Navigating omic data integration: methods and challenges. Multiomics combines data from multiple platforms, offering comprehensive insight into biological systems. It begins with meticulous sample collection and preparation, encompassing various biological specimens like blood and tissue, pivotal for capturing a wide array of omic information, such as genomics, transcriptomics, epigenomics, proteomics, and metabolomics. The integration of multiomic data faces notable challenges in analysis and synthesis, including the high costs of omics technologies, computational complexities, dataset variability, and limited data sharing among researchers. Nevertheless, by harnessing machine learning and statistical approaches—including pairwise integration, dimensionality reduction, and network-based methodologies—the integration process unlocks invaluable insights. These include the identification of novel biomarkers, therapeutic targets, and the development of enhanced risk prediction models, thereby illustrating the transformative power of integrated omic data in pushing the boundaries of our understanding of complex biological systems.
Navigating omic data integration: methods and challenges. Multiomics combines data from multiple platforms, offering comprehensive insight into biological systems. It begins with meticulous sample collection and preparation, encompassing various biological specimens like blood and tissue, pivotal for capturing a wide array of omic information, such as genomics, transcriptomics, epigenomics, proteomics, and metabolomics. The integration of multiomic data faces notable challenges in analysis and synthesis, including the high costs of omics technologies, computational complexities, dataset variability, and limited data sharing among researchers. Nevertheless, by harnessing machine learning and statistical approaches—including pairwise integration, dimensionality reduction, and network-based methodologies—the integration process unlocks invaluable insights. These include the identification of novel biomarkers, therapeutic targets, and the development of enhanced risk prediction models, thereby illustrating the transformative power of integrated omic data in pushing the boundaries of our understanding of complex biological systems.
MULTIOMIC: WHAT ARE WE TALKING ABOUT?
Multiomics, also known as integrated omics or panomics, combines multiple datasets to analyze, visualize, and interpret the mechanisms of biological processes.6 It aims to identify molecular markers by uncovering genomic, transcriptomic, proteomic, and metabolic changes and capturing spatiotemporal dynamics.7 Multiomics provides insights into molecular functions, interactions, and cellular outcomes, helping to identify predictive biomarkers and drug targets and refine disease prognosis.8 An understanding of single-omics strategies is essential before embarking on multiomics analysis, especially in PAH where each approach provides unique insights into the molecular mechanisms of the disease.
Genomics
Genomic techniques are designed to explore interindividual variation at both the germline and somatic levels by sequencing the genome of interest.9 The evolution from first-generation Sanger sequencing to the eventual third-generation long-read sequencing has facilitated whole genome/exosome sequencing with sufficient depth to characterize the mutational landscape within a given sample.10 For example, advances in genomics have revealed heterozygous germline mutations in the BMPR2 gene as the primary genetic cause of most cases of familial PAH, with over 600 mutations identified. These mutations, including nonsense, frameshift, splice-site, missense, and copy number variants, are prevalent in over 75% of affected families. Authors of genomic studies have also identified more than mutations in 18 other genes such as ACVRL1, ENG, SMAD9, and TET2, advancing our understanding and paving the way for targeted diagnostics and therapies.11,,,–14
Epigenomics
Epigenetics, the study of heritable traits or stable changes in cell function without changes in DNA sequence, encompasses histone modification, DNA methylation, and noncoding RNA (ncRNA) regulation.15 Epigenomics studies these modifications across the genome and provides insights into their role in cellular processes and disease development.16 Techniques such as chromatin immunoprecipitation sequencing (ChIP-Seq) map histone modifications, while assay for transposase-accessible chromatin sequencing reveals the dynamics of chromatin accessibility.17 Whole-genome bisulfite sequencing and DNA methylation microarrays profile DNA methylation patterns, and RNA sequencing (RNA-Seq) unveils ncRNA modifications.18,19 Epigenetic modifications play a critical role in PAH, with DNA hypermethylation associated with abnormal cell proliferation and resistance to apoptosis in small pulmonary arteries. Authors of studies have identified hypermethylation of specific genes such as BMPR2 and SOD2 in PAH, influencing disease pathogenesis.20,21 Epigenetic age acceleration observed in PAH patients suggests accelerated aging in key tissues and blood components.22 In addition, omic technologies have highlighted the regulatory role of ncRNAs, such as miR-17-5p and miR-23a-3p in PAH pathology, affecting potent signaling pathways like BMP/SMAD.23,24
Transcriptomics
Transcriptomics techniques such as next-generation sequencing and RNA microarrays enable the profiling of differentially expressed genes. These methods provide insight into distinguishing normal from disease states by quantifying mRNA abundance across thousands of genes.25,26 Bulk RNA-Seq provides a broad overview but lacks resolution of individual cell behavior, in contrast with single-cell RNA-Seq, which dissects transcriptomes at high resolution and identifies distinct cell types and states. The newly developed spatial RNA-Seq technology (spatial transcriptomics) preserves the spatial context of RNA expression and maps gene expression within tissue architecture for comprehensive studies of biological systems.25
Transcriptomic studies in PAH have provided tremendous insight into disease mechanisms and therapeutic targets. Rodor et al.27 uncovered the involvement of endothelial cells in PAH inflammation by demonstrating upregulated major histocompatibility complex class II pathways in a mouse model. Similarly, Potus et al.14 identified decreased TET2 expression in peripheral blood mononuclear cells from PAH patients, suggesting a role in disease pathophysiology. Single-cell RNA-Seq allows detailed cellular examination, highlighting gene expression variations within specific cell groups.28 Notably, activation of the NF-κB pathway in monocytes and dendritic cells has been observed in experimental PAH models.29 Moreover, comparative transcriptomic analyses revealed dysregulated genes across pulmonary artery cell clusters in PAH.30 On the other hand, spatial transcriptomics revealed immune cell patterns near damaged vessels in rat lungs with induced PAH, shedding light on the nuances of the disease.31 These advances provide insights into the pathogenesis of PAH and potential therapeutic strategies, underscoring the importance of omics technologies in unraveling complex diseases.
Proteomics
Proteomics investigates the functional implications of all proteins expressed in cells, tissues, or organisms, using mass spectrometry-based techniques to analyze the flow of protein signals.32 High-resolution mass spectrometers, including LTQ™Orbitrap™ and MALDI-TOF-TOF, accurately measure protein masses. Given the central role of proteins in biological processes, accurate measurement of proteome changes during disease development is critical for biomarker discovery.33 In PAH, Hołda et al.34 used iTRAQ-based LC-MS to analyze the RV proteome in MCT-induced PAH rats, revealing early upregulation of fatty acid β-oxidation and myosin-7 proteins and late overexpression of fibrosis-related proteins. Le Ribeuz et al.35 identified differentially expressed proteins in PAH-related cells, suggesting that KCNK3 deficiency induces cancer-related functions. In lung lobectomy homogenates, increased CLIC4 and decreased haptoglobin levels were associated with PAH.36 Plasma proteomic analysis in idiopathic/heritable PAH identified survival-associated proteins.37 With advances in proteomic technologies, O-link and SomaScan have emerged as critical tools, providing advanced capabilities to explore the intricate protein landscape within biological systems. O-link technology uses proximity extension assays for precise, high-throughput quantification of protein levels across multiple targets simultaneously, even in small sample volumes.38 In contrast, SomaScan employs a large library of aptamers to measure over 7000 proteins in a single run, providing unparalleled data breadth.39 Both technologies have played a pivotal role in PAH research, revealing novel biomarkers and therapeutic targets and advancing our understanding of the molecular mechanisms of the disease. For instance, Boucherat et al.40 used O-link to identify proteins associated with cardiac fibrosis in PAH patients, including latent transforming growth factor beta binding protein 2 (LTBP-2), which correlates with RV function, whereas Rhodes et al.41 used SomaScan to identify 6 prognostic proteins in a UK PAH cohort, complementing NT-proBNP and clinical risk factors for patient risk stratification. The authors of these studies have underscored the importance of proteomic analysis in elucidating pathological pathways and disease progression in PAH.
Metabolomics
Metabolomics, a branch of omics, is instrumental in elucidating the metabolic pathways underlying physiological or pathological processes. Using proton nuclear magnetic resonance spectroscopy or mass spectrometry, metabolomics analyzes biological samples to reveal intricate metabolic signatures.42 Over the past decade, the importance of metabolomics in identifying novel circulating markers of PAH has increased dramatically.43 In a PAH animal model (Sugen5416 plus the ovalbumin immunization), metabolomics revealed elevated levels of oxidized glutathione, xanthine, and uric acid, leading to increased xanthine oxidase-mediated reactive oxygen species release, which is known to impair pulmonary artery function.44 In addition, analysis of lung tissue from patients with advanced PAH revealed metabolic pathways that contribute to pulmonary artery remodeling, including imbalanced arginine pathways and altered heme metabolites.45 Moreover, metabolite profiling in idiopathic/heritable PAH patients identified altered nucleosides, energy metabolism intermediates, and decreased sphingomyelins, steroids, and phosphatidylcholines, which correlated with disease severity and patient survival.37 The authors of these studies have underscored the role of metabolomics in elucidating PAH mechanisms and its potential for in-depth phenotypic characterization and prognostic assessment.
NAVIGATING INTO MULTIDIMENSIONAL DATA
Integrating single-omic data into multidimensional/omic data is a challenging task that is crucial for understanding pathogenic mechanisms and identifying diagnostic or prognostic biomarkers. This transformative process merges information from different omics domains into comprehensive models.7 Data preprocessing, including rigorous quality control and normalization, ensures biological comparability across data types.46 Various integration tools, such as clustering, predictive modeling, and network-based methods, cater to specific data combinations and require careful selection to balance statistical robustness with biological relevance. The chosen methodology depends on the research objective, whether it is biomarker discovery, therapeutic target identification, or mechanistic insight.47 For biomarker discovery, clustering and predictive modeling prioritize data-driven insights, while authors of mechanistic studies tend to integrate biological context with data patterns using pairwise integration and network-based methods.48,49 The choice between supervised and unsupervised strategies also plays a crucial role in integration methodologies, with supervised approaches enhancing predictive modeling and biomarker discovery, while unsupervised strategies uncover novel patterns and enrich mechanistic explorations.50 A thoughtful selection process, considering biological nuances and data characteristics, is essential to unlock the full potential of multiomic research, especially in complex diseases like PAH.
CLUSTERING/DIMENSIONALITY REDUCTION-BASED APPROACHES
Clustering and dimensionality reduction are fundamental techniques in data science that simplify complex datasets and facilitate their interpretation. These methods unify disparate data types into a coherent analytical space, easing downstream integration and analysis.51 Clustering categorizes data points based on similarity to identify disease subtypes or patterns relevant to diagnosis and prognosis. Techniques such as hierarchical clustering and k-means clustering reveal hidden structures in the data, shedding light on disease subpopulations and potential markers.52 Dimensionality reduction simplifies data by reducing the number of variables considered, improving manageability and analysis. Methods such as principal component analysis (PCA) and multidimensional scaling distill complex datasets into informative components, preserving essential information while eliminating redundancy.53 The integration of multiomic data through clustering and dimensionality reduction is revolutionizing our understanding of biological systems and disease mechanisms. These methods enable researchers to gain unprecedented insights into and drive innovation in biomedical research.54 Techniques such as CIA/MCIA and FALDA demonstrate how dimensionality reduction fuses molecular data, facilitating the discovery of new disease subtypes and biomarkers while improving our understanding of complex biological interactions.55 This approach promises more precise and effective therapeutic strategies in the future.
In PAH, clustering and dimensional data reduction are commonly used in the unsupervised analysis of single-omic data to investigate whether datasets naturally segregate into groups based on experimental conditions (eg, PAH versus control, treated versus untreated).56,,–58 However, their use in a multiomic context is less common. Multiomic data integration via clustering has emerged as a cornerstone in PAH research to aid in sample classification and biomarker identification. This approach leverages different omic layers to unravel complex biological networks in PAH, providing a holistic view of disease pathology and facilitating precise sample clustering. As a result, unique cluster-specific biomarkers are discovered, offering promising avenues for targeted therapies and personalized medicine in PAH.59 For example, Wang et al.60 used a comprehensive approach, integrating mRNA, lncRNA, circRNA, and miRNA expression profiles of pulmonary artery samples, to differentiate hypoxia-induced pulmonary hypertension rats from controls. This unsupervised hierarchical clustering categorized samples into distinct groups based on molecular signatures, revealing the molecular landscape underlying PAH pathogenesis. Similarly, researchers integrated transcriptome and proteome analyses to differentiate between control and decompensated RV in PAH patients. They used RNA-Seq and proteomic approaches to study RV tissue from patients clinically categorized as control, compensated RV, and decompensated RV. PCA and unsupervised clustering revealed a distinct separation of decompensated RV samples, demonstrating the robustness of integrated omics in delineating pathological states in PAH. Such approaches not only enhance our understanding of the heterogeneity of PAH but also provide valuable insights into potential biomarkers and therapeutic targets for this complex disease.
PREDICTIVE MODELING APPROACHES
Predictive modeling has emerged as a powerful approach in the field of multiomics and big data exploration. It involves a series of steps, including collecting and merging data from multiple sources, selecting relevant features, training models, validating results, and translating findings into clinical practice.40,61,62 First, disparate data, such as patient records and genetic databases, are combined into comprehensive datasets. Then relevant biomarkers are identified using feature selection techniques to train machine learning models that predict disease outcomes. These models are evaluated using validation datasets, with biomarkers that show strong predictive ability being prioritized for further validation. Notably, this method does not require prior knowledge of omics integration, relying instead on algorithm training.63 Common machine learning methods include logistic regression, support vector machines, random forests, neural networks, Bayesian models, and boosting techniques.61 In summary, predictive modeling provides a robust and data-driven means to unravel complex biological processes and discover clinically relevant biomarkers in multiomic datasets.
In PAH, Pi et al.43 conducted a comprehensive analysis of metabolomic data from 117 PAH patients to uncover metabolites and metabolic pathways associated with indicators of disease severity and RV vulnerability. Their investigation focused on 5 key outcomes: RV dilation, NT-proBNP levels, REVEAL 2.0 score, 6-minute walk distance, and mortality. They first examined overall metabolic differences and their associations with these outcomes, followed by a detailed analysis of individual metabolites and pathways. The team used multivariate analysis techniques such as PCA and partial least squares discriminant analysis to understand global metabolic variation in relation to disease severity. Associations between outcomes and metabolites were assessed using linear and Cox regression analyses, adjusting for relevant factors. Of note, 65 metabolites were identified as associated with mortality, leading to the development of a predictive model using 11 consistent metabolites. This model, validated in an external cohort, shows promise for improving the care of PAH patients.43 In a similar study, researchers integrated transcriptomic and proteomic analyses to characterize RV changes in PAH patients with RV dysfunction. They identified 5 proteins—LTBP-2, COL6A3, COL18A1, TNC, and CA1—that were elevated in the blood of PAH patients. Predictive modeling was used to associate these proteins with patient survival were established, with LTBP-2 showing additional predictive value compared with conventional risk assessment methods.40 These findings underscore the potential of omic studies and predictive modeling to advance our understanding of PAH pathogenesis and improve biomarker discovery. Nevertheless, the application of predictive modeling approaches in a multiomic context in PAH remains relatively unexplored.
Pairwise Omics Data Integration
Pairwise omics data integration has emerged as a promising approach in biomarker discovery and understanding of disease mechanisms, enabling the identification of molecular signatures associated with disease etiology, progression, and treatment response.64 Pairwise omic data integration involves combining two omic datasets, such as genomics and transcriptomics or transcriptomics and proteomics, to uncover molecular relationships and interactions.65 A widely used approach is the analysis of expression quantitative trait loci (eQTLs), which stands out as a prominent method for pairwise integration, linking genetic variation with changes in transcriptomic profiles.66 The analysis of eQTLs serves as a method to elucidate the relationships between genetic variants (genomic data) and gene expression (transcriptomic data). Several computational methods for performing eQTL analyses exist, each offering unique advantages and tailored approaches to uncover the intricate relationships between genetic variation and gene expression patterns. These methods include robust computational algorithms such as GEMMA and Matrix eQTL.67,68 Other approaches, such as Bayesian methods and machine learning algorithms, provide flexible and versatile tools for eQTL analysis.69 Moreover, recent advances in single-cell sequencing technologies have paved the way for cell-specific eQTL mapping, enabling the dissection of transcriptional regulatory networks at unprecedented resolution.70 In PAH, eQTL analyses decode the complex interplay between genetic variation and gene expression dysregulation. Authors of studies have explored the relationship between genomic alterations and gene expression patterns in PAH, identifying potentially novel eQTL associated with immune-related pathways, shedding light on PAH pathophysiology, and offering insights into patient characterization and identification. For example, Prohaska et al.71 identified genome-wide single nucleotide polymorphisms associated with RASA3 expression in PAH patients and associated with disease severity and mortality. Similarly, Ulrich et al.72 performed transcriptome-wide eQTL analysis and uncovered novel genetic influences on gene expression variability, particularly in immune-related pathways, emphasizing the utility of eQTL in characterizing PAH patients.
Correlation analysis, another widely used approach, quantifies pairwise associations between omic features (eg, genes, proteins, metabolites) across samples, revealing coregulated or coexpressed molecular signatures.73 Additionally, pathway analysis tools map molecular features to known biological pathways, helping to identify dysregulated pathways associated with disease phenotypes.74 For example, Hou et al.75 used high-throughput sequencing to study mRNA and lncRNA interactions in PAH pathogenesis. They integrated cis and trans assays, constructed a lncRNA-mRNA coexpression network based on Pearson correlation coefficients, and established a lncRNA-miRNA regulatory network. Functional analysis revealed regulatory networks involving 285 mRNAs and 147 lncRNAs, highlighting the importance of transcriptome and epigenome integration in understanding lncRNA-mRNA interactions in PAH.75 Moreover, Chelladurai et al.76 performed a pairwise integrative analysis, combining RNA-Seq and ChIP-Seq, to compare the transcriptional profile of fibroblasts derived from individuals with PAH against healthy controls. This comprehensive approach uncovered a robust correlation between the altered histone modification signatures with the aberrant gene expression pattern observed in PAH fibroblasts.76 Similarly, researchers have focused on pairwise integration of transcriptomic and proteomic data to uncover mechanisms of RV dysfunction and identify novel biomarkers to assess RV function in PAH. This combination of knowledge is critical to elucidate molecular mechanisms and improve our understanding of PAH.40,56
Network-Based Methodologies
Network-based methods play a critical role in multiomics integration by modeling complex interactions between biological molecules, facilitating the identification of key regulatory elements, pathways, and potential therapeutic targets.77 These methods fall into 2 main categories based on network construction: those that use established, experimentally validated interactions sourced from scientific literature databases and those that use correlational or statistical approaches.78 Networks based on established interactions include protein-protein interaction (PPI) networks from sources such as STRING and BioGRID, gene regulatory networks detailing the relationships between transcription factors and target genes, and pathway-based networks from databases such as KEGG and Reactome.79 In contrast, statistical methods such as weighted gene coexpression network analysis (WGCNA) and correlation networks compute pairwise correlations or use advanced machine learning techniques to infer functional associations or coregulation between omics features, potentially revealing novel interactions.80 While established interaction-based networks are valued for their accuracy, correlational or statistical methods are essential for their ability to explore and hypothesize novel biological connections, albeit with the risk of false positives. In the field of PAH, network-based approaches provide a systemic understanding of the molecular mechanisms driving pathogenesis and facilitate the identification of regulatory elements, signaling pathways, and therapeutic targets.
For example, Li et al.82 characterized differentially expressed genes in PAH lungs and constructed a PPI network to evaluate functional relationships between hub genes using the STRING database. This tool integrates multiple data sources, including experimental evidence and computational predictions, to predict and visualize protein interactions. It assigns confidence scores to these interactions based on supporting evidence, facilitating the creation of a visual network where proteins are nodes and their interactions are edges.81 Through this analysis, the study authors identified 9 hub genes that were significantly upregulated in PAH lung tissue compared with control, revealing connections to pathways involved in DNA-templated transcription, sister chromatin cohesion, mitotic nuclear division, and regulation of actin cytoskeleton. These findings provide potential mechanistic insights into the development of PAH by elucidating the interplay of biological processes at the molecular level within PPI networks, which will facilitate the identification of novel therapeutic targets.82
While several statistical network approaches are available for PAH research, WGCNA is the most widely used method. For example, Kasavi83 conducted a comprehensive study integrating omics data by analyzing genome-wide gene and miRNA expression patterns in idiopathic PAH patients and controls. Using WGCNA in R, the author constructed a gene coexpression network to identify clusters of highly correlated genes, which were then integrated with the human PPI network to uncover novel molecular insights. Using miRNA-target gene interactions from the miRTarbase database, the author aimed to unveil molecular signatures and potential therapeutic drug candidates.83 Similarly, Duo et al.84 applied a WGCNA approach to identify key modules associated with PAH and to develop a diagnostic signature and immune landscape for the disease. Their research contributed to a better understanding of the molecular mechanisms underlying PAH and provided valuable insights for diagnostic and therapeutic advances in PAH.84
CHALLENGES ASSOCIATED WITH MULTIOMIC RESEARCH
Over the past decade, the advent of omics technologies has revolutionized our understanding of disease etiology and led to groundbreaking discoveries in the identification of novel biomarkers and therapeutic targets in PAH. This transformative shift has broadened the scope of research methodologies, allowing us to examine molecular changes comprehensively, including global transcriptomic and proteomic changes, rather than focusing solely on individual genetic or protein components. By adopting this holistic approach, researchers have gained deeper insights into the intricate molecular landscape of disease, unraveling its complexity and elucidating key biological pathways and mechanisms involved. However, the next frontier is to integrate multiple layers of omic data to obtain a unified view and identify multiomic hubs affected in PAH, ultimately improving our understanding of its pathophysiology, accelerating biomarker discovery, and prioritizing potential therapeutic targets. However, the integration multiomic data poses several challenges (Figure 1), including the need for diverse expertise (biological, computational, and programming skills), data heterogeneity, and the significant costs associated with such approaches. Consequently, authors of previous studies on PAH have predominantly focused on single-omic analyses, while integrated multiomic approaches remain largely unexplored in the field. Overcoming these challenges is crucial for unlocking the full potential of multiomic integration and advancing our understanding and treatment of complex diseases such as PAH.
The heterogeneity and variability inherent in omics research can be broadly categorized into 2 main types. The first type encompasses the biological variability inherent in the samples themselves and the criteria used to include or exclude them. This category includes factors such as the age, sex, and origin of the biological subjects as well as the methods used to obtain the tissues. For instance, differences in blood samples obtained via venipuncture versus right heart catheterization may introduce variability. Similarly, when working with tissue biopsies, the specific part of the tissue collected (eg, superior versus inferior or left versus right lung sections) and the method of collection (autopsy versus biopsy) warrant careful consideration to accurately interpret omics data. In addition, the selection of a control cohort, which could range from healthy individuals without any comorbidities to patients without a diagnosis of PAH or individuals undergoing lung cancer resection, can significantly influence the data. The second type of heterogeneity arises from data processing decisions, such as the choice of reference genome, alignment methods (eg, STAR versus HISAT2), various cutoffs and quality control measures, and the analytical pipelines employed, which are often subject to individual investigator preferences. This lack of consensus on omics data analysis methods, combined with the lack of standardized methodologies for data collection, processing, and analysis across different omics studies, poses challenges for reproducibility and comparability of results across studies. For example, a recent systematic review highlighting the epigenetic changes associated with RV dysfunction underscores the reproducibility issues among authors of studies using microarrays to investigate microRNAs involved in RV failure in PAH.85 This variability underscores the critical need for standardized approaches in omics research to improve the comparability and reliability of findings.
The financial burden associated with multiomics research has become increasingly apparent with the advent of advanced technologies like single-cell analyses, spatial transcriptomics, and Olink proteomics. While these methods are transformative, they significantly increase the cost of conducting large cohort studies. As a result, data mining of previously published datasets is emerging as a compelling research methodology to mitigate both the financial constraints and the sample availability challenges. This is particularly relevant for rare diseases such as PAH, where obtaining tissue samples from organs such as the RV and lungs can be difficult for many laboratories. However, relying on previously published data to supplement new research brings its own set of complexities, particularly regarding the reproducibility of results. One major issue is that metadata, which are crucial for understanding the context and conditions under which the data were collected, are not always completely or readily available to the research community. Furthermore, some datasets may not be shared or accessible at all, limiting their utility for further investigation and slowing the pace of new discoveries in the field. This lack of accessibility does not serve the interests of patients, scientific advancement, or research progress. This scenario complicates efforts to replicate studies or build on existing research and underscores the need for better standards and practices for data sharing and documentation in the multiomics field.
PROMISES OF MULTIOMICS INTEGRATION FOR IDENTIFYING NOVEL BIOMARKERS AND TREATMENT TARGETS
The incorporation of multiomic approaches into PAH research is still in its infancy, primarily nestled within the domain of basic science to improve our foundational comprehension of the disease. Multiomic analyses in PAH aim to unravel complex biological mechanisms, including the impact of histone modifications on the transcriptome, the regulatory functions of ncRNA on gene expression, and the delineation eQTLs.72,86,,,,–90 Such exploratory endeavors are pivotal because they contribute to building a robust framework of fundamental biological knowledge, albeit with a delayed trajectory toward direct clinical applicability. Notwithstanding the invaluable insights afforded by basic science research, the foray of multiomic methodologies into the clinical and translational research landscape of PAH has been relatively limited. To date, the application of multiomic approaches has focused primarily on the identification and prioritization of potential biomarkers, notably through the detection of gene alterations at both the transcriptomic and proteomic levels in the RV and blood of patients suffering from RV failure due to PAH40 as well as the establishment of gene expression profiles and identification of novel protein alterations in the lungs of patients afflicted with PAH to gain further insight into novel biomarkers that characterize this disease.91 Similar analysis has been used to discover novel therapeutic targets in PAH.87 However, the superior efficacy of a comprehensive multiomic biomarker panel that includes specific markers from epigenomics, proteomics, and metabolomics over traditional single-omic strategies remains an underexplored frontier. Similarly, the exploration of the benefits of panomic therapeutic strategies that may emerge from multiomic research is still in its preliminary stages.
CONCLUSION
In conclusion, the integration of multiomic data holds significant potential and promises for unraveling the complexities underlying the pathophysiology of PAH. However, evolving from theoretical research to tangible clinical applications remains a major challenge. The journey to effectively bridge the gap between groundbreaking multiomic discoveries and their clinical application is a daunting task, highlighting an urgent need for continued research and innovation in this field. Moreover, this journey requires not only advancements in technology and analytics but also a multidisciplinary approach that encompasses clinicians, researchers, and patients. By fostering collaboration across these diverse areas, we can accelerate the translation of multiomic insights into treatments that significantly improve patient outcomes, pushing the boundaries of what is currently possible in PAH care.
References
Disclosure: The authors have no relevant personal financial relationships to disclose.