Advances in high-dimensional mass cytometry cell and tissue analyses for translational biomarker discovery

The systematic profiling of cellular functions and phenotypes, especially for cancer and inflammatory diseases, enables patient stratification strategies and personalized medicine approaches. These studies include quantification of cell surface phenotypic and activation markers, and detection of secreted proteins and peptides. This information can aid discovery of both clinical biomarkers, as well as Ex Vivo/ In Vitro read-outs to guide development of novel therapeutics. The focus of this review is to highlight the value of the precision medicine data generated using novel high-dimensional technologies, such as mass cytometry (MC) and imaging MC, and to review emerging data tools which enable comprehensive data analysis and disease mechanism elucidation. Integration of these novel multiplex read-outs and corresponding mathematical analyses has facilitated discovery of previously unrecognized prognostic clinical biomarkers. Correspondence to: Ilona Kariv, Ph.D., Merck & Co., Inc., 33 Avenue Louis Pasteur, Boston, MA, USA 02115, Tel: 617-992-2098, Fax: 617-992-2487, E-mail: Ilona_Kariv@merck.com

During the last two decades our knowledge of cellular homeostasis during disease origination and progression has significantly advanced due to the systematic profiling of cellular functions and phenotypes, especially for immune-cell mediated pathologies [1,2]. In fact, many clinical trials now include cellular immunophenotyping as a biomarker component [3,4]. Patient specimens, often as small as a core needle biopsies or superfluous volumes of routinely collected fluids such as blood or sputum [5], are tested by a multitude of single cell profiling methods focusing on gaining a better understating of the cellular milieu. These studies include quantification of cell surface phenotypic and activation markers [6], ex-vivo analysis evaluating response to therapies [7], and detection of secreted proteins and peptides [8]. Multiplexed analyses throughout clinical trials have enabled the collection of valuable information from precious and limited patient specimens to investigate cellular disease perturbations. Ultimately these data aid in the development of disease diagnostic and prognostic tools, and more importantly determine patient stratification strategies which enable personalized medicine approaches. The focus of this review is to highlight the value of the translational data generated using novel technologies, such as mass cytometry (MC) and imaging MC, which has brought single cell immunophenotyping and imaging of the tumor microenvironment and other tissues to a new forefront, and to review emerging data tools which enable comprehensive data analysis and interpretation, often identifying previously unrecognized connections between cellular phenotypes and functions in healthy and diseased states [9][10][11][12].
Although still in early development, MC based single cell immunophenotyping and analysis are steadily gaining recognition and becoming a mainstay in not only immune-mediated disease research, but in interrogating basic biology for other therapeutic targets [13][14][15][16]. MC or cytometry by time of flight (CyTOF) merges concepts of both flow cytometry and mass spectrometry [17,18]. This method relies on the binding of labeled antibodies (Abs) to detect specific cell surface antigens (Ags) or intra-cellular proteins, similar to well-established flow cytometry (FC) and other immunofluorescence based techniques. However, fluorescent signal spillover associated with traditional flow cytometry methods limits signal detection to 12-15 analytes for a single sample [18,19]. Furthermore, a typical phenotyping panel of 12 colors may permit only 2-3 markers to be dedicated to the identification of each of the major immune cell subsets, hence phenotyping of highly heterogeneous samples becomes particularly challenging because numerous markers are required to positively identify multiple cellular populations and activation states. In contrast to FC, detection Abs for the MC method are labeled by lanthanide metals, and the actual read-out is the mass of the analytes associated with single cells by highresolving mass spectrometry. Given that the lanthanide metal-tags are not endogenously found in biological systems, this detection method has virtually no background signal that needs to be compensated as in fluorescence-based techniques [21]. The merging of highly specific Ab binding and a precise detection of metal traces by mass spectrometry allows for detection of an unprecedented number of total analytes, currently upwards of 50, to be accurately quantified on a single cell level [22,23]. Furthermore, the resolution of available analytes is relatively similar across the entire mass range available for CyTOF [24], allowing for the same staining sensitivity for abundant and lower density antigens. This advantage simplifies Ab panel design as the low abundance antigens can now be discerned equally by almost all the metal tags spanning the whole mass channel range. This is in contrast to flow cytometry where rare antigens are typically stained by Abs conjugated to brightest fluorophores, such as PE or BV421 [25][26][27], in order to provide adequate signal resolution from background detection.
Even though CyTOF instruments have been commercially available for only five years, in this short period of time an ever increasing number of novel and groundbreaking findings have been published [28][29][30][31]. For example Hansmann et al. (2015) have used a multiplexed 39-parameter MC analysis to identify a novel memory B cell subset (CD24l o CD38 + CD27 + ) that is highly enriched and unique to multiple myeloma (MM) patient peripheral blood mononuclear cells (PBMCs) [32]. While the precise function of these cells remains to be elucidated, these memory B-cells were not detected, using a similar multiplexed MC analysis, in other cancer types, demonstrated a unique association and a potential diagnostic biomarker for MM. Similarly, Thomas, et al. . In this study, the authors systematically profiled disease mediated changes in the immune cell compartment in tumor specimens, normal adjacent tissue, and the peripheral blood in early lung adenocarcinoma treatment naïve patients in response to treatment. More than 30 cell surface markers of the lymphoid and myeloid origin were profiled in 32 patients' specimens, allowing for the identification of immune cells likely responsible for promoting an immunosuppressive microenvironment. The authors report that the tumor microenvironment, even as early as in stage I lung adenocarcinoma, was enriched for PD-1 hi CTLA-4 hi regulatory T-cells, and PPARγ hi macrophages [39]. Furthermore, an increase in CX3CL1 positive cells [39] at the tumor lesion could provide a potential target for novel therapeutic strategies. Given these findings, it is likely that a neo-adjuvant or adjuvant immunotherapy may be an appropriate course of treatment [39], with the ultimate goal of shifting the balance and repopulating the tumor microenvironment with infiltrating lymphocytes, and harnessing the individual's own immune system to fight the cancer.
Multiparametric studies of single cells as enabled by mass cytometry can also assist in defining sample integrity for clinical biomarkers studies. This is of particular importance considering that collection of patient specimens and downstream analyses often occur at different locations and are accompanied by significant time delays associated with sample transport, and the need for simultaneous analysis of all collected samples for a given study. Therefore researchers frequently rely on data obtained using cryopreserve specimens. Indeed, significant evaluations have been carried out in the past to determine proper isolation, shipping and storage conditions of various bio-specimens [40,41], as well as assessment of the effects of cryopreservation on sample integrity [42,43]. The conclusions from many of these studies are variable and contradictory. While some publications state that fresh and frozen specimens are equivalent [44], others report on potential limitations in the use of frozen specimens [45]. These discrepancies are possibly in part due to the restrictions associated with FC, allowing for only a focused subset of markers to be evaluated on fresh and frozen specimens. Our recent study employed an unbiased and comprehensive MC phenotyping panel. These data revealed significant reduction in detection of cellular subsets upon cryopreservation, most notably myeloid derived suppressor cells (MDSC), defined by co-expression of CD66b+ and CD15+, HLADRdim and CD14-phenotype, while most lymphoid markers were unaffected, as evident by staining equivalency of fresh and cryopreserved specimens [46]. More recent reports [47,48] have also demonstrated the applicability of MC-based single-cell analysis to interrogate novel immune-checkpoint receptors' therapies.
To date most of the FC and MC knowledge is delineated from single cell suspensions, thus negating cellular co-localization and potential implications of neighboring cells to the tissue homeostasis. A more recent advancement to the mass cytometry is the application of the metal tagged analytes to interrogate sections of tissues and adherent cells, rather than cell suspensions, hence giving researchers an unprecedented spatial resolution of the cellular microenvironment and composition. Imaging mass cytometry (IMC) and multiplexed ion beam imaging (MIBI) with resolutions similar to light microscopy, combined with high multiplexing abilities of CyTOF [49] . For the IMC method, tissue sections are ablated by a high-powered laser, releasing clouds of particles which are then carried to the mass cytometer and analyzed accordingly [50]. In contrast, MIBI quantifies secondary ions which are released when tissues are subjected to a rasterized oxygen primary ion beam under vacuum pressure [51]. Both methods employ previously established methods for tissue preparation and staining with isotopically pure lanthanide tagged antibodies and acquisition is accomplished by a mass spectrometer. The acquired data is then processed to reconstruct images of the tissue sections, which can be further analyzed by traditional bioinformatics methods. Currently IMC and MIBI allow for up to 32 markers to be simultaneously visualized on single tissue sections [52], while traditional colorimetric immunohistochemistry (IHC) staining necessitates the consecutive processing of up to only four analytes on the same slide, in order to achieve multiplexing [53]. Immunofluorescent (IF) staining methods allow for the quantification of up to 7 simultaneous markers, limiting the number of analytes due to spectral overlap of the fluorescent probes [54]. Because of the limitations of the traditional methods an unbiased multiplexed analysis of tissue sections and adherent cells is challenging given that only a select few analytes can be employed for interrogation. Giesen, et al. (2014) applied a 32-plexed IMC panel to 32 FFPE breast cancer samples, and interrogated the tissue for HER2 expression levels, a common biomarker currently used for patient stratification [55,56], while a smaller cohort of breast cancer tissue samples, also focusing on HER2 was carried out by both Angelo et al. and Rost et al. using MIBI. The data highlighted in all three of these studies provided proof of concept by using tissue specimens with known HER2 levels and showed a significant correlation with established methods [50,51]. Furthermore, these studies demonstrate the applicability of both IMC and MIBI to significantly improve delineation of tumor heterogeneity and cellular interactions. Even though commercial IMC and MIBI modules have only recently become available to the research and clinical laboratories, it is likely that as of yet unrealized potential of these technologies will further advance our understanding of the tumor and tissue microenvironment, and thus benefit the discovery With the widespread applications of highly-multiplexed single cell analysis methods, the development of new data processing methods, needed to abridge big data sets into manageable and meaningful results, has become the next frontier. For example, the use of dimensionality reducing algorithms such ViSNE [61] and clustering methods common for SPADE analysis [62], has greatly improved the post-acquisition data analysis workflow, making traditional semi-manual hierarchical or two-dimensional gating obsolete. Both ViSNE and SPADE process the data in an unbiased approach by employing algorithms that evaluate all the different signals simultaneously and group cell subsets based on the expression of all tested markers, thus uncovering cellular phenotypes which may have been overlooked using traditional hierarchical approach. While both ViSNE and SPADE have been the most widely used and validated applications for the analysis of high-dimensional data, a number of other algorithms using dimensionality reduction and clustering are becoming available for use [63].
However, these analytical methods, geared towards analyzing single cell data, do not accurately depict spatial information when imaging tissue sections. For this purpose, histology topography cytometry analyses toolbox (histoCAT) has been developed by the University of Zurich researchers [64]. This software relies on the preprocessing of images, such as cell segmentation by CellProfiler [65] or other methods, followed by pixel based processing in histoCAT to recreate spatial co-localization of the cells. The single cell information is then overlaid on the segmentation masks allowing the identification of cellular features, including abundance of measured markers, cellular morphological features, such as size and shape, and more significantly, information about cellular microenvironment, such as co-localization of the neighboring cells and cell crowding [50,51]. Schapiro, et al. used histoCAT to analyze 49 diverse breast cancer specimens and matched normal controls, which were imaged via multiplexed IMC. They could identify 29 unique phenotype clusters as defined by expression of specific epitopes, and subsequently focused on clusters containing tumor associated macrophages (TAMs) [64] due to their known involvement in tumor progression or inhibition thereof [66,67]. Among other findings, their analysis indicated that the immediate cellular environment of TAMs is distinctly proliferative and hypoxic, as evident by expression of Ki-67 and carbonic anhydrase IX respectively [64]. Because both intra-tumor hypoxia and hyper proliferation are common hallmarks of breast and other cancers [68], these findings support proof of concept for multiplex tissue staining and data analysis tools.
More recently, multifaceted tools, allowing integration of features common to a number of algorithms, such as ImmunoClust [69] and Scaffold/immune reference maps [70] have further facilitated automated mapping of cellular phenotypes in the whole organism. Using these advanced algorithms, Spitzer et al. have used mass cytometry to interrogate and analyze the bone marrow from wildtype and mutant C57BL/6 mice and process cell data onto intuitive maps, allowing to distinguish between immune cell organization in specific genetic variants and even circadian rhythms of tissues [70]. These studies exemplify how genetic, environmental, and pathophysiological causative factors can be linked to disease initiation and progression [70].
While mass cytometry has clear advantages over traditionally used fluorescent methods, there are various reasons why many labs continue to rely on flow cytometry studies. It is a well-established technology and has been used for immunophenotyping since the early 1970s, with instrument operation and data analysis workflows thoroughly optimized. Due to its long-standing existence, FC community represents a vast network of expertise that novel users can rely on. Reagents are easily accessible for multitude of antigens for different species as Ab-multiple fluorophore conjugates. In addition, the wide spread use of FC ultimately has brought instrument and reagent costs within budget for most research institutions thus making the choice a simpler one. Furthermore, development of newer and brighter flourochromes [71], increased acquisition speeds [72] and improved optics implemented during the last four decades of the flow cytometry use provide researchers with advanced single cell profiling methods. Most recent innovation to FC, a novel platform (SymphonyTM, Beckton Dickinson, CA) allows for quantification of up to 30 parameters. Therefore, many institutions opt not to make a choice between flow and mass cytometry, but rather prefer to complement different detections methods to obtain comprehensive data sets to advance knowledge of disease precision medicine. In fact Lavin et al.
(2017), in their efforts to thoroughly map the innate immune landscape of early lung adenocarcinoma have employed numerous analysis methods beyond mass and flow cytometry, ranging from imaging of tissue sections by IHC to multiplexed profiling of secreted soluble factors and high-dimensional RNA sequencing.
These types of analyses wouldn't be possible without the development of new tools [49], refinement of established platforms [72] and improvements in data processing and analysis methods [63]. Thus, enhancing knowledge of cellular phenotypes using single cell profiling methods, and furthermore understanding of the cellular spatial co-localization in the complex tissue environment, will certainly improve clinical biomarker strategies and enable discovery of novel therapeutics.

Flourescent Cytometry References
Number of analytes