Complexity across scales: a walkthrough to linking neuro- imaging readouts to molecular processes

Neurodegenerative disorders are often classified as a “multifactorial syndrome” as they share similarities between many genetic, clinical, psychological as well as environmental factors [1,2]. They are highly debilitating clinical conditions that result in progressive neuronal degeneration. Alzheimer Disease (AD) in particular is characterized by progressive neuronal dysfunction and regular decline in cognition and behavior. The cause of AD is broadly classified into two categories: sporadic and familial. The most common form of familial mutations is due to three major genes namely APP, PSEN1 and PSEN2 [3-5]. However, the sporadic form of AD is a complex amalgamation of genetic polymorphisms, environment as well as social lifestyle [6-8]. Although it has been decades since the search for novel biomarkers commenced, there is still no proper diagnosis and treatment for AD [9-11]. Barrett and Hunter’s team report that the lack of efficient treatment for AD could be primarily due to a sort of careless misdiagnosis of the disease by physicians [12,13]. Such errors could be an act of lack of attention in routine medical examinations. The existing health care treatment for AD is symptomatic relief [14,15]. However, it is widely disputed that the altering neurodegenerative patterns actually commence much earlier than the actual clinical manifestation of the disease. Therefore, early detection would not only improve the diagnostic accuracy in the clinics but also aid clinicians to offer better and earlier treatment for cognitive and behavioral problems [16,17] as well as better quality of life and economic outcomes.


Introduction
Neurodegenerative disorders are often classified as a "multifactorial syndrome" as they share similarities between many genetic, clinical, psychological as well as environmental factors [1,2]. They are highly debilitating clinical conditions that result in progressive neuronal degeneration. Alzheimer Disease (AD) in particular is characterized by progressive neuronal dysfunction and regular decline in cognition and behavior. The cause of AD is broadly classified into two categories: sporadic and familial. The most common form of familial mutations is due to three major genes namely APP, PSEN1 and PSEN2 [3][4][5]. However, the sporadic form of AD is a complex amalgamation of genetic polymorphisms, environment as well as social lifestyle [6][7][8]. Although it has been decades since the search for novel biomarkers commenced, there is still no proper diagnosis and treatment for AD [9][10][11]. Barrett and Hunter's team report that the lack of efficient treatment for AD could be primarily due to a sort of careless misdiagnosis of the disease by physicians [12,13]. Such errors could be an act of lack of attention in routine medical examinations. The existing health care treatment for AD is symptomatic relief [14,15]. However, it is widely disputed that the altering neurodegenerative patterns actually commence much earlier than the actual clinical manifestation of the disease. Therefore, early detection would not only improve the diagnostic accuracy in the clinics but also aid clinicians to offer better and earlier treatment for cognitive and behavioral problems [16,17] as well as better quality of life and economic outcomes.
State-of-the-art brain imaging technologies provide high-resolution information of structural and functional alterations. Therefore, they offer unprecedented early diagnosis; they also provide the opportunity for regular monitoring of a progressive clinical condition such as AD. Furthermore, imaging techniques aid in tracing the transition between diagnostic states such as Mild Cognitive Impairment (MCI) and AD.
Depending on brain complexity, imaging techniques reveal different dimensions of brain structure and function. They can be broadly classified into three groups namely [

Structural neuroimaging
Magnetic Resonance Imaging (sMRI), Computed Tomography (CT) and Diffusion Tensor Imaging (DTI) are some of the prominent structural neuroimaging techniques. Structural MRI is widely used to examine the shape, size and structural alterations in the brain regions [20,21]. DTI is an advanced MR technique that helps in understanding structural connectivity between brain regions [22,23]. These techniques primarily help in observable indicators such as "tissue damage" or loss of brain regions as well as measurable indicators such as white or gray matter changes and morphological changes such as cortical thinning [24,25]. These indicators are collectively classified as neuroimaging biomarkers as they are quantitative tracers of the disease progression. Some important neuroimaging biomarkers are listed below:

Atrophy
Brain atrophy is one of the most prominent neuroimaging biomarker for AD. Atrophy refers to the loss of nerves and tissue, which ultimately results in the shrinkage of the brain [26,27]. It has been previously estimated that whole brain atrophy affects 2% of AD patients while the rate of atrophy in normal ageing does not exceed beyond 0.7% per year [28]. According to Frisoni, et  MRS is a widely used non-invasive imaging technique that helps in measuring the metabolites found in brain tissues. It also facilitates in measuring the chemical composition of tissues such as myo-inositol, choline, n-acetyl aspartate as well as choline. The advanced MRS imaging techniques help in identifying patients much ahead of the clinical onset of AD [46,47].

Brain glucose metabolism
Recent advancements in functional imaging studies have contributed significantly to identification of patterns amongst patients, who are at the risk of developing AD [48,49]. The earliest PET imaging abased studies were used to detect altering glucose metabolic changes amongst patients who were at a genetic risk of developing AD [50-52]. PET-based radioisotopes such as oxygen (0 15 ) aid in tracing changes in cerebral blood flow which are often caused due to increased neuronal activity [53][54][55]. Similarly, [18 F] fludeoxyglucose-positron emission tomography (FDG-PET) detects bilateral temporoparietal hypometabolism [56-58]. They have been widely used as a diagnostic differential biomarker discriminating between patients with AD dementia and vascular dementia [59,60]. Another radioisotope based biomarker that is widely used in diagnostic studies is C-labeled Pittsburgh Compound-B ([(11)C]PIB). The increased binding potential of PiB was found to be common amongst MCI patients whereas decreased FDG uptake was observed only with patients with AD, thus serving a crucial diagnostic biomarker [61,62] (Figure 2).

Perfusion
Imaging techniques such as SPECT and DTI enable early detection of hypoperfusion in the white matter and cortex [63,64]. Abnormal cerebral perfusion are clear indicators of diagnostic transition from MCI to AD [65,66]. Borroni and Chao et al., has demonstrated patterns of hypoperfusion in parietal, temporal and posterior cingulate cortex in all those patients who are progressing from MCI to AD [67,68]. Another study performed by Caroli et al., compared three diagnostic groups namely CN, MCI and AD. The outcome of this study reported that hippocampal hypoperfusion pattern was found across patients with amnestic MCI in transition to AD [69] (Figure 3).

Emerging combinatorial biomarkers for AD
Clinical neuroimaging biomarkers are useful resources for AD diagnosis. However, the characteristics of these imaging biomarkers are not yet adequate for diagnosis of patients at an individual level. This is largely due to the lack of longitudinal imaging data [70,71]. Combining known genetic biomarkers with imaging data could improve the prediction pattern across all patients [72][73][74]. Neuroimaging genetics is an emerging field in which quantitative phenotypic features from brain imaging are used as readout to inspect the role of genetic variation in brain function [75,76].
Large scale GWAS studies have contributed to the identification of many risk mutations associated with AD such as CLU, PICALM, BIN1, CR1 and so on [77][78][79]. These studies have created a substantial shift in the mundane AD detection through standard cognitive tests. Of all the above mentioned genes, CLU is the most significant gene used in combinatorial imaging analysis. The risk variant rs11136000 have been associated with reduction in hippocampal volume in patients with Late Onset Alzheimer Disease (LOAD) [80][81][82]. Apart from CLU, the risk variant rs541458 of PICALM was found to be associated with CSF Abeta 42 levels [83][84][85]. Similarly, large scale initiatives across the globe have already started investing in the direction of combining genetic and imaging derived biomarkers for better AD diagnosis (Table 1).

Large scale initiatives on neuroimaging and genetics
Here, we summarize the various initiatives that are focusing on integrating multi-scale data such as imaging and genetics for efficient diagnosis and treatment.

ADNI
ADNI is considered as one of the biggest ongoing multicenter study for developing longitudinal clinical, imaging, genetic and neuropsychological biomarkers for early detection of AD. The initial phase (ADNI-1) study had the greatest enrollment of participants comprising of 400 early MCI subjects, 200 AD and 200 Controls. Owing to its success, the study was further extended into the next NIFT: Ventricular enlargement

Study type Cohort Snps Imaging readouts Outcome
Voineskos, et al. [86] Philadelphia Neurodevelopmental cohort rs12148337 White matter fractional anisotropy The mutation had a polygenic risk score with white matter FA in schizophrenic population Louwersheimer, et al. [87] Amsterdam Dementia Cohort rs2070045-G (SORL1) Hippocampal atrophy SORL1 SNP rs2070045-G allele was related to CSF-tau and hippocampal atrophy, 2 endophenotype markers of AD, suggesting that SORL1 may be implicated in the downstream pathology in AD.

Oliveira-Filho, et al. [94] Boston Cohort rs20417
White matter hyperintensity volume rs20417 polymorphism was associated with increased WMHv (P = .037),not cardioembolic stroke patients. phase (ADNI-2) with additional 550 participants. This study aimed at developing a standardized protocol for data integration and collection for MRI, PET and CSF biomarkers in a global environment [95,96]. The outcome of this study produced interesting hypotheses which went beyond conventional understanding of the AD pathology. One of the earlier studies demonstrated that image derived biomarkers such as "atrophy" and "hypometabolism" exhibited a pattern based on the disease progression and severity [97,98]. Many successive studies also demonstrated the importance of CSF biomarkers, PET based biomarkers as early indicators of pre-clinical AD [99][100][101]. Another sister initiative of ADNI is called ADNI Genetics Core, which provides the possibility for researchers to estimate the genetic alterations using imaging features for understanding disease progression over time [102][103][104].

The European Alzheimer's disease Neuroimaging Initiative (E-ADNI)
The overall goal of the E-ADNI initiative was to apply the standardized protocol of collecting images, genetics, and clinical as well as psychological data by adapting the European Centers of the Alzheimer 's disease Consortium (EADC). This initiative was propelled to encourage the academic EADC centers to adopt the ADNI protocol for enrolling participants [105,106].

The Italian Alzheimer's Disease Neuroimaging Initiative (I-ADNI)
The I-ADNI initiative was launched in succession to US-ADNI study for validating the acquisition and processing protocol of structural MRI scans obtained from different clinics across Italy by following the procedure from the original ADNI study [107,108].

The Australian Imaging Biomarkers and Lifestyle Study of Aging (AIBL)
The AIBL (https://aibl.csiro.au/about/) initiative consists of 1,200 Australian participants who were longitudinally assessed for over 5 years. This study was launched in 2006 to identify biomarkers, cognitive assessments, genotype, biomarkers such as APOE, social and health factors for monitoring AD progression and early AD treatment. The AIBL initiative has given rise to lot of insights such as AD patients are prone to be more anemic than patients with MCI [109,110]. Participants enrolled in this initiative are continuously assessed every 18 months for any clinical indication of the disease. EPAD EPAD (http://ep-ad.org/) stands for European Prevention of Alzheimer's Dementia Consortium. It is a major European initiative for developing systematic and flexible approaches to clinical trials of drugs for preventing Alzheimer's dementia. The adaptive trial design in EPAD promises a faster and low cost drug production in the market. The imaging protocol of EPAD is adapted from the AMYPAD initiative which brings together the academic and private research groups for PET based studies to explore amyloid-beta as a therapeutic marker for AD [111,112]. AMYPAD AMYPAD (http://www.amypad.eu/) stands for Amyloid Imaging to Prevent AD. This project was initiated to investigate the betaamyloid biology through PET scans from pre-symptomatic population as a diagnostic and therapeutic biomarker for AD. The AMYPAD project is funded by the Innovative Medicine Initiative (IMI) program and will run initially over 5 years. In the course of this project, patients susceptible to AD will be scanned for beta-amyloid through PET imaging. The initiative aims at improving the diagnostic standards for AD treatment and prevention (http://www.alzheimer-europe.org/ News/EU-projects/Thursday-17-December-2015-AMYPAD-projectprogresses-to-second-stage-of-applications-for-IMI2-Call-5).
PPMI ENIGMA ENIGMA (http://enigma.ini.usc.edu/) stands for Enhancing NeuroImaging Genetics Through Meta-Analysis. This consortium is an effort towards bringing researchers from diverse domains such as imaging genomics, neurology and psychiatry together to understand brain structure and function through MRI, DTI, fMRI, genetic as well as patient data. This study has so far analyzed 12,826 subjects. The preliminary project of ENIGMA was to identify common genetic variants in hippocampal or intracranial volume using Genome Wide Association Studies (GWAS). ENIGMA2 was the next project to explore genetic variants associated with subcortical volumes and ENIGMA-DTI was designed to explore genetic variants associated with white matter microstructures. Apart from meta-analysis based studies, the consortia are also focusing on understanding, how psychiatric conditions such as schizophrenia, bipolar disorder, depression affect brain functionality [114,115].

NeuroImage
NeuroImage (http://www.neuroimage.nl/) is an International Multiscale Attention-Deficit/Hyperactivity Disorder (ADHD) Genetics Initiative (IMAGE) funded by the National Institute of Mental Health. The goal of the study is to gather and analyze endophenotypic, phenotypic and genetic information about ADHD. This study is based on a collection of 5,578 subjects from 8 European countries. In the course of this project, structural and functional MRI scans are performed on patients, along with neuropsychological assessments and GWAS analysis in order to detect functional abnormalities underlying ADHD [116,117].
Initiatives such as ADNI and PPMI have largely invested in systematically harvesting genetic and imaging data. Studies like ADNI and PPMI form the basis for the association of imaging readouts with genetic variation information and may facilitate the generation of hypotheses about mechanistic links between genes and imaging features.

Mining links between neuroimaging readouts and molecular processes from literature
High-throughput imaging technologies have been employed to understand the molecular mechanisms underlying clinical conditions. Such efforts have led to the identification of novel biomarkers for all disease domains, especially AD [118,119]. However, the rapid growth of the literature around these combinatorial studies has made it increasingly difficult to aggregate and mine the reported findings [120]. Obviously, new technologies enabling automated text processing ("text-mining") may help to retrieve relevant documents and to extract relevant knowledge from text.

Ontologies and terminologies
One of the most efficient ways to address the challenge of unstructured information mining is with the efficient usage of ontologies and controlled terminologies. Ontologies are formal representations of knowledge that can represent entire research domains. They are helpful when concepts need to be shared across research communities in an unambiguous fashion. This is very crucial as it enables different research groups to communicate with each other without misinterpretation of the biological context [121][122][123]. Ontologies do also facilitate the exchange of data and knowledge between machines; they are in fact readable by both, human experts and machines. When transformed into terminologies (dictionaries), they can readily be integrated into text-mining systems and are very useful for information extraction and knowledge representation. Furthermore, ontologies bear the potential to enable automated reasoning over knowledge representations [124,125].

Existing ontologies in the field of neuroimaging
Similar to other biological domains, the field of neuroimaging research has advanced semantically by generating various terminologies and ontologies in the past. Some of the more widely recognized neuroimaging ontologies are listed below:

Quantitative Imaging Biomarker Ontology (QIBO)
QIBO ontology was developed to standardize quantitative imaging biomarkers for better therapeutic intervention. This ontology consists of 488 terms and they consist of classes such as imaging agent, imaging instrument or biological intervention. QIBO represents concepts across several fields, including imaging physics and biology [126].

Magnetic Resonance Imaging Ontology (MRIO)
This ontology captures all concepts needed to describe the outcome of MRI scans. It has been designed to overcome the heterogeneity in MRI readouts. The authors mainly capture measured data coming from T1, T2, tissue as well as other factors, such as temperature. The MRIO ontology focusses mainly on two MRI representations namely MRI simulators and DICOM images and conceptualize all possible terms that can be observed using these scanned images [127,128].

NeuroLog
The NeuroLog consortium was established in the year 2006 for sharing and reusing data and tools for neuroimaging studies. This software architecture aids in efficient integration of neuroimaging data and tools from various neuroimaging research centers. This consortium also takes charge of the autonomous data management from each center to maintain the confidentiality of the neuroimaging data. Furthermore, the usage of semantically annotated tools inbuilt in the system architecture provides better standardization of neuroimaging datasets and therefore offers better accessibility through the federated schema based ontology [129,130].

NeuroImage Feature Terminology (NIFT)
Although there are so many ontologies established in the area of neuro-imaging, there is still a lack of a terminology which facilitates a systematic representation and retrieval of measured indices with high relevance for neurodegenerative diseases. All the existing ontologies represent what the imaging scan capture, but they do not contain concepts that link imaging readouts to disease pathology. Motivated by the apparent need for such a terminology, we have developed NIFT, the "neuro-image feature terminology". NIFT represents a wide spectrum of terms linked to radiological, neuropsychological as well as measured indices highly relevant to neurodegenerative diseases (e.g. AD and PD) [131]. The NIFT terminology comprises highly generic concepts describing common neuroimaging features, but at the same time it is very specific and represents disease-centric pathological measures used in imaging scans in the domain of Alzheimer 's disease and Parkinsonism. NIFT can act as a potential resource to capture molecular as well as clinical readouts, which are crucial in bridging these two domains as well as retrieving relevant documents which can be further used in a multi-layered disease models. As such, NIFT is well suited to support the identification of novel mechanisms underlying the etiology of AD and PD.

Retrieval of relevant publications using the nift terminology
The main purpose for developing ontologies and terminologies is to retrieve relevant publications and automatically extract relevant information from the literature. To enable specific retrieval and information extraction in the imaging domain, we integrated the NIFT terminology into our in-house text-mining system SCAIView [132,133]. SCAIView was developed at Fraunhofer SCAI to enable biologists and clinical researchers to perform semantic search and information extraction from the scientific literature. A free version of this literature mining environment, SCAIView academia, allows free access to the semantically annotated PubMed abstracts. For PubMed Central (PMC) full text publications, SCAIView allows a full-text search as well. We have integrated NIFT in SCAIView and used the system to systematically retrieve relevant documents containing useful information on imaging readouts linked to molecular entities. The resulting literature corpus was then used for mechanistic modeling purposes.

Mechanistic modeling of neuroimaging indices
We wanted to understand the significance of a measured index obtained from imaging techniques and their association with clinical tests to improve the prediction an underlying neurodegenerative disease, in this case, AD. For this, we performed an optimized search query using our literature-mining environment SCAIView.

We used the query "[Neuroimaging Feature]) AND [MeSH Disease: "Alzheimer Disease"]) AND [Alzheimer Ontology Node: "Evaluation"]) AND [BRCO]) AND [PTS]) AND [Organism: "Homo
sapiens"]" to retrieve relevant publications that comprises diseasespecific terms, brain region and cell-type information (BRCO) and that comprise pathway mentions (PTS). The Alzheimer Ontology (ADO) concept "evaluation" provides a wide spectrum of entities that describe various clinical tests that are significant for diagnosing AD. Once the articles were retrieved, we tried to model them in order to identify underlying the molecular mechanisms.

Mechanistic modeling of neuroimaging features with molecular pointers
One major motivation to develop the NIFT terminology was to support the generation of cause-and-effect models in the area of neurodegenerative diseases. With the integration of imaging features in cause-and-effect models, we hope to bridge between the molecular level (genome, pathways) and the macroscopic anatomical level of brain structures such as brain regions and the entire organ.
Using the query described above, we generated a literature corpus highly enriched for mentions of interesting imaging features together with interesting molecular processes. One of the resulting models that link imaging features to the molecular pathophysiology of AD deals with the influence of cerebral blood flow on cognitive impairment in AD. The overall workflow applied is shown in Figure 4.

NIFT application example Hypothetical model for linking high-level cerebral blood flow with molecular processes:
The scientific community has long been interested in the vascular biology, in which the human physiology is represented as large and small blood vessels which might play a role in AD progression [134,135]. Although clinical studies conducted on AD patients reveal substantial evidences of vascular lesion being the biggest factor of AD, the fundamental understanding of the molecular mechanism behind that remains unexplained [136,137]. Therefore, here we establish our first hypothetical model that links high level complex biology such as cerebral blood flow with molecular processes. This model is highly putative due to the lack of experimental validation and lack of clinical resources to support the hypothesis.
AD is highly diverse and complex in terms of the various cellular and molecular players that together result in the disease pathology. Apart from the molecular deposits such as plaques and tangles, increasing supporting evidences on the role of vascular abnormalities in AD pathology, so much so that these co-morbid conditions are classified under the term "vascular dementia" [138][139][140]. The links between vascular lesions and cognition impairment are based on observations that have been captured using advanced neuroimaging techniques such as SPECT [141,142]. By using radioisotopic tracers, depletion of blood flow can be traced by reduced glucose consumption in a particular brain region [143,144]. Apart from SPECT, MRI tensors are constantly tuned to detect early neoplasms and altered blood flow detection with high-resolution quality [145,146].

Hypothetical mechanism for cerebral blood flow in AD
SIRT1 stands for "Silient Information Regulator 2 homolog 1". In general, their role is to maintain cellular functions and promote longevity of the cells in humans as well as other model organisms [147,148]. Sirtuins have been reported to protect the brain from infarction by regulating the blood flow to all parts of the brain, especially the cerebral region [149][150][151]. In normal conditions, SIRT1 has been reported to play a protective role by enhancing the nonamyloidogenic cleavage of amyloid-beta protein (APP) through NFkb inhibition. The inhibition of NF-kb contributes to the clearance of amyloid plaques from the brain [152,153]. However, in case of AD, SIRT1 genes are reported to be under expressed which in turn activates the accumulation of amyloid beta in cerebral cortex through NF-Kb activation. The accumulation of APP in the cerebral region could further lead to the depletion of nutrients such as oxygen from the blood, resulting in the inhibition of cerebral blood flow. Lack of oxygen and other nutrients to the brain, various mental and psychiatric abnormalities and could lead to cognitive impairment [154,155].
Also, we hypothesize that the overexpression of SIRT1 co-activates a regulator, which transcribes ADAM10 [156][157][158]. This could trigger ADAM10 to partially compete with the gamma-secretase for APP fragment resulting in the activation of Notch signaling pathway which is well-known for neuronal repair [159][160][161]. However, in case of AD ADAM mutant Q170H and R181G does not compete with alpha-secretase, therefore the beta-secretases accumulate in the brain resulting in impaired cerebral blood flow [162][163][164].
Another plausible mechanism of reduced cerebral blood flow is due to APOE activity. Increased expression of APOE also facilitates the molecular interaction between amyloid beta and Butyrylcholineesterase (BCHE) gene which results in the formation of a complex BCHE-Abeta-APOE (BaβA) complex [165][166][167]. This complex alters the structure of BCHE which accelerates the catalytic activity of the enzyme. This results in the formation of amyloid plaques [168][169][170] as seen Figure 5. Increased expression of APOE also disrupts the neuronal activity in the hippocampus resulting in atrophy. Hippocampal atrophy is also one of the causative factor of cognitive decline in AD [171][172][173].
Apart from the well-known genes of AD, recently, PICALM gene has been emerging as a potential AD candidate. PICALM plays a crucial role in intracellular trafficking of endothelial proteins resulting in endocytosis. The protective allele of PICALM, rs3851179 facilitates the amyloid beta clearance through endocytosis [174][175][176]. LRP1 is another crucial protein whose major function is cholesterol transport and transcytosis of various molecules including amyloid beta across the BBB [177][178][179]. As PICALM plays a major role in the internalization of the endothelial proteins, it also internalizes the sLRP1 and amyloidbeta complex by trafficking through two other proteins Rab5 and Rab11. These further results in amyloid transcytosis and clearance from entering the BBB [180][181][182]. Also, LRP1 activates another protein called GLUT1 which is another major glucose transporter across the BBB [183][184][185]. During normal conditions, there is a free flow of glucose and other nutrients across BBB. However, during AD, GLUT function is altered by Gly286Asp resulting in inhibition of glucose metabolism [186][187][188].
Here, we have demonstrated a hypothetical mechanism around cerebral blood flow in AD. We call this model as "putative" and "hypothetical" because they lack individual causal proof and substantial experimental validation. The overall workflow of the altered regulation of cerebral blood flow can be seen in Figure 5.