Micro RNA’s (miRNA’s) may help explain expression of multiple genes in Alzheimer’s Frontal Cortex

MicroRNA’s (miRNA’s) are non-coding RNA’s that can regulate gene expression. miRNA’s are small (~22 nt) and can bind by complementation to mRNA’s resulting in direct degradation and/or interference with translation. We used paired-end RNAseq and small RNAseq to examine expression of mRNA’s and miRNA’s, respectively, in total RNA’s isolated from frontal cortex of Alzheimer’s disease (AD) and control (CTL) subjects. mRNA’s were aligned to the hg38 human genome using Tophat2/Bowtie2 and relative expression levels determined by Cufflinks as FPKM (fragments per kilobase of exon per million mapped reads). miRNA’s were aligned to the hg19 human genome and quantitated as number of mature known miRNA reads with miRDeep*. Using a false-discovery rate (FDR) of 10% (FDR ≤ 0.10) applied to 11,794 genes with FPKM > 2.0, we identified 55 genes upregulated in AD and 191 genes downregulated in AD. The top 145 miRNA’s with the highest mean expression were mostly under-expressed in AD (132/145 miRNAs had AD/CTL < 1.0)). AD samples had increased coefficients of variation for miRNA and mRNA expression, implying greater heterogeneity. miRTAR analysis showed that 32 (58%) of the mRNA genes upregulated in AD could be controlled by one or more of 60 miRNA’s under-expressed >1.5-fold in AD (AD/CTL < 0.67). AD frontal cortex gene expression analysis showed extensive regulation by miRNAs that appear to interact with the majority of overexpressed protein coding genes. The temporal sequence of appearance in AD brain and etiologies of this extensive, heterogeneous regulation by miRNA’s are unknown. (243 words) Correspondence to: James P. Bennett, Jr. M.D., Ph.D, Neurodegeneration Therapeutics, Inc., 3050A Berkmar Drive, Charlottesville, VA 22901-3450, Tel: 434-529-6457, Fax: 434-529-6458, E-mail: aging.mitochondria@gmail.com


Introduction
Alzheimer's disease (AD) most often occurs sporadically (>95%) and is the leading cause of dementia, which, according to Alzheimer's Disease International (World Alzheimer Report 2010, "The Global Economic Impact of Dementia"), results in costs that represent the world's 18 th largest economy and is highest per capita in North American countries. It is estimated that the United States currently spends ~220 billion dollars per year for dementia care, most of which is due to AD. AD progresses clinically through loss of one or more domains of cognition, known as mild cognitive impairment (MCI) [1][2][3]. Individuals with MCI have ~50% chance of progressing into AD dementia over 3-5 years and demonstrate impaired cerebral cortical glucose metabolism [3][4][5][6][7][8][9][10][11] and increased oxidative stress damage products [12]. These findings suggest an early impairment of bioenergetic function potentially arising from oxidative stress damage.
Although advanced AD brains demonstrate accumulation of beta amyloid peptide in extracellular "plaques", and hyper-phosphorylated intraneuronal tau seen as "neurofibrillary tangles", the hypotheses advancing causality of either of these proteins in soluble or aggregated forms for genesis or progression of AD dementia remain controversial [13]. Strategies of reducing beta amyloid protein levels, either by antibody clearance or synthesis inhibition, have to date not altered dementia progression [14][15][16][17][18][19]. These results have stimulated some to question the "amyloid cascade hypothesis" of AD pathogenesis, but the interpretations of these negative clinical research results remain controversial [13,18,19].
Improvements in sequencing technologies have allowed development of "next-generation sequencing" (NGS), which in an unbiased manner sequences fragments of DNA or cDNA's. Alignment algorithms available in the public domain align these fragments to genomes, allowing quantitation of coding gene cDNA abundance, gene mutations, exon-intron boundaries, among many possibilities. The more recent development of technologies that allow small segment sequencing has allowed exploration of non-coding RNA's, that include micro RNA's (miRNA's), small RNA's whose genes exist on the intergenic DNA formerly referred to as "junk DNA", but now recognized to contain coding sequences for a variety of small RNA's whose functions are being explored. miRNA's are now considered to be part of the gene expression regulatory system, along with histone post-translational modifications and DNA methylation. miRNA's, of which there are thousands currently known in the human genome, promiscuously regulate expression of many genes. A given miRNA can control expression of dozens of mRNA's, by both binding tightly to the untranslated regions (UTR's) of a gene, inducing degradation of that mRNA, or by interfering with the translation of the mRNA, reducing the resulting protein content. Likewise, a given mRNA can have many miRNA's that regulate it with varying degrees of specificity.
Thus, modern NGS studies of gene expression can now include identification and quantitation of miRNA's as regulators of gene expression. With this approach, a more comprehensive picture of how gene expression changes are regulated can emerge.

Brain samples
Brain samples were collected over multiple years by the University of Virginia Brain Resource Facility. Individuals, who died with antemortem diagnoses of Alzheimer's dementia, after provision of informed consent by next of kin, underwent autopsy with brain removal. The brain was divided sagittally and half was immersed in formalin for histopathological diagnosis. The remaining half was sectioned coronally. From these sections, representative areas were dissected and rapidly frozen in liquid nitrogen-chilled 2-methlybutane. Both flash frozen blocks and coronal sections were stored at -80 degrees. For the present study, we restricted ourselves to samples that were autopsied and frozen within 12 hours of death. All brains were examined by a neuropathologist to confirm diagnosis of Alzheimer's Disease (AD) or normal adult brain (CTL).

Total RNA isolation
For this study, 25, 20 um cryostat sections of approximately 1 by 1.5 cm blocks of frontal cortex grey matter were collected and immediately placed in Qiazol. Total RNA was isolated according to Qiagen miRNeasy instructions with sonication of the tissue in Qiazol, on-column DNase digestion and an additional RPE buffer wash. RNA quality was assessed on an Agilent Bioanalyzer  capillary gel electrophoresis system, and RNA was quantitated using a Nanodrop spectrophotometer. Total RNA samples were sent for RNA sequencing (CoFactor Genomics). CoFactor carried out the library preparation, quantitation and Illumina sequencing (see below for details).

RNA sequencing
Ribo-depletion, NGS library preparation: Total RNA isolated by us was processed for library construction by Cofactor Genomics (http:// cofactorgenomics.com, St. Louis, MO) according to the following procedure: Briefly, species specific rRNA-probes were hybridized to total RNA in order to remove nuclear-encoded and mitochondrial contaminating ribosomal RNA from the sample. The resulting ribodepleted RNA was then fragmented. First-strand cDNA synthesis was performed using reverse transcriptase and random primers in the presence of Actinomycin D, followed by second-strand cDNA synthesis with DNA polymerase I and RNase H. Double-stranded cDNA was end-repaired and A-tailed for subsequent adaptor ligation. Indexed adaptors were ligated to the A-tailed cDNA. Enrichment by PCR was performed to generate the final cDNA sequencing library. Libraries were sequenced as paired-end 150 base pair reads on a NextSeq500 (Illumina, San Diego, CA) following the manufacturer's protocols.
Small RNA size selection, NGS library preparation: Total RNA was processed for library construction by Cofactor Genomics (http:// cofactorgenomics.com, St. Louis, MO) according to the following procedure: Briefly, adapters were ligated to the 5' and 3' ends of the RNA, allowing cDNA to be generated with a reverse transcriptase. Following a dual-SPRI size selection with Ampure XP beads, PCR was used to add indexes and amplify adapter-ligated fragments. A final size selection was performed using a Pippin HT instrument (Sage Science, Beverly, MA) to enrich for miRNA inserts. Libraries were sequenced as single-end 75 base pair reads on a NextSeq500 (Illumina, San Diego, CA) following the manufacturer's protocols.

Paired-end mRNA libraries
We utilized a format similar to that described in our earlier work [20]. Briefly, compressed, paired-end sequencing files in FastQ format for each mRNA sample were downloaded, decompressed and examined in FastQC. They were then trimmed of Illumina sequencing adapters using Trimmomatic with Phred scores of 20 (99% minimum sequencing accuracy) and re-examined in FastQC to be certain all adapters had been removed. The trimmed sequences were aligned against the hg38 version of the human genome using Tophat2/ Bowtie2, then quantitated against the hg38 human transcriptome using Cufflinks. Mean FPKM (fragments per kilobase of exon per million reads) were used for estimates of relative gene expression.

microRNA's
Single-end sequencing files were aligned against the hg19 version of the human genome using miRDeep* [21]. Total reads for known miRNA's were manually aligned for each CTL and AD sample. miRNA target prediction of the miRNA expression data was carried out using miRTAR [22] with the default thermodynamic settings for miRNA-mRNA interactions. The gene lists studied included all miRNA's where the mean AD/mean CTL ratio was <0.67 (ie, the miRNA's were under-expressed in AD samples > 1.5-fold), and the mRNA genes were significantly (FDR < 10%) overexpressed. miRTAR identified 54 (58%) of the significantly overexpressed mRNA's in AD samples that could be regulated by 16 miRNA's underexpressed in AD brain.

qPCR
Primers for qPCR assay of gene expression were designed using BLAST Primer Design (https://www.ncbi.nlm.nih.gov/tools/primerblast/). Primer pairs (n=12) for qbasePLUS/GeNorm  determination of the optimum reference ("housekeeping") genes to use for data normalization were obtained from Primer Design (http://www. primerdesign.co.uk/products/9460-genorm-kits) and used according to manufacturer instructions. qPCR was carried out in 96-well PCR plates using a BioRad CFX thermocycler and SybrGreen detection of DNA pairs. The GeNorm algorithm in qbasePLUS (v 3.1) was used to determine the least variable reference genes (CYC1, ATP5A1, ACTB) used to normalize gene expression results in qbasePLUS.

Statistics
All statistical processing used Prism v 7.0a (GraphPad) mRNA gene expressions were analyzed by multiple t-tests with two-stage linear step-up procedure of Benjamini, Krieger and Yekutieli to determine False Discovery Rates (FDR) [23]. We report genes that were expressed with FDR < 10% performed by Prism software. Otherwise, 2-way ANOVA's were used to determine significance (P<0.05).

Results
Supplemental Table 1 shows all FPKM values for mRNA's identified in frontal cortex post-mortem samples by paired-end RNAseq from Tophat2/Bowtie2 alignment against the hg38 human genome and Cufflinks quantitation against the hg38 human transcriptome, for coding genes with mean FPKM>2.0, after removal of duplicate samples (n=11,795 total). We studied RNA's isolated from frontal cortex samples harvested from 10 persons with AD and 9 CTL individuals, all collected under informed consent and examined by a neuropathologist. We used the Qiagen miRNeasy  system to isolate total RNA, which includes small RNA's. Table 1 lists the demographics on the brain samples and their RNA quality as assayed with the Agilent Bioanalyzer  system. There was no difference between the two groups (AD, CTL) in terms of mean RNA quality numbers. (4.7) The AD population was slightly older (but not significantly so) at death (AD = 78 +/-3 years; CTL = 71 +/-4 years; p = 0.19 by t-test). Coding gene RNAseq was carried out as described in our earlier work [20], which provides links to all software algorithms and Unix texts used. Briefly, total RNA samples were treated to remove ribosomal RNA's (Ribo-Free, Illumina) then Illumina multiplex sequencing oligonucleotides were attached. The libraries were paired-end sequenced on an Illumina (San Diego, CA) NextSeq 500 at ~ 60 million base depths. The resulting files in FastQ format were decompressed and examined with FastQC before and after trimming, which was carried out with Trimmomatic and minimum Phred scores of 20 (99% sequencing accuracy). Subsequent bioinformatics pipeline involved alignment against the hg38 version of the human genome with Tophat2/Bowtie2 and quantitation of expression against the hg38 human transcriptome using Cufflinks. Table 2 lists the genes identified with FDR (q values) of 10% or less as over-expressed or under-expressed in the AD brain samples, along with their AD/CTL expression ratios determined from the mean FPKM values.
To create Table 3, we used qPCR to assay all AD and CTL cDNA's for the 4 genes yielding the highest AD/CTL ratios (HSPB1, RBM14, RSAD1, GSN) and the 4 genes yielding the lowest AD/CTL ratios (TMEFF1, DPH2, GNG4, CYP26B1) (see Table 2). After normalizing each sample to the geometric mean of the 3 reference genes found to be least variable by GeNorm, we found with qbasePLUS (Table 3) that 7 out of 8 genes had expression levels qualitatively similar (increased or decreased in AD) by qPCR compared to RNA-seq. The one exception was CYP26B1 that was slightly overexpressed in AD by qPCR but underexpressed by RNA-seq.
Small RNA sequencing was carried out on each total RNA sample that was also sequenced for mRNA genes as above. Sequencing files were first examined for residual sequencing library adapters and then restricted with Trimmomatic to Phred scores of 30 (sequencing accuracy > 99.9%), a "sliding window" of 4 bases, each requiring a minimum Phred score of 30, and a maximum length of 24 bases. In this manner, we wished to restrict the small miRNA's identified to those of an appropriate size and very high sequencing accuracy.
The resulting files were processed individually using miRDeep* [21] alignment (operating in Java  mode) against the hg19 human genome. Known miRNA's from each of the CTL and AD samples were manually curated to align expression (as number of reads of each mature miRNA) across both groups. From this Table (Supplemental  Table 2) the mean, SD, coefficient of variance (CV) and mean AD/ mean CTL ratios were calculated. The minimum number of samples for which a miRNA was present was set to be 6 for the CTL samples and 7 for the AD samples. In Supplemental Table 2, the mean number of samples for detected miRNA's was 8.7 for CTL and 9.5 for AD. Figure 1 shows the results of miRNA quantitation, with results expressed as % of mean CTL values +/-SEM. 132 out of 145 miRNA's were under-expressed in AD samples. (AD/CTL < 100% mean CTL) miRTAR [22] was then used to assign interactions of each miRNA most under-expressed in the AD samples (mean AD/mean CTL < 0.67) with the mRNA genes over-expressed in the AD samples with FDR<10%. Default miRTAR free energy values of miRNA binding were used, and interactions were sought with only the 3' UTR of each gene. Table 4 shows the results of the miRTAR analysis of the interactions among miRNA's (under-expressed in AD) and mRNA's (overexpressed in AD) identified in the RNA seq studies. We identified 32 mRNA's (out of 55 over-expressed in AD, 58%) that could be regulated by 60 miRNA's underexpressed in AD, using the conservative thermodynamic default values of miRNA-mRNA interactions in miRTAR.
Of the 32 mRNA's whose expressions were potentially regulated by miRNA's as identified by miRTAR, we found that 7 (BST2, CEP250, GEMIN4, ITGAV, KLH21, LAMB2, SMG5) could be regulated by two miRNA's; 3 (ANKS6, DYRK1A, MAP3K3) could be regulated by 3 miRNA's; 4 (FLCN, IFFO2, MICAL3, PLXNB1) could be regulated by 4 miRNA's; and 1 (TMEM109) could be regulated by 5 miRNA's. We consider these 15 mRNA genes to be the most regulated by the miRNA's we could identify in our AD brain samples. Figure 2 shows the differences in coefficients of variation (CV) as a function of mean expression levels for the mRNA (bottom) and miRNA (top) genes identified in the AD and CTL populations. There was no apparent relationship among CV's and mRNA expression levels. The CV's of the AD samples for both mRNA and miRNA were significantly elevated compared to CTL, suggesting greater heterogeneity in the AD samples.

Discussion
There are at least three major findings of our study: First, we carried out RNA-seq of both coding genes and regulatory small RNA's (miRNA's) on total RNA's isolated from frontal cortex brain samples of an AD cohort (n=10) and a CTL, disease-free, cohort (n=9). Using false-discovery rate correction for multiple samples (FDR < 10%), we found significant up-regulation of 55 genes and down-regulation of 191 genes in the AD population. Using DAVID analyses of the significantly over-and under-expressed genes in our AD samples (not shown), we could not determine any gene ontology groups or pathways that were significantly represented (Benjaminicorrected p values < 0.05). This speaks to the widespread heterogeneity of molecular genetic changes observed in our AD samples.
Second, our miRNA analysis with miRTAR suggests that ~58% of the AD upregulated genes could be regulated by miRNA's that were under-expressed <1.5-fold in the AD samples. This result implies that miRNA regulation potentially plays a significant role in gene expression changes in AD. Because miRNA's can regulate multiple coding genes at both the mRNA and protein translation levels, AD brain, at least at the advanced stages, appears to experience multiple and wide-ranging molecular genetic abnormalities likely derived from alterations in miRNA regulation.
The third major finding is the increased heterogeneity (AD>CTL) in the brain samples for expression of both mRNA's and miRNA's. This result suggests that no single molecular cause for AD is likely to  Gene expression was assayed by RNA-seq as described in Methods. After removal of duplicates, all AD and CTL frontal cortex genes with mean FPKM > 2.0 (n=11,795) were subjected to multiple t-test comparison with FDR correction < 10% (q < 0.10) applied for multiple comparisons.   Table 3. Comparison of mean AD/CTL gene expression ratios between qPCR and RNA-seq be uncovered, and that multiple molecular genetic abnormalities may exist among individuals that result in shared clinical phenotypes over time. Identifying molecular genetic subgroups may allow protective therapies to be "personalized".
There are several important caveats about our study that potentially limit interpretation: First, we were restricted to use of post-mortem tissues from persons who died with advanced disease. Our findings may represent "survival" strategies of neurons and glia in these samples. One cannot assume that we have identified pathogenic patterns of gene expression or miRNA depletion.
Second, we studied a limited number of samples. The current high costs of RNA-seq limits studies in terms of the total population size examined. Since AD afflicts > 5 million persons in the US alone, extrapolation of our findings from a limited cohort to this very large population is risky.
Third, our RNA quality was not ideal, compared to total RNA quality isolated from quickly processed human biopsies, animal tissues or cells. This is an inevitable limitation of post-mortem human brain samples; although some improvements in RNA quality have been reported with "rapid" autopsy/freezing procedures. Because our AD and CTL samples did not differ in overall RNA quality, it is unlikely that our results can be explained on the basis of variable RNA degradations. However, that consideration must be kept in mind.
Fourth, we cannot speculate as to the temporal sequence of appearance of the abnormalities we found. If these abnormalities appear early in clinical disease evolution, then the earliest cognitive changes may signal marked disruption in the AD brain of molecular genetic harmony. If true, this suggests that disease-altering treatments should begin very early, prior to clinical deficits. This would require some type of risk stratification, similar to current approaches to vascular disease and lowering of blood cholesterol levels, as an example.
An alternative explanation is that we are describing late-appearing events. In this case, stratification of therapies in different molecular subgroups may identify specific populations that experience greater or lesser response to a given therapy at the time of symptom onset. In this scenario, therapies might be "personalized" based on molecular genetic profiles. It is presently unclear if this approach would provide a meaningful alteration to disease-modifying therapies for AD that to date have not succeded. However, the degree of increased molecular genetic heterogeneity we observed in the AD population supports exploration of subpopulations in the AD community that could form the basis of stratifying future clinical trials.
Fifth, we did not carry out any miRNA expression studies in cell models to seek confirmation of mRNA changes. It would appear meaningful to overexpress the miRNA's that modulated expression of multiple mRNA's. As an example, five miRNA's (mir-134 (AD/CTL   = 0.541), mir-2110 (AD/CTL=0.557), mir-15a (AD/CTL=0.604), mir-1301 (AD/CTL=0.666), and mir-214 (AD/CTL=0.599)) are predicted by miRTAR to have significant thermodynamic interactions with TMEM109 mRNA. TMEM109, a gene coding for a transmembrane protein felt to yield a voltage-dependent ion channel and to mediate UV light-induced DNA damage, is one of the genes significantly (FDR<10%) overexpressed in our AD frontal cortex samples, consistent with the 5 indicated miRNA's all being underexpressed in our AD brain samples.
Sixth, we used total tissue homogenates for RNA isolation. As a result, our findings represent a combination of RNA's from multiple cell types, such as neurons (5-10% of cells), astrocytes (~80-85% of cells) and unknown but minor amounts of microglia, oligodendroglia and endothelial cells. We cannot assign the changes we observed to any particular cell type. This would require microdissection of specific cell types followed by RNA-seq.
There are multiple miRNA studies of AD brain tissues that use different assay methods, such as microarrays, qPCR and small RNAseq (for a recent review, see [24]). Because miRNA's are promiscuous in terms of the mRNA genes they can regulate, the varieties of AD tissues that have been examined, and the varying techniques to quantitate miRNA's (ie, qPCR, microarray, small RNA-seq), there is presently no "consensus" regarding the deregulation of specific miRNA's critical to AD pathophysiology [24]. Our approach, combining traditional RNA-seq of gene cDNA's and non-coding small RNA-seq from the same total RNA samples, combined with predictive bioinformatics tools, yields interactions among mRNA's and miRNA's that may be relevant to AD brain.
A particular miRNA consistently felt to be critical to AD pathogenesis is miR-132 [25], which has been found to be extensively down-regulated. We also found expression of miR-132 to be downregulated in our samples, with an AD/CTL ratio of 0.389. (miR-132 is highlighted in Supplemental Table 2).
The nature of our study is primarily descriptive rather than mechanistic. For instance, we showed (see above) that multiple miRNA's we found were under-expressed in AD cortex samples (thus relatively over-expressed in CTL samples), were predicted thermodynamically to interact with a single mRNA (TMEM109). Thus any or all of these miRNA's could bind to and lead to the degradation of TMEM109 mRNA in CTL brain samples. This would constitute a pre-translational regulation of gene expression, but it does not inform us as to possible translational mechanisms of controlling TMEM109 protein levels that these miRNA's are also capable of.
Our study has multiple caveats and limitations (discussed above), but there are several predictions that are experimentally testable in human iPSC-derived neural cells. Testing these predictions would be important to define mechanistic interpretations of our findings relevant to AD pathogenesis. One prediction would be that expression of genes both overexpressed in AD brain and potentially regulated by several miRNA's underexpressed in AD brain (such as TMEM109 that is potentially regulated by 5 miRNA's) would adversely affect iPSCderived neural stem cells derived from disease-free (CTL) persons. We are presently carrying out that experiment. Other genes (overexpressedin-AD brains) potentially regulated by multiple miRNA's could be tested singly or in combination. Should these overexpressions reveal quantitative effects, conversion to a high-throughput format would allow screening of potential disease-altering medications.
Our findings support the concept that AD arises from dysfunction of multiple molecular systems and is heterogeneous across individuals in terms of which molecular systems have greater or lesser degrees of dysfunction. A therapeutic strategy for mild cognitive impairment and early AD has been devised, which attempts to improve simultaneously multiple nutritional, endocrinological and mitochondrial dysfunctions. The MEND protocol (Metabolic Enhancement for NeuroDegeneration) has achieved some success in subjects with early dementia [26]. One potential refinement of such a protocol would be to examine how the MEND approach alters gene and miRNA expression in peripheral cells, or perhaps neural membrane exosomes [27].
In summary, we found using RNA-seq and small RNA-seq on the same RNA samples that in AD frontal cortex expression of multiple genes (mRNA) was significantly up or down-regulated. ~58% of the up regulated genes could derive from effects of miRNA's that were under-expressed in AD brains. Because we did not examine the results of overexpressing individual miRNA's in cells, we can provide only correlational changes and cannot speak to potential causation of the miRNA changes we observed.
We hope to stimulate interest in the problems of molecular heterogeneity in a complex brain disease and therapeutic stratification for future clinical studies. In addition, if AD (at least, AD at end stage, which is what our post-mortem samples represent) includes significant miRNA control of gene expression, what factors mediate those changes? Is miRNA expression repressed in AD brain in response to "environmental" changes or other stresses? That such regulations appear to be present in AD support the hypothesis that variables of living can affect AD risk heterogeneously across individuals and that sporadically occurring AD should no longer be regarded as a monolithic illness with a common origin (or a common treatment).
Finally, our approach demonstrated the advantage of performing RNAseq for both mRNA's and miRNA's on the same RNA samples. By using this approach we were able to identify a core set of regulated mRNA's in AD that may have their expression increased by loss of specific miRNA's. Whether therapeutic strategies directed at multiple molecular systems, such as the MEND protocol that appears to improve cognitive function [26], alter expressions of miRNA's or mRNA's remains to be tested.