MicroRNAs as molecular markers for colon cancer. Diagnostic screening in stool & blood

Screening for colon cancer (CC) allows for diagnosis of the early stage for malignancy and potentially reduces disease mortality as the cancer could be cured at disease earliest stages. Early detection could be desirable if accurate, practical and cost effective diagnostic measures for this cancer are available. Mortality and morbidity from colon cancer represent a major health problem involving a malignant disease that is theoretically preventable through screening. Current screening methods (e.g., the immunological fecal occult blood test, FOBTi, obtained from patients’ medical records) either lacks sensitivity and requires dietary restriction, which impedes compliance and use; are costly (e.g., colonoscopy), which decreases compliance; or could lead to mortality. In comparison to the FOBT test, a noninvasive sensitive screen for which there is no requirement for dietary restriction would be a more convenient test. Colorectal cancer (CRC) is the only cancer for which screening by colonoscopy is recommended. Although colonoscopy is a reliable screening tool, its invasive nature, accompanying abdominal pain, potential complications and high cost have hampered the application of this screening method worldwide. A novel screening approach using the stable miRNA molecules, which are relatively nondegradable when extracted from noninvasive stool and semi-invasive blood samples by currently available commercial kits and manipulated thereafter, would be preferable to a transcriptomic messenger (m)RNA-, a mutation DNA-, an epigeneticor a proteomic-based test. The approach utilizes reverse transcriptase (RT), followed by a modified quantitative real-time polymerase chain reaction (qPCR). Although exosomal RNA would not be measured, using a restricted extraction of total RNA from stool or blood, then a parallel test could also be carried out on RNA obtained from stool or plasma samples, and appropriate corrections for exsosomal loss can be made to obtain accurate and quantitative results. Eventually, a chip would be developed to facilitate diagnosis, as has been carried out for the quantification of genetically modified organisms (GMOs) in foods. The gold standard to which the molecular miRNA test is compared is colonoscopy. If performance criteria are met, as detailed herein, then a miRNA test in human stool or blood samples based on high throughput automated technologies and quantitative expression measurements --commonly used in the diagnostic clinical laboratory-would eventually be advanced to the clinical setting, which will make a vivid impact on the prevention of colon cancer. Correspondence to: Farid E. Ahmed, GEM Tox Labs, Institute for Research in Biotechnology, 2905 South Memorial Drive, Greenville, NC 27834, USA, Tel: +1 (252) 375-9656, Fax:+1 (252) 7561656, E-mail: gemtoxconsultants@yahoo.com


Introduction
Colon cancer is a disease that is different from rectal cancer [1]. Epidemiologic evidence suggests that colon cancers (CCs) and rectal cancers (RCs) differ in their morbidities and etiologies. RC is more common in China where it accounts for over 50% of CRC, compared with < 30% in Western countries. Data from Peking Union Medical College Hospital, China indicated that colon & rectal cancers accounted for 55.6% and 44.4% of CRC, respectively during the years 1989 through 2008, and are more prevalent in younger Asian individuals [2]. In contrast, CC was shown to account for over 60% of CRC cases in the USA & Europe, and is related to fatty foods, less exercise and a Caucasian ethnic origin [3][4][5][6], which suggest differences in carcinogenesis between CC and RC. Several structural and molecular studies have indicated differences in etiology, clinical manifestation, pathological features and genetic abnormalities between CC & RC [7][8][9]. The proximal colon, distal colon and rectum have different embryological origin. Some studies found that tumor suppressor genes,

Current methods for colon cancer screening
There are different tests to screen for colon cancer, which fall into two broad categories [18]: a) In vivo tests, which detect both polyps and cancer, and looks at the structure of the colon to find any abnormalities. This is carried out with an x-ray either after ingesting a contrasting liquid, followed by inserting a scope into the rectum (flexible sigmoidoscopy, capsule endoscopy, double contrast barium enema), or in other tests that employs special x-ray imaging such, as CT colonography (virtual colonoscopy). These tests although invasive, they allow for the removal of polyps when seen, and therefore present a role in colon cancer prevention. b) The second group of tests look for signs that the cancer may be present in vitro (in either stool or blood), and generally looks at the genetic material (whether DNA or RNA) in a non-invasive excrement (stool) or in a semi-invasive body fluid (blood), so that tests with high sensitivity and specificity, which are capable to function as an acceptable screen for this preventable cancer (e.g., guaiac-and immunological-based FOBTs, and molecular DNA tests in either stool and blood) are developed. These tests are less invasive and are easier to carry out, but many of them have low sensitivity for polyps' detection, unless they are further developed and refined [1,16-18-29]. Therefore, much effort and expense have been spent during the last 20 years to develop acceptable non-invasive tests [19][20][21][22][23][24][25][26][27][28]. All of these tests and others can be used when people have symptoms of either colon cancer, or other digestive diseases to check on the progression of anomalies. Table 1 compares the tests used to screen for colon cancer.
When recommended, screening usually starts with fecal occult blood test, FOBT, which is blood that cannot be seen with the naked eye in feces [17,19,20,23,29]. Many CRCs bleed into the intestinal lumen because blood vessels at the surface of large polyps or cancer are fragile and easily damaged by the passage of feces, releasing a small amount of blood into the stool, and FOBT can detect the otherwise invisible presence of blood in stool through a chemical reaction. The test, however, cannot inform as to whether the blood is from the colon or from other parts of the digestive tract (e.g., stomach). Although cancers and polyps can cause blood in stool, other causes of bleeding are ulcers, hemorrhoids, diverticulosis (tiny pouches that form at weak spots in colon wall), or IBDs (colitis] [25,30]. Nonetheless, as blood passes through the gastrointestinal (GI) tract, it becomes degraded, and depending upon the site at which the hemorrhage occurs, blood products detected in the stool by FOBT method will vary. Therefore, FOBTs alone have a limited ability to decrease mortality. Moreover, despite aggressive screening attempts, 67-85% of colon cancer patients who underwent FOBT died from the disease, indicating that its detection does not occur early enough to maximally affect the overall outcome of the disease, and therefore FOBT is not a sensitive test since it apparently misses many early stage cancers and adenomas. Moreover, guaiac FOBT tests require patients to change their diet before testing, Colonoscopy, based upon the same principles as sigmoidoscopy, allows visualization of the entire colon. Although it is the "gold standard" for CRC screening, for the 70 million people older than 50 years of age in the USA, it requires an unpleasant bowel preparation, the test itself could be uncomfortable, but sedation usually helps, and some people could experience low blood pressure or changes in heart rhythm during the test due to the sedation, although these side effects are not serious. If polyps are removed or a biopsy is taken during the procedure, blood can be observed for a day or two after the test, and in rare cases when bleeding continues, it could require treatment [11]. The test could cost $10 billion per year and exceed the physician capacity to perform this procedure, requires cathartic preparation and sedating or anesthetizing the patient, and it has an increased risk of morbidity or mortality due to perforation of the GI [35]. Moreover, studies found that the range of colonoscopy miss rates for right-sided colon cancer was 4.0%, 12-13% for adenomatous polyps 6-9 mm, and 0-6% for polyps ≥ 1 cm in diameter (36). Clearly, a simple, inexpensive, noninvasive, sensitive and specific screening test is badly needed to identify people at risk for developing advanced adenomas (e.g., polyps ≥ 1 cm with high grade dysplasia) or CRC who would benefit from a subsequent colonoscopy examination.
Virtual colonoscopy (CT colonography) is an advanced type of computed tomography (CT or CAT) scan of both the colon and rectum. It involves examination of a computer generated 3D presentation of the entire GI tract by reconstructing of either a computerized tomography (CT) or a magnetic resonance imaging. This test does not require sedation, but it requires bowel preparation and the use of a tube placed in the rectum --as in barium enema--to fill the colon with air, and also the drinking a contrast solution before the test in order to tag any remaining stool in the colon or the rectum. The procedure takes about 10 minutes, and it is especially useful for people who do not want to take the more invasive colonoscopy test. This method detects lesions based on their site, rather than their histology, and is thus unable to distinguish benign adenoma from an invasive carcinoma. It was shown in a meta-analysis of 33 studies involving 6,393 patients that this test has a low sensitivity for polyps (48% for polyps < 6 mm, 70% for polyps 6-9 mm and 85% for polyps > 9 mm). Furthermore, the test is expensive, and requires the availability of experts, which could reduce patients' compliance [37]. CT is still considered as an investigational alternative for asymptomatic, not at risk individuals, which also results in the expose of patients to a small amount of x-irradiation, and it can also miss the detection of small lesions [38].
In an effort to find a more pragmatic early biomarker noninvasive colon cancer detection methods, investigators have developed many in vitro tests such as epigenetic methylation marker changes in genes and chromosomal loci in fecal DNA [39], promoter DNA methylation in stool [40], mutated DNA markers found in neoplastic cells that are excreted in feces [41,42], or the minichromosomal maintenance proteins (MCMs) needed for DNA replication test [17], proteomics'based approaches in stool or blood [43], and transcriptomic mRNAbased approaches in stool or blood [26], or a combination of both genetic, as well as epigenetic tests [44].
Molecular studies have shown the presence of mutations of K-ras in DNA from stool of patients, but its drawbacks include its expression by fewer than half of large adenomas and carcinomas. In addition, its expression in non-neoplastic tissue makes it less than an optimal molecular marker. Besides, mutations are only found in a portion of the tumors, making the test to be a less sensitive one [45]. avoid nonsteroidal anti-inflammatory drugs (NSAIDs) like ibuprofen (Advil), naproxen (Aleve) or aspirin (> 1 adult aspirin, 325 mg/day) for 7 days before testing as they cause bleeding, although Tylenol ® can be taken as needed, vitamin C in excess of 250 mg/day from all sources, as well as red meats (beef, lamb or liver) for 3 days before testing, because components of blood in meat may give false positive results [1,[18][19][20][29][30][31]. The procedure requires multiple tests to be repeated every year, potentially reducing compliance [23,32]. If the test finds blood, a colonoscopy will also be needed to look for the source (American Cancer Society, http://ww.cancer.org).
The fecal immunochemical test (FIT) or (iFOBT) is a more recent test than the traditional guaiac, which reacts to part of the human hemoglobin protein found in red blood cells. This test is easier to use than guaiac FOBT because it requires are no drug or dietary restrictions, and it is less likely to react to bleeding from parts of the upper digestive tract (e.g., stomach) [18,30]. Because like guaiac FOBT, the FIT may not react to a tumor that is not bleeding [32], multiple stool samples are needed for testing, and if results are positive, a colonoscopy will also be required.
Current participation rates in CRC screening in the USA are less than 30% for both genders, compared to rates of 70 and 80% for breast and cervical cancer screening, respectively [1,18,30]. Participation could be improved by using high throughput automated molecular tests that are less uncomfortable, economical, easy to comply with, and more accurate (higher sensitivity and specificity, particularly for early stages) than currently available tests [20,[25][26][27].
In contrast to FOBTs, minimally invasive procedures could effectively detect neoplastic lesions. Since > 60% of early lesions are believed to arise in the rectosigmoid areas of the large intestine, rigid sigmoidoscopy, which is about 60 cm long, which can only see half the colon, has been routinely used in the past for screening [33]. Recently, however, there has been an increase in the number of lesions arising from more proximal lesions of the colon [20,[34][35][36], requiring the use of flexible, fiber optic sigmoidoscopies. Although these methods are effective and offer a means of removing neoplastic polyps, they still leave undetected all lesions that are beyond the reach of the scope (estimated to be between 25 and 34%) [33].
Double-contrast barium enema (DCBE), also referred to as aircontrast barium enema, or a barium enema with air contrast and sometimes known as lower GI series, is basically a type of an x-ray test in which a chalky liquid (barium sulfate) and air is used to outline the inner part of the colon and rectum to look for abnormal areas on X-rays [16][17][18][19]. A clear liquid diet is taken for a day or two before the procedure, and eating or drinking dairy products is avoided the night before the start of the procedure. The procedure takes about 45 minutes and does not require sedation. Moreover, the colon and rectum needs to be cleansed the night before the test by laxative intake, and/or use of enemas the morning of the exam. At testing, a small flexible tube is inserted into the rectum, and barium sulfate liquid is pumped into it in order to partially fill and opens the colon. Air is then pumped into the colon through the same tube, which may lead to bloating, cramping and discomfort, in addition to an urge for a bowel movement. X-ray pictures of colon lining are taken. If polyps or other suspicious areas are observed, a colonoscopy may also be needed. The barium could cause constipation for a few days after the procedure, and there is a small risk due to inflating the colon with air, which could injure or puncture the colon, in addition to an exposure to a relatively small amount of radiation [18].
Mutation of the adenomatous polyposis coli (APC) gene in stool of patients obtained by analysis of ductal DNA by PCR of APC gene templates and the detection of generated abnormal truncated polypeptides by in vitro transcription and translation of the PCR product has been demonstrated at early stages of the disease. However, the digital protein truncation test is not a reliable screening test because it lacks specificity (i.e., 5 out of 28 controls were positive for FOBT, and another 6 showed rectal bleeding) [46]. Since CRCs exhibit genetic heterogeneity, a multitarget approach that employ mutations in K-ras, APC and p53; the microsatellite instability marker Bat-26; and "long" DNA representing DNA of nonapoptotic colonocytes characteristic of cancer cells exfoliated from neoplasms, but not normal apoptotic colonocytes, have been looked at and undergone clinical testing [47]. However, DNA alterations were detected in only 16 of 31 (51.6%) invasive cancer, 29 of 71 (40.8%) invasive cancer plus adenoma with high-grade dysplasia, and 76 of 418 (18.2%) in patients with advanced neoplasia (tubular adenoma ≥ 1 cm in diameter, polyps with high grade dysplasia, or cancer [41]. Moreover, these tests are not cost-effective, as screening for multiple mutations is generally very expensive [48]. Preliminary studies suggest that proteomics may distinguish normal state from adenoma. This approach has, however, not been evaluated as a noninvasive screening tool, and it is therefore considered investigational [49,50]. Currently, the markers most often elevated in advanced CRC are carcinoembryonic antigen (CEA) [51] and the carbohydrate antigen, which is also called cancer antigen (CA) 19-9 [52], but neither of these markers has been found to be a useful, or a reliable diagnostic screen for colorectal cancer.
Early detection would be greatly enhanced if accurate, practical and cost effective diagnostic biomarkers for CRC were available. However, despite the advances detailed herein above, tests now available neither detect colon cancer in all cases (thus have low sensitivity), nor are they highly specific. Furthermore, these tests are often costly, produce falsepositive or false-negative results, molecules could be non-stable and easily fragment in vitro requiring an excessive care and special handling techniques (mRNA molecules), and some methods entail discomfort/ inconvenience to the patients, or could in rare cases result in mortality (e.g., colonoscopy) [35,36]; all are factors that could discourage patients' enthusiasm and/or compliance. Current participation rates in CRC screening are less than 30% in both genders, compared to screening for breast and cervical cancer that have rates of 70 to 80%, respectively [53]. Participation could thus be enhanced by the use of molecular lab tests that are less uncomfortable, less expensive and offer greater accuracy (more sensitivity and specificity). However, larger clinical studies would be needed to corroborate initial test results.
On the other hand, our data and others [25,27,28,[54][55][56][57][58][59][60][61][62][63][64][65][66] have shown that quantitative changes in the expression of few miRNA genes in stool or blood that are associated with colon cancer permit development of more sensitive and specific CRC molecular markers than those currently available on the market. In comparison to the commonly employed FOBT stool test, a noninvasive molecular and reliable test would particularly be more convenient as there would be no requirement for dietary restriction, or meticulous collection of samples, and thus a screening test would be acceptable to a broader segment of the population. Using stable molecules such as miRNAs that are not easily degradable when extracted from stool or blood and manipulated thereafter, a miRNA -approach for colon cancer is thus preferable to a transcriptomic mRNA-, mutation DNA-, epigeneticor a proteomic-based test [25,27,[55][56][57][58][59][60][61][62][63][64][65][66][67][68], particularly that we and others have shown that these stable, nondegradable miRNA molecules can be easily extracted from stool or from circulation in vitro using commercially available kits. Advantages and disadvantages of the in vivo and in vitro tests are presented in table 1.
A noninvasive miRNA test in stool or blood for developing a diagnostic screen for colon cancer Stool testing has several advantages over other colon cancer screening media as it is truly noninvasive and requires no unpleasant cathartic preparation, formal health care visits, or time away from work or routine activities [17][18][19][20]. Unlike sigmoidoscopy, it reflects the full length of the colorectum and samples can be taken in a way that represents both the right, as well as the left side of the colon. It is also believed that colonocytes are released continuously and abundantly into the fecal stream [21,24], contrary to situation in blood --where it is released intermittently--as in FOBT (23), and transformed colonoctes produce more RNA than normal ones [24][25][26][27][28]; therefore, this natural enrichment phenomenon partially obviate for the need to use a laboratory technique to enrich for tumorigenic colonocytes. Furthermore, because testing can be performed on mail-in-specimens, geographic access to stool screening is unimpeded [16,30,44,45]. The American Cancer Society (ACS) (http://ww.cancer.org) has recognized that a promising diagnostic screen for CRC would be enhanced by employing a molecular-based stool testing.
It should be emphasized that although not all of the shed cells in stool are derived from a tumor, data published by us and others [25,27,28,[56][57][58][59][60][61][62][63][64][65][66][67][68] have indicate that diagnostic miRNA gene expression profiles are associated with adequate number of exfoliated cancerous cells and enough transformed RNA is released in the stool, and also the availability of measurable amount of circulating. miRNA genes in blood (either cellular or extracellularly), which can be determined quantitatively by a sensitive technique such as PCR in spite of the presence of bacterial DNA, non-transformed RNA and other interfering substances. That quantification is feasible because of the high specificity of PCR primers that are employed in this method, which overcomes all of these stated obstacles; hence, the number of abnormally-shed colonocytes in stool, or total RNA presents in plasma or serum becomes unlimiting [24][25][26][27].
A test that employs miRNA in stool or blood could also result in a robust screen because of the durability of the miRNA molecules [25,27,28]. Moreover, an approach utilizing miRNA genes is more comprehensive and encompassing than a test that is based on the fragile messenger (m) RNA [26], for example, because it is based on mechanisms at a higher level or control. We believe that ultimately the final noninvasive test in stool or blood will include testing of several miRNA genes that show increased and decreased expression, and eventually a chip that contains a combination of these stable molecules will be produced to simplify testing, as has been developed for the testing of GMOs in foods [69].
Blood is a body fluid that can be obtained through a semi-invasive method (skin puncturing) that is commonly used in the laboratory testing, which makes it logical to employ on routine bases, and thus it would be attractive to technicians performing lab tests. However, working with blood for miRNA profiling present various challenges in purification and molecular characterization. For example, a naked miRNA molecule would degrade within seconds of vein puncture due to the presence of high levels of nucleases and other inhibitory components in blood, which can interfere with downstream enzymatic reactions, as for example, the common anticoagulant heparin that coamplify with RNA. Moreover, high-quality RNA preparations found in blood contain contaminants that inhibit a RT-qPCR reaction if too much sample is used in the RT preparation [70]. Therefore, it is recommended to use EDTA or citrate anticoagulated blood instead of heparin. Circulating miRNAs, however, have shown stability in several studies resulting from either the formation of complexes between circulating miRNAs and specific proteins [71,72], or the miRNAs are contained within protective circulating exosomes or macrovesicles [73]. Plasma is preferable to serum when quantifying miRNAs, because its use minimizes variations caused by differences due to the lack of clotting factors [74].
For mature miRNAs testing, there are currently available commercial preparations that save time and provide the advantage of manufacturer's established validation and QC standards. For example, a Qiagen buffer (miScript HiSpec Buffer ® ), Qiagen, Inc., Frederick, MD, USA, that inhibits the activity of the tailing reverse-transcription (RT) reaction on templates other than miRNA-sized templates provides for an exceptionally specific cDNA synthetic reaction that eliminates background from longer RNA species. To measure pre-miRNA, however, it would be essential to use another buffer (miScript HiFlex Buffer ® ) as the nonbiased reaction results in an increased background signal from cross reactivity with sequences from a total RNA preparation, which can be distinguished by performing a melt curve analysis when carrying out PCR analysis [74,75].
Small noncoding RNAs that exhibit little variation in different cell types (e.g., snoRNAs and snRNAs) are polyadenylated and are reverse transcribed (RT) in the same way as the small miRNAs and thereby could serve as controls for variability in sample loading and real-time RT-PCR efficiency. They are, however, not suited for data normalization in miRNA profiling experiments because they are not well expressed in serum and plasma samples. Therefore, normalization by a plate mean (i.e., mean C T value of all the miRNA targets on the plate), or using a commonly expressed miRNA targets (i.e., only the targets that are expressed in all samples are used to calculate the mean value) would be needed for a proper normalization of the amplification reaction [76].
An extraction protocol for miRNAs in blood can, however, be challenging. When setting up an extraction step, there are two options: either extract the miRNA molecules from cellular blood components, as whole blood is full of cells that can be obtained by differential centrifugation followed by isolating these cells, or from liquid plasma that contains circulating miRNAs. Attention, however, should be paid to heparin as this anticoagulant is known to be a strong inhibitor of polymerase in PCR reactions. There are several collection tubes that contain citrate as anticoagulant instead of heparin, as those made by Qiagen or Tempus can be used for the whole blood collection. If the aim is to isolate miRNAs from plasma, EDTA tubes can be used to collect blood and plasma isolated, then store at -80 o C until ready for extracting the miRNAs, as these molecules are very stable under standardized laboratory extraction methods. Extraction can be carried out by modified Trizol method from Life Technologies, or miRNeasy reagent from Qiagen. Columns employed in extraction can be clogged and RNA may be lost and/or degraded, therefore, the integrity of total RNA needs to be checked on a standard agarose or acrylamide gels, or with an electrophoresis apparatus, like the Agilent Bioanalyzer. To check if RT-PCR method works, one should employ another source of RNA, as for example cells in culture. A RT-qPCR based screening, like hybrid based assays, however, does need validation. Both Life Technologies' Taqman-and SYBR -based probes (like LNA Universal miRCURY RT microRNA PCR assay, made by Exiqon, Woburn, MA) have high specificity for short miRNAs and both methods showed similar efficiencies, without the need to design and validate homemade primers. MiRNA quantification by both methods, however, showed difference in variability that impact miRNA measurements, and therefore quantification is influenced by the choice of assay methodology. Thus, the method used for quantification must be considered when interpreting analyses of PCR results [77][78][79][80].
Our research team [25,27,28] and others [54,[81][82][83][84][85][86][87][88][89][90][91][92][93][94][95][96] are in the opinion that a miRNA approach in tissue, cell lines, stool or plasma, could meet the criteria for test acceptability by laboratory staff carrying out these tests, as it is a non-or a minimally-invasive method, requites at the most 1 g of stool, or < 2 ml of blood (60% of which is plasma), does not need sampling on consecutive dates, can be sent by mail in cold packs, able to differentiate between normal subjects and colon adenomas/carcinomas, has high sensitivity and specificity for detecting advanced polyps, and can be automated, which makes it relatively inexpensive and more suited for early detection when compared to a test such as mutated DNA markers, especially since plasma is free from interfering clotting products, which are present in serum, miRNAs are stable in stool and plasma [25,28], and only 500 µl of plasma and 1 gram of stool, is required to perform the assay using commercially available kits [27,28]. The availability of powerful approaches for global miRNA characterization such as microarrays [97], NGS [98][99][100][101][102][103][104][105][106] and simple, universally applicable assays for quantification of miRNA expression such as qPCR [107] and statistical/bioinformatics methods for data analyses and interpretation [108][109][110], suggests that the validation pipeline that often encounters bottlenecks [17] will be more efficient in this assay. There is a pressing need for accelerating use of sensitive and stable molecular markers, such as miRNA molecules, in non-or minimally-invasive media such as stool and/or blood to improve the detection of CRC [111], particularly at an early tumor lymph node metastasis (TNM) disease stage (0-1) [112,113] while the cancer is still curable. An experimental workflow for the quantification of miRNAs is shown in figure 1.

MicroRNAs as molecular biomarker molecules for screening of colon cancer
The discovery of small noncoding protein sequences, 17-27 nucleotides long RNAs, miRNAs, which regulate cell processes in ~ 30% of mammalian genes by imperfectly binding to the 3' untranslated region (UTR) of target mRNAs resulting in prevention of protein accumulation by either transcription repression, or by induction of mRNA degradation [114,115], has opened new opportunities for a non-invasive test for early diagnosis of many cancers [65][66][67][68]79,[81][82][83][84][85][86][87][88][89][90][91][92]. The latest miRBase release (v20, June 2013) [http://ww.mirbase.   org] contains 24,521 21,264 miRNA loci from 206 species to produce 30,424 mature miRNA products [116]. Each miRNA generally targets hundreds of conserved mRNAs and several hundreds of nonconserved targets that operate in a complex regulatory network, and it is predicted that miRNAs together regulate thousands of human genes [61,[65][66][67]. MiRNAs are transcribed as long primary precursor molecules (pri-miRNA) that are subsequently processed by the nuclear enzyme Drosha and other agents to the precursor intermediate miRNA (pre-miRNA), which in turn is processed in the cytoplasm by the protein Dicer to generate the mature single-stranded (ss) miRNA [117].
MiRNA functions have been shown to regulate development [118] and apoptosis [119], and specific miRNAs are critical in oncogenesis [65], effective in classifying solid [81][82][83][84][85][86][87] and liquid tumors [90][91][92], and serve as oncogenes or suppressor genes [120]. MiRNA genes are frequently located at fragile sites, as well as minimal regions of loss of heterozygosity, or amplification of common breakpoint regions, suggesting their involvement in carcinogenesis [121]. MiRNAs have great promise serving as biomarkers for cancer diagnosis, prognosis and/or response to therapy [62,64,122]. Profiles of miRNA expression differ between normal tissues and tumor types, and evidence suggests that miRNA expression profiles can cluster similar tumor types together more accurately than expression profiles of protein-coding mRNA genes [24,26,28,111].
Several of the miRNAs were shown by microarrays, NGS and RT-qPCR in CRC tissue, cell culture lines, stool and blood to be related to colon cancer tumorigenesis [25,27,28,[54][55][56][57][58][59][60][64][65][66][67][68]86,87,94,98,114,123] and UC [25,95]. A study indicated that a combination of mRNA and miRNA expression signatures represent a broader approach for improving biomolecular classification of CRC [123]. Another study employing microarrays and qPCR, in addition to an in situ hybridization test to assess differential expression in IBD, showed aberrant expression of 11 miRNA in inflamed tissue and in HT-29 colon adenocarcinoma cells (3 showing significant decrease and 8 significant increase] [95]. Our work support the notion that quantitative changes in the expression of a few cell-free circulatory mature miRNA molecules in stool and plasma that are associated with colon cancer progression would provide for a more sensitive and specific biomarker approach than those tests that are currently available on the market [25,27,28,111].
As colon cancer-specific miRNAs are identified in stool colonocytes or blood plasma by microarrays, NGS and qPCR-based approaches as presented in this review, the validation of novel miRNA/mRNA target pairs within the pathways of interest could lead to discovery of cellular functions collectively targeted by differentially expressed miRNAs [123]. For example, comparison of top 12 pathways affected by colon cancer and globally targeted by miRNAs overexpressed in CRC shows that coexpressed miRNAs collectively provide for a systemic compensatory response to the abnormal phenotypic changes in cancer cells by targeting a broad range of signaling pathways affected in that cancer [108].
Several algorithms such as: TargetScan [108][109][110] that could be dysfunctional in CRC [124,125]. These programs differ in their requirement for base pairing of miRNA and target mRNA genes, and implement similar but not the same criteria when cross-species conservation is applied. Therefore, these different programs will invariably generate different sets of target genes for probably all miRNAs [126,127].
A study that examined global expression of 735 miRNAs in 315 samples of normal colonic mucosa, tubulovillus adenomas, adenocarcinomas proficient in DNA mismatch repair (pMMR), and defective in DNA mismatch repair (dMMR) representing sporadic and inherited CRC stages I-IV [128]. Results showed the following: a) six of the miRNAs that were differentially expressed in normal and polyps (miR-1, miR-9, miR-31, miR-99a, miR-135b and miR-137) were also differentially expressed with a similar magnitude in normal versus both the pMMR and dMMR tumors, b) all but one miRNA (miR-99a) demonstrated similar expression differences in normal versus carcinoma, suggesting a stepwise progression from normal colon to carcinoma, and that early tumor changes were important in both the pMMR-and dMMR-derived cancers, c) several of these miRNAs were linked to pathways identified for colon cancer, including APC/ WNT signaling and cMYC, and d) four miRNAs (miR-31, miR-224, miR-552 and miR-592) showed significant expression differences (≥ 2 fold changes) between pMMR and dMMR tumors. The above data thus suggest the involvement of common biologic pathways in pMMR and dMMR tumors in spite of the presence of numerous molecular differences between them, including differences at the miRNA level [128].
Unlike screening for large numbers of mRNA genes, a modest number of miRNAs is used to differentiate cancer from normal, and unlike mRNA, miRNAs in stool and blood remain largely intact and stable for detection [25][26][27][28]111]. Therefore, miRNAs are better molecules to use for developing a reliable noninvasive diagnostic screen for colon cancer, since we found out that: a) the presence of Escherichia coli does not hinder detection of miRNA by a sensitive technique such as qPCR, as the primers employed are selected to amplify human and not bacterial miRNA genes, and b) the miRNA expression patterns are the same in primary tumor, or diseased tissue, as in stool and blood samples. The gold standard to which the miRNA test is to compared should be colonoscopy, which is obtained from patients' medical records, as well as the cheaper immunohistological (IHC) FOBT screen, currently used in annual checkups [32], for comparison with miRNA results. Although exosomal RNA will be missed [129] when using restricted extraction of total RNA from blood or stool, a parallel test could also be carried out on the RNA obtained from noninvasive stool or blood samples, and the appropriate corrections for exsosomal loss can then be made after the tests are completed. A miRNA quantification workflow is shown in figure 1.

Microarray, NGS and RT-qPCR tests for detection of miRNAs in tissue & noninvasive samples
We have shown that we are able to routinely and systematically able to extract a high quality total RNA containing miRNAs from a small number of laser capture microdissected (LCM) cells from tissue [130], colonocytes isolated from human stool [25,27,111] or circulating blood [28,54] using commercially-available kits (RNeasy isolation Kit ® ) from Qiagen, Valencia, CA, USA, followed by another kit from Qiagen "The "Sensiscript RT Kit". We show below various molecular techniques that allow for the quantitative detection of miRNAs in various tissues, excrements and human blood, with high sensitivity and specificity than   currently available on the market.

Next-generation sequencing (NSG) technologies
The launch of chain-termination method by Sanger et al in 1977, commonly referred to as Sanger's dideoxy sequencing [102], is the most commonly used DNA sequencing technique today, and has been partly supplanted by other next-generation sequencing technologies that are more cost effective and provides higher throughput, although at the expense of read lengths. The Sanger method is based on DNA polymerase-dependent synthesis of a complementary DNA strand in the presence of 2'-deoxynucleotides (dNTPs) and 2',3'-dideoxynucleotides (ddNTPs) that serve as nonreversible synthesis terminators when ddNTPs are added to the growing oligonucleotide chains, resulting in truncated products of varying lengths, which can then be separated by size on polyacrylamide gel electrophoresis. Advances in fluorescence detection allowed for combining the four terminators into one reaction, using fluorescent dyes of different colors, one for each of the four ddNTP. Moreover, the original slab gel electrophoresis was replaced with capillary gel electrophoresis, enabling better separation. Then capillary electrophoresis was replaced by capillary arrays, allowing many in vivo amplified fragments samples cloned into bacterial hosts to be analyzed in parallel. Furthermore, the development of linear polyacrlamide and polydimethylacrilamide allowed the reuse of capillaries in multiple electrophoretic runs, thereby increasing the sequencing efficiency. These and other advances of sequencing technology have contributed to the relatively low error rate, long read length and robustness of modern Sanger sequencers (Table 3). For example, the high throughput automated Sanger sequent instrument from Applied Biosystems (ABI 37730xl) has a 96 capillary array format that produces ≥ 900 PHRED 20 bp (a measure of the quality of identification of the nucleobases generated by sequencing) per read, for up to 96 kb, for a 3 h run [98].
The 454Roche instrument was the first next generation sequencer released to the market that circumvents the lengthy, labor intensive and error-prone technology by using in vitro DNA amplification known as emulsion PCR, where individual DNA fragment-carying streptavidin beads, obtained by the shearing the DNA and attaching the fragments to beads using adapters, which are captured into separate emulsion droplets that act as individual amplification reactors, producing ~ 10 7 clonal copies of a unique DNA template per bead. Each template-containing bead is then transferred into a well of a picotiter plate, which allows hundreds of thousands of clonally related templates of pyrosequencing reactions to be carried out in parallel, increasing sequencing output [104]. The sequence of DNA template is determined by a pyrogram, which corresponds to the correct order of chemiluminescently incorporated nucleotide as the signal intensity is proportional to the amount of pyrophosphate released. The pyrosequencing approach is prone to errors resuling from incorrectly estimating the length of homopolymeric sequence stretches (or indels). The Roche 454 platform, which has been the most widely used next generation sequencing technology, is capable of generating 80-120 Mb of sequence in 200-300 bp reads in a 4h run [98].
The Illumina/Solexa approach achieves cloning-free DNA amplification by attaching a ssDNA fragment to a solid surface, known as a single molecule array, or free cell, and carrying out solid-phase bridge amplification of single molecule DNA templates in which one end of single DNA molecule is attached to a solid surface by an adapter; the molecule is subsequently bend over and hybridized to complementary adapters, creating a bridge, which serves as a template for the synthesis of complementary strands. Following the amplification, a flow cell containing more than 40 million clusters, each cluster composed of ~ 1000 clonal copies of a single tempelate molecule is produced. Templates are sequenced in massivley parallel fashion using a DNA sequencing-by-synthesis approach that employs reversible terminators with removable fluorescent moieties and DNA polymerases capable of incorporating these terminators into growing oligonucleotide chains. The terminators are labeled with fluors of four different colors to distinguish among the different bases at the given sequence position, and the template sequence of each cluster is deduced by reading off the color at each successive nucleotide addition step. Although Illumina technology seems more effective at sequencing homopolymeric stretches than pyrosequencing, it produces shorter sequence reeds, and thus cannot resolve short sequence repeats. Moreover, substitution errors have been noted in this platform due to the use of modified DNA polymerases and reversible terminators. The Massivley parallel sequencing (MPS) by hybridization-ligation supported in the oligonucleotide ligation and detection system SOLiD from Applied Biosystem is based on the polony sequencing technique [106]. Libraries begins with an emulsion PCR single-molecule amplification step, followed by transfer of the products onto a glass surface where sequencing occurs by sequential rounds of hybridizatrion and ligation with 16 dinucleotide combinations labeled by four different fluor dyes. Each position is probed twicw and the identity of the nucleotide is determined by analyzing the color resulting from two successive ligation reactions. The two base encoding scheme allows the distinction between a sequencing error and a polymorphism (an error would be detected in only one reaction, whereas a polymorphism would be detected in both. The 1-3Gb SoLiD generates 35 bp reada per an 8 day run [104]. Table 3 shows currently available DNA sequencing technologies.

Microarray technologies
For microarray studies, we employed Affymetrix Gene Chip Micro 3.0 Array (Affymetrix, Inc, Santa Clara, CA, USA), which provides for 100% miRBase v17 coverage [http://ww.mirbase.org] by a one-color approach. The microarray contains 16,772 entries representing hairpin precursor, expressing 19,724 mature miRNA products in 153 species, and provides >3 log dynamic range, with 95% reproducibility and 85% transcript detection at 1.0 amol for a total RNA input of 100 ng.
Global microarray expression studies have shown similarity in expression between stool, plasma and tissue using various microarray formats [131]. Our microarray studies in stool samples obtained from fifteen individuals (three controls, and three each with TNM stage 0-1, stage 2, stage 3, and stage 4 colon cancer) showed 202 preferentially expressed miRNA genes that were either increased (141 miRNAs), or reduced (61 miRNAs) in expression [27]. A scatter plot comparing low dose microarray data to the control group, and data presented in figure 3 shows a multigroup plot comparing miRNA-193a-3p to internal standard 18S rRNA in healthy normal control and the four TNM colon cancer groups (stages 0 to IV). Table 4 presents comparison of NGS with qPCR technologies for miRNA profiling [98][99][100][101][102][103][104].

NGS qPCR General
Offers a hypothesis-free approach.

Able to detect isomiRs & novel miRNAs
Designed towards a specific miRNA sequence (usually listed in the miRBase).
Offers less sensitivity compared to qPCR & Offers higher sensitivity compared to NGS.
Requires higher sample amount than qPCR Requires less sample than NGS. Optimized NGS and qPCR methods for miRNA profiling RNA spike-ins are added during the RNA isolation step to monitor the reproducibilityand linearity of the isolation of the reactions.
RNA spile-ins are also included for RNA QC of the qPCR isolation reactions.
qPCR-based QC to monitor RNA isolation efficiency, inhibition and outliers' detection. qPCR-based QC to monitor RNA isolation efficiency, inhibition and outliers' detection.
Hemolysis indicators, spike-in controls & endogenous miRNA controls for RNA QC.
Hemolysis indicators, spike-in controls & endogenous miRNA controls. For RNA QC.
Library preparation utilize methods optimized for low con of starting materials, size selection to maximize miRNA reads, and QC of library by Bioanalyzer and qPCR Choice of appropriate methods of RT and qPCR assay.
Use of a sequencing platform. Use of appropriate qPCR instrument.

Raw sequencing data (FASTQ files) Raw qPCR data (Cq values) Data QC and filtering Data QC
Base and read quality Tm and melting curve analysis

PCR technologies
To be able to screen several miRNA genes using the proposed PCR technology in a sequence-specific RT manner, in which a cDNA preparation can assay for a specific miRNA, we have employed in our work [25,27,28] a sequence-specific stem-loop RT primers designed to anneal to the 3'-end of a mature miRNA, which result in better specificity and sensitivity compared to conventional linear ones [132]. This step was followed by a SYBR Green ® -based real-time qPCR analysis in which a forward primer specific to the 5'-end of the miRNA, a universal reverse primer specific for the stem-loop RT primer sequence, and a 5'-nuclease hydrolysis probe-TaqMan TM minor grove binding (MGB) probe matching part of the miRNA sequence and part of the RT primer sequence--was employed, using a standard TaqMan ® PCR kit from Applied Biosystems on a Roche's LightCycler (LC ® ) 480 instrument in our labs, and employed the E-method [131] to calculate the relative expression of miRNA genes in a modified RT-qPCR studies. It should be emphasized that the Roche's LC-480 ® PCR instrument [133] employs a non user-influenced method for high throughput measurements, using second derivative calculations and double corrections [134]. One correction utilizes the expression levels of a housekeeping gene of an experiment as an internal standard, which results in reduced error due to sample preparation and handling, and the second correction uses reference expression level of the same housekeeping gene for the analyzed expression in colonocytes or plasma, which avoids the variation of the results due to the variability of the housekeeping gene in each sample, especially in experiments that employ different treatments [135].
We conducted a stem-loop RT-TaqMan minor groove binding (MGB) probes, followed by a modified qPCR expression assay on 20 selected mature miRNAs in stool [27] and on 15 mature miRNAs in blood [28] that involved amplification of the gene of interest (target) and a second control sequence (reference) also called an external standard, which amplified with equal efficacy as the target gene, in the same capillary, a procedure known as "multiplex PCR". Quantification of the target was made by comparison of the intensity of the products. A suitable reference gene has been the housekeeping pseudogenefree 18S ribosomal (r)RNA gene that was used as a normalization standard because of the absence of pseudogenes and the weak variation in its expression [136,137]. This selection has obviated the need to use normalization strategies such as plate mean (a mean C T value of all miRNA targets on the plate), a panel of invariant miRNAs [138], or commonly expressed miRNA targets [76]. A software to find a normalizer such as NormFinder [www.mld.dk/publicationnormfinder. htm], which is run as a template within Microsoft Excel ® can also be used. For a more focused approach employing PCR on selected number of miRNA genes, we used miRNA stem-loop RT primers [132] for specific miRNA species to be tested, to make a copy of ss-DNA [25,27,28] for real-time PCR expression measurements.
Of the selected 15 miRNAs that exhibited quantifiable preferential expression by qPCR in plasma, and have also been shown to be related to colon cancer carcinogenesis, nine of them (miR-7, miR-17-3p, miR-20a, miR-21, miR-92a, miR-96, miR-183, miR196a and miR-214) exhibited increased expression in plasma (and also in tissue) of patients with CRC, and later TNM carcinoma stages exhibited a more increased expression than did adenomas. On the other hand, six of the selected miRNAs (miR-124, miR-127-3p, miR-138, miR-143, miR-146a and miR-222) exhibited reduced expression in plasma (and also in tissue) of patients with colon cancer, with the reduction becoming more pronounced during progression from early to later TNM carcinoma stages [28].
The PCR stool data on 60 samples are tabulated in table 2 and presented graphically in figure 3 using a scatter plot, and also in figure  5. By employing a volcano plot (Figure 4), data exhibits minimal variance within groups, resulting in low p-values calculated using 2 (-dCT) (SD of 0.015275 or 0.025166 is minimal, or raw CT values is only ~ 0.03 for three replicates). The 95% CT for group 4 was between 134.39 and 135.63, indication a slight variation between groups. However, because the raw CT variations are low, even the slightest changes resulted in significant p-values; for example, miR-193a-5p was induced in different groups by between two to 134-fold. It should be emphasized that there was been no need to use receiver operating characteristic (ROC) curves because the difference in miRNA expression between healthy individuals and patients with colon cancer, and among stages of cancer was large and informative.
For example, the presented data can be compared to that which would be obtained from a group of students where half are 1 st graders and the other half are high school students (although we have considered more groups, the idea can still be exemplified with just two groups). To separate these groups, we would use height as a measurement (in our case we used gene expression). It turns out that the shortest high school student is a lot taller than the tallest 1 st grader and all those above are high school students. Specificity, sensitivity and area under the curve are all 100%. When we use weight (in our case, a different expression) we get the same results: the lightest high school student is a lot heavier than the heaviest 1 st grader. We can use other measures, such as shoe size or reading level, and again we get the same result.
Thus, our results [25,27,28] are in general agreement with what has been reported in the literature for the expression of these miRNAs in tissue, blood, stool of colon cancer patients, and cells in culture [54][55][56][57]60,62,65,87,94]. This indicates that the choice of carefully selected miRNAs can distinguish between non-colon from colon cancer, and can even separate different TNM stages. A predictive miRNA expression index ( Table 5) similar to that developed for mRNA [139] or a complicate multivariate statistical analysis [140] was therefore not  necessary in this case in order to reach conclusions from these data.
The initial number of miRNA genes (whether 15 or twenty) could be refined by validation studies to a much lower number (or even a single miRNA molecule) if the data pans out in a larger epidemiologically randomized study [141] that employs a prospective specimen collection retrospective blinded evaluation (PRoBE) design for randomized selection of control subjects and case patients from a consented cohort population, to avoid bias and to ensure that biomarker selection and outcome assessment will not influence each other, in order to have a statistical confidence in data outcome. The validated miRNA biomarkers can then be placed on a chip to facilitate screening, as has been done for the testing of genetically modified organisms in food [69].
It is necessary to clearly understand the normal, healthy functions of the human body, and their value ranges (e.g. with respect to age, sex, environment), in order to more thoroughly detect what is abnormal by studying human tissue/blood/stool from healthy donors and patients. Such studies need high quality samples from large numbers of subjects --in the hundreds to thousands-designed by an appropriate epidemiologic method that employs a randomized unbiased PRoBE design [141] of hundreds to thousands of control subjects and case patients from a consented cohort population [142].

Method for PCR quantification by Roche's 480 lightcycler, normalization and QC issues
The comparative cross point (CP) value (or E-method) [133] was employed, utilizing the LightCycler (LC) Quantification Software™, Version v4.0 [134] for Roche LC PCR instruments (Mannheim, Germany) for the semi-quantitative PCR analysis. The method employs standard curves in which the relative target concentrations is a function of the difference between crossing points (or cycle numbers) as calculated by the second derivative maximum [135], in which the Cycler's software algorithm identifies the first turning point of the graph showing fluorescence versus cycle number to calculate the expression of miRNA genes automatically without user's input, with a high sensitivity and specificity. A CP value corresponds to the cycle number at which each well has the same kinetic properties. The CP method corresponds to the 2 -ΔΔC T method [143] used by other PCR instruments, although the latter method produces reliable quantitative results only if the efficiency [E=10-1/slope] of the PCR assay for both target and reference genes are identical and equal to 2 (i.e., doubling of molecules in each amplification cycle); for example if well A1 has a CP value of 15 and well A2 has a CP value of 16, we deduce that there was twice as much of the gene of interest in well A1. A 10-fold difference is shown by a difference of ~ 3.3 CP value. It is not possible to compare these values between different primer pairs. The CP method compensates for difference in target and reference gene amplification efficiency either within an experiment, or between experiments.
It is also essential to normalize the data to a "reference" housekeeping internal standard gene, or in some cases against several standards because the total input amount may vary from sample to sample when doing relative quantification. To report "fold change" results, the software incorporates all those factors. The CP method can normalize for run-to-run differences, as those caused by variations in reagent chemistry. For such normalization, one of the relative standards must be designated a "calibrator" for the target and for the reference genes, which can be any of our healthy control stool sample. These calibrator(s) can then be used repeatedly in subsequent runs to guarantee a common reference point, allowing for comparison of all experiments within the series. If necessary, the 2 -ΔΔC T can be calculated by instrument's software if samples are properly labeled; the 2 -ΔΔC T calculations can also be set up manually. To determine fold change for a particular unknown cancer stool or blood sample that has a target gene CP value of 10, one needs three additional values: a) The reference gene CP value of that same unknown stool sample/cancer stool sample, b) the target gene CP for the calibrator sample/normal stool, and c) the reference gene CP for the calibrator sample/normal stool or blood [143].
In all PCR reactions, strict attention must be given to quality control (QC) procedures, and as the field has matured, guidelines on reporting qPCR data known as minimum information for publication of quantitative real-time PCR expression (MIQUE) has also been implemented by us [136] in order to ensure the uniformity, reproducibility and reliability of the PCR reaction and data integrity.

Tumor heterogeneity due to mismatch DNA repair
To add another level of complexity to colon cancer and show the extent of cancers heterogeneity, colon tumors have shown differential expression of miRNAs depending on their mismatch repair status. MiRNA expression in colon tumors has exhibited an epigenetic component, and altered expression due to mismatch repair may reflect a reversion to regulatory programs characteristic of undifferentiated proliferative developmental states [144].
MiRNAs also undergo epigenetic inactivation [145], and miRNA expression in CRC has been associated with MSI subgroups [146,147]. MiRNAs may regulate chromatin structure by regulating key histone modification; for example, cartilage-specific miR-140 targets histone deacetylase 4 in mice [148], and miRNAs may be involved in meiotic silencing of unsynapsed chromatin in mice [149]. In addition, DNA methylation enzymes DNMT1, 3a and 3b were predicted to be potential miRNA targets [150]. Moreover, a specific group of miRNAs (epi-miRNAs), miR-107, -124a, -127, directly target effectors of the epigenetic machinery such as DNMTs, histone deacetylases and polycomb repressive complex genes, and indirectly affect the expression of suppressor genes .
In addition to negatively regulating target mRNA, miRNAs are regulated by other factors. For example, c-myc activate transcription of miR-17-92 cluster that has a role in angiogenesis [154], and TFs NFI-A and C/EBPα compete for binding to miR-223 promoter decreasing and increasing miR-223 expression, respectively [155]. MiR-223 also participates in its own feedback, and favors the C/EBPα binding by repressing the NFI-A translation. Many of the miRNAs located in the introns of protein-coding genes are co-regulated with their host gene [156]. The challenge now is to identify those driver methylation changes that are thought to be critical for the process of tumor initiation, progression or metastasis, and distinguish these changes from methylation changes that are merely passenger events that accompany the transformation process but that have no effect per se on carcinogenesis.

Test performance characteristics (TPC) of the miRNA approach
Cytological methods carried out on purified colonocytes employing Giemsa staining [157] as described for CRC, showed a sensitivity for detecting tumor cells in smears of 80%, which is slightly better than that reported earlier (i.e. about 78%) [158,159].
Numerical underpinning of the miRNAs as a function of total RNA was carried out on colonocytes isolated from stool [160] before any preservative was added to five healthy control samples, and five TNM stage IV colon cancer samples, extracting total RNA from them and determining the actual amount of total RNA per stool sample, and from the average CP values, taking into account that some exsosomal RNA will not be released from purified colonocytes into stool, and arbitrarily corrected for that effect [161]. It is evident from data shown in table 6 that an average CP value for stage IV colon carcinoma of 21.90 is invariably different from a CP value of 26.05 for healthy controls.
Test performance characteristics (TPC) of the miRNA approach obtained by the CP values of the miRNA genes calculated from stool colonocyte samples of normal healthy individuals and patients with colon cancer were compared to the commonly used FOBT test and with colonoscopy results obtained from patients' medical records in 60 subjects (20 control subjects and 40 colon cancer patients with various TNM stages). The data showed high correlation with colonoscopy results obtained from patients' medical records for the controls and colon cancer patients studied.

Statistical methods and bioinformatics analyses
In genomics work, it is important to have an understanding of statistics and bioinformatics to appreciate and make sense of generated data [162]. First, power analysis could be used for estimating sample size for a study [163]. Moreover, power analysis, as well as first and second order validation studies could be carried out to access the degree of separation and reproducibility of the data [164].
If the difference in miRNA gene expression between healthy and cancer patients and among the stages is found to be large and informative for multiple miRNA genes, suggesting that classification procedures could be based on values exceeding a threshold, then a sophisticated classification would not be needed to distinguish between the study data. However, if inconsistent differences on large samples are found, then predictive classification methods can be employed. Programs supplied by Qiagen Corporation can be used freee of charge to analyze, normalize and graph molecular data (http://pcrdataanalysis. sabiosciences/com).
The goal in predictive classification will be to assign cases to predefined classes based on information collected from the cases. In the simplest setting, the classes (i.e., tumors) are labeled cancerous and non-cancerous. Statistical analyses for predictive classification of the information collected (i.e., microarrays and qPCR on miRNA genes) attempt to approximate an optimal classifier. Classification can be linear, nonlinear, or nonparametric [162,164]. The miRNA expression data could be analyzed first with parametric statistics such as Student t-test or analysis of variance (ANOVA) if data distribution is random, or with nonparametric Kruskall-Wallis, Mann-Whitney and Fisher exact tests if distribution is not random [162,165]. If needed, complicated models as multivariate analysis and logistic discrimination [166,167] could also be employed.
False positive discovery rates (expected portion of incorrect assignment among the expected assignments) could also be assessed by statistical methods [168][169][170], as it could reflect on the effectiveness of the test, because of the need to do follow up tests on false positives. The number of optimal miRNA genes (whether 20 or less) to achieve an optimum gene panel for predicting carcinogenesis in stool will need to be established by statistical methods.
For the corrected index, cross-validation could be used to: protect against overfitting, address the difficulties with using the data to both fit and assess the fit of the model, and determine the number of samples needed for a cancer study, where the expected proportion of genes' expression common to two independently randomly selected samples is estimated to be between 20% and 50% [171]. Efron and Tibshirani [172] suggested dividing the data into 10 equal parts and using one part to assess the model produced by the other nine; this is repeated for each of the 10 parts. Cross-validation provides a more realistic estimate of the misclassification rate. The area under the ROC curves, [in which sensitivity is plotted as a function of (1 -specificity)], are used to generally describe the trade-off between sensitivity and specificity [173].
Principal component analysis (PCA) method [174], which is a multivariate dimension reduction technique, could also be used to simplify grouping of genes that show aberrant expression from those not showing expression, or a much reduced expression. In cases where several genes by themselves appear to offer distinct and clear separation between control or cancer cases in stool samples, a PMI may thus not be needed.
If the miRNA gene panel (or a derived PMI) is found to be better than existing screening methods, then all of the data generated can be used to assess the model so over-fitting is not a concern. The level of gene expression could be displayed in a database using parallel coordinate plots [175,176] produced by the lattice package in R (version 2.9.0, The R Foundation for Statistical Computing [http://cran.r-project.org], and S-plus software (Insightful Corporation, Seattle, WA). Other packages such as GESS (Gene Expression Statistical System) published by NCSS [http://www.ncss.com] could also be employed, as needed Bioinformatics analysis using the basic TargetScan algorithm [103] for up-regulated and down regulated mRNAs genes has been employed. The program yielded 21 mRNA genes encoding different cell regulatory functions. The first 12 of these mRNAs were found with the DAVID program [177] to be active in the nucleus and related to transcriptional control of gene regulation. For down regulated miRNAs, the DAVID algorithm found the first four of these miRNAs to be clustered in cell cycle regulation categories [26].

Conclusions
The innovation of employing a miRNA approach for colon cancer screening lies in the exploratory use of a screening technology, such as (NSG) or microarrays, followed by an affordable, quantitative miRNA expression profiling of few of these molecules in noninvasive stool or blood samples, whose extracted fragile total RNA can been stabilized  in the laboratories shortly after stool collection or blood drawing by commercially available kits so it does not ever fragment, followed by global miRNA expression, then quantitative standardized analytical real-time qPCR tests on fewer selected genes that are neither labor intensive, nor requires extensive sample preparation, in order to develop a panel of few novel miRNA genes for the diagnostic screening of early left and right sporadic colon cancer more economically, and with higher sensitivity and specificity than any other colon cancer screening test currently available on the market.
RT-qPCR has been the subject of considerable controversy. While the technique is considered the gold standard for quantifying gene expression in a cell or tissue, there are so many variables involved that different labs could perform the same experiment and end up with different results. Moreover, although a study may produce a statistically significant result, it's hard to know if that result is truly valid or if the data might have been skewed due to a technical error. Therefore, in 2009, a group of researchers published guidelines to help scientists publish data that are both accurate and reproducible. These guidelines are known as "The Minimum Information for Publication of Quantitative Real-Time PCR Experiments (MIQE)" [136]. They address several key aspects of qPCR, including sample quality control, assay design, PCR efficiency, and normalization. A paper that attempted to identify a set of suitable, reliable reference genes for several different human cancer cell lines and to determine whether or not MIQE guidelines are followed, reported that in many of the studies important data are missing, as many publications do not report the efficiency of their reference genes or their qPCR data, and that only 30-40 percent of published studies that investigated reference genes actually followed the MIQE guidelines [136]. Moreover, as the newest incarnation of PCR, digital pCR or dPCR, is now being used by an increasing number of labs to provide for broader quantification, a new set of MIQE guidelines geared to the specific concerns of this brandnew version of PCR have recently been published [136].
It is noteworthy to point out that since the discovery of miRNA in 1993, investigators working in cancer research paid attention to these regulatory molecules and attempted to develop minimally-invasive markers to diagnose this disease. Although methods that employ PCR in stool and blood samples are currently in the forefront of the quantitative methods to develop reliable screening markers, chip that contain a combination of these genes will be produced to simplify testing, as has been accomplished in testing of genetically modified organisms in foods [69].
MiRNAs are interesting biomarkers that are stable, amplifiable, and functionally important, have ample information content, play a significant role in gene regulation, and the expression profiles of the 800 validated molecules allows for distinguishing malignant and nonmalignant tissue, as well as distinguishing different tumor entities. Most circulating miRNAs are associated with Argonaute2, which is part of the RISC silencing complex. But whether these circulating miRNAs come from normal tissue or tumor tissue and how they are released into body fluids -through cell death or some other process -are mostly unanswered questions. In healthy tissue, evidence indicates that cells release miRNAs, both in vesicles and in protein complexes, which can then act as intercellular signaling molecules. When taken up by a recipient cell, miRNAs could modulate their gene expression. In tumor tissue cells they promote a microenvironment that helps the tumor survive, giving tumors a selective advantage. However, it is not known what is the balance between passive release by various ways, and release that is programmed within the cell, as for example, immune cells.
Many circulating miRNAs linked to solid tumors are also expressed in blood cells. The source of miRNAs is not important, provided they are validated as markers. What has been a challenge is to establish standardized protocols for extracting and quantifying circulating miRNAs, as the technology keeps developing and improving; however, it is expected that in 5 to 10 years, we'll have worked out the best way to quantitate miRNAs in blood and other body fluids.
Because results for many tumor markers have not been adequately reported, this anomaly has led to difficulty in interpreting research data and inability to compare published work from different sources, guidelines for carrying out tumor marker studies in a transparent fashion and for adequately reporting research findings have been jointly published by the USA National Cancer Institute and the European Organization for Research and Treatment of Cancer (NCI-EORTC] [179] so that researchers could have confidence in outcome and could repeat these data using the published methods.
It is envisioned that eventually a microfluidic device of an implantable biosensor platform that is simple in design, durable in performance and easy to use will be produced, whereby an individual takes noninvasive stool or semi-invasive blood samples at home and inserts them into it for assay of colon cancer disease markers. Identification of early stage disease biomarkers combined with a realistic awareness of self and sustained discipline for good and improved health would allow the individual to take preventative actions quickly, which will help prevent the spread of this cancer.

Recommendations
The following issues are considered important and represent a summary of how we envision miRNAs to influence colon cancer development and progression:

1.
It is necessary to thoroughly understand the normal, healthy functions of the human body, and their value ranges (e.g. with respect to age, sex), in order to more rapidly detect what is abnormal.by studying human tissue/blood/ stool from healthy donors and patients. Such studies need high quality samples from large numbers of subjects (in the hundreds to thousands) selected by an appropriate epidemiological design to facilitate reaching meaningful conclusions.

2.
When carrying out biological studies, it is essential to select the number of subjects by an epidemiologically-acceptable approach, and to have an adequate number of samples (in the hundreds to thousands) to be able to carry out a thoughtful analyses, and to be able to reach meaningful conclusions.

3.
In its application as a screening approach, global miRNA profiling by a high throughput omic method such as microarrays, followed by real-time qPCR, as well as digital PCR (dPCR) and next generation sequencing (NGS) should be looked at as an expedition into the terra incognita of molecular diagnosis to identify novel genes, mechanisms and/or pathways in which a stimuli, whether genetic or environmental, exerts a change on the physiology of the cell.

4.
MiRNA profiling is limited by available cells, which could be obtained by noninvasive methods, genetic heterogeneity of the tested population, and environmental factors such as diverse life styles and nutritional habitats.

5.
MiRNA quantification can be influenced by the choice of methodology, which must be considered when interpreting the miRNA analysis results.

6.
Because array data often underestimate the magnitude of change in miRNA level, it would be essential to use an independent confirmatory method such NGS, RT-qPCR, or Northern blotting, to check the magnitude of miRNA level of the identified target gene(s), as the magnitude of the change in the miRNA level depends on a variety of parameters, particularly the employed normalization method.

7.
To avoid errors due to exosomal RNA loss using restricted extraction of total RNA from stool or blood, a parallel test should also be carried out on total RNA obtained from stool or plasma samples, and appropriate corrections for exsosomal loss need to be made.

8.
Although it is mainly used now as a basic science tool, global miRNA gene expression is moving from laboratories to largescale clinical trials as a diagnostic tool to describe a pathophysiologic condition, or even allow clinical states to be determined in diseases such as cancer.

9.
MIQE guidelines, which address several key aspects of qPCR, including sample quality control, assay design, PCR efficiency, and normalization were published in 2009 to help scientists publish data that are both accurate and reproducible, which have been followed recently by similar guidelines for digital PCR. 10. Although our results show that several miRNA genes can be used to discriminate noninvasively healthy individuals from patients with colon cancer, it would, however, be necessary to conduct a prospective randomized validation study using the methods that we have outlined herein, but on larger number of individuals to have a statistical confidence in data outcome.
11. Effort is needed to identify driver methylation changes believed to be critical to the process of tumor initiation, progression or metastasis, and distinguish these from methylated changes that are passenger events, accompanying the transformation process but have no effect per se on carcinogenesis.
12. Guidelines for carrying out tumor marker studies in a transparent fashion and for adequately reporting research findings have been jointly published by the US National cancer institute and the European Organization for Research and treatment of cancer (NCI-EORTC).