An overview on potential of in vitro studies and transcriptomics for cancer

The concept of cell lines emerged from the uncontrolled growth of cells called cancer; however cancer cell lines are not easy to develop. In current time a good number of human cancer and non cancer cell lines are available for research to find out a cure for this disease with the development of new therapies and medicines. According to different reports, cases of this disease are increasing day by day. Environmental pollution and different habits in modern life style viz. smoking, tobacco, etc. increasing the risk of cancer. Reports from several countries, including India on this issue are horrendous. The risk, severity and symptoms can be better understood by molecular level analysis of the disease. To date advancement of technology from microarray transcriptome analysis to transcriptome sequencing has been provided a deep insight into cancer. Sample size and ethical issues related to human necessitates the use of in vitro techniques alongwith such molecular techniques to get a thorough understanding of the disease and also provides a good prospectus. Correspondence to: Pankaj Soni, Senior Research Fellow, Molecular Biology Division, ICAR-NBFGR, Lucknow, India, Tel: 8115930054; E-mail: mbt. pankaj@gmail.com


Introduction
The first reported human cell line established in culture was HeLa and originated from the cervical cancer tumor of a patient named Henrietta Lacks, who later died of cancer in 1951. Thus the very first human cell line was cancer cell line [1,2]. The exact number of human cell lines is not yet figured out but only the "Cancer cell line encyclopedia" (CCLE) having approx. 1000 human cell lines and "American type culture collection" (ATCC) company is having around 4000 human cell lines. The field of applicability of human cell lines either cancer or non-cancerous is vast and the extent of information one can achieve is beyond of comprehension. The study of human cell lines has transformed our understanding of human cancer cell biology viz. from oncogene function to therapeutic sensitivity etc. [3].
One of the major benefits of using cultured cell lines in cancer research is that they offer an infinite supply of a relatively homogeneous cell population that is capable of self replication in standard cell culture medium [4]. However like stem cells, cancer cells are widely thought to be able to proliferate indefinitely and yet it is notoriously difficult to establish immortal cell lines in culture from primary cancers [5]. With difficulties in developing the cell lines from different human cancers much success has been achieved in this area [6] and in current time we are having cell lines from almost every type cancer known to us. Importance of cell lines in field of cancer biology is increasing day by day with the advent of new technologies and their contribution is not restricted to field of cancer but is multidisciplinary; one of the earlier and important examples is the development of vaccine against the polio virus by employing the HeLa cell line for the purpose [7].
With the development of cancer cell lines we have been able to make progress in advancement in therapy and medicine but according to the reports maximum survival period of a cancer patient is limited to five years for different cancers [8][9][10]. According to the many statistics more than 11 million people are diagnosed with cancer every year. It is estimated that there will be 16 million new cases every year by 2020 [11]. A few prominent types of cancers like ovarian and oral cancers have even more horrendous reports viz. every year 4100,000 women around the globe die of ovarian cancer [12] while oral cancer remains 8th and 13th most common cancer worldwide for males and females respectively [13].
Studies define cancer as a cluster of diseases involving alterations in the status and expression of multiple genes that confer a survival advantage and add undiminished proliferative potential to somatic or germinal cells [14]. For gene level understanding of a disease, Transcriptomics is a powerful tool in applied as well as in basic biological research works. Genomic studies play a crucial role in study of cancer treatment and there is a need to further explore the technology which will help provide us deep insights into gene expression studies.
The emergence of "-omic" technologies in the first decade of the 2000s allowed the characterization of cancers at the molecular level, which revealed the genetic heterogeneity of the tumors along with numerous potential targets [15]. The expression studies imparts heavily in these molecular level of studies carried out long before the definition of today's transcriptome that is now much reliable due to advancement of sequencing technologies. Before advanced sequencing technology the technologies used for transcriptomics and gene expression studies were microarray and serial analysis of gene expression (SAGE).
The expression profiling of first human cell line HeLa was done by using microarray which was the pioneer of 'omics' work [16][17][18]. The transcriptome of HeLa has been characterized with second-generation sequencing technologies, e.g., poly(A)-RNA [19] and small RNAs (Affymetrix ENCODE Transcriptome Project & Cold Spring Harbor Laboratory ENCODE Transcriptome Project 2009). The genome of cell is defined as total number of gene or total DNA content of a cell while transcriptome of a cell is dynamic and differs according to the change in intracellular or extracellular environment. There are general differences between the environment of cells growing in vitro and that of a heterogeneous tissue [20]. In late 1980 clinical trials, compounds were identified through screens using transplantable marine neoplasms for solid tumors [21] which helped develop an in vitro human based tool for the identification of newly identified anticancer compounds [22]. After a period of almost two decades deep analysis of cancer cell lines at the genomic level was found critical to establish the oncogenic driver events as well as the mechanisms underlying response to treatments [23,24]. These studies proved the importance and scope of vitro as well as molecular level studies.
Till date studies of cancer rely on the use of primary tumors, paraffin-embedded samples ,cancer cell lines [25][26][27], xenografts [28][29][30], tumor primary cell cultures [26,27] and/or genetically engineered mice [28]. Cancer cell lines are explored to investigate it's response to adverse environmental conditions [31,32], genetic perturbations [33] etc. show non confined area of cancer cell line utilization. Here it is also important to mention that not only cancer cell lines but vivo studies found new direction with the advancement of transcriptomic technology using which we can get the data that can be analysed in silico unambiguously. With the help of obtained data we can explore almost all of the genes related to the study and can target the desired gene/genes according to our hypothesis which can help us find solution of the targeted problem.

Expression studies on cancer with microarray
National Center for Biotechnology Information (NCBI), a part of United States National Library of Medicine (NLM) which is branch of National Institute of Health (NIH) an agency of Health Department of U.S. Federal Government, where different databases have 40836 gene loci information, 1129699 protein sequences and 3558791 expressed sequence tags (ESTs) regarding cancer is itself enough to get the available information on expression studies that has been conducted on cancer. Therefore all the information about microarray or sequencing related to cancer is a tedious task to present in a single article but the chronological study of information can help us understand the development, utilization and importance of these microarrays to advanced sequencing studies [34,35].
Long before the molecular profiling, the technique used to recognise the different cancers was histopathology based on morphological observations [4]. Expression studies became plausible after the southern blot analysis technique By E M southern in 1975. Gene expression study using microarray for cancer was first done in 1996 to assess the development and progression of cancer and experimental reversal of tumorigenicity using mRNA from tumorigenic UACC-903 cell line and non-tumorigenic UACC-903 (+6) cell lines. The reliability of results had also been tested for differential gene expression in both cell types using northern blot analysis for a few genes taken for microarray and was found corroborating [36].
Although Classification of cancer had been improved till the last decade of 20th century but there was no general approach for the identification of new classes of cancer which became feasible with the gene expression studies [37]. The gene expression studies also facilitate identification of different type of cancers and cancerous/non cancerous stages because microarray allows analysis of thousands of genes in parallel with significant accuracy using different statistical and bioinformatical approaches like two way clustering analysis that separated cancerous from noncancerous tissue and cell lines from in vivo tissues on the basis of subtle distributed patterns of genes even when expression of individual genes varied only slightly between the tissues [38,39].
The different cell lines that have been developed from different cancers are still being used as good experimental modals with the specificity that these cell lines differs from both normal and cancerous tissue. As far as advanced study of development of biomarker genes is concerned is in fact a result of such pioneering gene expression studies with microarrays that may ultimately lead to the development of effective therapies for such diseases. Cancer cell lines along with microarray have been utilised to know the histologic origin of cells, interpretation of gene expression patterns in clinical samples, physiological variation, gene fusion, gather prognostic information of disease conclusively for variety of purposes that was not achievable before the development of this technique [40,41].

Expression studies on cancer with transcriptome sequencing
In contrast to microarray methods, sequence-based approaches directly determine the cDNA sequence and so the complexity of transcriptome unravelled to an extent with traditional Sanger sequencing, performed for cDNA and ESTs that finally revolutionized to the next generation sequencing (NGS) [42,43]. Prior to Next generation DNA sequencing a series of other high throughput approaches applied were BAC end sequencing, fosmid paired-end sequencing, serial analysis of gene expression (SAGE) sequencing for genome-wide detection of chromosomal rearrangements in cancer. In current time RNA-Seq is a powerful tool to identify rearrangements that lead to chimeric transcripts and are more likely to have functional consequences in cancer. Since the middle of the first decade of twenty first century different sequencing technologies like 454 life Science ( Roche) and Illumina (formerly Solexa sequencing) have also been developed for sequencing of m-RNA [44,45].
The pioneering work on transcriptome was probably done on breast cancer cell line HCC1954 targeted principally to find out gene fusion by taking the reference mRNAs from gen bank after the dataset of mRNAs was made available on 2008 [46]. Understanding the transcriptome is essential for interpreting the functional elements of the genome, revealing the molecular constituents of cells and tissues and also for studying the development of disease which is achievable by hybridization or sequence based approaches. Hybridization technique has several limitations like necessity of prior knowledge of genome sequence and problem of cross hybridization that compromises the accuracy of the results. These issues have been fixed in RNA-Seq technology [47].
RNA sequencing is highly accurate tool for measuring expression across the transcriptome and allows one to detect both known and novel features in a single assay; this enables the detection of transcript isoforms, gene fusions, single nucleotide variants, allele specific gene expression and other features without the prior knowledge [48,49]. mRNA sequencing is done to know about novel as well as known features of coding transcriptome. However others like total RNA sequencing, small RNA sequencing, single cell RNA sequencing, targeted RNA sequencing, ribosome profiling approaches are also used to gain insight into different processes. Unlike the genome, RNA transcripts are not present at equimolar concentrations, and are typically expressed in a context-specific manner. Such genes can be used as biomarkers that are defined as any molecule derived from a biological sample that can indicate current disease status, evaluate progression of the disease, and assess potential responsiveness to a particular medication [49,50]. Thus biomarkers can also be employed to know about the specific stage or the condition of cell/cells. Another profiling method, microarray-based gene expression profiling also represents a mature, high-throughput, transcriptomic analysis approach that has been extensively applied in biomedical and clinical research as the major biomarker tool for almost two decades comparatively with low accuracy than RNA-Seq. RNA-Seq is additionally promising because of its capability to discover splicing junctions, novel transcripts, alternative splicing variants with accurate measurement of gene expression levels, and most importantly un-annotated genes (The genes for which no reference sequence is available) and the sustained cost decrease. The interesting thing regarding microarray and RNA-Seq is that the data can be applied reciprocally. Conclusively we can say that RNA-Seq and microarraybased models are comparable and can be used in clinical endpoint prediction for cancer or other diseases. This prediction refers to any abnormality or symptom that constitutes one of the target outcomes of the trial [51,52].
A genome-wide screening of transcriptome dysregulation between cancer and normal tissue would provide insight into the molecular basis of cancer initiation and progression. Various studies have found a higher than expected mutation frequency of candidate cancer genes and RNA sequencing technology has the potential to detect such abnormal regulations in the cancer [53][54][55][56]. The most typical and challenging condition with the cancer is ''Metastasis'' a term that was originally coined in 1829 by Jean Claude Recamier. A defining hallmark of a malignant tumor, Metastasis is a process by which cancer cells move to another location from its original place [57]. The process of metastasis is still in controversy but by using the RNA-Seq various attempts have been made to gain substantial information regarding this in breast cancer, spindle cell sarcoma and other types of cancers [58][59][60].

Cell lines and transcriptomics
Almost a decade has gone since cell lines are being employed for transcriptomic studies related to cancer. The transcriptome of HeLa cells firstly done in 2008 but genomic sequence information was based on human reference genome [19]. Nevertheless for expression studies they are in use relatively for a long time as mentioned earlier. One thing that must be taken into the consideration is that the transcriptomic studies with microarray are purely expression studies including the high throughput sequencing strategies [61]. These studies based on the central dogma and the sequences of the genes under study are required. Transcriptomic analysis with NGS includes all other component not covered by central dogma viz. Non coding RNA, mi-RNA, Si-RNA etc. and not necessarily requires the reference gene. Till 2012 there were 3,359 publications on A431 a model cell line for epidermoid carcinoma, over 1,200 published articles on U251MG a commonly used glioblastoma cell line, 35% of the articles associated with the osteosarcoma Medical Subject Headings (MeSH) term in the PubMed database have used U2OS an osteosarcoma cell line. These studies became feasible due to the availability of genomic, transcriptomic and proteomic data on human cancer cell lines [62]. Similarly in current time there are multiple cell lines and similar data is available for study of oral cancer, brain cancer, ovarian cancer, gastric cancer, liver cancer, prostate cancer and almost all types of cancers that are known occurring to humans. Some of the cell lines from mice models have also been considered for such studies [63][64][65][66][67].
A large number of 675 human cancer cell lines have been studied in parallel for comprehensive transcriptome features with gene expression, mutations, gene fusions and expression of non-human sequences. The study accomplished with great achievement of 1,435 consists of genes out of 2200 gene fusions not reported previously in a single study. To identify and resolve gnomically similar cell lines, SNP genotyping data was also compared by clustering them and after selection of one representative cell line from group of gnomically related cell lines 610 distinct cell lines were retained [68]. For such studies a large set of molecular profiles are available for both tumor samples and cell lines: In "The Cancer Genome Atlas (TCGA)", the genomes and expression profiles of at least 500 tissue samples per tumor type are being comprehensively characterized. Likewise the Broad-Novartis Cancer Cell Line Encyclopedia (CCLE) contains genomic profiles of around1,000 cell lines that are used as models for various tumor types that helps one to understand the contribution of cell lines in the field of cancer [69,70].

Need, suitability and problems with cancer cell lines for cancer studies
Along with ethical issues, low availability of samples from human cancers have been backbreaking if not impossible to obtain a large amount of data that have generated using cell lines. The human specimens that are taken directly from the operating room have limited value for biochemical and molecular comparisons between tumor and normal tissues which makes obtaining them even more difficult. Besides the cell type of interest, these tissues contain varying amounts of other cells including lymphocytes, blood vessels, and stromal fibroblasts, which interfere with comparisons. Attempts made to dissociate the tissue and isolate the particular type of cell of interest, usually results in very few cells for study [71]. Indefinite growth of cell lines is not possible because normal human cells have a limited lifespan in culture and almost never spontaneously immortalize (in contrast to rodent cells). Consequently cell lines can only be used over a limited period until they senesce. Uncontrolled growth of the cells that causes a tumor to rise created a misconception that because cancers seem to have unlimited growth potential in patients, the cells are easy to culture and have limitless growth potential in the laboratory. But, cell lines changes in culture, no longer retain the tumor heterogeneity present in the primary cancer and also do not contain the relevant components of the tumor microenvironment. In contrast a period with stable phenotypic and genotypic properties made them efficient for studying mechanisms of tumorigenesis and evaluation of therapy and as the cells proliferate, they show little or no evidence of tissue-specific differentiation that makes them different from their origin [72,73]. However a striking observation was found for eight cancer types of the NCI-60panel (A panel of 60 cancer cell line) that all of the cell lines either grown in vitro or in vivo bore more resemblance to each other, regardless of the tissue of origin, than to the clinical samples that they are supposed to model [22,74,75]. Aim of the proposed work was study of mechanism of clinical anticancer drug resistance (Multi Drug Resistance-MDR) relevant to established cell lines and emphasizes the necessity for new in vitro cancer models. Established fact indicates the possible compensative utilization of such cell lines due to this similarity. Besides a few such clinical failures where cells are not showing similarity to the tissue they originates, cancer cell lines are the in vitro models that provide a thorough insight into the cancer with successful genomics, proteomics and transcriptomics with xenograft studies that help develop a new way to understand the biology of cancer in parallel with their assured availability for a long time.

Discussion
In vitro studies are widely accepted to understand the cancer biology in human. The development of first ever reported cell line from the cancerous tissues resulted in a belief that cells grow indefinitely outside the human body. It was obvious for the cancer researchers to assume that this will surely be helpful to comprehend the disease outright. The considerable difference of the microenvironment of the cells inside and outside the human body was realized after the long term culture of the cells and analysis. Since advent of disease continuous efforts have been made by scientific community to find out a cure but until now only early stage of the cancer is curable. There are exceptions where conditions of spreading of the disease are not the same. This degree of complexity necessitates the thorough investigation to find out the solution.
At present day molecular level analysis of any disease or process is possible with such accuracy that help to understood how or why a particular problem takes place and helps to find out the permanent or partial solution. The aim of microarray to advanced transcriptomic studies was to find such solutions for cancer. Thousands of successful attempts have been made with human cancer cell lines to develop new therapies and medicines where genomics, proteomics, and metabolomics also play a crucial role along with transcriptomics. The large collection of data and developed databases obtained through omics technology helpful in providing appropriate future directions. However in vivo studies also provide invaluable data, much of it is achieved by in vitro experiments.

Conclusion and perspective
Human cancer cell lines are widely used for the basic and applied research work to find a cure for cancer. Technologies have been moderated from histopathology to molecular biology for clear identification of different cancers. Human cell lines that have been developed from normal and cancerous tissues are adopted for better understanding of disease that may lead to its successful treatment. Cancer is a very complex condition but the precision of it's studies have largely increased by the advent of new technologies for transcriptomic studies and today most advance and precise of them is RNA-Sequencing. Several biomarker genes, proteins, mi-RNA, Si-RNA etc. have been found to be associated with different types of cancers but still the debates exist because of different findings by various researches. Several comparative molecular studies accomplished with the better results to identify different cancers and their stages strongly suggests that the comparative molecular studies of different stages of cancer may reduce the anonymity of the findings and thus may lead to unambiguous results.