Relationship between E6-E7 lineage sequences, viral loads, and integration of HPV16 in women with atypical squamous cells of undetermined significance (ASCUS) pap smears

HPV16 DNA associated with histologic high-grade precancer lesions (cervical intraepithelial neoplasia grade 2 or 3 [CIN2 or CIN3]) is sometimes found in women with atypical squamous cells of undetermined significance (ASCUS) Pap smears, the mildest form of cellular abnormality in a cervical smear. This study evaluated, in a population of patients with ASCUS Pap smears, the prevalence of different HPV16 lineages (or isolates) and the physical status, total viral load, and integrated viral load of HPV16 DNA in a population, stratified by HPV16 isolates (primarily either EUR-350T or EUR-350G) that was found. We demonstrated, in smears of women diagnosed with ASCUS and infected with EUR-350G HPV16 isolates (sublineage A1) that both the physical status of HPV16 and the distribution of integrated HPV16 DNA viral loads were different when compared to a population infected with EUR-350T HPV16 isolates. We showed that the mean and median percentages of HPV16 DNA were higher, and a distribution of log10 integrated HPV16 viral load was more homogeneous in a population infected with EUR-350G HPV16 isolates than in a population infected with EUR-350T HPV16 isolates. No difference was found in terms of total HPV16 viral load between these two isolates. Here, we investigate E6-E7 sequencing as an easy technique to detect women with a greater proportion of integrated HPV genome, and we suggest an adaptation of the current protocol for monitoring women with ASCUS Pap smears. Correspondence to: Valerie Giordanengo, CHU de Nice Hôpital Archet 2-151 route Saint-Antoine de Ginestière CS 23079 06 202 Nice Cedex 3, France, Tel: 04 92 03 61 84, E-mail: giordane@unice.fr


Introduction
Mucosal Human Papillomaviruses (HPVs) play a crucial role in the development of malignancies. These viruses are responsible for 99.7% of cervical cancers [1] and are classified as High-Risk (HR) or Low-Risk (LR), based on their association to cervical cancer [2,3]. HPV infection is extremely frequent, more often transient, asymptomatic, and cleared in 9-15 months; only a small fraction of women infected with a HR-HPV develop precancerous lesions that progress to cervical carcinoma. Persistent HR-HPV infection (especially HPV16) has been well established as the central cause of cervical cancer and integration of HPV DNA into the host cell genome is believed to be essential for malignant transformation. However, persistent infections with certain variants of HPV16, the genotype found in approximately half of all cervical cancers [4], differ in cancer risk. These variants diverge in their biological properties; therefore, they may become important risk factors in cervical cancer because of possible differences in pathogenicity [5][6][7][8][9].
HPV DNA integration into the host cell genome is believed to be essential for malignant transformation [25][26][27]. Correlations between the progress of cancer and the physical states of the HPV genome have been shown in different studies; episomal HPV seems to be predominant in the early stages of CIN, while the integrated virus is more frequently detected in high-grade CIN and Squamous Cell Carcinoma (SCC) [27,28]. Nevertheless, a study by Peitsaro et al. also supports the idea that integrated HPV could likewise be found in lowgrade lesions (< CIN2) [22].
Another marker of HPV16 infection, carcinogenic high viral load, has been associated with a risk of developing CIN2 or 3 lesions [29][30][31][32], whereas other reports showed no correlation [33,34]. Carcopino et al. demonstrated, in cervical smears collected before colposcopic examinations and in women aged 30-40, significant correlations between high HPV16 viral loads and the presence of high-grade squamous intraepithelial lesions [35]. Another study, with a prospective cohort design, reported higher HPV16 viral loads associated with modest increases in the incidence of cytological abnormalities, and predictive of incident CIN2/3+ [36].
The goal of our study was to describe the prevalence of different HPV16 lineages and isolates in women with Atypical Squamous Cells of Undetermined Significance (ASCUS) Pap smears living in the southeast of France. We also aimed to quantify HPV16 physical status and HPV16 DNA viral loads in this population, stratified by HPV16 isolates found. Lastly, based on this study, we may propose an adaptation of the current protocols for monitoring the patient population undergoing ASCUS Pap smears.

E6-E7 nucleotide sequences analysis
After a first step of cytological analysis and HPV genotyping, samples diagnosed both as ASCUS (by the Pathology Laboratory of the University Hospital of Nice) and HPV16-positive with PapilloCheck® HPV DNA chips (by the Laboratory of Virology of the University Hospital of Nice) were further amplified to sequence the region containing the E6 and E7 genes (nt 83-858). HPV16 DNA nucleotide positions were numbered according to the published sequence of the reference clone [37]. The alignment of E6/E7 nucleotide sequences (nt 83-858) were referred to the prototype HPV16 sequence (NCBI Reference Sequence: NC_001526.2) using the BioEdit Sequence Alignment Editor. Sequence differences between E6-E7 genes of these HPV16 genotypes are summarized in table 1.
The four major lineages of HPV16 were found in the studied population and E6/E7 sequences maximum likelihood tree was constructed to confirm the presence of the two principal isolates (EUR-350T prototype NC_001526.2 or EUR-350G) ( Figure 1). The large majority of samples (95%) contained HPV16 sequences belonging to sublineage A1, and were identical either to the prototype EUR-350T (37.7%) or to the EUR-350G isolate (57.6%) ( Table 1). The number of women infected with EUR-350T or EUR-350G during these four years showed no important variation of distribution ( Figure 2). Sequence analyses of sublineage A1 showed substitutions in 10 nucleotides    Table 1. Sequence alterations, relative to the E6-E7 ORF of the reference HPV16 sequence (referred to as the prototype HPV16 sequence): each letter represents the nucleotide change at that position. Lower case letter = nucleotide change at that position without an amino acid change. Upper case letter = nucleotide change at that position with a resulting amino acid change.

NC_001526.2 T G C G G A T T A C T G A T G A T A G T T T A
In the "amino acid" column, the letter preceding the amino acid position refers to the reference HPV16 sequence, and the letter after it refers to the substitution. Bold letters indicate an EUR-350G isolate with other amino acid changes.
Loubatier C (2018) Relationship between E6-E7 lineage sequences, viral loads, and integration of HPV16 in women with atypical squamous cells of undetermined significance (ASCUS) pap smears  located between positions 109 and 856 in the E6-E7 sequence, with no changes in predicted amino acid sequences for EUR-350T isolates (2 silent mutations), and with a change for 4 EUR-350G isolates (equaling six amino acid changes I52V, L83V, I52L, V53G, A46T, E89L). The non-European HPV16 variants (lineage B, C and D) were found in five ASCUS samples (Table 1).
In women infected with non-European HPV16 variants (lineage B, C or D), the patient age ranged 20-55 years, with a mean age of 31.0 years. In women infected with the EUR-350T or EUR-350G variants, patient age ranged 23-57 years (with a mean age of 35.7 years) and 19-69 years (with a mean age of 35.1 years), respectively (Table 1).

HPV16 DNA status and HPV16 variants
We analyzed the physical status of HPV16 in the cervical smears that were diagnosed as ASCUS. The physical state of HPV16 was determined using qPCR that targeted the HPV16 genes E6 and E2, because the E2 gene is more frequently lost during viral integration. Thus, E2 and E6 genes are present in equivalent amounts in cells with only episomal HPV genomes, whereas in cells with 100% integrated HPV genomes, the E6 gene is present but the E2 gene is absent [38]. The percentage of integrated HPV DNA was expressed as the ratio of: the integrated viral load ([number of E6 DNA copies] -[number of E2 DNA copies] per cell) / the to number of E6 DNA copies x100. An undetectable E2 gene defined integrated forms as detected in SiHa cells control.
On the basis of the assumption that integration of HPV results in the disruption of the E2 or E1 genes, we can induce, in this analysis, an underestimate of integration. But some studies suggest that (1) the most common patterns of deletions is in the region of the E2 ORF corresponding to the protein "hinge" region [39,40], and (2) it is only the loss of regulatory episomal E2 in cells, which confers a growth advantage on selected cells [41,42].
In this study, the population infected with non-European HPV16 variants presented a high percentage of integration (mean of percentage of integration in this population = 63.3%), compared to those infected with European HPV16 variants (mean = 31.9%) with 2 samples in lineage D and 1 sample in lineage C found 100% integrated. In the population infected with sublineage A1 isolates, which were then classified into EUR-350T and EUR-350G, we observed an increased mean percentage of integration in EUR-350G (36.3%) compared to EUR-350T (25%), with significant differences in percentage of integration in these two populations infected with EUR-350T or EUR-350G: p < 0.05, as determined using the Student's t-test (Table 1). In women infected with the EUR-350T isolate, we observed a higher number of samples without any integrated HPV genome (19 cases) than in samples of women infected with the EUR-350G isolate (5 cases) with Chi-square (χ 2 ) test p<0.001 (5.6.10exp-6). Interestingly, box plot analysis ( Figure 3) confirmed this difference of HPV16 DNA physical status between these two populations. The median of the percentage of integration was 15.70%, with 75th percentiles being 39.56% and 25th percentiles being zero, in the population infected with the EUR-350T isolate. By contrast, in the population infected with the EUR-350G isolate, the median of the percentage of integration was higher (39.80%), with a distribution of values that were more homogeneous (75th percentiles = 46.45%, and 25th percentiles = 32.39%).

Total or integrated viral load and HPV16 variants
We analyzed the physical status of HPV16 by calculating the total viral load (number of E6 DNA copies per cell [c/c]) and the integrated viral load ([number of E6 DNA copies] -[number of E2 DNA copies] per cell), normalized for cells by quantification of β-globin DNA; in table 1, we show the means of viral load and of integrated viral load in each population. The means of the total HPV16 viral load in cervical smears from women diagnosed with ASCUS were 35 c/c and 16 c/c, respectively, in the population infected with the EUR-350T or EUR-350G HPV16 isolates. The mean of total viral load was 32 c/c in the population infected with non-European HPV16 isolates, and this population was very heterogeneous for this marker. The mean of integrated viral load was identical in the populations infected with the EUR-350T and EUR-350G variants (7 c/c and 6 c/c, respectively), and was higher in the populations infected with the non-European HPV16 variants (16 c/c). The associations between HPV16 viral loads with European or non-European HPV16 isolate parameters are shown in table 1.
The distribution of log 10 -transformed HPV16 viral load per million of cells (total and integrated) in the population infected with EUR-350T or EUR-350G HPV16 isolates is shown in figure 3. The log 10 of % of integration of HPV16 log 10 of viral loads total viral load per cell in cervical smears infected with the EUR-350T or EUR-350G HPV16 isolates were similar, ranging 0-8.76 c/c (median = 6.41 c/c) and 0-8.50 (median = 6.33 c/c), respectively. An analysis of distribution of log 10 -integrated viral load per million of cells showed a higher median in cervical smears from women infected with EUR-350G HPV16 isolates (5.70) compared to cervical smears from women infected with EUR-350T HPV16 isolates (0.00). The box plot analysis, in particular, demonstrated that the distribution of integrated viral load was more heterogeneous in the population infected with the EUR-350T isolate (75th percentiles = 5.95%, and 25th percentiles = 0.00%) than in the population infected with the EUR-350G isolate (75th percentiles = 6.62%, and 25th percentiles = 5.06%).

Discussion
Infection with high-risk Human Papillomavirus (HPV) is necessary for the development of a cervical lesion, but only a fraction of precursor lesions progress to cancer. Additional factors are likely to increase the probability of this progression occurring. For example, the percentage of HPV16 DNA integration, as a marker for high-grade cervical lesions, was demonstrated in some studies [23], while the increased frequency of non-European HPV16 variants in invasive lesions was described by Tornesello et al. [43], suggesting greater oncogenicity for the non-European HPV16 variants [44][45][46].
Ours is the first study on ASCUS cytology and HPV16 infection in France in which the viral lineages or isolates, plus the physical state and viral loads of HPV16 (total and integrated), have been characterized. We analyzed a population categorized by the function of the detected HPV16 isolate. Lineage A accounts for a large proportion of HPV16 in many regions and this specific ratio (EUR-350T/EUR-350G) varies considerably by world region [47]. We first analyzed the HPV16 sequences belonging to sublineage A1, which are further divided into the EUR-350T and EUR-350G isolates. The EUR-350G isolate was 1.5 times more represented than the EUR-350T isolate in this cohort of women, with stability of this ratio (EUR-350T/EUR-350G) existing for the 2012-2015 period, like in other French studies [48]. A small fraction of these lineages contained co-mutations that were previously described [49,50].
Some studies showed higher HPV16 viral loads in high-grade lesion [51]; and other studies showed integrated viral load versus total viral load is an interesting marker with which to describe cervical disease progression [52]. For HPV16 DNA, we analyzed the means for viral load and integrated viral load in cervical smears from woman diagnosed with ASCUS. The mean for HPV16 integrated viral load was identical in the two populations infected with EUR-350T or EUR-350G. Interestingly, we identified a higher median of log 10 -integrated viral load per million of cells and even more notably, a more homogeneous distribution of log 10 -integrated viral load per million of cells in cervical smears from women infected with EUR-350G compared to women infected with EUR-350T.
Associated with this first observation, we analyzed the physical state of HPV16 DNA in these two populations. Detection of integrated HPV might be a promising marker for cervical disease progression, because integration of HPV already occurs at an early stage of the disease [22,53]. Regarding the percentages of HPV16 DNA integration in the population infected with EUR-350G vs. with EUR-350T, we demonstrated that the mean (36.3% vs. 25.0%, respectively) and, to a greater extent, the median (39.80% vs. 15.70%, respectively) were more important in the population infected with EUR-350G (with significant differences in percentage of integration in these two populations p < 0.05, Student's t-test) as described in study from the south of Poland [54]. Integration of HPV into the human genome often causes a lack of the E2 repressor protein, consequently inducing the up-regulation of the HPV oncogenes E6 and E7 [23,55] with an increased stability in these mRNAs [40]. The analysis of box plots showed more homogeneous values of percentage of integration in the population infected with EUR-350G vs. EUR-350T. Furthermore, we determined that there was a significant smaller number of completely episomal HPV16 genomes in this EUR-350G population. These findings suggest a lower risk of persistence and progression in women with ASCUS Pap smears that were infected with the EUR-350T isolate [8,15,20]. However our technique risks to under-quantify the percentage of integration if the gene E2 is intact after integration, for example in case of tandem-repeated HPV integrated genomes induce by the rolling circle replication mechanism [56,57]. But, HR-HPV DNA is frequently integrated into the chromosome of cervical cancer cells and is occasionally found amplified as tandem-repeated HPV integrated genomes, as typically observed in CaSki cells [21,40]. In other hand, the detection of HPV integration sites can be more relevant to the treatment monitoring of patients [58].
Furthermore, the analysis of these two markers (integrated HPV16 viral load and percentage of HPV16 DNA integration) in the few samples where a non-European HPV16 variant was found showed values that were much higher than the samples infected with a European variant. This result suggests that non-European HPV16 variants are more oncogenic than European variants, and is in agreement with the findings of previous studies [5,44,59,60] demonstrating that HPV16 non-European viral variants were significantly more likely than European variants to cause persistence and a CIN3+ evolution [8,14,16,61] inducing proliferation phenotype in an organotypic epithelial model [45 ,46], probably due to 1) modification of the immunogenic properties of the virus [46,62,63], 2) stimulation of cell migration and cell invasion [64] and 3) viral DNA integration into the host genome [45,46].
In France, the guidelines consider the test for HPV as a possible option only for managing ASCUS Pap smears. This pilot study pinpoint the potential predictive benefits of sequencing E6-E7 for the early identification of women with a higher risk of CIN3+ evolution, specifically in a HPV16-infected patient population with ASCUS Pap smears. The identification of the EUR-350G isolate (being easily detectable by short region sequencing) in women with ASCUS Pap smears should be considered as a risk indicator, and should constitute an alert signal for the medical staff [54]. This pilot study should be follow-up by a larger study to consolidate this result.

Collection of samples
Samples were obtained from women during gynecological visits in the Department of Obstetrics and Gynecology of the Archet Hospital (Nice, France) during 2012-2015. This study was carried out with approval from the local ethical committee, « Comité de Protection des Personnes » 2011-002954-28, with written informed consent from all women. Cervical smears were collected using a cytobrush. Cervical cells were placed in PreservCyt® solution (Cytyc), and were then transferred to the Pathology Laboratory for cytological screening and classification. In the University Hospital of Nice, if the results of routine Pap Tests indicated a diagnosis of ASCUS, the remainder of the sample was tested using PapilloCheck HPV DNA chips; samples positive for HPV16 were sequenced and evaluated for HPV16 physical status. significance (ASCUS) pap smears

DNA isolation for PCR or HPV genotyping
For DNA isolation, 1 mL of cell suspension was centrifuged at 2000 × g for 5 min, and the cells were then resuspended in 500 µL PBS (1x). For extraction of HPV DNA, a total nucleic acid automated extraction method was used (EasyMag®, Biomérieux), according to the manufacturer's instructions. DNA was extracted using the Generic 2.0.1 program, with onboard lysis protocol. Then, 50 µL of NucliSENS easyMAG magnetic silica was added and thoroughly mixed with the sample. Elution was performed in 25 μL NucliSens Extraction Buffer 3.

PapilloCheck® HPV DNA chips
HPV genotyping was performed using PapilloCheck® HPV DNA chips (GBO). After extraction, a 350-bp fragment of the E1 ORF, an external PCR control, and a DNA fragment of the human ADAT1 gene were amplified with specific primers (in the presence of 5 μL of each purified DNA). PCR was performed for 40 cycles at 95 °C for 30 s, 55 °C for 25 s, and 72 °C for 45 s, then 15 cycles at 95 °C for 30 s, and 72 °C for 45 s. The denatured amplification products were then hybridized to complementary DNA-probes on the chip, at room temperature, in a humid atmosphere, for 15 min. Arrays were then washed for 10 s in washing solution I, 60 s in washing solution II at 50 °C, and 10 s in washing solution III; arrays were then spun dry. Slides were scanned and analyzed using the CheckReport™ software package (GBO). The HPV genotyping results are available only if CheckReport™ software validate the amplification quality of the controls (external PCR control, and a PCR of the human ADAT1 gene). The oligonucleotide microarray detects 15 HR HPV and 11 LR HPV (HPV 16,18,31,33,35,39,45,51,52,53,56,58,59,66,68;and HPV 6,11,40,42,43,44/55,53,70,73,82,respectively). The PapilloCheck® DNA chip contains 28 probes, 24 HPV probes, and 4 control probes (orientation, hybridization, external PCR, and ADAT1), each in 5 replicate spots.

E6-E7 HPV sequence and phylogenetic analysis
HPV16-positive samples were further amplified with specific sets of oligonucleotides designed to amplify the ORF of E6-E7 (Table 2). PCR amplification of HPV16 E6-E7 was performed with the Invitrogen® ThermalAce DNA Polymerase kit: 3 µL of purified DNA, 5 µL of buffer 10x, 100 ng of each primer (HPV16E6vF [5′-3′] and HPV16E7vR [5′-3′]), and 1 µL Taq polymerase. PCR was performed for 40 cycles at 95 °C for 1 min, 51 °C for 1 min, and 74 °C for 1 min. PCR products were electrophoresed in 1% agarose gel, and were visualized by staining with ethidium bromide. HPV16 variants were identified by sequencing 3 µL of E6-E7 PCR product using the Big-Dye Terminator ready reaction kit (perking-Elmer), using the same primers used for initial amplification. Sequencing was run for 10 min at 96 °C, followed by 25 cycles at 96 °C for 10 s, 50 °C for 5 s, and 60 °C for 4 min. Sequencing results were analyzed (after precipitation) using an ABI PRISM 3100 genetic analyzer (Perkin-Elmer) and Sequence Navigator™ software. Only nucleotide changes that were verified as occurring on both strands were accepted.
The variants were identified and numbered using the prototype sequence (HPV16, NCBI Reference Sequence: NC_001526.2) belonging to the European lineage. Sequence alignments and multiple alignments of E6-E7 were performed using the BioEdit Sequence Alignment Editor, and the phylogenetic tree was constructed using MAFFT software application, v6.846.

Quantitative real time-PCR (qPCR)
PCR amplifications were performed on an Applied Biosystems StepOne Plus Real-Time PCR system, using TaqMan Fast Universal Master Mix (Life Technologies). The amplification conditions were 95 °C for 20 s, 95°C for 1 s, and 60 °C for 20 s, for 40 cycles. The primers and probes were described previously [22] and real-time PCR was performed for a simultaneous amplification of two HPV16 genes: E2 and E6 (Table 2). Each sample was normalized using β-globin in separate reactions; the β-globin gene was used to correct HPV copy numbers for cells in the cervical swab material. The final primer and probe concentrations, in a total volume of 10 µL, were 0.9 µM and 0.25 µM, respectively. A volume of 2.5 µL of target DNA from cervical smears was added to the reaction mixture.
Three standard curves were obtained by amplification of a serial dilution of 70 million to 700 copies of the HPV16 E2 and E6 plasmids (Taq-amplified PCR E2 or E6 product into a plasmid vector in pCR™2.1 Vector) and the β-globin plasmid (Taq-amplified PCR β-globin product into a plasmid vector in pCR™2.1 Vector). There was a linear relationship between the threshold cycle values plotted against the log of the copy number, over the entire range of dilutions. HPV16 E6 and E2 viral loads were expressed as the number of HPV16 copies per cell equivalent. The integration of HPV16 into human genome was detected by subtracting the copy numbers of E2 (episomal DNA) from the total copy numbers of E6 (episomal and integrated DNA combined).
Different physical states (episomal, integrated or both) were identified by the percentage of integrated HPV DNA was expressed as the ratio of the integrated viral load to the total copy numbers of E6. SiHa cells, which represent cells containing fully integrated HPV, were included in each run to determine the detection limits for integrated HPV16. All experiments were performed in duplicate, with

Statistical analysis
The significance of number of samples without any integrated HPV genome between HPV16 sublineages groups was tested using the chi-square method. Differences in viral quantitative data associated in population infected with different HPV16 sublineages were compared using the Student's t-test. All p values presented were interpreted as significant when p < 0.05.
Statistical analyses were conducted using the Easy Med Stat © software package.

Funding Information
This work was supported by institutional grants from CHU and Conseil Général 06.