Role of HIV RNA Structure in Recombination and Speciation:
Romping in Purine A, Keeps HTLV Away
Donald R.
Forsdyke
ABSTRACT ABSTRACT
Keywords: Base composition; Purine-loading; Reproductive isolation; Speciation;
Stem-loops
1. Recombination, Rejuvenation and Conservation of RNA Structure At its most fundamental level, sex is recombination
(Forsdyke, 2007; Michod et al., 2008). The 'romp in the hay' that can precede
the meeting of paternal and maternal human gametes is an elaborate, albeit
necessary, preliminary to the final meiotic meeting of parental genomes in the
gonads of their offspring, where recombination occurs. The early microscopists
who witnessed this final meeting described it as a "conjugation of the
chromosomes" that was necessary for a rejuvenating "interchange
of substances" (Montgomery, 1901): "The conjugation of the chromosomes in the synapsis stage may be considered the final step in the process of conjugation of the [parental] germ cells. It is a process that effects the rejuvenation of the chromosomes; such rejuvenation could not be produced unless chromosomes of different parentage joined together, and there would be no apparent reason for chromosomes of like parentage to unite." In contrast to humans, retroviral sex is quite
elementary (Temin, 1991). Yet, like humans, retroviruses are diploid. This
diploidy in HIV-1 can be heterozygous due to the viral strategy of mutation to
near oblivion, so countering host defences. The degree of heterozygosity in two
HIV genomes that are copackaged within an infectious retrovirus particle, if
below the sequence difference threshold above which recombination is inhibited
(see Section 7), will allow viral rescue by recombination. Thus, within the next
host cell, the two partially crippled genomes repeatedly recombine to generate a
rejuvenated form that will successfully colonialize the vulnerable population of
host cells (usually T4 lymphocytes; Smyth et al., 2012). Apart from this, HIV genomes literally 'romp' in the purine A. The highly stable A-richness of certain lentivirus genomes (Kypr and Mrazek, 1987), was recently reviewed by Kuyl and Berkhout (2012). Noting the growing evidence that "RNA genome structure, and not the encoded proteins, is the most decisive factor that triggers HIV conservation" (Forsdyke, 1995; Watts et al., 2009; Simon-Loriere et al., 2010; Snoeck et al., 2011; Sanjuan and Borderia, 2011), they suggested that A-richness would facilitate the reverse transcription stage of the viral life cycle by ensuring that such RNA structures are of low stability (since A-U bonds are relatively weak). Indeed, as might be expected, lowering the A-content of codons without changing the encoded amino acids, can impede cDNA synthesis (Keating et al., 2009). As for RNA structure itself, it was held that this would confer a selective advantage by providing "packaging signals." Thus, the viral nucleocapsid packaging protein could be considered a "sensor of nucleotide composition." But, paradoxically, "genes -- from multiple sources (e.g. bacterial, viral, and human) are tolerated by -- HIV-based vectors, suggesting that nucleotide composition of the insert -- is not critical for packaging" (Kuyl and Berkhout, 2012). There have long been other views on these issues. Viral structure has been related to viral recombination and speciation (Forsdyke, 1995, 1996; Forsdyke, 2001). Viral A-richness has been related to the phenomenon of purine-loading (Forsdyke and Mortimer, 2000). These are the subjects of this paper. In keeping with current conventions (Forsdyke, 2011a), "high A%" implies a high frequency of the base A. "High AT%" implies a high frequency of the combination of bases A and T, without specifying which of the two bases makes the major contribution. The same considerations apply to high values for GC% and AG% ("purine loading"). 2. Purine-loadingThe A-bias in HIV-1 is a characteristic of coding regions, but not of the non-coding long terminal repeats (LTRs). This parallels the selective "purine-loading" (AG% enrichment) of the encoding (mRNA synonymous) strands of the exons of genes of many biological species, including man. Since purines do not pair with purines, this would militate against self RNA-self RNA interactions, and a purine-loaded virus would be less likely to form base-paired RNA duplexes with host RNAs. Preventing dsRNA formation should be advantageous to the virus, since segments of double-stranded RNA (dsRNA) can provide alarm signals to the host, so alerting immunological defences (Cristillo et al., 2001). Base pairing being largely entropy-driven, purine-loading should be high in thermophiles, as is indeed found (Lao and Forsdyke, 2000; Lambros et al. 2003). Purine-loading also provides a rationale for interruption of proteins by low complexity segments corresponding to purine-rich codons (Xue and Forsdyke, 2003; Tian et al., 2011). This implies function at the nucleic acid level. Thus, Muralidharan et al. (2011), from studies of a P. falciparum protein with an asparagine repeat sequence (corresponding to purine-rich codons GAY), concluded that "the asparagine repeat is dispensible for protein expression, stability and function." A similar observation has been made for a purine-rich repeat within the mRNA encoding the EBNA-1 protein, which normally maintains the Epstein-Barr virus (EBV) in a deeply latent state with its many pyrimidine-rich mRNAs unexpressed. In this way EBV should evade host immune defences (Cristillo et al., 2001). Irrespective of whether avoidance of dsRNA is the correct adaptive explanation, purine loading is a fact. A-loading by HIV-1 could be an extreme expression of this (Cristillo et al., 2001). Purine-loading can influence amino acid composition and, since the extra purines are generally located at third codon positions (the main site for synonymous mutations), it is likely that purine-loading has driven amino acid composition, rather than the converse (Mortimer and Forsdyke, 2003). The purines would be expected to occupy, and are indeed found in, the loop regions of the highly structured HIV genome (Forsdyke and Bell, 2004; Watts et al., 2009). Indeed, Hemert et al. (2013) confirm for HIV-1 RNA that the percentage of A nucleotides is particularly low in double-stranded structures (21%) compared to the single stranded parts (79%), and suggest that "the virus may have adopted a particular genome architecture to adapt to cellular defense mechanisms." Furthermore, Zanini and Neher (2013) note that most synonymous (non-amino acid changing) mutations disrupt base pairing in RNA stems, so "hinting at a direct fitness effect of these stem-loop structures."
Values for AG% and GC% are reciprocally related, and
purine-loading is achieved mainly by replacing C with A, leaving T and G
relatively constant (so avoiding runs of Ts and Gs that can be detrimental).
Thus, there can be conflict between 'AG pressure' and 'GC-pressure' (Mortimer
and Forsdyke, 2003; Forsdyke, 2011a). To fully appreciate why HIV-1 is A-loaded,
not G-loaded, we must understand the role of GC%. 3. Two Species in One Cell Differ Greatly in GC%Consider three examples of pairs of viral species, where members of each pair might occupy the same host cell. (1) An early report
on species differences in GC% (Wyatt, 1952) noted that, while there is no clear
relationship between the GC% values of virus and host, two insect viruses
(polyhedral and capsule) with a common larval host (spruce budworm) differ
dramatically in their GC% values (38% versus 51%). In this example it is
uncertain whether these viruses could in fact have shared a common cell within
their larval host. (2) However, Schachtel et al. (1991) noted for humans, that two neurotropic alphaherpesviruses also differ dramatically in GC% (46% versus 68%). The differences are dispersed, being present in both non-coding and coding regions and, in the latter case, mainly affect the third bases of codons (consistent with the close similarity in amino acid sequences of the viral proteins). But, as in the case of purine-loading (Section 2), the differences are sometimes sufficient to change amino acids (e.g. lysine to arginine). It was proposed that the differences are adaptive in that there would be less competition for host resources: "More specifically, it is proposed that HSV1 and VSV avoid competition for host resources such as nucleic acid precursors and aminoacyl-tRNAs through divergent base composition and codon usage. This may enable these two closely related viruses to coexist in the same host species and even to multiply simultaneously in the same cells." (3) A similar explanation was advanced by Bronson and
Anderson (1994) who noted that HIV-1 is extremely AU-rich, and HTLV is extremely
GC-rich. They supposed that the intracellular environment would consist of a
variety of metabolic "ecological niches." Two viruses that coinfected the same
host cell would avoid competition with each other by making different metabolic
demands on NTP pools. There is, however, another interpretation. 4. Coinfectants either Blend or Speciate The wide GC% differences between coinfecting viruses
can be seen as part of the speciation process (Forsdyke, 1995, 1996, 2001). Two
species of virus derived from a common ancestral species, which coexist
synchronously within the same host cell (sympatry), have the opportunity to
recombine ('blending inheritance'). But their reproductive isolation would then
be lost and they would mutually destroy each other as independent species. To
the extent that retaining their differentiation as independent species is
advantageous, there would have been a strong
selective pressure on their nucleic
acids to evolve to prevent recombination. If differences in GC% could achieve
this (Section 7), then they would have been mutually driven to extreme GC%
poles. If HIV-1 were the first to A-load, then it would have been driven to the
low GC% pole. HTLV would have been driven to the high GC% pole. Since, to arrive
at high GC% values, organisms preferentially replace A with C (Section 2), HTLV
would have foregone the advantages of purine-loading and adopted an alternative
evolutionary strategy (Cristillo et al., 2001). With its 'romp' in the A, HIV-1
recombinationally drove HTLV-1 away! Like the GC-rich EBV (Section 2), HTLV-1 is deeply
latent and many carriers are symptom-free. Also like EBV, the duplex HTLV-1
provirus avoids transcribing its many pyrimidine-rich RNAs (controlled by the 5'
LTR region), by expressing a protein (HBZ) with a role analogous to that of the
EBNA-1 protein. HBZ mRNA is transcribed from the antisense strand as a
purine-rich, RNA species (controlled by the 3' LTR; Cook et al., 2013). 5. FORS-D Analysis and SHAPE illuminate RNA Structure New technologies have greatly assisted the
understanding of HIV RNA structure. Le and al. (1988, 1989) used novel
energy-minimization folding algorithms to show that RNA regions involved in
intra-molecular base pairing tend to be more evolutionary stable. This stability
of nucleic acid folding into stem-loop conformations is expected to be
influenced by base composition, being low in low GC% organisms such as HIV-1
(since there is less opportunity for pairing between Gs and Cs). However, GC%
values tend to characterize whole genomes or large genome sectors. Base order is
a character that critically affects local structure. A method of dissecting out
the base order-dependent component of the folding energy - "folding of
randomized sequence difference" (FORS-D) analysis - revealed many evolutionarily
conserved structural elements in the HIV RNA genome; these were separated by
less structured variable regions likely to be under positive Darwinian selection
(Forsdyke, 1995; Snoeck et al., 2011; Zanini and Naher, 2013). These observations were confirmed and extended by
"selective 2'-hydroxyl acylation analysis by primer extension" (SHAPE; Watts et
al., 2009). While certain conserved structures corresponded to established
regulatory elements, others corresponded to the peptide linkers between the
protein domains of polyproteins. The results were interpreted in terms of "another level of genetic code" so that
"higher ordered RNA structure directly
encodes protein structure." Recombination was not mentioned. Others agreed,
stating that "this novel component of the genetic code represents the strongest
determinant of conservation" (Snoeck et al., 2011). Again, there was no
suggestion of a relationship to recombination. 6. Recombination Depends on Structure and SpeciesIf the degree of heterozygosity of two HIV-1 genomes, copackaged within an infectious retrovirus particle, is insufficient to generate sequence differences of a magnitude that inhibits recombination and fosters speciation (Section 7), then, by virtue of similarities in sequence and structure, the two genomes can generate recombinants with properties more advantageous than those of the originating parental genomes (Simon-Loriere et al., 2011). But HIV-1 itself, by virtue of its high mutation rate, explores sequence space in years, to an extent that would take its host millions of years. Thus, as sequence differences increase, a differentiation of HIV-1 into "quasispecies" (up to approx. 15% intrasubtype variation), "subtypes" (approx. 15% - 30% intersubtype variation) and "groups" (30% intergroup variation), is recognized. How efficiently can recombination have checked this variation? Can differences in recombination between members of these various categories, although mechanistically copy-choice (template switching without strand breakage), guide our understanding of speciation in more complex organisms where recombination is likely to involve DNA strand breakage (Forsdyke, 1995, 1996)? Recombination in HIV-1 begins with the formation of RNA
dimers by means of complementary loop-loop "kissing" interactions between "dimer
initiation sequences" (DIS) that are part of stem-loop secondary structures in
the Ψ (packaging signal) region of the genome. A similar DNA
"kissing" may be involved in recombination in higher organisms (Forsdyke, 1996;
Danilowicz et al., 2009).
A lack of base complementarity between HIV-1 DIS loops decreases recombination,
but does not eliminate it. This suggests that other parts of the genome can
assist dimer formation. Upon transfer to a new host cell, copy choice
recombination occurs between the colocalized genomes in HIV-1 homodimers or
heterodimers (Onafuwa-Naga
and Telesnitsky, 2009;
Nikolaitchik et al., 2011). Since a species is defined by its recombinational
(reproductive) isolation relative to other species (Forsdyke, 2001), it is
important to note that, albeit rare, recombination can occur between members of
different HIV-1 groups (Takehisa et al., 1999). This defines them as belonging
to the same species. On the other hand, although HIV-1 and HIV-2 can coinfect an
individual, no recombinants have been detected, so they are different species.
HIV-1 intrasubtype recombination is more frequent than intersubtype (intragroup)
recombination, which is more frequent than intergroup recombination (Chin et
al., 2007). Thus, after introduction of synonymous substitutions, it was found
that 5%, 9%, and 18% sequence differences corresponded, respectively, to 35%,
74% and 95% decreases in recombination frequency (Onafuwa-Naga
and Telesnitsky, 2009). Others, with a different assay, found 9% and 18%
sequence differences corresponded to 67% and 90% decreases in recombination
(Nikolaitchik et al., 2011). These values are consistent with previous FORS-D
studies (Section 7), which postulated that HIV-1 secondary structure is
conserved because, in addition to regulatory roles, it plays a critical role in
recombination (Forsdyke, 1995). Consistent with this, Simon-Loriere et al.
(2011) note: "Strong disparities [in recombination] were observed for comparable degrees of sequence identity in conserved regions, indicating that -- parameters other than the level of sequence identity modulate -- recombination. -- [R]egions of the genomic RNA with a high proportion of residues involved in the formation of secondary structure contained significantly more [recombinational] breakpoints. The extent of RNA structure along the HIV genome seems to provide us with a relatively accurate picture of the pattern of recombinant genomes generated by the mechanism of recombination."
7. Base Order Conserves StructureAs the differences between two subtypes increase, it becomes evident that conservation (low substitution rate) corresponds to regions where bases are ordered to support the formation of higher ordered local structure (i.e. the base order-dependent component of stem-loop potential is maximized). This is shown in Figure 1 where subtype HIVSF2 differs by 455 dispersed substitutions (4.68% difference) from the reference sequence (HIVHXB2). Here structural stability in 200 base moving windows (expressed in negative kilocalories/mol), is plotted with the corresponding substitution frequency for each window. Where the base order-dependent folding stability is high, substitutions are low. Where base order-dependent folding stability is low, substitutions are high. In other words, base conservation associates with stable RNA structure.
This relationship is better seen when the two values within each window are plotted together. The reciprocal relationship does not hold for the base composition-dependent component (Fig. 2b), but does hold for the base order-dependent component (Fig. 2c). Although the points are widely scattered, approximately 10% of the decline in the base order-dependent component of the folding energy is accounted for by increasing substitutions (r2 = 0.098). This result, together with the results of studies with other HIV-1 pairs whose sequences differed more or less than in this case, are summarized in Figure 3 (adapted from Table 1 of Forsdyke, 1995).
When differences between
HIV-1 genomes are low (e.g. 0.77%; 75 substitutions) a correlation between
stable structure and conservation is difficult to demonstrate (slope value
0.142, which is not significantly different from zero). However, when
differences are intermediate (4.68%; 455 substitutions; see Figs. 1 and 2) the
correlation is significant (slope value -0.34; P 0.001). Thus, in a sequence
window where bases are conserved (low substitutions), the bases are likely to
contribute positively, by virtue of their ordering, to RNA structure.
Conversely, in a sequence window where base order is variable (high
substitutions) the bases do not contribute to RNA structure (or can sometimes
contribute negatively; i.e. positive FORS-D values). With higher differences between genomes (8%, 12%),
slope values decline and are of marginal significance. This implies that the
additional substitutions are now entering windows corresponding to conserved
regions, where the substitutions can change both base order and composition, so
modify RNA structures that, when substitutions were less, would have supported
intra-species recombination. Thus, there is a difference threshold above which
recombination, with its associated rejuvenating effects (Section 1), begins to
fail.
8. Base Composition and Phylogenetic Analysis
Above the threshold the
potential to follow a new evolutionary path - speciation potential - increases.
In this circumstance base composition plays a more crucial role, a very small
fluctuation in GC% being able to substantially change structure so that 'kissing' interactions fail and speciation can
initiate (Forsdyke 1998, 2007). Once species are established, in phenotypically
more complex organisms GC% values may then turn to other roles, but in viruses
that have not undergone extensive phenotypic adaptation this seems not to occur,
and 'echoes' of the originating GC% differences may remain (Section 3; Forsdyke,
2001). Such differences have been found useful in the phylogenetic analyses of
retroviruses (Bronson and Anderson, 1994). Likewise, for influenza viruses
Sampath et al. (2007) reported that different "evolving virus species" can be
differentiated on the basis of GC% differences: "Base composition derived
clusters inferred from this [phylogenetic] analysis showed 100% concordance to
previously established clades." Analyses of influenza virus RNA structures show
that, as in the case of HIV-1, conserved RNA structure is in potential conflict
with other functions. For example, Moss et al. (2011) found for influenza virus
that "RNA structural constraints lead to suppression of variation in the third
(wobble) position of amino acid codons." While GC% differences usually do
not suffice for phylogenetic analysis in more complex organisms, palindrome
frequencies can be informative (Lamprea-Burgunder et al., 2011). Higher ordered
nucleic acid structure depends on appropriately placed, complete or incomplete,
palindrome-like inverted repeats of distinctive base order and composition. The
frequency of short palindromes has low intra-species variance, but high
inter-species variance. Mutations sufficient to generate the large variances are
sometimes cryptic, in that there are no obvious phenotype differences (Forsdyke,
2013). Yet, such palindrome frequencies can be used to distinguish species. 9. Gene Definition and Recombination
It is suggested that the
potential to adopt a higher order structure relates to recombination (Forsdyke,
1995; Simon-Loriere et al., 2010, 2011). Conserved structure may influence where
initial strand-switching occurs when there is copy-choice recombination, and
where initial crossovers occur in conventional recombination (Forsdyke, 1996;
Smyth et al., 2012). Recombination is also linked to the seemingly endless
debate on how to define a gene (Forsdyke, 2009). There is an apparent
discrepancy between the gene as defined by biochemists and the gene defined by
G. C. Williams and R. C. Dawkins as a "selfish" element that is able to resist
recombinational disruption. Evidence that final recombinational crossovers are
preferentially located close to gene boundaries brings the two definitions into
close correspondence (Forsdyke, 2011b). Studies of the HIV-1 genome further
support this.
A study of intergroup recombination by Takehisa et al.
(1999) revealed that:
"Breakpoints appeared mostly near
the boundaries of the respective genes. The high frequency of recombination that
occurs only near the beginning or end of the respective genes seems to reflect a
common adaptive strategy for recombination." As noted above (Section 5), SHAPE analysis indicates
that conserved structure corresponds to the interdomain regions of various
polyproteins (Watts et al., 2009). Further application of SHAPE led
Simon-Loriere et al. (2010), to conclude that:
"Junctions between genes are
enriched in structured RNA elements and are also preferred sites for generating
functional recombinant forms. These data suggest that RNA structure-mediated
recombination allows the virus to exchange intact genes rather than arbitrary
subgene fragments, which is likely to increase the overall viability and
replication success of the recombinant HIV progeny."
Smyth et al. (2012) agreed:
"As junctions between genes are
enriched with RNA structure -- one could argue that the HIV genome has evolved
to exchange intact genes as genetic units, rather than as random fragments of
the genome, which should increase the chances of recreating a viable virus." 10. Conclusions Having evolved the strategies both of extreme
purine-loading and of extreme mutation to evade host defences,
in extremis only HIV-1 characters that
support recovery from mutation (i.e. diploidy, recombination) need to be
conserved. Thus the observation of high conservation of RNA secondary structure
in HIV-1 that is particularly dependent on local base order, indicates a
critical role for structure in recombination. However, when mutational
differences exceed a certain threshold, changes in base composition (GC%) can
impair structure, and hence impair recombination. Many observations can be
explained on this basis, including that failure of recombination (known as
'reproductive isolation') that can lead to speciation (Forsdyke, 2013).
Mechanisms of viral speciation may be of heuristic value for studies of
speciation in more complex organisms. AcknowledgementsMy AIDS studies were supported by the American Foundation for AIDS Research and the Medical Research Council of Canada. Queen's University hosts my evolution education webpages where some of the cited references may be found (http://www.queensu.ca/academia/forsdyke/evolutio.htm). ReferencesBronson, E.C., Anderson, J.N., 1994. Nucleotide composition as a driving force in the evolution of retroviruses. J. Mol. Evol. 38, 506-532.
Chin,
M.P.S., Chen,J., Nikolaitchik, O.A., Hu, W.-S., 2007. Molecular determinants of
HIV-1 intersubtype recombination potential. Virol. 363, 437-446.
Cook,
L.B., Elemans, M., Rowan, A.G., Asquith, B. 2012. HTLV-1: persistence and
pathogenesis. Virol. 435, 131-140.
Cristillo, A.D.,
Mortimer, J.R., Barrette, I.H., Lillicrap, T.P., Forsdyke, D.R., 2001.
Double-stranded RNA as a not-self alarm signal: to evade, most viruses
purine-load their RNAs, but some (HTLV-1, Epstein-Barr) pyrimidine-load.
J. Theor. Biol.
208, 475-491.
Danilowicz, C., Lee, C.H., Kim, K., Hatch, K., Coljee, V.W., Kleckner, N.,
Prentiss, M., 2009.
Single molecule detection of direct, homologous, DNA/DNA pairing.
Proc. Natl. Acad. Sci. USA
106,
19824-19829.
Forsdyke, D. R., 1995. Reciprocal relationship between stem-loop potential and
substitution density in retroviral quasispecies under positive Darwinian
selection.
J. Mol. Evol.
41,
1022-1037.
Forsdyke, D.R., 1996. Different biological species 'broadcast' their DNAs at
different (G + C)% 'wavelengths.' J. Theor. Biol. 178, 405-417.
Forsdyke, D.R., 1998. An alternative way of thinking about stem-loops in DNA. A
case study of the G0S2 gene.
J. Theor. Biol.
192, 489-504.
Forsdyke, D.R., 2001. The Origin of Species Revisited. McGill-Queen's University
Press, Montreal.
Forsdyke, D.R., 2007. Molecular sex: the importance of base composition rather
than homology when nucleic acids hybridize.
J. Theor. Biol.
249, 325-330.
Forsdyke, D.R., 2009. Scherrer and Josts' symposium. The gene concept in 2008.
Theory
Biosci.
128,
157-161.
Forsdyke,
D.R., 2011a. Evolutionary Bioinformatics. 2nd edition, Springer, New
York, pp. 241-247.
Forsdyke, D.R., 2011b.
The selfish gene revisited: reconciliation of Williams-Dawkins and conventional
definitions.
Biol. Theory
5,
246-255.
Forsdyke,
D.R., 2013. Base composition, speciation, and barcoding. Trends Ecol. Evol.
28,
73-74.
Forsdyke, D.R., Bell, S.J., 2004. Purine-loading, stem-loops, and Chargaff's
second parity rule: a discussion of the application of elementary principles to
early chemical observations.
Appl. Bioinf.
3, 3-8.
Forsdyke, D.R., Mortimer, J.R., 2000. Chargaff's legacy.
Gene
261,
127-137.
Hemert, F.J. van, Kuyl, A.C. van der, Berkhout, B. 2013. The A-nucleotide
preference of HIV-1 in the context of its structured RNA genome. RNA Biol.
10,
1-5.
Keating,
C.P., Hill, M.K., Hawkes, D.J., Smyth, R.P., Isel, C., Le, S.-Y., et al., (2009)
The A-rich RNA sequences of HIV-1 pol are important for the synthesis of viral
cDNA. Nucleic Acids Res. 37, 945-956.
Kuyl,
A.C. van der, Berkhout, B., 2012. The biased nucleotide composition of the HIV
genome: a constant factor in a highly variable virus. Retrovirol.
9, 92.
Kypr, J., Mrazek, J., 1987. Unusual codon usage in HIV.
Nature 327, 20.
Lambros, R.J., Mortimer, J.R., Forsdyke, D.R. 2003. Optimum growth temperature
and the base composition of open reading frames in prokaryotes.
Extremophiles
7, 443-450.
Lamprea-Burgunder, E., Ludin, P., Maser, P., 2011. Species-specific typing of
DNA based on palindromic frequency patterns. DNA Res. 18, 117-124.
Lao, P.J., Forsdyke, D.R., 2000. Thermophilic bacteria strictly obey Szybalski's
transcription direction rule and politely purine-load RNAs with both adenine and
guanine.
Genome Res.
10, 228-236.
Le,
S.-Y., Chen, J.H., Braun, M.J., Gonda, M.A., Maizel, J.V., 1988. Stability of
RNA stem-loop structure and distribution of non-random structure in human
immunodeficience virus (HIV-1). Nucleic Acids Res. 16, 5153-5168.
Le, S.-Y., Maizel, J.V., 1989. A method for assessing the statistical
significance of RNA folding.
J. Theor. Biol.
138,
495-510.
Michod,
R.E., Bernstein, H., Nedelcu, A.M., 2008. Adaptive value of sex in microbial
pathogens. Inf. Genet. Evol. 8, 267-285.
Montgomery, T.J., 1901.
A study of the chromosomes of the germ
cells of metazoa.
Trans. Am. Phil. Soc.
20,
154-236.
Mortimer,
J.R., Forsdyke, D.R., 2003. Comparison of responses by bacteriophage and
bacteria to pressures on the base composition of open reading frames. Appl.
Bioinf. 2, 47-62.
Muralidharan, V., Oksman, A., Iwamoto, M.,
Wandless, T.J., Goldberg, D.E., 2011. Asparagine repeat function in
Plasmodium falciparum protein assessed via a regulatable fluorescent
affinity tag. Proc. Natl. Acad.
Sci. USA 108, 4411-4416.
Moss, W.N.,
Priore, S.F., Turner, D.H. 2011. Identification of potential conserved RNA
secondary structure throughout influenza A coding regions.
RNA
17, 991-1011.
Nikolaitchik, O.A., Galli, A., Moore, M.D.,
Pathak, V.K., Hu, W.-S., 2011. Multiple barriers to recombination between
divergent HIV-1 variants revealed by a dual-marker recombination assay. J. Mol.
Biol. 407, 521-531. Onafuwa-Naga, A., Telesnitsky, A., 2009.
The remarkable frequency of human immunodeficiency virus type 1 genetic
recombination. Microbiol. Mol. Biol. Rev. 73, 451-481. Sanjuan, R.,
Borderia, A.V., 2011. Interplay between RNA structure and protein evolution in
HIV-1. Mol.
Biol. Evol. 28, 1333-1338.
Schachtel, G.A., Bucher, P., Mocarski, E.S., Blaisdell, B.E., Karlin, S. 1991.
Evidence for selective evolution on codon usage in conserved amino acid segments
of alphaherpesvirus proteins. J. Mol.
Evol. 33, 483-494.
Sampath, R.,
Russell, K.L.,
Massire, C., Eshoo,
M.W., Harpin, V., Blyn,
L.B., et al., 2007. Global surveillance of emerging influenza virus
genotypes by mass spectrometry.
PLOS One
2(5), e489.
Simon-Loriere, E., Martin, D.P., Weeks, K.M., Negroni, M., 2010. RNA structures
facilitate recombination-mediated gene swapping in HIV-1. J. Virol.
84,
12675-12682.
Simon-Loriere, E., Rossolillo, P., Negroni, M., 2011. RNA structure, genomic
organization and selection of recombinant HIV. RNA Biol. 8, 280-286. Smythe, R.P.,
Davenport, M.P., Mak, J., 2012. The origin of genetic diversity in HIV-1. Virus
Res. 169, 415-429. Starck, S.R., Cardinaud, S., Shastri, N., 2008. Immune surveillance obstructed by viral mRNA. Proc. Natl. Acad. Sci USA 105, 9135-9136.
Snoeck, J., Fellay, J., Bartha, I., Douek, D.C., Telenti, A., 2011.
Mapping of positive selection sites in the HIV-1 genome in the context of RNA
and protein structural constraints. Retrovirol. 8, 87. Takehisa, J.,
Zekeng, L., Ido, E., Yamaguchi-Kabata, Y., Mboudjeka, I., Harada, Y., Miura, T.,
Kaptue, L., Hayami, M., 1999. Human immunodeficiency virus type 1 intergroup
(M/O) recombination in Cameroon. J. Virol. 73, 6810-6820.
Tellam, J.T.,
Lekieffre, L., Zhong, J., Lynn, D.J., Khanna, R., 2012. Messenger RNA sequence
rather than protein sequence determines the level of self-synthesis and antigen
presentation of the EBV-encoded antigen, EBNA1. PLOS Path.
8,
e1003112. Temin, H.M., 1991. Sex and recombination in
retroviruses. Trends Genet. 7, 71-74. Tian, X., Strassmann, J.E., Queller, D.C.,
2011. Genome nucleotide composition shapes variation in simple sequence repeats.
Mol. Biol. Evol.
28, 899-909.
Watts,
J.M., Dang, K.K., Gorelick, R.J., Leonard, C.W., Bess, J.W., Swanstrom, R.,
Burch, C.L., Weeks, K.M., 2009. Architecture and secondary structure of the
entire HIV-1 genome. Nature 460, 711-716.
Wyatt, G.R., 1952. The nucleic acids of some insect viruses.
J. Gen.
Physiol. 36, 201-205.
Xue, H.Y., Forsdyke, D.R., 2003. Low complexity segments in Plasmodium falciparum proteins are
primarily nucleic acid level adaptations.
Mol. Biochem. Parasitol.
128, 21-32.
Zanini, F., Neher, R.A., 2013. Deleterious synonymous mutations hitchhike to
high frequency in HIV env evolution.
arXiv:1303.0805v1 [q-bio.PE].
END NOTE (Oct 2013)As may be deduced from the introduction, this review was triggered by what was deemed an unsatisfactory review of the base composition of HIV-1 by Kuhl and Berkhout (2012), which seemed to imply that it had dealt with the subject comprehensively. As indicated in the above copy deposited in arXiv, my review was initially submitted to the Journal of Theoretical Biology in 2012. However, it was rejected, after much delay, on the advice of an anonymous reviewer who appeared to have an inordinate admiration for the work of Berkhout (who is not known to me personally). This prompted my arXiv deposition in May 2013. The latter was essentially the same as the 2012 version, except for the citations of a new paper from the Berkhout group and an arXiv review by Zanini and Neher. Subsequently, my review was revised and submitted to Virology, whose editors declined to review it. There then followed a series of negative presubmission enquiries (BioEssays, BMC Biology, eLife, Genetics, PLOS Pathogens). Eventually an editor, David Ojcius, encouraged submission of a full paper to Microbes and Infection, which was duly submitted (9th Aug) and accepted 25th October. A comparison of the final version with the above 2012 original as deposited in arXiv will reveal a different title and numerous references to new works published in 2013. But the message remains the same and, sadly, is presented in what some may consider a less entertaining fashion.
Go to: Final Version in Microbes and Infection (Click Here) Return to: AIDS Page (Click Here) Return to: Bioinformatics Page (Click Here) Return to: Evolution Index Page (Click Here) Return to: HomePage (Click Here)
This page was established May 2013 and was last edited on 11 November 2020 by Donald Forsdyke.
|