Mol 2 Net Interdependence of Influenza HA and NA and Possibilities of New Reassortments

The influenza virion is characterized by two surface proteins, hemagglutinin (HA) and neuraminidase (NA). The changes in their surface antigenic sites have given rise to several subtypes – H1 to H16 for the hemagglutinin and N1 to N9 for the neuraminidase, and each influenza strain is identified with these subtypes such as the H5N1, H7N9, etc. Of the 16 x 9 combinations possible, only certain combinations are observed to proliferate in the wild, such as the H1N1, H3N2, H5N1, etc. This interdependence of the HA and NA on certain subtypes have been noticed, and experimentally demonstrated, but the underlying cause or its systematics have been unknown. We have hypothesized that the base distribution characteristics of the HA and NA constitute a coupling between them. We estimate the coupling strength by measuring the distance in graph radii between the two genes in a graphical representation scheme. We found that this distance was characteristic of each subtype combination and forced combinations with a different HA or NA subtype led to widely different values, which by our hypothesis, and the experimental findings of Zhang et al, implied unstable combinations. This hypothesis implies that given a stable subtype of pathogenic influenza, we can estimate using the coupling constants which other subtype combinations could emerge through reassortment. Thus in the case of the H5N2 strain which had an epidemic form in North America in 2015, we have calculated the consequences of altering the NA component. We found that only H5N4, H5N6 and H5N9 combinations could match the coupling strength of the H5N2, thus implying that the next epidemics could arise from these combinations rather than other subtypes of H5. This allows for more focused monitoring of emerging flu strains for epidemic potential. .


Introduction
Influenza is a widely prevalent seasonal viral infection that afflicts several million people annually with fatalities that range into tens of thousands.The influenza virus is of three main types, Influenza A, B and C, of which Influenza A is the most prevalent and affects birds, mammals and humans.The influenza A virus genome is a segmented negative strand RNA virus with eight distinct pieces of the nucleic acid coding for 11 proteins.Two of thesehemagglutinin (HA) and neuraminidase (NA)are surface situated glycoproteins responsible for virus endocytosis and progeny elution.On the basis of their antigenic segments several subtypes of these proteins have been identified -16 for the HA and 9 for the NA, enabling the viral strains to be characterized as H1N1, H3N2, H7N9, etc.
It has been observed that while 16 x 9 combinations of the HxNy variety of viral strains are possible, only certain combinations seem to predominate in the wild [1], as can be seen from the database population of the various flu strains [2].The reason for this interdependence of one subtype of HA for a specific subtype of NA and vice versa is not known, but that such an interdependence exists has been verified by the experiments of Zhang et al [3] who swapped different subtypes to show that outside the preferred subtype combinations the strains were not stable or adequately infective; this has also been observed in other subtype combinations [4].
This issue is quite critical.The evolution of different strains of influenza leads to the possibilities of emergence of epidemic and pandemic strains that can affect very large number of host species as had happened with the Spanish Flu (subsequently identified as a H1N1 subtype) of 1918 where the human death toll exceeded 20 million, the Swine Flu (H1N1) of 2009 where several million people were infected and over 25,000 died within one year, the H5N1 bird flu that apparently surfaced around 1997 and the H7N9 flu of 2013 whose containment necessitated culling of millions of chicken, and the recent H5N2 avian flu epidemic in North America (2015) that has led to culling of millions of poultry and farm birds.Such strains arise through genetic shift and drift, of which reassortments among the various genes, especially the glycoproteins, which can occur when two flu strains infect a single cell, are among the most important processes [5,6].Such possibilities require incessant monitoring of evolving subtypes and combinations worldwide, a rather daunting, but inescapable task.Taking interdependence of HA and NA into account can help to reduce this task to more manageable proportions since monitoring one of them, say HA, automatically accounts for the associated NA that can retain the infectivity.Quantifying the interdependence of HA and NA therefore can be instrumental in this enterprise.
Our study of this HA-NA interdependence [2] led us to hypothesize a coupling between the two glycoproteins arising out of their base distribution and composition characteristics which are best visualized in a graphical representation.Our research showed that the major influenza subtype combinations had very distinct coupling strengths and interchange between different subtype components compromised such strengths, an outcome in keeping with Zhang et al's experimental results [3].As a consequence of this model, we considered the current H5N2 avian epidemic in http://sciforum.net/conference/mol2net-1 the USA and could forecast two possible reassortment products that could conceivably fuel new epidemics [7].This report very briefly summarizes our methodology and these results and observations. .

Results and Discussion
Graphical representation of the HA and NA sequences of H1N1, as described in the Methods section and shown in Fig. 1, depicts the base distributions of the two genes.Our hypothesis of a coupling between the two genes is predicated upon the assumption that the base distributions are mutually dependent; indeed, Hu [8] has shown that mutational changes in one lead to a coupled mutational change in the other.We quantify this inter-dependence by the distance between the end-points of the graph radii of the two plots as defined in the Methods section.
The point is that if the coupling between the HA and the NA were to be characteristic of the specific subtype, implying co-ordinated change of some kind, then irrespective of the genetic shifts in the two gene sequences, the coupling factor as measured by the distances of the graph radii should remain constant within a reasonable limit.As shown in Table 1, taking a large sample of the H1N1 strains we found that the coupling factor worked out to 39.49±8.19.Similar analyses for other strains of recent interest showed similar trends.Table 1 lists the results of our analysis with all viral strains in our database showing the coupling actors for each individual viral strain and also an average for each flu subtype where adequate number of strains was available.We notice that this average is different for each subtype with a reasonably low standard deviation implying that each flu subtype has a specific coupling strength, which we may refer to as its characteristic value.
For the coupling to be characteristic of each flu subtype, replacing one gene with another variety should produce quite different result for the coupling factor.In fact, as we replaced the H1 sequence of A/South Dakota/01/2011(H1N1) with a H5 sequence from A/duck/France/05066b/2005(H5N1), the coupling factor changed from 33.97 to 12.03; exchanging only the NA between the two strains changed the coupling factor to 69.72 implying gross lack of compatibility between the HAs and NAs of the two strains.This effect we found in a wide variety of samples tested as shown in Table 2.We note that the HA exchange produced less dramatic or insignificant effects than NA since the HA sequences are more homologous across all HA subtypes compared to the NA subtypes: taking typical examples each of all subtypes of HA and NA, we find that in terms of the composition of the four bases a, c, g, t the standard deviation from the average composition values is <3% for a and t (a: 2.6%, t: 2.8%) and <4% for g and c (g:3.7%, c:3.8%) for the HA, whereas these figures are >3% for a,t (a:3.7%,t:9.2%) and >6% (g:6.7%,c:6.2%) for the NA; the wide variation of the NA and the comparatively lesser variation of the HA subtypes is evident too in the graphical representations shown in Ref 2.
These results of our analyses, summarized in Tables 1 and 2, show that forced exchanges between the HA and NA of the flu strains often lead to coupling factors widely different from the characteristic values.Our observations tie in neatly with the experimental results of Zhang et al [3] who found from HA, NA exchanges within a set of 1918 pandemic H1N1, a 2009 pandemic H1N1 and a HPAI H5N1 that the NA exchanges led to significant decline in influenza infectivity whereas the effect was comparatively much less when the HAs were exchanged.This he attributed to lack of "matching patterns" between the NA of the H5N1 with the HAs of the H1N1s http://sciforum.net/conference/mol2net-1 in the experiment.From the observation noted earlier that the availability in the wild of only a few wild subtypes of flu may imply low stability of other subtypes, and the observations of wide variation in the computed coupling value and Zhang et al's results, we may infer that such "forced" subtypes will not yield stable or efficient infective strains.
This leads us to an interesting prognosis.This year has witnessed a sudden epidemic of highly pathogenic avian influenza (HPAI) H5N2 infection among poultry and farm raised birds like chicken and turkeys in the North American west and mid-west leading to death through infection or culling of millions of birds [9].While the virus has not affected humans yet, strict monitoring is being done to ensure adequate warning in case the virus develops human-to-human transmission ability [10].At the same time a watch has to be kept on the possibility that the virus could undergo reassortments and give rise to new subtypes, though one does not know which of the possible subtypes could be highly pathogenic too.
Our analysis provides a guideline here.Once we know that the H5N2 is a HPAI virus, we can forecast which of the possible reassortants have the potential to be stable and possess pathogenic ability [7]; it is pertinent to note here that according to the USDA, the current H5N2 subtype is itself a combination of Asian HPAI H5 and North American N2 [11,12].We accessed all North American H5N2 gene sequences available at the time, i.e., around mid-May 2015, and determined that the magnitude of the coupling for these strains as measured through the delta-gR was 38.58±1.46.To assess possible reassortments from these strains we are mindful of the fact that Asian H5 is a highly pathogenic virus that in its H5N1 bird flu form had caused high level of human fatalities at a mortality rate of 1 per 2 infections [13], and a continuing fear that the virus may mutate to a form causing human-to-human transmission leading to a new pandemic [14].Taking such a HA as one component of the possible reassortants, we tried combinations with all subtypes of the NA available from typical flu sequences (Table 3).Taking a cue from the results given in Table 1 that the standard deviations of the coupling values between the various strains of the flu subtypes is 18.85% (range: 7.65% -29.12%), we can look for those HA-NA combinations that lie within this range.The results as shown in Table 3 indicate that only combinations of the H5 with a possible N4, N6 or N9 fall within this coupling value range and therefore could be the new HPAI to evolve from reassortments of the H5N2 with other flu subtypes.Our research showed that flu subtypes with these varieties of the neuraminidase have been reported in various places in North America, indicating that it is possible for reassortments of the HPAI H5N2 with these subtypes to take place.While monitoring for genetic shifts and drifts of the H5N2 in North America, close attention, therefore, may be given to development of H5N4, H5N6 and H5N9, if any, of which H5N9 may bear extra scrutiny since a hitherto benign to human H7N9 strain in China suddenly developed a mutation in 2013 that led to human fatalities.Such focused scrutiny might reduce the monitoring overhead to some extent to concentrate on the more potent possibilities.The same exercise can be done with other HA subtypes, but as we have seen, the flu subtypes are more sensitive to the changes in NA. http://sciforum.net/conference/mol2net-1Table 1   The 2D graphical representation method used here [16] is a simple device to visualize the base distribution of any DNA/RNA sequence.On a 2dimensioanl Cartesian cor-ordinate system, the four cardinal directions are identified with the four bases which preferentially are: adenine with the -ve x-direction, cytosine with the positive ydirection, guanine with the +ve x-direction and thymine with the -ve y-direction.The query sequence is plotted starting from the origin and moving one step in the direction indicated for the base sequentially.This traces out a curve that reflects the distribution of bases along the sequence.Fig. 1 is an example of two gene sequences, of the HA and NA, on the same graph.
Quantitative assessments of the different sequences, e.g., descriptors of the two sequences in Fig. 1, can be made as a first approximation by defining weighted centre of mass (µx,µy) of a sequence as where (xi,yi) represent the co=ordinates of the ith base, N is the total number of bases and we define the distance from the origin to the centre of mass as gR.Then the difference between two sequences can be represented by the distance between the end points of the graph radii of the two sequences as We use the gR as an indicator of the coupling between the two sequences as explained earlier.
More discussions on the properties of gR and gR can be found in the earlier papers and related documents. . . . . . .

Conclusions
In this brief report we have discussed the observation of interdependence of hemagglutinin and neuraminidase in influenza A subtypes that appear to restrict the proliferation of influenza subtypes to a few combinations, although theoretically a much larger number should be possible.We believe the origins of this phenomenon must lie in the base distribution and composition of the related sequences; observations of Hu [8] on HA, NA mutations show that mutations in one sequence appear to regulate mutational changes in the other.To quantify this phenomenon we have hypothesized a coupling of the two sequences with a coupling factor that is characteristic of the related subtypes.Our investigations into real sequences of several different subtypes using graphical representation techniques yielded specific numbers for each flu subtype within a reasonable tolerance level; forced replacements of one gene with another subtype led to different coupling strengths, which was more dramatic in the case of NA exchanges [2].
Since the influenza genome is known to undergo genetic drift to new reassortants quite frequently due to its inherent segmented structure, this interdependence of the HA and NA serves to restrict such reassortants to a reduced subset of possible stable pathogenic varieties.Our methodology described http://sciforum.net/conference/mol2net-1above, and reported previously in Ref 2, allows us to compute possible such subtypes of the influenza.In response to the recent epidemic of H5N2 influenza among North American poultry, we have made a prognosis on this basis of possible new pathogenic reassortants that may arise out of the current epidemic [7].This has important consequences on monitoring of influenza strains and mutations that allows opportunity to focus on possible more pathogenic subtypes.On a larger scale, our approach provides an opportunity worldwide to compute and monitor evolution of highly pathogenic influenza viruses. .

.
Coupling factors of HA-NA interdependence.The last two columns provide summary data for each major flu type indicated in the first column

Table 2 .
Coupling factors for forced matches between HA and NA for different strains and subtype combinations

Table 3 .
Hypothetical combinations from 2015 North American H5N2: Coupling a H5 with different NA subtypes.
Note: ss -short stalk; ls -long stalk; Delta-gR is the coupling factor (Reproduced by permission from Current Comput Aided Drug Design,Ref 7)