Posts

Showing posts from January, 2025

Boltz-1 H5N1 HA mutations

Image
This paper by Lin et al. from the Dec. 5 2024 issue of Science showed how a single mutation, a glutamine to leucine change (Q226L) in the H5N1 HA amino acid sequence could change HA binding specificity from avian to human binding. The specificity was enhanced when combined with an asparagine to lysine (N224K) change at a nearby position. Since I have been playing around with Boltz-1 , I thought it would be interesting to see what Boltz-1 would show about the protein structure with these changes. I could not find the unmodified amino acids at the positions listed. Rather than using sequential site numbering for positions (initial Met would be position 1), the authors use reference site numbering , called H3 numbering. This scheme puts sites into alignment with the ectodomain structure of the H3 subtype of HA. Figure 1 from the Lin et al. paper shows the subsequence "SQVNGQRG" where the target amino acids are found. I located this subsequence in the HA sequence. >>>...

Boltz-1 Democratizing Biomolecular Modeling

Image
 Boltz-1 is pretty cool. Boltz-1 is an open-source deep learning model for predicting biomolecular structures based on their sequences. According to the developers, Boltz-1 achieves AlphaFold3 level accuracy. They have released training and inference code, model weights, datasets, and benchmarks under the MIT open license. They're democratizing biomolecular modeling. You can read the introductory paper here  and a press article about it here . I just downloaded Boltz-1 two days ago, so this will not be an in depth look into Boltz-1. Maybe that will come later. Right now, I just wanted to try it out. Downloading and installing Boltz-1 was easy; clone the GitHub repo and you are ready to go. I used the reference H5N1 HA amino acid sequence that I used for this post . I extracted the sequence from the GenBank file for the Influenza A virus (A/cattle/Texas/24-008749-003/2024(H5N1) PP755589. #!/usr/bin/env python # -*- coding: utf-8 -*- """ aa_from_gb.py - extract ...

H5N1 Part 2

Image
  https://en.wikipedia.org/wiki/Hemagglutinin_(influenza)#/media/File:Flu_und_legende_color_c.jpg In a previous post , I looked at the countries posting H5N1 HA sequences to GISAID and the host species that the samples were drawn from. This time I want to look at the variation in the data to identify possible mutations. It has been reported that a glutamine to leucine mutation at residue 226 of the HA amino acid sequence increased specificity of host recognition from avian to human.  In order to get a sense of the current mutation landscape as reported to GISAID, I aligned 3,432 sequences from the time period January 1 2024 to December 12, 2024 using MAFFT . The sequences were aligned to a reference sequence,  PP755589  HA sequence from GenBank. See also this GitHub page . time mafft --6merpair --maxambiguous 0.05 --preservecase --thread -1 --addfragments data/gisaid_epiflu_sequence_HA_2024_12_12.fasta data/HA_reference.fasta > data/gisaid_HA_2024_12_12_aln.fa...

H5N1

Image
The H5N1 strain of avian influenza (bird flu) is a significant health concern. Although human infections of H5N1 are relative rare, the effect on bird populations has been devastating in some areas. In addition to birds, H5N1 infections have been found in domestic cattle, cats, goats, and other mammals.  So far, Human cases appear to come from contact with infected animals. Although there are cases of undetermined origin, effective human-to-human transmission hasn't been determined. H5N1 belongs to the A subtype of the influenza virus. It's a negative sense RNA virus. I wanted to get a picture of H5N1 genomics. In particular, I am interested in what is happening in the HA protein. The viral genome consists of eight proteins: PB1, PB2, PA, HA, NP, NA, M, and NS. See   H5N1 genetic structure  for more details and references. I'm particularly interested in changes in the hemagglutinin (HA) protein. HA is found on the surface of viral envelope. It is responsible for bind...