Scientists in Germany and the US have predicted the most topologically complex knot ever found in a protein using AlphaFold, the artificial intelligence (AI) system developed by Google’s DeepMind. Their complete analysis of the data produced by AlphaFold also revealed the first composite knots in proteins: topological structures containing two separate knots on the same string. If the discovered protein knots can be recreated experimentally it will serve to verify the accuracy of predictions made by AlphaFold.
Proteins can fold to form complex topological structures. The most intriguing of these are protein knots – shapes that would not disentangle if the protein were pulled from both ends. Peter Virnau, a theoretical physicist at Johannes Gutenberg University Mainz, tells Physics World that there are currently around 20 to 30 known knotted proteins. These structures, Virnau explains, raise interesting questions around how they fold and why they exist.
A protein’s shape can be closely linked with its function, but while there are a few theories on the functionality and purpose of protein knots there is little hard evidence to back these up. Virnau says that they might help to keep the proteins stable, by being particularly resistant to thermal fluctuations, for instance, but these are open questions. While protein knots are rare, they also appear to be highly preserved by evolution.
“If a knotted protein exists, for example, in yeast, there is a high likelihood that it is also knotted in the corresponding protein in humans,” Virnau explains. “So, these are structures that have been around for hundreds of millions of years.”
A long-standing problem in protein knot research has been finding and identifying protein knots. While complex protein structures have been experimentally determined in the laboratory, this can be challenging and time consuming. Recently, DeepMind developed an AI system known as AlphaFold that it claims can predict protein structures with incredible speed and precision. The deep-learning system works on a large database of known proteins and their amino acid sequences. It uses those sequences and information on the primary structure of amino acids to predict the three-dimensional structures of the proteins. Its training is based around evolutionary, physical and geometric constraints of protein structures.
AlphaFold has predicted several hundred thousand protein structures, most of which have not yet been catalogued. In this latest work, published in Protein Science, Virnau and his colleagues searched AlphaFold’s databank for previously unknown complex protein knots. They discovered nine new knots. This included the first 71-knot – a knot with seven crossing points that is the most topologically complex knot ever found in a protein.
The researchers also found several six-crossing composite knots. These each contain two trefoil knots, which are knots with three crossings. They also discovered two previously unknown knots with five essential crossings, a 51-knot and a 52-knot.
The team is now working with biochemist Todd Yeates, at the University of California Los Angeles, to create the proteins identified by AlphaFold experimentally to confirm that they form the predicted topological structures. “I’m quite confident that we will be able to confirm these structures experimentally,” says Virnau.
Quantum approach reveals faster protein folding
If these topologically challenging structures can be created experimentally it would show that AlphaFold is working as expected and provide confidence in its predictions of less complex protein shapes. “The protein knots might only be a minor aspect of this, but it may nevertheless serve as a validation of these tools in general,” Virnau explains.
In the future it might be possible to use these AI tools for protein engineering. Proteins could be designed containing knots and other complex structures that provide them with functionality for specific tasks, although this is at least a few years away.