WHILE active in laboratory research in the Biochemistry Department I never actually worked in the field of protein structure. However, after training in information technology I was encouraged by my colleague James Milner-White, a structural biologist, to collaborate with him in this area. A major interest of James is small hydrogen-bonded three-dimensional protein motifs, and he was keen to have a database which included those that he had identified. It was decided to achieve this by first building a relational database which would model the structural features of proteins, and then to populate it with motifs selected by SQL queries. This approach has resulted in a database that allows one to examine other aspects of protein structure, and we have employed it particularly in studies of the ends of α-helices.
‘Motivated Proteins’ is a Java web application which allows users access to these small three-dimensional motifs from the database. It has a range of features which are described in more detail on the Motivated Proteins website and in a publication (BMC Bioinformatics (2009) 10:60). There are two aspects of it that I find particularly satisfying. One is the way in which the user can view retrieved motifs in the context of the three-dimensional structure of the protein. The other is the way in which the user is helped to retrieve motifs in the first place: a battery of techniques is employed to reduce the likelihood that a query will draw a complete blank.
‘Structure Motivator’ is a Java desktop application which also uses the motif database, this time in an embedded form. However, its function is to allow comparisons and selections to be made from all the instances of any specific motif type, which is done through an interactive display of dihedral angles. Although the motifs are held in memory in a different manner from their representation in the database, the user effectively makes rapid and repeated complex SQL queries on the database (without, of course, being aware of doing so). The application has quite a rich feature set, which is described on the Structure Motivator web page, and includes the ability to use external files representing any aspect of protein structure that the user may wish to consider as a ‘motif’. We have used this to effect ourselves in studies of helix structure (below).
It would seem inherently unlikely that James and I could discover anything new about α-helices, given that they were not an aspect of protein structure with which James had been particularly concerned, and that my own knowledge of them hardly extended beyond the level of undergraduate text books. However, as often happens in science, new insights to a problem arise from approaching it with new or different methodologies. In our case querying the database was the first aspect of our ‘different’ approach, made possible because the database had also been populated with secondary structure elements. The second aspect was the availability of an early version of ‘Structure Motivator’. We selected helix components by direct SQL queries of the database, and then used the software to examine and manipulate them further.
What, then, did we discover? In brief, we showed that the marked differences found in the conformations of the N-terminal and C-terminal ends of α-helices emerge naturally from the structures adopted by their constituent overlapping individual α-turns (Proteins (2011) 79:1010–1019). Furthermore, if one assumes that helices are initiated from a single α-turn and grow by successive addition of further α-turns, our results suggest that extension in the C to N direction would be energetically preferred to extension in the N to C direction, even though the opposite has generally been assumed to be the case.