An Algorithmic Approach to Determining the Spatial Configuration of a Protein

Grade Level at Time of Presentation

Senior

Major

Computer Science & Mathematics (premed)

Minor

-

Institution

Eastern Kentucky University

KY House District #

81

KY Senate District #

34

Department

Mathematics & Statistics

Abstract

Determining the 3-dimensional structure of a protein is a significant problem that still poses a major challenge. The ability to determine the spatial orientation of a protein is highly desired, as it offers great insight into its functionality. Current mainstream experimental methods include X-Ray Crystallography and NMR Spectroscopy. Using NMR results, the primary objective is to obtain a solution such that the distances between atoms within the predicted, computed structure, are as close as possible to the experimental distances; this is referred to as the "Molecular Distance Geometry Problem". Once a true structure is acquired, it is then uploaded to an online protein database (PDB).

Our approach first began by collecting statistical data from the PDB, and coupled it with NOE data (distances between atoms lacking chemical bonds), in order to more accurately define distances within the target protein. We then implemented a "Branch-and-Prune" A.I. (artificial intelligence) algorithm to consider various physical and chemical assumptions when constructing a solution. Without this consideration the computational time is exponential, however, by necessitating the validation of these assumptions, the algorithm can decide when to stop constructing a potential solution and discard it. The uniqueness in our approach comes from developing a tolerance of violation for these assumptions. This violation tolerance is controlled through assigning weights of importance amongst all assumptions, and "balancing" these weights to produce optimal solutions. By allowing this kind of tolerance, we were able to produce a higher output of solutions, founded on the idea that although a solution may contain some degree of violations, it is still a candidate for being a true solution. The algorithm will be presented and a predetermined protein structure from the protein database will be used as an example to demonstrate the accuracy and performance of the algorithm.

This document is currently not available here.

Share

COinS
 

An Algorithmic Approach to Determining the Spatial Configuration of a Protein

Determining the 3-dimensional structure of a protein is a significant problem that still poses a major challenge. The ability to determine the spatial orientation of a protein is highly desired, as it offers great insight into its functionality. Current mainstream experimental methods include X-Ray Crystallography and NMR Spectroscopy. Using NMR results, the primary objective is to obtain a solution such that the distances between atoms within the predicted, computed structure, are as close as possible to the experimental distances; this is referred to as the "Molecular Distance Geometry Problem". Once a true structure is acquired, it is then uploaded to an online protein database (PDB).

Our approach first began by collecting statistical data from the PDB, and coupled it with NOE data (distances between atoms lacking chemical bonds), in order to more accurately define distances within the target protein. We then implemented a "Branch-and-Prune" A.I. (artificial intelligence) algorithm to consider various physical and chemical assumptions when constructing a solution. Without this consideration the computational time is exponential, however, by necessitating the validation of these assumptions, the algorithm can decide when to stop constructing a potential solution and discard it. The uniqueness in our approach comes from developing a tolerance of violation for these assumptions. This violation tolerance is controlled through assigning weights of importance amongst all assumptions, and "balancing" these weights to produce optimal solutions. By allowing this kind of tolerance, we were able to produce a higher output of solutions, founded on the idea that although a solution may contain some degree of violations, it is still a candidate for being a true solution. The algorithm will be presented and a predetermined protein structure from the protein database will be used as an example to demonstrate the accuracy and performance of the algorithm.