Andrew Grove
Advisor: Dr. Dan Raftery
CS 497 Presentation
Wednesday, April 23, 2003
Goal: To use pattern matching techniques on the computer in order to identify and remove compounds in NMR data for the purpose of identifying any remaining compounds. This could be use to speed up clinical trials.
During this semester my goal was to work with a chemistry graduate student to reduce noise in the NMR data. We looked at three options for this task; examining the local extrema of the data, wavelet transformation, and statistic modeling. NMR (Nuclear Magnetic Resonance) is the process of exposing a small amount of solution (10-20 μL) to a very high intensity magnetic field. This help to determine the structure and concentration of the molecule.
The first method we tried was singling out the local extrema in the graph. I was able to write a Matlab function which would take the data points and a desired threshold and reduce the data. This does make a nice graph, but it is not a very good statistical model since is neglects many of the points surrounding the local extrema.
Then we tried, by the direction of Dr. Raftery, to use wavelet transformations to reduce the data in a mathematical application called R. We spent most of the semester trying to learn how to use R and how to use wavelets for our purposes. Unfortunately, we were unable to figure out how to use this type of transformation. However, by the time we came to that conclusion, Dr. Raftery had another suggestion.
Now we are working on using statistical model (e.g. linear/non-linear model) to reduce the data. We just started working on this method so not much has been completed so far. However, we feel there is more promise since there is an abundance of information available on normal statistical models.
After we, or some future team, figure out the best way to reduce the data, we will then remove the spectral lines in the graph that correspond to known compounds. The compounds left will then need to be identified. This process will then be automated in hopes of significantly reducing the time required for clinical testing of drugs by pharmaceutical companies.