« September 2007 | Main | November 2007 »

October 2007

October 26, 2007

Fringe Benefits and Knowledge Management

Last week I blogged about Phil Keyes' and Anthony Macherone's applications of NMR software towards automated structure confirmation.

A few months back, I pointed you to Steve Coombes' workflow when working with ACD/Structure Elucidator.

Phil had a very nice section in his presentation about the "fringe benefits" he was able to derive outside of the main goal of the project, "Automated Structure Verification".

Specifically, Phil pointed to a couple of fringe benefits:

1) A spectral database is grown as a result of the automated structure confirmation. This database is heavily searchable and can be used as a resource within the company. Building the database is part of the workflow. No extra work needs to be done.

2) The software provides an assignment starting point. In running the verification algorithm, the software automatically attempts to assign multiplets in the 1D and 2D spectra, provides feedback of the quality of those assignments, along with the ability to easily edit them:

Keyesimage

Anthony Macherone also mentioned automatically storing data in a searchable database as an additional benefit to conducting automated structure confirmation in his presentation.

On a different application, Steve Coombes spoke a lot about the additional benefits he receives out of ACD/Structure Elucidator.

In this presentation Steve really stresses the knowledge management angle from Structure Elucidator. Sure, the software can help elucidate the chemical structure of unknowns, but it also supports the ability to store the knowledge you gain from working on your data.

In Steve's opinion this is what separates ACD/Labs software from many other packages out there. The "ability to extract the information and knowledge for further use"

It's not just the ability to build databases with structures and spectra. The key is the ability to assign that data electronically and store it in a searchable database. That's knowledge.

And of course by retaining that knowledge through electronic assignments, you can share that knowledge with the software by training the predictions and improving elucidation and verification performance. 

I'd like to thanks these guys for teaching me a nice "marketing" lesson. It's not always about the main application of the software. Always be on the lookout for "fringe benefits"

October 25, 2007

How Accurate are Experimental Chemical Shifts?

Several months ago, I asked, "How Accurate Should NMR Predictions Be?"

Today, I ask how accurate and consistent are actual experimental chemical shifts?

In many ways, this post probably should have preceded the one I linked above because in reality, before a discussion about prediction accuracy can begin, the topic of experimental accuracy needs to be addressed.

The issue of experimental accuracy can be important from two perspectives. For example, accuracy is important for a chemical shift database that is used for producing the predictions (Hence ACD/Labs' Purgatory Database). In addition, it is also important in identifying the accuracy of a predicted chemical shift when comparing it to an experimental one. How can we determine where the inaccuracy occurred?

In the process of producing an experimental NMR spectra, there are many variables that can affect a chemical shift that are not always carefully controlled. They include, but are certainly not limited to:

  • Concentration of the sample
  • Temperature of the probe
  • Equilibration time in the probe
  • Solvent type
  • Residual water content of the solvent
  • pH of the sample (if aqueous media)
  • Digitization of the spectrum
  • Shimming and phasing inaccuracies
  • Choice of reference standard

Of course many of these factors, can significantly affect the chemical shift of the peaks in the spectrum. How much they are affected is sometimes hard to measure, but as an example, we can consider the range of database entries in our database for the shift of the methyl group protons in toluene. All of the following have been published in peer-reviewed journals.

Table

Of course the deviations in these shifts are primarily based on the fact that each chemical shift was recorded in a different solvent. The reason for adding 8 sources of toluene in our database is so we can attempt to take the solvent into account when solvent-specific prediction is performed.

But as mentioned, solvent is not the only variable that can affect how an experimental chemical shift is recorded.

Thoughts?

References:

1. Prog. Nucl. Magn. Reson. Spectrosc.,1996,v.28,p.161 (Toluene-d8)

2. Zh. Org. Khim.,v.12,p.275 (Tetrahydrofuran-d8)

3.  J. Org. Chem.,1997,v.62,p.7512 (Chloroform-d; 300 MHz; 24 C)

4.  J. Org. Chem.,1997,v.62,p.7512 (Acetone-d6; 300 MHz; 24 C)

5.  J. Org. Chem.,1997,v.62,p.7512 (Dimethylsulfoxide-d6; 300 MHz; 24 C)

6.  J. Org. Chem.,1997,v.62,p.7512 (Benzene-d6; 300 MHz; 24 C)

7.  J. Org. Chem.,1997,v.62,p.7512 (Acetonitrile-d3; 300 MHz; 24 C)

8.  J. Org. Chem.,1997,v.62,p.7512 (Methanol-d4; 300 MHz; 24 C)

 

October 19, 2007

Meet Quindolinocryptotackiene

Tony over at ChemSpider takes us on a trip down memory lane to one of the most successful stories surrounding Computer Assisted Structure Elucidation (CASE).

It is also the best example of achieving symbiosis between a spectroscopist (in this case Gary Martin) and software (ACD/Structure Elucidator) I have ever seen.

He is referring to "Solving a structure computationally after 10 years of human effort" that was presented by Gary and Tony at the ASP Meeting in 2003 (It's a long presentation but skip to slide 44 to get to the meat of the presentation).

There is also a publication on this story

Tony's purpose for resurrecting this story is as follows:

Now, we THINK we have it elucidated correctly. However, we would like to confirm it. Synthesis of the molecule in question, further NMR data generation and a crystal structure would help finish this work fully. This is a call to organic chemists to participate in a hobby project. Anybody want to help? We guarantee a publication etc. The structure is shown below. Contact me at antonyDOTwilliamsATChemspiderDOTcom. Thanks!

Hopefully someone is willing to step up to the plate.

I'd also like to take this opportunity to, once again, point out that CASE is not simply about piling a bunch of data in a piece of software and getting the answer out the other end. Sure this is possible, but it usually benefits when an experienced spectroscopist works with it and shares their knowledge of the existing chemistry. I think Gary's story is a perfect example of that.

That being said, in Gary's case, along with comments I have received from Dr. Shaun Tennant (another elucidator user) in the past, the software is an unbiased approach that will propose some things that the spectroscopist simply might not think about. Knowledge can sometimes be your enemy.

October 18, 2007

Applications of Automated Structure Verification with NMR Software- Part 2

Yesterday I blogged about how Phil Keyes has applied automated structure verification at Lexicon Pharmaceuticals to help validate compound registrations in an open access environment.

Links to the latest performance statistics of our automated structure verification solution for both 1D 1H and combined 1D 1H and 2D HSQC structure verification can be found in the previous post.

As promised, today I will highlight the application of automated structure verification that Anthony Macherone has employed at ASDI.

Anthony works in a high-throughput environment where more than 1000 compounds are directed to 1D 1H NMR analysis per week. Based on this workload, he has implemented a very nice workflow in his laboratory. In his presentation, Anthony mentioned that it in his line of work, the ultimate goals are to:

  1. Maximize instrument efficiency
  2. Maximize throughput
  3. Be cost effective

Sounds like some pretty good goals to me. How Anthony is able to achieve this is of course the really interesting part.

Anthony describes his workflow in three phases, the pre-game, middle-game, and end-game. In the pre-game he uses proprietary software (not ACD/Labs) to screen the compounds and "bin" them into appropriate analytical techniques. In doing so he does not have to run a full battery of analytical data on every compound that is screened. In the middle-game, he automates the sample preparation and acquisition using well-plates and the help of robots.

The end-game is where Anthony employs ACD/Labs software. Once the data is acquired, he applies a custom macro to automatically:

  1. Attach chemical structures to appropriate FID files
  2. Process the data (FT, phasing, baseline correction, and integration)
  3. Run the ACD/Labs automated structure verification algorithm (Provide a red light/green light data assessment)
  4. Store the data in a searchable database

Following the data acquisition and analysis, Anthony only needs to manually evaluate the ambiguous or questionable results (i.e. red light data)

Make sure to check out Anthony's presentation for more details regarding the advantages of these phases, time-savings, accuracy, etc.:

Anthony Macherone- High-Throughput NMR Analysis: The End Game

Again, I would like to thank both Phil Keyes and Anthony Macherone for sharing their applications at our New Jersey User Meeting last week.

October 17, 2007

Applications of Automated Structure Verification with NMR Software- Part 1

Several posts back I pointed you to a couple of articles ACD/Labs were involved in with regards to automated structure verification.

I have pointed to these articles, but I have spent little time talking about it. I will now.

For those new to this idea, it involves using software to automatically confirm the consistency between a chemical structure and an NMR spectrum using NMR prediction. Lee Griffiths from AstraZeneca has done excellent work over the years in this field. Lee was kind enough to present at our European User's Meeting last year to share a summary of his approach towards automated structure using 1D 1H and 13C, and 2D HSQC data.  This presentation can be downloaded here.

In addition, by doing a simple search for "Griffiths" on the Magentic Resonance in Chemistry webpage, you'll find a whole bunch of relevant articles.

We initially published a validation on the performance of automated structure verification using just 1D 1H NMR data. We then proceeded to publish again recently to compare that to the performance of a combined verification approach using 1D 1H and 2D HSQC data.

As a result of these and other studies, much of the focus of late by ACD/Labs has been on the performance of automated structure verification using 1D 1H and 2D HSQC NMR data.

These publications along with posters we presented at SMASH and ENC on this topic should give you a general idea about the performance and accuracy of this approach.

I am not going to discuss the performance of this approach today but rather focus on the real-world applications and performance in an industrial setting.

Last Thursday I was in New Brunswick, New Jersey at our New Jersey User's Meeting where I was blown away by two terrific presentations by our guest speakers, Phil Keyes from Lexicon Pharmaceuticals and Anthony Macherone from ASDI.

Two different applications in two different environments. I'll talk about Phil's today, and Anthony's tomorrow.  Phil's is interesting as he is setting up a really cool system to significantly improve how analytical data is handled in an open access environment, and further to validate Lexicon's compound registration database.

In my opinion, the real crucial thing to point out here is the evolution of an open access environment from a more traditional analytical services setup. It used to be that NMR Spectroscopists would run and handle all the analytical data for compounds that a chemist produced, verify their structures for them, and give them the thumbs up or thumbs down. In this environment, spectroscopists were getting a look at the data from all compounds entering the registration database. In an open access environment this is no longer the case. While NMR spectroscopist certainly see lots of this data still, and they will likely eventually see a compounds data during it's pharmaceutical R&D life cycle, the reality is that there are still going to be some incorrectly or questionably verified structures in a company's registration database that will go on for further testing. Towards the evolution of open access NMR, somewhere along the way, it became OK for compounds to get registered without being approved by an analytical expert. Of course, these aren't being registered blindly, chemists are approving these and in most cases they are more than qualified to do so and are doing a good job. However, I have yet to talk to a NMR spectroscopist who has NOT seen compounds registered incorrectly.

My point is of course to not pick on chemists here. Sometimes these mistakes are unavoidable and the data LOOKS right. Sometimes there is nothing in the 1H NMR spectrum or the LC-MS that suggests that there is anything different present. The key is to better identify when these instances arise in the registration database. Can an automated structure verification solution with NMR software replace and outperform the QC of a chemist for good in an open access environment? No, not right now anyway.

However, the key statement is in Phil's presentation:

"Integrating a system to perform automated compound verification provides value by highlighting compounds for which structural data is complex and subject to interpretation."

Sure there are going to be false positives and false negatives with an automated approach. The question is, if 50 out of 1000 compounds being registered by chemists are incorrect, is there value in automated software highlighting 40 of them?

False negatives can be annoying because it involves the spectroscopist to do unnecessary work on a sample that was correct all along. But other times it might point out the need to run more experiments to prove that it is indeed the right structure. Ideally ALL of the data gets manually evaluated, but in the age of open access NMR where chemists are outnumbering spectroscopists 100:1 in some organizations this is clearly no longer plausible. But is there a balance here? While it isn't plausible to manually evaluate the data for say 1000 compounds, would it be feasible to manually evaluate the 300 of the 1000 samples that software has highlighted as complex or subject to interpretation?

Phil's and Anthony's presentations will be available on the ACD/Labs website shortly, but for my readers, you get advanced access to these presentations.

Phil Keyes- Validating Compound Registrations with Automated NMR Verification in Open Access

For those who want to do advanced reading on the topic for tomorrow's blog entry:

Anthony Macherone- High-Throughput NMR Analysis: The End Game

October 15, 2007

Another NMR Blog

I'd like to point out another NMR blogger to all my readers out there.

Glenn Facey, Facility Manager from the University of Ottawa has created a blog specifically for his NMR users. While he is doing this to provide a resource to the University's students, I think there are some nice tips and tricks in there specifically for NMR data acquisition and processing. There are a couple of irrelevant housekeeping posts for those not attending the university, but other than that it is a very useful resource for beginner NMR users and students at other academic institutions.

Check it out here:

http://www.u-of-o-nmr-facility.blogspot.com/

I should point out that Glenn's work is an EXCELLENT application of blogging. I think that all instrument facility managers at academic institutions should have a blog. While we are at it, the same can be said for industry (blogs can be internal as well). It's a great place to talk about instrument maintenance and downtime (avoid the 2-3 emails a week you get from the facility manager) but more importantly to offer the NMR users tips and tricks over time. Students generally only get one in-depth training session with their NMR spectroscopist a blog offers the ability to provide students with a running commentary from the NMR expert. Contrary to popular belief, students probably aren't going to read the instruction manual and while spectroscopists put some work in creating a cheat sheet for them, these generally get lost in the bottomless pile of data that graduate students are collecting. You leave one in the instrument room as a resource? Go check, I bet it isn't there anymore :).

Here's another NMR Facility Blog run by Tim Burrows at the University of Toronto:

http://www.chem.utoronto.ca/facilities/nmr/NMRBlog/

There's no reason NOT to do this. It's dead easy. If you can email, you can blog.

Glenn has provided a nice standard to build upon:

April_enc_2007_014

P.S. Glenn actually taught our very own application scientist, Arvin Moser, everything he knows ;)

If you are a facility manager for an academic institution and you blog. Please let me know, I will be sure to mention you on here at some point. I think you can provide a great resource for not only your students, but the entire academic community!

October 09, 2007

Metabonomics Anyone?

For those of you out there who do Metabonomics (or want to), you might be interested in learning more about what ACD/1D NMR Processor can do to help you process and analyze your data. Here are some things you should consider when evaluating software for this use:

  1. Very good processing algorithms are required. It think that anyone will tell you that proper phasing and baseline correction is crucial in order to get accurate and reliable statistics from the dataset. Many Metabonomics scientists worldwide have adopted our phasing and baseline correction algorithms and are very pleased with the results. We specifically recommend the “Auto Simple” routine for phasing and the “Spectrum Averaging” algorithm for baseline correction.
  2. How do you analyze and collect your data? This process typically involves slicing a spectrum into arbitrary regions and passing the data to a statistical package for examination.  This "bucketing" generally involves creating "buckets" or "bins" that are of equivalent size, meaning that all spectra share the bucket divisions in exactly the same ppm locations. Unfortunately, when spectra are split this way, the results can be very ambiguous as often times one peak can be split into two regions for statistical analysis or a peak can migrate to another region in separate spectra.  Intelligent bucketing makes better decisions on bucket division locations by overlaying the entire group of spectra and making divisions based on minima in a way that does not split peaks that belong in specific regions.  This more sophisticated bucketing routine produces more meaningful statistics and requires less manual work.
  3. Finally, it’s likely that you are analyzing large batches of data in your studies, therefore you need software that can handle and automatically process these large batches of data quickly and accurately. You can do this quickly and reliably in ACD/1D NMR Processor.   

More information about Intelligent Bucketing can be found here, here, and here (this one is a tutorial on how to do it in the software).

To get a nice overall description of how to prepare the 1D NMR Spectra of Biofluids, go here.

Finally, anyone who is doing this type of work is probably interested in software that can perform Principal Component Analysis. ACD/Labs software does not do this. For that you should probably look into either:

Umetrics. More info can be found at their website here:

http://www.umetrics.com/default.asp/pagename/Software_OverView/c/1

Pirouette from InfoMetrix. More information here:

http://www.infometrix.com/software/pirouette.html

For those of you doing alot of work on metabolites and biofluids, you'll definitely want to learn more about Chenomx software

October 02, 2007

PubChem for Newbies

A couple of reasons for this post.

One, since I have many readers who are chemists or doing lots of chemistry on a daily basis, I think this resource about PubChem will be useful to you (There's also an InChI for Newbies for those interested).

Two, for those of you who didn't already know about it, I wanted to introduce you to a fine blog that is taking place over at Depth First. I follow Rich Apodaca's blog religiously and find it very engaging reading and a great resource.

Be sure to check it out.

Also, Version 11 of our software is coming very soon. I am hoping to share with you some of the new and exciting features that will be included in the newest version very soon so stay tuned. (One of them involves PubChem as a matter of fact!)