Reporting

April 02, 2008

Linking to Meaningful Data in an ELN World

In a previous post, I asked the question, why does paper spectra continue to persist in chemistry?

Of course there is the next challenge, as Rich Apodaca points out on his Depth-First Blog in an earlier post:

The previous article in this series, suggested that the same dynamic applied to the compilation, management, and sharing of spectral data by chemists. More to the point:   

... cheminformatics has failed to deliver an inexpensive, robust, and truly usable solution to the problem of compiling, managing, and sharing spectral data for scientists of average computer skills. ...

To be sure, there are tools that address parts of the problem. But no solution addresses them all and that's why scientists and publishers resort to using obviously inferior solutions like PDFs.

Whether or not organizations and groups are resorting to inferior solutions is up for debate because it of course depends on the expectations of the end user. But his comments definitely struck a chord with me.

So the next question is:

"What is the best way to connect my analytical data to my ELN records TODAY?"

By far, the most common way that I have seen organizations connect the analytical data from our software to ELNs is via PDF.

But as Rich mentions in yet another post, for people who are looking to build on experiments or model or compile the results, static PDF images are practically useless.

I couldn't agree more.

So why do organizations choose this route?

The three biggest reasons I have heard are:

  1. File size limitations in the ELN
  2. The lack of a standard and supported analytical data format that is generic, open, lockable, and widely supported for years to come.
  3. Currently,  PDF is more controlled for legacy support than analytical data.

As a result, PDF is the only reasonable approach for many, and it is certainly better than not connecting to a record of the data at all. 

I think the key is for vendors to work horizontally and to combine their strengths to deliver as Rich suggests a:

an inexpensive, robust, and truly usable solution to the problem of compiling, managing, and sharing spectral data for scientists of average computer skills.

But the file format remains an issue.

Work by the ASTM E13.15 Commitee has been ongoing for the past 5-6 years towards a universal analytical data file format. This file format is called AnIML (Analytical Information Markup Language), the developing XML standard for analytical chemistry data. Most vendors support the general directions of the ASTM E13.15 for a universal data format for analytical data.

A final note on the role of MEANINGFUL data in an electronic world. When I refer to meaningful data, I am referring to knowledge gained and stored in an actual data file as opposed to a static PDF. One of the unique features that ACD/Labs has maintained over the years is the ability to electronically assign NMR data to chemical structures to truly capture not only the data but the knowledge gained from the experiment. I think not leveraging this knowledge is an awful shame, especially in an electronic world, but I think it will come.

As of right now, While it is common that NMR Spectroscopists will assign their data electronically, it is very rare to find a group of chemists in the pharmaceutical industry, for example, who routinely use their processing tools to assign their data. Why?

  1. They might not have the right software tools
  2. It is not required. In fact, in some cases I have learned that it is forbidden. Why spend the time it takes to assign the data if it is not required or permitted?

A static PDF is indeed proof that an experiment was run, but does it contain information that supports a proof of the proposed structure? Where is the knowledge that was gained from this exercise?

I think 1D NMR Assistant significantly reduces the amount of time it takes to electronically assign a spectrum so now it is just a matter of finding an easy way to tie this assigned analytical data to the ELN.

I think there is a real opportunity here.

What are your thoughts?

Would you prefer electronic data over PDFs?

Is simply raw or processed data enough?

How important is maintaining the knowledge gained from the experiment (i.e. assignments)?

Thanks to Rich for the multiple inspirations for this and previous posts.

February 21, 2008

Let's See your Printed Spectra Do This!- Part 2.

Back to the chemists with their ELN who continue to resort to their paper spectra.

Is it just an old habit?

No. I think it is something else. In fact I think there are two major (and completely understandable) reasons why some chemist continue to resist the complete transition to the electronic world:

  1. One of our users who I really enjoy speaking with when I get a chance once told me, "The only way you are going to get chemist to fully adopt these tools is when they can access and interact with the spectra JUST as fast as they can do it on paper." I think it's a great point. For some chemists having to sit in front of a new piece (or old piece) of software to try and get your data out can be a daunting task. But I think significant strides have been made in this regard. With the transition to open access, NMR Spectroscopists have done a nice job (in conjunction with software vendors) to automate processing and create software macros to provide chemists with access to fully processed spectra. In addition, I think that the usability of NMR software for manual processing (See Shortcut Mode for example) has greatly improved over the years, but there is of course still work to do.
  2. I think the other reason is that perhaps the benefits associated with electronic handling of NMR data have historically been not convincing enough or educated clearly enough to the user. For example, I have beaten the multiplet report topic to death on here, but I am continually amazed by the number of chemists who have had our software for a long time that aren't aware of this feature.

But perhaps easy access to data from your desk and formatted multiplet reports is not enough for a chemist to let go of the paper and embrace the electronic world. In fact, I am convinced it isn't. After all, an NMR spectrum is a means to an end. Sure, the delivery of the results in a readable form in a convenient place is essential and it has increased producitivity in open access environments, but the bottom line is that there is a reason the chemist ran an NMR experiment. In most cases that reason is to determine if their compound's proposed structure is consistent with their spectrum. Historically, NMR processing software has not provided any assistance in regards to data interpretation. 

I am hoping that ACD/1D NMR Assistant addresses this challenge and finally convinces the chemist to let go of their paper spectra for good and fully embrace the electronic world. To see how 1D NMR Assistant helps chemists interpret and assign their 1H NMR spectra check out my previous vlog postings (with video) here and here.

February 20, 2008

Are You Attached to the Paper Printouts of Your Spectra?-Part 1

This is the topic I will be presenting on at our annual ENC Symposium on March 9, 2008. If you happen to be attending this conference in Asilomar, check out the agenda and register here.

I mentioned this before, but I follow the Depth-First Blog authored by Rich Apodaca rather closely and I highly recommend it.

I mention it again because Rich had a very interesting post a few weeks back inspired by a discussion about Electronic Lab Notebooks (ELNs) that took place on Derek Lowe's In the Pipeline Blog (which is another blog I follow frequently and highly recommend!)

Fascinating posts and discussion for sure!

Rich notes:

The wasteful process of entombing valuable scientific data often begins with the paper lab notebook, so the subject of ELNs should be of great interest to anyone involved in creating, using, or reprocessing chemical information.

Why do paper notebooks continue to persist in chemistry?

The issue is complex, but in my view stems from the lack of a truly usable and affordable tool. Although the term "tool" may suggest software, it actually involves a much more complex beast consisting of hardware, software, an ergonomic hardware/software user interface, and a computer network. In chemistry, the problem is compounded by the centrality of chemical structures and the inability of most generic ELN products to capture or use them. Given these constraints, and the costs associated with creating and marketing general-purpose products designed to work within them, it's not surprising that many organizations decide to roll their own ELN. And it's even less surprising that many others decide sticking with paper is a better option - at least for now.

I think this is a good argument, and I want to add to it with the question, "Why does paper spectra continue to persist in chemistry?"

Personally, I am amazed by the number of times I have encountered groups who are already using ELNs, but are still routinely using paper spectra. Sure when the time comes to attach their PDFs (Rich has shared his feelings on this format as well, but that's an entirely different discussion!) to their notebook records, they will extract the electronic file, but until that point many chemists will walk back to the instrument room, pick up their data printout, study it, and then toss it in the recycling bin or toss it on their bench!

For example, I was recently visiting with some chemists in the pharmaceutical industry who were providing me with feedback on our products and discussing their workflow. They mentioned that while they religiously use the desktop NMR processing software for viewing, processing, analysis, interpretation, and reporting to their ELN right on their laptops, in their lab...many chemists from their group still make the walk to the instrument room for their piece of paper instead.

Paperless environment?

Old habits, I guess.

Why use a piece of paper when you can access the data electronically? Meaning you can zoom in and zoom out on regions of the spectra to take a closer look? I guess in most cases, you don't really need to do that but in many labs where the data can be accessed electronically on a laptop in your lab, why make the walk back to the instrument room for that piece of paper?

What are your thoughts? Are you still attached to the paper printout? If so, why?

I'll share my opinion in part 2 of this topic tomorrow.

January 24, 2008

What's New in Version 11- Improved Multiplet Analysis

In case you didn't know, version 11 of ACD/Labs software was released in November of 2007.

While there are several updates to the software over the course of a year, a major new version of each of our NMR products is released on an annual basis.

Over the next couple of weeks on this blog, I will try to mix in some new features and improvements that have been implemented in version 11.

First up, I'll talk a bit about improvements in the automated multiplet analysis algorithm in ACD/1D NMR Assistant, ACD/1D and 2D NMR Processor, ACD/1D and 2D NMR Manager, and ACD/1D and 2D NMR Expert.

For those unfamiliar with this process, automated multiplet analysis refer to the use of a software algorithm that automatically characterizes coupling patterns and extracts coupling constants from multiplets in an NMR spectrum.

The most popular application of this feature is that it provides an incredibly fast way of generating a formatted multiplet report for patents and publications:

1H NMR (400 MHz, DMSO-d6) d ppm 2.34 (dd, J=15.97, 8.06 Hz, 1 H) 2.64 (dd, J=16.05, 5.35 Hz, 1 H) 3.81 (tt, J=7.71, 5.33 Hz, 1 H) 4.47 (d, J=7.47 Hz, 1 H) 4.87 (d, J=5.13 Hz, 1 H) 5.68 (d, J=2.20 Hz, 1 H) 5.88 (d, J=2.34 Hz, 1 H) 6.58 (dd, J=8.13, 1.98 Hz, 1 H) 6.68 (d, J=8.06 Hz, 1 H) 6.71 (d, J=1.91 Hz, 1 H) 8.

So how well does this automated algorithm perform in Version 11?

Significant development efforts were put into improving the automated routine for Version 11 and I will present the results of a direct comparison to version 10 of the software.

For this study, two different data sets each consisting of the 1H NMR spectra of 30 samples were studied (~250 multiplets in total).

Test Set 1- A set of 30 spectra (~250 multiplets) with reasonably good signal to noise:

Good
We ran the automated multiplet analysis routine on all 30 spectra and got the following results in Version 10:

V10magoodsn

In version 10, 53% of the multiplet patterns in the spectra for the 30 samples were correctly defined, 32% were undefined, leaving 15% that were incorrectly defined. Multiplets that are termed as undefined are simply given a "m" designation for multiplets. In standard practices and manual analysis the "m" designation is often assigned by users to multiplets with unresolved peaks, often due to strong coupling.

Version 11 Results:

V11magoodsn

As you can see, while the number of correctly defined multiplets increased by 5%, the most notable observation is the reduction of incorrectly defined multiplets from 15% to 1%.

Test Set 2- A set of 30 spectra (~250 multiplets) with lower signal to noise:

Poor

 

Version 10 Results:

V10mapoorsn

Version 11 Results:

V11mapoorsn_2

As you can see in the second study, we did sacrifice some correctly defined multiplets (6%), but we were able to reduce the number of incorrectly defined multiplets significantly from 14% to 3%.

So how does this improved accuracy impact the end user?

First of all, this will result in significant time savings with more accurate automated multiplet analysis and report creation for spectroscopists' and chemists' patents, publications, etc.

But perhaps of more scientific relevance, is that improved automated multiplet analysis heavily impacts both the performance of automated structure verification in ACD/1D and 2D NMR Expert, as well as the structure verification algorithm included in ACD/1D NMR Assistant.

How much impact are we talking about here?

Stay tuned, it will be the topic of my next post.

If you are currently using Version 11, try it out, and share your results in the comments section.

December 14, 2007

Spectrum-to-Structure Integration - What is it?

One of the unique features that ACD/Labs software has embraced over the years is something called spectrum-to-structure integration.

It's the idea of not only attaching a chemical structure to a piece of analytical data but to also assign that data to pieces of the chemical structure.

This topic gives me a good opportunity to show you one of the new additions in version 11 of ACD/1D NMR Processor (and ACD/1D NMR Assistant) and to perhaps emphasize my point.

I have blogged about the ability to create a multiplet report before, but now in version 11, enhancements have been added. It is now possible to select from a list of pre-formatted templates for multiplet reports based on official journal and patent formats:

Jnatp

Furthermore, you can also create a report of a different format defined by you, the user:

Userdef_2

To get back on topic let's look at a formatted multiplet report for the Journal of Natural Products:

1H NMR (400MHz, DMSO-d6) d = 10.40 (1H, br. s., H-13), 8.49 (1H, t, J = 5.5 Hz, H-10), 7.63 (2H, d, J5, 3,6, 2 = 8.7 Hz, H-5, 3), 6.54 (2H, d, J6, 2,5, 3 = 8.7 Hz, H-6, 2), 5.69 (2H, s, H-7), 3.58 (2H, q, J = 6.0 Hz, H-11), 3.04 - 3.25 (6H, m, H-16<''>, 14<''>, 12, 16<'>, 14<'>), 1.22 (6H, t, J = 7.2 Hz, H-17, 15)

The thing I really like about the Journal of Natural Products report is that it includes the assignments in their multiplet report (see assigned atom labels in bold).

This is a nifty time-saving feature, but more on my point regarding spectrum-to-structure integration, notice that the software assigns an experimental multiplet. Not a peak in a multiplet nor a region of the spectrum.

Within ACD/1D NMR Processor or ACD/1D NMR Assistant, users can assign experimental multiplets, meaning that when assigning an atom to the spectrum it is associating the multiplet and all of it's properties (including chemical shift, coupling constants, integration values, etc.)

This of course becomes incredibly important when considering software for automated structure verification. ACD/Labs' solution is not just based on chemical shift prediction, but also on observed and predicted multiplet characteristics, and integration information.

It is these characteristics that are important in the implementation of the algorithms used in tools such as ACD/1D NMR Assistant, ACD/1D NMR Expert, and ACD/2D NMR Expert.

When evaluating different NMR software packages, keep an eye out for true spectrum-to-structure integration.

November 14, 2007

New Product Time! Introducing ACD/1D NMR Assistant

I apologize for not posting much lately, but things have been pretty busy in the ACD/Labs NMR world.

I have alluded to this moment in some previous posts for awhile now and I am now happy to unveil a new NMR product to be offered by ACD/Labs.

ACD/1D NMR Assistant.

So first and foremost, what is it?

I have talked a lot about how synthetic chemists currently use ACD/Labs software and blogged about the benefits and key features in the software.

The bottom line for us, was that while we have been successful in selling
ACD/1D and 2D NMR Processor to chemists and students in industry and academic institutions, we believe that there was still quite a bit of work to be done to design a tool for the chemist.

The truth is, ACD/1D NMR Processor has been around for quite some time. From the very beginning, we were developing a product with the NMR Spectroscopist in mind. Naturally, during the evolution of this product many sophisticated and advanced features have been added and as a result the software can sometimes be viewed as bloated, and overly complex for a novice or non-expert user.

In recent years we have worked hard on continuing to add advanced features but to also try and simplify things. At the end of the day, we made the decision to go in a different direction and create a separate product. This way we can continue to develop ACD/1D NMR Processor with the Spectroscopist in mind, and build ACD/1D NMR Assistant with the synthetic chemist in mind.

So as a result, we developed ACD/1D NMR Assistant and it is now finished and available.

How is ACD/1D NMR Assistant different than ACD/1D NMR Processor?

  1. Ease of Use- ACD/1D NMR Assistant includes all of the features available in ACD/1D NMR Processor, we have just de-emphasized some of those features in the software in an effort to greatly simplify the toolbars and interface. I believe that we reduced the learning curve significantly. One example is that upon file import an FID will automatically get FT'ed, phase corrected, and baseline corrected. In addition, the software will look for solvent and water signal automatically and darken them out. Because it includes Shortcut Mode from NMR Processor, users can peak pick, integrate, and characterize multiplets with one simple click and drag over each multiplet.
  2. Assignment Assistance- The big improvement added to ACD/1D NMR Assistant is that the software can now provide the user with feedback on potential assignments. When a structure is proposed users can hover over a multiplet of interest and the software will provide real-time feedback as to what the best assignments are. It does so by highlighting atoms in a structure with a green, yellow, and red color scheme.
  3. Structure Verification- Whether a user simply wants to check their own assignments, or ask the software to provide feedback on the consistency between a proposed structure and an experimental spectrum, the software provides this capability via one simple button click. If the software deems a spectrum-structure pair as inconsistent, it provides direct feedback on which part (or parts) of the spectra should be looked at closer and what the specific problem is.

The ultimate motivation behind this product was the simple idea that, "an NMR spectrum is a means to an end"

While there are several software packages that can process NMR data and print out spectra for evaluation, there is nothing available that really helps users evaluate and interpret an NMR spectrum  and it's relationship with a chemical structure.

That is until now, IMHO.

I think the development of ACD/1D NMR Assistant changes all that.

One more thing...

This product wasn't just built by a few software-oriented people within ACD/Labs. We have been speaking with current customers and chemists for years on how we can improve our offering. I personally have spent a very considerable portion of the last 2 years speaking with chemists, students, and spectroscopists who support chemists, discussing what was lacking in our current offerings and how we can improve our software for these users. Further, I spoke with people who have never used ACD/Labs software and discussed their desires and expectations for an NMR package that would suit their needs.

Finally, after the first stage of new development we sent the software to groups of chemists in 5 different pharmaceutical organizations in the US and had them evaluate the software. These groups ranged from chemists who currently used our 1D NMR Processor, to chemists who had never used it. In doing this we hoped we would be able to appropriately gauge the acceptance and learning curve of the new software. This wasn't beta testing...this was MARKET testing. We are very grateful to these chemists who provided valuable feedback that was implemented in the final version of the software. They were able to point out some very obvious things that our software-focused minds simply overlooked for years.

Following these evaluations, we took the feedback we received and performed another round of development + evaluation to optimize our offering.

And through those exercises, we reach today. A finished product in hand that I am very excited about.

Over the next few weeks, I will be highlighting different features and workflows in the software to educate you on how it works and what it does.

For those of you who want a sneak peek right now, go over to the ACD/Labs website and view the movie that shows the software in action. If you like what you see, just fill out the form at the bottom of the page and you can get a free evaluation copy to try yourself.

Click here to go to the 1D NMR Assistant page.

I hope you like it!

July 18, 2007

Electronic Lab Notebook (ELN) Integration

If you have followed ACD/Labs for years you may have come across the following older press releases at one point or another:

http://www.acdlabs.com/clients/pr_cs1002.html

http://chembionews.cambridgesoft.com/Articles/Default.aspx?articleID=269

Over the last few years, it seems that more and more organizations are making the decision to deploy an ELN. The choices are many, and not limited to Cambridge Soft of course (to name a few others). I thought I would take this opportunity however to share with you some of the integration that is possible between Cambridge Soft's E-Notebook and ACD/Labs tools.

While many organizations are busy evaluating different ELN offerings and understanding how it will work in their environment, some are not thinking about how these packages will work with their analytical data. Will a chemist simply be responsible for copy and pasting valuable pieces of a report into their notebooks? Should there be a link to the live or archived data?  Should users be able to access their data right in the ELN interface? Many possibilities and I will highlight one that is available thanks to ACD/Labs SpecX technology and Cambridge Soft's ELN.

For an explanation of what exactly SpecX can do, go here:

http://www.acdlabs.com/products/glob_sol_lab/activex/specx/

As far as demonstrating the integration, I'll just add a couple of screenshots to wet your appetite:

A screenshot of a 1H NMR and MS spectrum. Note you can freely zoom in or zoom out on different regions of the spectra (click images to enlarge):

Eln_image2

By right clicking on the spectrum, you can explore several other options:

Eln_image3

For example, you can access all of the data tables available in the ACD/Labs Processor such as tables of assignments, peaks, multiplets, annotations, etc.

Eln_image5_2

Perhaps the coolest feature is if you select the edit option in the drop down menu. This will launch the file in the ACD/1D NMR Processor application so that the user can make modifications to their spectra.

You can also store both the fully processed data as well as the raw data. Of course these can be easily processed and updated in the ACD/1D NMR Processor as well:

Eln_image7

Here's a short video (no audio) download of the CS and SpecX integration in action (my japanese readers will be pleased to see the interface is japanese! I apologize to all others, it's all I have at the moment).

Of course there are many other options to tie your analytical data handling in with your ELN practices, I will post more as the months go by. In the meantime, my colleague Andrew Anderson presented at out ENC User Meeting back in April on, "E-Lab Notebook Integration- Leveraging ACD/Labs Tools to Enhance a Paperless Environment". Check it out!

Do you already have a good solution in your organization? Trying to find one?  Feel free to share your stories, challenges, and expectations in the comments section of this post.

Thanks to Arvin and Furuta-san for the images and video!

July 10, 2007

Where is the Quality? Assumptions vs. Assurance

As I mentioned before in this blog, I have spent the last year and a half visiting with medicinal and process chemists, as well as NMR Spectroscopists who support chemists in an open access environment. I‘ve been trying to understand how NMR software is being used in these environments and how things can be improved in this world.

Of course the convenience of having NMR software at their desk or in their lab is a major bonus. J-Coupler and the ability to produce a multiplet report in journal format for patents and publications is always a huge hit.

A few weeks ago, I blogged about how chemists use NMR and I also questioned the information (or lack thereof) in archived data, and spectra in reports.

I am going to riff a bit about what I have learned about compound registration. In the great majority of medchem groups I have spoken with, a compound is synthesized, an NMR and sometimes an LC-MS is run, and then a registration form is filled out. This registration form consists of a check box where the chemist selects which experiments were run and a statement such as, “the structure is consistent with the data collected”

My question is, what happens with this data? Generally this data will be pasted in a chemist’s lab notebook, or maybe even in their ELN if the organization is that far along. But for the most part this is just data. Whether it’s a printout or an electronic file, it’s generally just the data. Maybe there is a structure attached, and some chemists will diligently assign their spectra to their structure on their printout.

But the issue I see here is that there is no consistency. There is no mandate at the top that is telling the chemists to provide support of a proof of structure. A 1H NMR and mass value cannot provide a full proof of structure…but how exactly is a simple statement like, “the structure is consistent with the spectrum” validated? Certainly it’s not validated from a paper printout of a spectrum.

I hope people realize that my intention is not to call out chemists. The chemists are not the problem here. Their job at this particular stage is to make compounds and fill out a registration form (forgive me for being overly simplistic). Their job is not to fully assign their data and prove that at least there is nothing in the data that suggests an inconsistency. Again, a 1H NMR is not going to be a proof of structure. But it was run for a reason, and that reason should be validated.

I’ve talked to many chemists who are very diligent about their handling of NMR data. They will use our software and fully assign their data to a chemical structure and ensure that there is no inconsistency. They do this not because they have to, but for their own piece of mind, and because they believe it is the responsible thing to do. But again, the problem is consistency. For every chemist who is diligent about their NMR data handling, there are several more that don’t do it. In fact, there are perhaps some who run the NMR but don’t look at it. The identity of their compound is confirmed by a mass value, and in the environment they are working that is perfectly acceptable. I personally don’t blame them for not spending the extra couple of minutes to fully assign their data. It’s not their job!

Maybe I am out of place…maybe I am misunderstanding the environment or the reasoning behind not taking this direction. Perhaps this just isn’t the right time to bring this issue up. But in the age of electronic lab notebooks, to me, it seems like a perfect time to discuss and implement this.

But in an industry that is so heavily regulated, it blows my mind that a chemist's decision to register a compound is based on good faith.

Perhaps an expert in this area can clear things up for me and put me in my place. I am simply trying to espouse the benefits of our software. I believe assigning NMR data electronically to structures has major benefits.

P.S. There's another benefit. Imagine chemists in an open access environment fully assigning all of their NMR data and then sending it off to an NMR Spectroscopists. This will create a purgatory database for the NMR spectroscopist to evaluate. Once quality checked the spectroscopists can build a database of all this information and in turn, vastly improve the NMR prediction capabilities within the organization.

June 19, 2007

Archive This

Maybe this is sufficient:

Http___wwwacdlabs

I disagree. I think this is the way to go:

Http___wwwacdlabs2

If you are currently interested in archiving or reporting NMR spectra, I think you should read the following application note. Especially if you have an electronic lab notebook (ELN):

Archiving NMR Data in an Electronic Lab Notebook World