This sliding window approach elegantly resolves the issue of chimeric spectra by adding a crucial dimension for computational deconvolution. It is a sophisticated advancement that significantly boosts the reliability and depth of untargeted metabolomics.
Approfondir
Prérequis
- Pas de données disponibles.
Prochaines étapes
- Pas de données disponibles.
Approfondir
Back to the future with sliding MS2 windows on the ZenoTOF 8600 systemAjouté :
[music] Uh what I'm going to be talking about today is uh as mentioned is the introducing the the new sliding MS2 on the the CX 8600 Xenoto system and the way in which we're taking advantage of this and the inspiration for the title here back to the future you'll see uh in a little while but it it goes back to some experiment experiments that we had originally proposed really over a decade ago but were never able to do until the the launch of this instrument. So we're very excited now to be doing some of these experiments for the first time.
The talk today is going to contain two parts. First I'm going to provide an overview of the workflows in metabolomics and try to point out where some of the the gaps are and some of the rate limiting steps are. And then in part two, I'm going to be discussing how we've deployed the Xenoto 8600 system to try to overcome those gaps and improve the metabolomics workflows.
So we'll go ahead and start here with the metabolomics workflows. And I do want to start by by mentioning one overview concept. And that is by when I say the word metabolomics. I know that this word can mean different things to different people, but I'd like to provide a little bit of a visualization of how we think of the the chemical universe in my lab. So, we divide the chemical universe into macroolelecules and into small molecules. The macroolelecules are going to be your genomics, your your genetic content, your proteins. And on the right, we have small molecules. And I'm going to use the word metabolomics to be referring to all of those small molecules. So it's not just water soluble metabolites. But when I say metabolomics, what I mean is profiling water- soluble metabolites as well as lipids as well as chemical exposures. So everything I I talk about is going to be broadly applicable to all those categories.
So we'll start with just an overview of the the workflow here of how we go about performing metabolomics or lipidomics or chemical exposomeics. And it starts there's nothing terribly original here, but it starts with sample preparation.
So we want to isolate those small molecules. Then we're going to analyze them by using mass spectrometry. The workflow we're going to be talking about today utilizes LCMS, liquid chromatography mass spectrometry. You can do these metabolomics experiments with other technologies like GCMS or NMR, but LCMS tends to be the preferred platform because of the expansive chemical coverage that you achieve.
After you generate the data, the data files are quite large and you can't analyze them manually. So, we do rely on software for data processing and that usually results in several thousand different signals, unique signals.
Sometimes in metabolomics we refer to those as features and there's a range there because the number of signals that you get does largely depend upon the context of the experiment.
After you generate the signals, you don't want to just stop there. There's not a whole heck of a lot you can do with signals alone. You want to be able to structurally identify those signals with a high level of confidence. So take those signals, convert them into IDs.
That does require doing MSMS experiments or fragmentation experiments and matching that MSMS data to MSMS data in databases or from authentic chemical standard. Now you can see here that the arrows on the top are colorcoded green.
The arrow on the bottom is colorcoded red and they're traffic lights corresponding traffic lights on the right of the slide. And this is to denote that those first series of steps are relatively easy to do. I think most labs now around the world can carry out sample prep, LCMS, feature detection.
But what the bottleneck currently is and what still represents the rate limiting step of unargeted metabolomics is converting those signals into identifications.
Now, just as a little background to to indicate how [snorts] serious of a challenge this is, I wanted to show you some real data. Um, these data are now quite, you know, quite old. These we published these over a decade ago, but but the situation really hasn't changed that much. Um, and you can see in this scenario, we're looking at human plasma and we're measuring on the order of about 25,000 features. And the number that we can identify is just this small sliver here off to the right. Okay, so the point is you're measuring a lot of signals that you're not able to to chemically identify. Now, you might say, well, humans are really complex. Maybe that's specific to humans, but we see the exact same thing for much simpler systems. So, this these are simply E.coli that were analyzed, cultured ecoli cells. And you can see effectively we get the same kind of result. lots and lots of signals that are measurable but a very small fraction that we can chemically identify with a high level of confidence.
It raises this question is what what are these signals here in the red? We're measuring these things but what are they? Measuring something that you can't identify is not particularly useful in the context of biology. So we'd like to think about how we can move forward and try to identify what these signals are.
Now to that end, we've spent quite a lot of time over the last decade or so, we and many other groups around the world to try to understand what these signals could be. And in that vein, what I'd like to do on this slide is develop a picture of what we've come to think of as the road map of an unargeted metabolomics LCMSbased data set. And as I say, this isn't just work that we've done, but this is work that builds on on the the research from labs all over the world. So the first thing we want to do is we want to recognize that the signals that you measure can either be of biological origin or they can be of nonbiological origin. And what I mean by that is that there are signals that are derived from contaminants or are derived from artifacts. Sometimes the words contaminants and artifacts are used interchangeably. We prefer not to equivocate the two. We think of contaminants as real molecules that show up in your data set, but those are molecules that are not derived from the sample itself. And what I mean by that is that they're coming from other places like perhaps impurities in the solvents, perhaps plasticizers in the vials that you're using, but they do they're not coming from the sample. So they're not biologically relevant. The other source of non-biological signals that we get are artifacts. Artifacts are not real molecules, but they are denoted as peaks in the data processing workflows. These can arise from things like electronic noise or they can be small deviations in your chromatographic baseline that the peak detection software falsely says are real signals.
The biological signals also come in a couple different flavors. We have those biological signals that are redundant and we have those biological signals that are unique. What I mean by redundant is that one metabolite, lipid or chemical exposure can show up as many different signals. The main sources of that redundancy sometimes referred to as degeneracy are shown here. A compound can show up as an adduct, different salts, sodium, potassium, ammonium, etc. They're naturally occurring isotopes that occur. Metabolites can stick to one another and form algmers and metabolites can break into smaller pieces as they enter the mass spectrometer.
The unique signals can either be those that are identified. These are going to be your TCA cycle metabolites, your glycolytic intermediates, your amino acids, your fatty acids, longchain fatty acids, cholesterols, etc. And then we have those that are unidentified.
Okay? And these are going to fall in two different categories. those that are underidentified because they are truly novel molecules. These are really exciting molecules that people have never reported before. They're not in the literature. They're not in the databases. They're not in the textbooks yet. And then we have those that are unidentified because they are chimeic.
Now, if this is a word you haven't heard before, we'll spend just a couple of minutes talking about what chimeic means here in a moment. But this is a real impediment to trying to identify signals in unargeted metabolites.
Now in our experiences when we tried to annotate all of these different signals in various different kinds of data sets over many years, this is the relative distribution of each of those signals that we're generally seeing. Again, on average, there are exceptions. Um, but you can see that we measure a lot of contaminants and artifacts. There's a lot of redundancy. And when you look at the things that we actually care about, the novel molecules and the known molecules down here, you can see that it is a very small fraction of the total number of signals that we're you're measuring.
But what I'd like to zoom in on here today and spend the rest of the talk discussing are these category of signals that I've boxed here called chimeri. Now your initial reaction might be yeah but it's so small you know say compared to artifacts and contaminants up here in these other things but it actually is quite significant when you look at the size of it compared to the novel in the known signals. So we want to be comparing that that that bar graph there not to the top but rather to what's underneath it.
Okay. So without further ado what do I mean by chimeic spectra? Well, when you're doing metabolomics, as I alluded to earlier, you have to do frag you have to collect fragmentation data to support structural identifications. And to do that, we collect a precursor in the mass spectrometer. And we fragment it. In this particular example here, I'm showing you three different precursors, and they're all colorcoded. So, we have some precursors are blue, some precursors that are green, and some precursors that are red. And what we like to do is isolate just one precursor in the fragmentation cell and then break it apart. And that means that all the fragments we measure in that experiment, that MSMS experiment, come from the blue precursor. And we ma when we match it to a database, which is MSMS data that was derived from a pure authentic standard, you can see they match quite well. But what happens when we fail to isolate just a single precursor in the fragmentation cell and instead we isolate two or more precursors? Well, in that scenario, what ends up happening is not only we now measuring fragments from the blue precursor, but we're also measuring fragments from the red precursor. Whenever you have fragments from multiple precursors, we call those MSMS data chime. And you can see the problem is is that when you have chimeic data, they do not match the database because again the databases were built off authentic standards, pure compound.
So there's no chimeic data in the databases.
So let's start our discussion when we talk about a data independent analysis or DIA. A technology that was pioneered by Swath 15 years ago or so now uh pioneered by SCX called SWAT 15 years ago or so. And what you're doing here is you're using wide MSMS isolation windows and you're scanning across the entire M overz space one after the other. So you do an MS1 scan and then you do all these MS2 scans that are partitioned across the M overz space and you could do this and this I know there's a lot of iterations of swath and it's evolved and there's all these much more sophisticated ways of doing it but this is one of the earlier uh one of the earlier iterations of it but you can see um generally that works it takes about 3 seconds to do and that's constitutes one cycle.
Now, when you do swath, we acknowledge that you're going to get a lot of chimeic spectra because you're using these really wide MSMS isolation windows of 25 m over Z. But how do we quantify how much MSMS data is chimeic? And to do that, we're going to use this technique that was first put forward by Rick Dun's group back in 2017. And essentially what you do is you measure everything in a scan. And then this is the targeted precursor and this is the co-isolated analytes. And then you take the size of the blue and you divide it by the size of the blue plus the size of the red. So it's essentially like the fraction of blue that you have compared to the total.
And when we do that with DIA, that's kind of swath workflow that I showed you earlier using plasma, a 30 minute run with helix separation, what you end up seeing is that the vast majority of what you're measuring is extensively chime.
Again, not surprising because we're using a 25 adult isolation window and a lot of stuff's going to sneak through.
You're not going to just get one precursor if you're using a wide isolation window. So, this has been in many ways a turnoff for the metabolomics community because you've got a lot of chimeic data and there's a lot of computational burden to try to figure out what to do with this. So we've in many ways many groups have transitioned to using DDA data dependent analysis and there instead of using large isolation windows you're using narrow isolation windows most commonly 1 m over Z um and you do the same experiment we use plasma 30 minute helix separation again 1 m over Z and these are the results that you get um and this is the most important slide I'm going to show you today because what it demonstrates is how much chimeic data you get even when you're using a 1 M over Z isolation window and a 30 minute helix separation.
Okay, the Yaxis here is the amount of MSMS data, number of MSMS data spectra.
Um, and the X-axis here is the percent of contamination.
And you can see that yeah, there's not a ton that's really contaminated, extensively contaminated, but there is a lot that has chimeic spectra and um that are going to contaminate your your data.
Um, so what do we do? Um, there's a couple ways to solve this. Um, one way is to use variable retention times. Um, so, um, this is a a technique that you rely on the fact that your precursors are going to have slightly different retention times. Um, and because your precursors have slightly different retention times, your fragments will have slightly different retention times and you can therefore deconvolute them.
Um, and this was been published by several groups, but most notably MS Dial is employing this technology. Um, there are some challenges associated with using retention time deconvolution. it doesn't always work and it does depend upon using the right chromatography. Um there's also ways in which you can do back-end deconvolution where you have instead of trying to deconvolute your experimental data um what you can do is you can take the MSMS data in the databases and you can convolute it to generate your experimental data. Um so you're not deconvoluting but you're convoluting the experimental data sorry your your reference data um and reconstructing it. Um and that that's a technique that we published a few years ago. Um and it turns out that the database assistant deconvolution and the retention time deconvolution are uh or highly complimentary. Um so sometimes the retention time deconvolution works, sometimes the um the the database assisted deconvolution works. Uh but you can see that they both have unique scenarios where they work. Um, but the key thing is here, and I'm going to wrap up here by talking about how we've inserted the Xenoto 8600 system into this workflow. The key thing here is that although we've seen about half of our spectra are chimeic, um, and we can solve a lot of those chimeic data by using retention time deconvolution and database assisted deconvolution, currently there are no tools to solve this portion here, this 21%. Um and that means these are molecules that we cannot observe that we can't we can't currently sorry these are molecules that we can't currently identify. Um so what do we do with those? Um how do we solve those?
Well this is the inspiration for the title of the talk. Um over 10 years ago we published this uh this workflow um that takes advantage of sliding MSMS isolation windows. Um, at the time it was just sort of a cute idea because there were no instruments that could perform it to scale. But essentially the idea is that when you do MSMS, this yellow line here represents an MSMS window. And you can see that we're isolating two precursors and getting red and blue fragments down here in the MSMS data. Um, if you move that window to the left and you move it to the right, you can see that you get different proportions of the blue and the red precursor. Here you get a lot of blue but little red. Here you get a lot of red and a little blue. So if you can slide your MS SMS window then what ends up it gives you another dimension to the MSMS data that allows you to deconvolute. And so we said if we could do these sliding MSMS windows that'd be really cool. We could deconvolute data.
But again there was no instrument that could do this fast enough at scale. Um but it worked really well just in our cute little examples. You can see I'll just uh flip through these. methionine worked really well, valing, swinging, couple examples. Um, but again, we couldn't do it to scale. So, um, that all changed with the xenoto 8600 system.
Um, as shown here, um, this is G, Tristan, Kevin, and Matthew. Um, and the what I'm going to show you here just very briefly is data from Tristan um, who spearheaded the the experiments I'm going to show you. So, I've already shown you how swath works. Um, the white lines here are the isolation windows, but when you do the this new approach, and there's so many cool different ways that you can run the Xenoto 8600 system, and I I I don't have time to to go through it all now, but there are just a million different modes that you can do.
I think that there's so many opportunities for innovation and creativity with respect to metabolomics workflows, but I'm just going to focus on this one this one uh application called ZT scan that they've implemented.
Use a 5.1 m overz isolation window, but you're incrementing that window at 0.1 M over Z. So each one of those little white lines there is a different MSMS scan and they're largely overlapping.
And so essentially what you're doing is what we said we would have loved to do, you know, 10 or more years ago, but we couldn't do. And now it's possible. And so when you do the ZT scan and then you do the deconvolution, you're getting MSMS data that looks just like DDA data, but you're getting it across the entire MSMS window. Okay. And when we take this and we apply it to human plasma, so this is human plasma, a 16-minute helic mode, a 16-minute helix separation and negative mode. when we do DDA and this is we're just using one database the NIST um and and uh not going through extensive cataloging here so these numbers just one mode so keep that in mind but you can see that when we add the ZT scan in we're getting double the number of um and then the last slide I have for you is that there are several other benefits to the Xenoto 8600 system um because of the Xenot trap technology you do get better sensitivity um which you're going to hear about from others throughout the the summit today. Um this is example of phenyl alanine. You can see the peak uh here. It's relatively low intensity. When we turn the xenot trap on um it goes much higher. Um and you can also see that in addition to doing c fragmentation data it is possible to collect ead um which in some cases is going to be much richer fragmentation data as you can see here for that the piece shown. Um, we're also getting weekly CVS that are less than 4% which is really outstanding if you're going to do large studies. That's critical. You heard earlier about the importance of quantitation and that's really the the key to it. Um, so with that I'll I'll wrap up. Thank you for allowing me to present. Um, again thanking Tristan um and uh and and I just want to say one closing thought and that is is that when you uh you know if someone said what is the perfect MSMS workflow um for metabolomics? You might say, well, if I could get MSMS data in one Daltton bins across the entire M overz range, that would be ideal. Um, but as I've shown you, if you do that, 50% of your data are going to be chimeic. And that's going to be a problem because you can't deconvolute all that chimeic data. And so, one could argue that you'd actually be better off doing one doing sliding windows um where you get lots of MSMS data, but you can you have an opportunity to deconvolute it. And that's exactly what the ZT scan um is and what SCX has developed with this new instrument. Thanks. Thanks so much for um the opportunity to present here today and I'm happy to take a question if we have time.
[music]
Vidéos Similaires
the entire of GCSE CHEMISTRY paper 2 (taught by a medical student!)
brynirons
164 views•2026-05-29
⚡ How Petroleum Becomes Petrol, Diesel & LPG 🛢️ | One Shot Chemistry Magic 🔥 #usa #canada #uk #aust
inamjazbi_studio
440 views•2026-05-28
Bonding of plastics - Part 3: Examples of polar, non-polar & insoluble polymers
HerwigJuster
332 views•2026-05-28
Total Synthesis of (±)-Dhilirolide U with Henrik Wilke
SynthesisWorkshopVideos
385 views•2026-05-30
Lecture - 03 - Summer Batch (Demo) - OL/IG O/N '26 & M/J '27 Live Class Solids,Liquids & Gas KPT
carboxylchem
105 views•2026-06-01
Lakshya NEET in English 2027 Solutions 🧪 Class 12 Backlogs Class
PWNEETEnglish
1K views•2026-05-31
A splash of chemistry, a dance of electrons, and a beautiful color transformation. 🧪✨#redoxreaction
harshrani_5920
1K views•2026-05-31
부풀어 오르는 검은 액체?! 폴리우레탄 스펀지 폼이 만들어지는 놀라운 과정 #worker #process #chemical #amazing #making
슥슥스르륵
2K views•2026-05-29











