NMRphasing is an R package designed for the correction of phase errors in NMR data. While its primary focus is on 1D NMR data, it can also be employed for 2D and 3D NMR data by processing one 1D NMR data file at a time.
First of all, please ensure that you have the R package MassSpecWavelet, which is a dependency for NMRphasing. You can install MassSpecWavelet using the following example code:
if (!requireNamespace(“BiocManager”, quietly = TRUE)) install.packages(“BiocManager”) BiocManager::install(“MassSpecWavelet”)
NMRphasing is available on CRAN, and we can directly install it in R:
install.packages("NMRphasing")
Alternatively, let’s proceed to install NMRphasing from GitHub.
devtools :: install_github(repo = "ajiangsfu/NMRphasing",force = TRUE)
## if you do not have old versions of NMRphasing, please remove force = TRUE
The input data for NMRphasing can be in one of four formats:
After installing NMRphasing, you can load a subset of example data from our multiple metabolite spike-in experiment using the following code snippet:
library(NMRphasing)
data("fdat")
str(fdat)
## 'data.frame': 5891 obs. of 2 variables:
## $ frequency_domain: cplx -252643-294983i -221414-311592i -189411-330984i ...
## $ ppm : num 4 4 4 4 4 ...
In the code above, we load the dataset ‘fdat’ from the package. ‘fdat$frequency_domain’ represents the complex vector used in the following illustration, while ‘freq’ is utilized for plotting purposes. If your NMR data uses ppm values, you can substitute ‘ppm’ accordingly.
## in order to make comparison, absorption part can be extracted
fdat$Observed_Absorption = Re(fdat$frequency_domain)
library(ggpubr)
## Warning: package 'ggpubr' was built under R version 4.4.2
## Loading required package: ggplot2
## Warning: package 'ggplot2' was built under R version 4.4.2
p1 = ggplot(fdat, aes(x = ppm, y = Observed_Absorption)) +
geom_line() + theme_bw() + labs(y = "Observed Absorption") +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(), axis.line = element_line(colour = "black"),
axis.title.x = element_text(size = 6),
axis.title.y = element_text(size = 6))
p1
The observed absorption spectrum without phase error correction looks very bad.
In this package, there are currently nine NMR phase error correction methods implemented. They are as follows: “NLS”, “SPC_DANM”,“MPC_DANM”, “SPC_EMP”, “MPC_EMP”,“SPC_AAM”,“MPC_AAM”, “SPC_DSM” and “MPC_DSM”.
This is our new shrinkage method, which is the fastest phase error correction method since it does not involve any optimization step.
fdat$Phased_Absoprtion = NMRphasing(specDatIn = fdat$frequency_domain, method = "NLS")
p2 = ggplot(fdat, aes(x = ppm, y = Phased_Absoprtion)) +
geom_line() + theme_bw() + labs(y = "Phased Absoprtion") +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(), axis.line = element_line(colour = "black"),
axis.title.x = element_text(size = 6),
axis.title.y = element_text(size = 6))
ggarrange(plotlist = list(p1,p2),labels = c("Before","After"),font.label = list(size = 6),nrow = 2, ncol=1)
The default setting for ‘withBC’ is TRUE, which tests for baseline bias based on spline regression on the lowess line. If the maximum of adjusted squared r is greater than 0.2, baseline correction is performed with modified polynomial fitting.
If you set ‘withBC’ as FALSE, then no baseline bias will be tested and corrected. The example code is:
fdat$Phased_Absoprtion = NMRphasing(specDatIn = fdat$frequency_domain, method = "NLS", withBC = FALSE)
The “SPC_DANM” method employs the traditional single linear model approach for phase error correction but incorporates a new optimization function designed to minimize the difference between absolute area and net area.
fdat$Phased_Absoprtion = NMRphasing(specDatIn = fdat$frequency_domain, method = "SPC_DANM")
p2 = ggplot(fdat, aes(x = ppm, y = Phased_Absoprtion)) +
geom_line() + theme_bw() + labs(y = "Phased Absoprtion") +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(), axis.line = element_line(colour = "black"),
axis.title.x = element_text(size = 6),
axis.title.y = element_text(size = 6))
ggarrange(plotlist = list(p1,p2),labels = c("Before","After"),font.label = list(size = 6),nrow = 2, ncol=1)
This approach utilizes our new multiple linear model for phase error correction along with our optimization function designed to minimize the difference between absolute area and net area.
fdat$Phased_Absoprtion = NMRphasing(specDatIn = fdat$frequency_domain, method = "MPC_DANM")
p2 = ggplot(fdat, aes(x = ppm, y = Phased_Absoprtion)) +
geom_line() + theme_bw() + labs(y = "Phased Absoprtion") +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(), axis.line = element_line(colour = "black"),
axis.title.x = element_text(size = 6),
axis.title.y = element_text(size = 6))
ggarrange(plotlist = list(p1,p2),labels = c("Before","After"),font.label = list(size = 6),nrow = 2, ncol=1)
This is an old phase error correction method based on the traditional single model approach targeting on an existing optimization function: entropy minimization with negative peak penalty.
fdat$Phased_Absoprtion = NMRphasing(specDatIn = fdat$frequency_domain, method = "SPC_EMP")
p2 = ggplot(fdat, aes(x = ppm, y = Phased_Absoprtion)) +
geom_line() + theme_bw() + labs(y = "Phased Absoprtion") +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(), axis.line = element_line(colour = "black"),
axis.title.x = element_text(size = 6),
axis.title.y = element_text(size = 6))
ggarrange(plotlist = list(p1,p2),labels = c("Before","After"),font.label = list(size = 6),nrow = 2, ncol=1)
This is to apply our new multiple linear model approach for phase error correction based on an existing optimization function: entropy minimization with negative peak penalty.
fdat$Phased_Absoprtion = NMRphasing(specDatIn = fdat$frequency_domain, method = "MPC_EMP")
p2 = ggplot(fdat, aes(x = ppm, y = Phased_Absoprtion)) +
geom_line() + theme_bw() + labs(y = "Phased Absoprtion") +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(), axis.line = element_line(colour = "black"),
axis.title.x = element_text(size = 6),
axis.title.y = element_text(size = 6))
ggarrange(plotlist = list(p1,p2),labels = c("Before","After"),font.label = list(size = 6),nrow = 2, ncol=1)
This is an old phase error correction method based on the traditional single model approach with an existing optimization function: absolute area minimization.
fdat$Phased_Absoprtion = NMRphasing(specDatIn = fdat$frequency_domain, method = "SPC_AAM")
p2 = ggplot(fdat, aes(x = ppm, y = Phased_Absoprtion)) +
geom_line() + theme_bw() + labs(y = "Phased Absoprtion") +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(), axis.line = element_line(colour = "black"),
axis.title.x = element_text(size = 6),
axis.title.y = element_text(size = 6))
ggarrange(plotlist = list(p1,p2),labels = c("Before","After"),font.label = list(size = 6),nrow = 2, ncol=1)
This is to apply our new multiple linear model approach for phase error correction based on an existing optimization function:absolute area minimization.
fdat$Phased_Absoprtion = NMRphasing(specDatIn = fdat$frequency_domain, method = "MPC_AAM")
p2 = ggplot(fdat, aes(x = ppm, y = Phased_Absoprtion)) +
geom_line() + theme_bw() + labs(y = "Phased Absoprtion") +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(), axis.line = element_line(colour = "black"),
axis.title.x = element_text(size = 6),
axis.title.y = element_text(size = 6))
ggarrange(plotlist = list(p1,p2),labels = c("Before","After"),font.label = list(size = 6),nrow = 2, ncol=1)
This is an old phase error correction method based on the traditional single model approach with an existing optimization function: summation minimization on dispersion.
fdat$Phased_Absoprtion = NMRphasing(specDatIn = fdat$frequency_domain, method = "SPC_DSM")
p2 = ggplot(fdat, aes(x = ppm, y = Phased_Absoprtion)) +
geom_line() + theme_bw() + labs(y = "Phased Absoprtion") +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(), axis.line = element_line(colour = "black"),
axis.title.x = element_text(size = 6),
axis.title.y = element_text(size = 6))
ggarrange(plotlist = list(p1,p2),labels = c("Before","After"),font.label = list(size = 6),nrow = 2, ncol=1)
This is to apply our new multiple linear model approach for phase error correction based on an existing optimization function: summation minimization on dispersion.
fdat$Phased_Absoprtion = NMRphasing(specDatIn = fdat$frequency_domain, method = "MPC_DSM")
p2 = ggplot(fdat, aes(x = ppm, y = Phased_Absoprtion)) +
geom_line() + theme_bw() + labs(y = "Phased Absoprtion") +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(), axis.line = element_line(colour = "black"),
axis.title.x = element_text(size = 6),
axis.title.y = element_text(size = 6))
ggarrange(plotlist = list(p1,p2),labels = c("Before","After"),font.label = list(size = 6),nrow = 2, ncol=1)
This is a phase error correction method based on the traditional single model approach with an optimization function: absolute summation minimization on dispersion.
fdat$Phased_Absoprtion = NMRphasing(specDatIn = fdat$frequency_domain, method = "SPC_ADSM")
p2 = ggplot(fdat, aes(x = ppm, y = Phased_Absoprtion)) +
geom_line() + theme_bw() + labs(y = "Phased Absoprtion") +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(), axis.line = element_line(colour = "black"),
axis.title.x = element_text(size = 6),
axis.title.y = element_text(size = 6))
ggarrange(plotlist = list(p1,p2),labels = c("Before","After"),font.label = list(size = 6),nrow = 2, ncol=1)
This is to apply our new multiple linear model approach for phase error correction based on an optimization function: absolute summation minimization on dispersion.
fdat$Phased_Absoprtion = NMRphasing(specDatIn = fdat$frequency_domain, method = "MPC_ADSM")
p2 = ggplot(fdat, aes(x = ppm, y = Phased_Absoprtion)) +
geom_line() + theme_bw() + labs(y = "Phased Absoprtion") +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(), axis.line = element_line(colour = "black"),
axis.title.x = element_text(size = 6),
axis.title.y = element_text(size = 6))
ggarrange(plotlist = list(p1,p2),labels = c("Before","After"),font.label = list(size = 6),nrow = 2, ncol=1)
Version: 1.0.6 Authors@R: c( person(“Aixiang”, “Jiang”, role = c(“aut”, “cre”, “cph”), email = “aijiang@bccrc.ca”, comment = c(ORCID = “0000-0002-6153-7595”)) ) Maintainer: Aixiang Jiang aijiang@bccrc.ca Depends: R (>= 4.3.0),stats Suggests: knitr, rmarkdown, ggpubr VignetteBuilder: knitr Imports: baseline, splines, MassSpecWavelet Description: There are three distinct approaches for phase error correction, they are: a single linear model with a choice of optimization functions, multiple linear models with optimization function choices and a shrinkage-based method. The methodology is based on our new algorithms and various references: Binczyk et al. (2015) doi:10.1186/1475-925X-14-S2-S5 Chen et al. (2002) doi:10.1016/S1090-7807(02)00069-1 de Brouwer (2009) doi:10.1016/j.jmr.2009.09.017 Džakula (2000) doi:10.1006/jmre.2000.2123 Ernst (1969) doi:10.1016/0022-2364(69)90003-1 Liland et al. (2010) doi:10.1366/000370210792434350 License: MIT + file LICENSE Encoding: UTF-8 LazyData: true RoxygenNote: 7.2.3