With MergeNeighboringPeaksParam, refineChromPeaks() should merge partially or completely overlapping peaks. In the current implementation it can however fail to merge the latter, depending on the order of the peaks in chromPeaks():
## Define overlapping/nested chrom peaks
pks <- rbind(
c(666.0693, 666.0693, 666.0693, 31.779, 27.59, 35.968, 1, 1, 1, 1, 1),
c(666.0713, 666.0683, 666.0747, 31.181, 27.59, 39.559, 2, 2, 2, 2, 1),
c(666.0693, 666.0693, 666.0693, 31.779, 27.59, 36.968, 3, 3, 3, 3, 1))
colnames(pks) <- c("mz", "mzmin", "mzmax", "rt", "rtmin", "rtmax", "into",
"intb", "maxo", "sn", "sample")
rownames(pks) <- c("A", "B", "C")
In this example, the first ("A") and third ("C") peak are completely within the m/z and rt range of the second ("B"). Peak merging should therefore only report the second, but not the first or third. The internal function to merge peaks is xcms:::.merge_neighboring_peak_candidates().
## define the remaining data required for the function
pkd <- data.frame(ms_level = rep(1L, 3), is_filled = rep(FALSE, 3))
x <- list(cbind(mz = c(), intensity = c()),
cbind(mz = c(), intensity = c()),
cbind(mz = c(), intensity = c()))
rt <- c(30.5, 31.5, 32.5)
Running this function with these data results in
> xcms:::.merge_neighboring_peak_candidates(x, rt, pks, pkd)
$chromPeaks
mz mzmin mzmax rt rtmin rtmax into intb maxo sn sample
A 666.0693 666.0693 666.0693 31.779 27.59 35.968 1 1 1 1 1
B 666.0713 666.0683 666.0747 31.181 27.59 39.559 2 2 2 2 1
$chromPeakData
ms_level is_filled
1 1 FALSE
2 1 FALSE
so, both "A" and "B" are reported, although "A" is completely within "B".
We get the expected results if we change the order of the peaks at input into B, A, C:
> xcms:::.merge_neighboring_peak_candidates(x, rt, pks[c(2, 1, 3), ], pkd)
$chromPeaks
mz mzmin mzmax rt rtmin rtmax into intb maxo sn sample
B 666.0713 666.0683 666.0747 31.181 27.59 39.559 2 2 2 2 1
$chromPeakData
ms_level is_filled
1 1 FALSE
The function iterates over the peaks, and if a peak is completely within the rtrange of the another only the bigger is retained. For that the peaks need to be ordered in a way that for overlapping peaks, larger peaks are ordered first. The function does order the peaks, but only by "rtmin": idx <- order(pks[, "rtmin"]). In cases like our example where the "rtmin" is the same, they are processed in the original order. peak B is not completely within peak A and therefore peak A is retained. Peak C is completely within peak B and therefore only peak B is reported.
To address this, the peaks should not only be ordered by "rtmin", but also considering "rtmax", such that in cases where peaks have the same "rtmin", the bigger peak (with a larger "rtmax") comes first.
With
MergeNeighboringPeaksParam,refineChromPeaks()should merge partially or completely overlapping peaks. In the current implementation it can however fail to merge the latter, depending on the order of the peaks inchromPeaks():In this example, the first (
"A") and third ("C") peak are completely within the m/z and rt range of the second ("B"). Peak merging should therefore only report the second, but not the first or third. The internal function to merge peaks isxcms:::.merge_neighboring_peak_candidates().Running this function with these data results in
so, both
"A"and"B"are reported, although"A"is completely within"B".We get the expected results if we change the order of the peaks at input into B, A, C:
The function iterates over the peaks, and if a peak is completely within the rtrange of the another only the bigger is retained. For that the peaks need to be ordered in a way that for overlapping peaks, larger peaks are ordered first. The function does order the peaks, but only by
"rtmin":idx <- order(pks[, "rtmin"]). In cases like our example where the"rtmin"is the same, they are processed in the original order. peak B is not completely within peak A and therefore peak A is retained. Peak C is completely within peak B and therefore only peak B is reported.To address this, the peaks should not only be ordered by
"rtmin", but also considering"rtmax", such that in cases where peaks have the same"rtmin", the bigger peak (with a larger"rtmax") comes first.