Package 'CovidMutations' reference manual

Title:	Mutation Analysis and Assay Validation Toolkit for COVID-19 (Coronavirus Disease 2019)
Description:	A feasible framework for mutation analysis and reverse transcription polymerase chain reaction (RT-PCR) assay evaluation of COVID-19, including mutation profile visualization, statistics and mutation ratio of each assay. The mutation ratio is conducive to evaluating the coverage of RT-PCR assays in large-sized samples<doi:10.20944/preprints202004.0529.v1>.
Authors:	Shaoqian Ma [aut, cre] , Yongyou Zhang [aut]
Maintainer:	Shaoqian Ma <[email protected]>
License:	GPL-3 \| file LICENSE
Version:	0.1.3
Built:	2025-02-07 05:01:21 UTC
Source:	https://github.com/MSQ-123/CovidMutations

Calculate the mutation detection rate using different assays

Description

This function is to use the well established assays information to detect mutations in different SARS-CoV-2 genomic sites. The output will be series of figures presenting the mutation profile using a specific assay and a figure for comparison between the mutation detection rate in each primers binding region.

Usage

AssayMutRatio(
  nucmerr = nucmerr,
  assays = assays,
  totalsample = totalsample,
  plotType = "barplot",
  outdir = "."
)
AssayMutRatio(
  nucmerr = nucmerr,
  assays = assays,
  totalsample = totalsample,
  plotType = "barplot",
  outdir = "."
)

Arguments

`nucmerr`	nucmerr Mutation information containing group list(derived from "nucmer" object using "nucmerRMD" function).
`assays`	Assays dataframe including the detection ranges of mutations.
`totalsample`	Total sample number, total cleared GISAID fasta data.
`plotType`	Figure type for either "barplot" or "logtrans".
`outdir`	The output directory.

Value

Plot the selected figure type as output.

Examples

data("nucmerr")
data("assays")
Total <- 1000 ## Total Cleared GISAID fasta data, sekitseq
outdir <- tempdir()
#Output the results
AssayMutRatio(nucmerr = nucmerr,
              assays = assays,
              totalsample = Total,
              plotType = "logtrans",
              outdir = outdir)
data("nucmerr")
data("assays")
Total <- 1000 ## Total Cleared GISAID fasta data, sekitseq
outdir <- tempdir()
#Output the results
AssayMutRatio(nucmerr = nucmerr,
              assays = assays,
              totalsample = Total,
              plotType = "logtrans",
              outdir = outdir)

Assays for mutation detection using different primers and probes

Description

These assays include the primer detection ranges in which mutations may occur.

Usage

data(assays)
data(assays)

Format

A dataframe with 10 rows and 7 columns.

References

Kilic T, Weissleder R, Lee H (2019) iScience 23, 101406. (PubMed)

Examples

data(assays)
data(assays)

A list of places in China

Description

The list is used for displacing some original cities' names with "China" in order to make the downstream analysis easier.

Usage

data(chinalist)
data(chinalist)

Format

A dataframe with 31 rows and 1 column.

Source

This data is created by Zhanglab in Xiamen University.

Examples

data(chinalist)
data(chinalist)

Mutation annotation results produced by "indelSNP" function

Description

A dataframe which could be used for downstream analysis like mutation statistics description.

Usage

data(covid_annot)
data(covid_annot)

Format

A dataframe with 49821 rows and 10 columns.

Source

https://www.gisaid.org/

Examples

data(covid_annot)
data(covid_annot)

Detection of co-occurring mutations using double-assay information

Description

The detection of SARS-CoV-2 is important for the prevention of the outbreak and management of patients. Real-time reverse-transcription polymerase chain reaction (RT-PCR) assay is one of the most effective molecular diagnosis strategies to detect virus in clinical laboratory. It will be more accurate and practical to use double assays to detect some samples with co-occurring mutations.

Usage

doubleAssay(nucmerr = nucmerr, assay1 = assay1, assay2 = assay2, outdir = ".")
doubleAssay(nucmerr = nucmerr, assay1 = assay1, assay2 = assay2, outdir = ".")

Arguments

`nucmerr`	nucmerr Mutation information containing group list(derived from "nucmer" object using "nucmerRMD" function).
`assay1`	Information of the first assay(containing primers locations and probe location, see the format of assays provided as example data. e.g. data(assays); assay1<- assays[1,])
`assay2`	Information of the second assay, the format is the same as the first assay.
`outdir`	The output directory. If NULL print the plot in Rstudio.

Value

Plot three figures in a single panel, including two results of assays and a "venn" plot for co-occurring mutated samples.

Examples

data("nucmerr")
data("assays")
assay1 <- assays[1,]
assay2 <- assays[2,]
#outdir <- tempdir()
doubleAssay(nucmerr = nucmerr,
            assay1 = assay1,
            assay2 = assay2,
            outdir = NULL)
data("nucmerr")
data("assays")
assay1 <- assays[1,]
assay2 <- assays[2,]
#outdir <- tempdir()
doubleAssay(nucmerr = nucmerr,
            assay1 = assay1,
            assay2 = assay2,
            outdir = NULL)

"GFF3" format gene position data for SARS-Cov-2

Description

This "GFF3" data is used for counting the mutations in each gene in virus sample.

Usage

data(gene_position)
data(gene_position)

Format

A dataframe with 26 rows and 10 columns.

Source

https://www.ncbi.nlm.nih.gov/

Examples

data(gene_position)
data(gene_position)

"GFF3" format annotation data for SARS-Cov-2

Description

This "GFF3" data is used for annotating the effects of mutations in virus sample.

Usage

data(gff3)
data(gff3)

Format

A dataframe with 26 rows and 10 columns.

Source

https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=2697049

Examples

data(gff3)
data(gff3)

Global mutational events profiling of proteins

Description

This function is to visualize the global protein mutational pattern in the SARS-CoV-2 genome.

Usage

globalProteinMut(
  covid_annot = covid_annot,
  outdir = ".",
  figure_Type = "heatmap",
  top = 10,
  country = "global"
)
globalProteinMut(
  covid_annot = covid_annot,
  outdir = ".",
  figure_Type = "heatmap",
  top = 10,
  country = "global"
)

Arguments

`covid_annot`	The mutation effects provided by "indelSNP" function.
`outdir`	The output directory.
`figure_Type`	Figure type for either "heatmap" or "count".
`top`	The number of variants to plot.
`country`	Choose a country to plot the mutational pattern or choose "global" to profile mutations across all countries. The default is "global".

Value

Plot the selected figure type as output.

Examples

data("covid_annot")
outdir <- tempdir()
# make sure the covid_annot is a dataframe
covid_annot <- as.data.frame(covid_annot)
globalProteinMut(covid_annot = covid_annot,
                 outdir = outdir,
                 figure_Type = "heatmap",
                 top = 10,
                 country = "USA")
data("covid_annot")
outdir <- tempdir()
# make sure the covid_annot is a dataframe
covid_annot <- as.data.frame(covid_annot)
globalProteinMut(covid_annot = covid_annot,
                 outdir = outdir,
                 figure_Type = "heatmap",
                 top = 10,
                 country = "USA")

Global single nucleotide polymorphism (SNP) profiling in virus genome

Description

This function is to visualize the global SNP pattern in the SARS-CoV-2 genome.

Usage

globalSNPprofile(
  nucmerr = nucmerr,
  outdir = ".",
  figure_Type = "heatmap",
  country = "global",
  top = 5
)
globalSNPprofile(
  nucmerr = nucmerr,
  outdir = ".",
  figure_Type = "heatmap",
  country = "global",
  top = 5
)

Arguments

`nucmerr`	Mutation information containing group list(derived from "nucmer" object using "nucmerRMD" function).
`outdir`	The output directory.
`figure_Type`	Figure type for either "heatmap" or "count".
`country`	Choose a country to plot the mutational pattern or choose "global" to profile mutations across all countries. The default is "global".
`top`	The number of mutational classes to plot.

Value

Plot the selected figure type as output.

Examples

data("nucmerr")
outdir <- tempdir()
globalSNPprofile(nucmerr = nucmerr,
                 outdir = outdir,
                 figure_Type = "heatmap",
                 country = "global",
                 top = 5)
data("nucmerr")
outdir <- tempdir()
globalSNPprofile(nucmerr = nucmerr,
                 outdir = outdir,
                 figure_Type = "heatmap",
                 country = "global",
                 top = 5)

Provide effects of each single nucleotide polymorphism (SNP), insertion and deletion in virus genome

Description

This function is to annotate the mutational events and indicate their potential effects on the proteins. Mutational events include SNP, insertion and deletion.

Usage

indelSNP(
  nucmer = nucmer,
  saveRda = FALSE,
  refseq = refseq,
  gff3 = gff3,
  annot = annot,
  outdir = "."
)
indelSNP(
  nucmer = nucmer,
  saveRda = FALSE,
  refseq = refseq,
  gff3 = gff3,
  annot = annot,
  outdir = "."
)

Arguments

`nucmer`	An object called "nucmer", mutation information derived from "nucmer.snp" variant file by "seqkit" software and "nucmer SNP-calling" scripts. To be processed by "indelSNP" function, The nucmer object should be first transformed by "mergeEvents" function.
`saveRda`	Whether to save the results as ".rda" file.
`refseq`	SARS-Cov-2 genomic reference sequence.
`gff3`	"GFF3" format annotation data for SARS-Cov-2.
`annot`	Annotation of genes(corresponding proteins) list from "GFF3" file by "setNames(gff3[,10],gff3[,9])".
`outdir`	The output directory.

Value

Write the result as ".csv" file to the specified directory.

Examples

data("nucmer")
# Fix IUPAC codes
nucmer<-nucmer[!nucmer$qvar%in%c("B","D","H","K","M","N","R","S","V","W","Y"),]
nucmer<- mergeEvents(nucmer = nucmer)## This will update the nucmer object
data("refseq")
data("gff3")
annot <- setNames(gff3[,10],gff3[,9])
outdir <- tempdir()
indelSNP(nucmer = nucmer,
         saveRda = FALSE,
         refseq = refseq,
         gff3 = gff3,
         annot = annot,
         outdir = outdir)
data("nucmer")
# Fix IUPAC codes
nucmer<-nucmer[!nucmer$qvar%in%c("B","D","H","K","M","N","R","S","V","W","Y"),]
nucmer<- mergeEvents(nucmer = nucmer)## This will update the nucmer object
data("refseq")
data("gff3")
annot <- setNames(gff3[,10],gff3[,9])
outdir <- tempdir()
indelSNP(nucmer = nucmer,
         saveRda = FALSE,
         refseq = refseq,
         gff3 = gff3,
         annot = annot,
         outdir = outdir)

Bacth assay analysis for last five Nr of primers

Description

Last five nucleotides of primer mutation count/type for any reverse transcription polymerase chain reaction (RT-PCR) primer.

Usage

LastfiveNrMutation(
  nucmerr = nucmerr,
  assays = assays,
  totalsample = totalsample,
  figurelist = FALSE,
  outdir = "."
)
LastfiveNrMutation(
  nucmerr = nucmerr,
  assays = assays,
  totalsample = totalsample,
  figurelist = FALSE,
  outdir = "."
)

Arguments

`nucmerr`	nucmerr Mutation information containing group list(derived from "nucmer" object using "nucmerRMD" function).
`assays`	Assays dataframe including the detection ranges of mutations.
`totalsample`	Total sample number, total cleared GISAID fasta data.
`figurelist`	Whether to output the integrated plot list for each assay.
`outdir`	The output directory. if the figurelist = TRUE, output the figure in the R session.

Value

Plot the mutation counts(last five nucleotides for each primer) for each assay as output.

Examples

data("nucmerr")
data("assays")
totalsample <- 1000
outdir <- tempdir()
LastfiveNrMutation(nucmerr = nucmerr,
                   assays = assays,
                   totalsample = totalsample,
                   figurelist = FALSE,
                   outdir = outdir)
data("nucmerr")
data("assays")
totalsample <- 1000
outdir <- tempdir()
LastfiveNrMutation(nucmerr = nucmerr,
                   assays = assays,
                   totalsample = totalsample,
                   figurelist = FALSE,
                   outdir = outdir)

Merge neighboring events of single nucleotide polymorphism (SNP), insertion and deletion.

Description

The first step for handling the nucmer object, then effects of mutations can be analysed using "indelSNP" function.

Usage

mergeEvents(nucmer = nucmer)
mergeEvents(nucmer = nucmer)

Arguments

nucmer

An object called "nucmer", mutation information derived from "nucmer.snp" variant file by "seqkit" software and "nucmer SNP-calling" scripts.

Value

An updated "nucmer" object.

Examples

#The example data:
data("nucmer")
#options(stringsAsFactors = FALSE)

#The input nucmer object can be made by the comment below:
#nucmer<-read.delim("nucmer.snps",as.is=TRUE,skip=4,header=FALSE)
#colnames(nucmer)<-c("rpos","rvar","qvar","qpos","","","","",
#"rlength","qlength","","","rname","qname")
#rownames(nucmer)<-paste0("var",1:nrow(nucmer))

# Fix IUPAC codes
nucmer<-nucmer[!nucmer$qvar%in%c("B","D","H","K","M","N","R","S","V","W","Y"),]
nucmer<- mergeEvents(nucmer = nucmer)## This will update the nucmer object
#The example data:
data("nucmer")
#options(stringsAsFactors = FALSE)

#The input nucmer object can be made by the comment below:
#nucmer<-read.delim("nucmer.snps",as.is=TRUE,skip=4,header=FALSE)
#colnames(nucmer)<-c("rpos","rvar","qvar","qpos","","","","",
#"rlength","qlength","","","rname","qname")
#rownames(nucmer)<-paste0("var",1:nrow(nucmer))

# Fix IUPAC codes
nucmer<-nucmer[!nucmer$qvar%in%c("B","D","H","K","M","N","R","S","V","W","Y"),]
nucmer<- mergeEvents(nucmer = nucmer)## This will update the nucmer object

Plot mutation counts for certain genes

Description

After annotating the mutations, this function is to plot the counts of mutational events for each gene in the SARS-CoV-2 genome.

Usage

MutByGene(nucmerr = nucmerr, gff3 = gff3, figurelist = FALSE, outdir = ".")
MutByGene(nucmerr = nucmerr, gff3 = gff3, figurelist = FALSE, outdir = ".")

Arguments

`nucmerr`	Mutation information containing group list(derived from "nucmer" object using "nucmerRMD" function).
`gff3`	"GFF3" format gene position data for SARS-Cov-2(the "GFF3" file should include columns named: "Gene", "Start", "Stop").
`figurelist`	Whether to output the integrated plot list for each gene.
`outdir`	The output directory, if the figurelist = TRUE, output the figure in the R session.

Value

Plot the mutation counts figure for each gene as output.

Examples

data("nucmerr")
data("gene_position")
outdir <- tempdir()
MutByGene(nucmerr = nucmerr, gff3 = gene_position, figurelist = FALSE, outdir = outdir)
#if figurelist = TRUE, the recommendation for figure display(in pixel)is: width=1650, height=1300
data("nucmerr")
data("gene_position")
outdir <- tempdir()
MutByGene(nucmerr = nucmerr, gff3 = gene_position, figurelist = FALSE, outdir = outdir)
#if figurelist = TRUE, the recommendation for figure display(in pixel)is: width=1650, height=1300

Plot mutation statistics for nucleiotide

Description

Visualization for the top mutated samples, average mutational counts, top mutated position in the genome, mutational density across the genome and distribution of mutations across countries.

Usage

mutStat(
  nucmerr = nucmerr,
  outdir = ".",
  figure_Type = "TopMuSample",
  type_top = 10,
  country = FALSE,
  mutpos = NULL
)
mutStat(
  nucmerr = nucmerr,
  outdir = ".",
  figure_Type = "TopMuSample",
  type_top = 10,
  country = FALSE,
  mutpos = NULL
)

Arguments

`nucmerr`	Mutation information containing group list(derived from "nucmer" object using "nucmerRMD" function).
`outdir`	The output directory.
`figure_Type`	Figure type for: "TopMuSample", "AverageMu", "TopMuPos", "MutDens", "CountryMutCount", "TopCountryMut".
`type_top`	To plot the figure involving "top n"("TopMuSample", "TopMuPos", "TopCountryMut"), the "type_top" should specify the number of objects to display.
`country`	To plot the figure using country as groups("CountryMutCount" and "TopCountryMut"), the "country" should be TRUE.
`mutpos`	If the figure type is "TopCountryMut", "mutpos" can specify A range of genomic position(eg. 28831:28931) for plot

Value

Plot the selected figure type as output.

Examples

data("nucmerr")
outdir <- tempdir()
mutStat(nucmerr = nucmerr,
        outdir = outdir,
        figure_Type = "TopCountryMut",
        type_top = 10,
        country = FALSE,
        mutpos = NULL)
data("nucmerr")
outdir <- tempdir()
mutStat(nucmerr = nucmerr,
        outdir = outdir,
        figure_Type = "TopCountryMut",
        type_top = 10,
        country = FALSE,
        mutpos = NULL)

Mutation information derived from "nucmer" SNP analysis

Description

The "nucmer.snpss" variant file is obtained by processing the SARS-Cov-2 sequence from Gisaid website (complete, high coverage only, low coverage exclusion, Host=human, Virus name = hCoV-19) with "seqkit" software and "nucmer" scripts. The example data is downsampled from complete data in 2020-06-14.

Usage

data(nucmer)
data(nucmer)

Format

A dataframe with 5000 rows(mutation sites) and 14 columns.

Source

https://www.gisaid.org/

Examples

data(nucmer)
data(nucmer)

Preprocessed "nucmer.snpss" file using "nucmerRMD" function

Description

A dataset contains some group information subtracted from the "nucmer" object by "nucmerRMD" function in order to best describe the results.

Usage

data(nucmerr)
data(nucmerr)

Format

A dataframe with 4982 rows (downsampled mutation sites) and 10 columns.

Source

https://www.gisaid.org/

Examples

data(nucmerr)
data(nucmerr)

Preprocess "nucmer" object to add group information

Description

Manipulate the "nucmer" object to make the analysis easier.

Usage

nucmerRMD(nucmer = nucmer, outdir = ".", chinalist = chinalist)
nucmerRMD(nucmer = nucmer, outdir = ".", chinalist = chinalist)

Arguments

`nucmer`	An object called "nucmer", mutation information derived from "nucmer.snp" variant file by "seqkit" software and "nucmer SNP-calling" scripts.
`outdir`	The output directory.
`chinalist`	A list of places in China, for displacing some original cities with "China" in order to make the downstream analysis easier.

Value

Saving the updated "nucmer" object.

Examples

data("nucmer")
data("chinalist")
#outdir <- tempdir() specify your output directory
nucmerr<- nucmerRMD(nucmer = nucmer, outdir = NULL, chinalist = chinalist)
data("nucmer")
data("chinalist")
#outdir <- tempdir() specify your output directory
nucmerr<- nucmerRMD(nucmer = nucmer, outdir = NULL, chinalist = chinalist)

Plot the mutation statistics after annotating the "nucmer" object by "indelSNP" function

Description

Basic descriptions for the mutational events.

Usage

plotMutAnno(results = results, figureType = "MostMut", outdir = ".")
plotMutAnno(results = results, figureType = "MostMut", outdir = ".")

Arguments

`results`	The mutation effects provided by "indelSNP" function.
`figureType`	Figure type for: "MostMut", "MutPerSample", "VarClasses", "VarType", "NucleoEvents", "ProEvents".
`outdir`	The output directory.

Value

Plot the selected figure type as output.

Examples

data("covid_annot")
# make sure the covid_annot is a dataframe
covid_annot <- as.data.frame(covid_annot)
#outdir <- tempdir() specify your output directory
plotMutAnno(results = covid_annot,figureType = "MostMut", outdir = NULL)
data("covid_annot")
# make sure the covid_annot is a dataframe
covid_annot <- as.data.frame(covid_annot)
#outdir <- tempdir() specify your output directory
plotMutAnno(results = covid_annot,figureType = "MostMut", outdir = NULL)

Plot the most frequent mutational events for proteins in the SARS-CoV-2 genome

Description

Plot the most frequent mutational events for proteins selected. The protein name should be specified correctly (only for SARS-CoV-2).

Usage

plotMutProteins(
  results = results,
  proteinName = "NSP2",
  top = 20,
  outdir = "."
)
plotMutProteins(
  results = results,
  proteinName = "NSP2",
  top = 20,
  outdir = "."
)

Arguments

`results`	The mutation effects provided by "indelSNP" function.
`proteinName`	Proteins in the SARS-CoV-2 genome, available choices: 5'UTR, NSP1~NSP10, NSP12a, NSP12b, NSP13, NSP14, NSP15, NSP16, S, ORF3a, E, M, ORF6, ORF7a, ORF7b, ORF8, N, ORF10.
`top`	The number of objects to display.
`outdir`	The output directory.

Value

Plot the mutational events for selected proteins as output.

Examples

data("covid_annot")
# make sure the covid_annot is a dataframe
covid_annot <- as.data.frame(covid_annot)
#outdir <- tempdir() specify your output directory
plotMutProteins(results = covid_annot,proteinName = "NSP2", top = 20, outdir = NULL)
data("covid_annot")
# make sure the covid_annot is a dataframe
covid_annot <- as.data.frame(covid_annot)
#outdir <- tempdir() specify your output directory
plotMutProteins(results = covid_annot,proteinName = "NSP2", top = 20, outdir = NULL)

SARS-Cov-2 genomic reference sequence from NCBI

Description

This reference sequence is derived from "fasta" file, preprocessed by "read.fasta" function(refseq<-read.fasta("NC_045512.2.fa",forceDNAtolower=FALSE)[[1]]). It is used for annotating mutations in virus samples.

Usage

data(refseq)
data(refseq)

Format

"SeqFastadna" characters.

Source

https://pubmed.ncbi.nlm.nih.gov/32015508/

Examples

data(refseq)
data(refseq)

Package 'CovidMutations'

Help Index

Calculate the mutation detection rate using different assays

Description

Usage

Arguments

Value

Examples

Assays for mutation detection using different primers and probes

Description

Usage

Format

References

Examples

A list of places in China

Description

Usage

Format

Source

Examples

Mutation annotation results produced by "indelSNP" function

Description

Usage

Format

Source

Examples

Detection of co-occurring mutations using double-assay information

Description

Usage

Arguments

Value

Examples

"GFF3" format gene position data for SARS-Cov-2

Description

Usage

Format

Source

Examples

"GFF3" format annotation data for SARS-Cov-2

Description

Usage

Format

Source

Examples

Global mutational events profiling of proteins

Description

Usage

Arguments

Value

Examples

Global single nucleotide polymorphism (SNP) profiling in virus genome

Description

Usage

Arguments

Value

Examples

Provide effects of each single nucleotide polymorphism (SNP), insertion and deletion in virus genome

Description

Usage

Arguments

Value

Examples

Bacth assay analysis for last five Nr of primers

Description

Usage

Arguments

Value

Examples

Merge neighboring events of single nucleotide polymorphism (SNP), insertion and deletion.

Description

Usage

Arguments

Value

Examples

Plot mutation counts for certain genes

Description

Usage

Arguments

Value

Examples