loadSQMlite

loadSQMlite

R Documentation

Load tables generated by sqm2tables.py, sqmreads2tables.py or combine-sqm-tables.py into R.

Description

This function takes the path to the output directory generated by sqm2tables.py, sqmreads2tables.py or combine-sqm-tables.py a SQMlite object. The SQMlite object will contain taxonomic and functional profiles, but no detailed information on ORFs, contigs or bins. However, it will also have a much smaller memory footprint. A SQMlite object can be used for plotting and exporting, but it can not be subsetted.

Usage

loadSQMlite(tables_path, tax_mode = "allfilter")

Arguments

tables_path

character, tables directory generated by sqm2table.py, sqmreads2tables.py or combine-sqm-tables.py.

tax_mode

character, which taxonomic classification should be loaded? SqueezeMeta applies the identity thresholds described in Luo et al., 2014. Use allfilter for applying the minimum identity threshold to all taxa (default), prokfilter for applying the threshold to Bacteria and Archaea, but not to Eukaryotes, and nofilter for applying no thresholds at all.

Value

SQMlite object containing the parsed tables.

The SQMlite object structure

The SQMlite object is a nested list which contains the following information:

lvl1

lvl2

lvl3

type

rows/names

columns

data

$taxa

$superkingdom

$abund

numeric matrix

superkingdoms

samples

abundances

$percent

numeric matrix

superkingdoms

samples

percentages

$phylum

$abund

numeric matrix

phyla

samples

abundances

$percent

numeric matrix

phyla

samples

percentages

$class

$abund

numeric matrix

classes

samples

abundances

$percent

numeric matrix

classes

samples

percentages

$order

$abund

numeric matrix

orders

samples

abundances

$percent

numeric matrix

orders

samples

percentages

$family

$abund

numeric matrix

families

samples

abundances

$percent

numeric matrix

families

samples

percentages

$genus

$abund

numeric matrix

genera

samples

abundances

$percent

numeric matrix

genera

samples

percentages

$species

$abund

numeric matrix

species

samples

abundances

$percent

numeric matrix

species

samples

percentages

$functions

$KEGG

$abund

numeric matrix

KEGG ids

samples

abundances (reads)

$bases

numeric matrix

KEGG ids

samples

abundances (bases)

$tpm

numeric matrix

KEGG ids

samples

tpm

$copy_number

numeric matrix

KEGG ids

samples

avg. copies

$COG

$abund

numeric matrix

COG ids

samples

abundances (reads)

$bases

numeric matrix

COG ids

samples

abundances (bases)

$tpm

numeric matrix

COG ids

samples

tpm

$copy_number

numeric matrix

COG ids

samples

avg. copies

$PFAM

$abund

numeric matrix

PFAM ids

samples

abundances (reads)

$bases

numeric matrix

PFAM ids

samples

abundances (bases)

$tpm

numeric matrix

PFAM ids

samples

tpm

$copy_number

numeric matrix

PFAM ids

samples

avg. copies

$total_reads

numeric vector

samples

(n/a)

total reads

$misc

$project_name

character vector

(empty)

(n/a)

project name

$samples

character vector

(empty)

(n/a)

samples

$tax_names_long

$superkingdom

character vector

short names

(n/a)

full names

$phylum

character vector

short names

(n/a)

full names

$class

character vector

short names

(n/a)

full names

$order

character vector

short names

(n/a)

full names

$family

character vector

short names

(n/a)

full names

$genus

character vector

short names

(n/a)

full names

$species

character vector

short names

(n/a)

full names

$tax_names_short

character vector

full names

(n/a)

short names

$KEGG_names

character vector

KEGG ids

(n/a)

KEGG names

$KEGG_paths

character vector

KEGG ids

(n/a)

KEGG hiararchy

$COG_names

character vector

COG ids

(n/a)

COG names

$COG_paths

character vector

COG ids

(n/a)

COG hierarchy

$ext_annot_sources

character vector

(empty)

(n/a)

external databases

If external databases for functional classification were provided to SqueezeMeta or SqueezeMeta_reads via the -extdb argument, the corresponding abundance, tpm and copy number profiles will be present in SQM$functions (e.g. results for the CAZy database would be present in SQM$functions$CAZy). Additionally, the extended names of the features present in the external database will be present in SQM$misc (e.g. SQM$misc$CAZy_names). Note that results generated by SqueezeMeta_reads will contain only read abundances, but not bases, tpm or copy number estimations.

See Also

plotBars and plotFunctions will plot the most abundant taxa and functions in a SQMlite object. exportKrona will generate Krona charts reporting the taxonomy in a SQMlite object.

Examples

## Not run:
## (outside R)
## Run SqueezeMeta on the test data.
/path/to/SqueezeMeta/scripts/SqueezeMeta.pl -p Hadza -f raw -m coassembly -s test.samples
## Generate the tabular outputs!
/path/to/SqueezeMeta/utils/sqm2tables.py Hadza Hadza/results/tables
## Now go into R.
library(SQMtools)
Hadza = loadSQMlite("Hadza/results/tables")
# Where Hadza is the path to the SqueezeMeta output directory.
# Note that this is not the whole SQM project, just the directory containing the tables.
# It would also work with tables generated by sqmreads2tables.py, or combine-sqm-tables.py
plotTaxonomy(Hadza)
plotFunctions(Hadza)
exportKrona(Hadza, 'myKronaTest.html')

## End(Not run)