loadSQMlite

loadSQMlite

R Documentation

Load tables generated by sqm2tables.py, sqmreads2tables.py or combine-sqm-tables.py into R.

Description

This function takes the path to the output directory generated by sqm2tables.py, sqmreads2tables.py or combine-sqm-tables.py a SQMlite object. The SQMlite object will contain taxonomic and functional profiles, but no detailed information on ORFs, contigs or bins. However, it will also have a much smaller memory footprint. A SQMlite object can be used for plotting and exporting, but it can not be subsetted.

Usage

loadSQMlite(tables_path, tax_mode = "allfilter")

Arguments

tables_path

character, tables directory generated by sqm2table.py, sqmreads2tables.py or combine-sqm-tables.py.

tax_mode

character, which taxonomic classification should be loaded? SqueezeMeta applies the identity thresholds described in Luo et al., 201 4. Use allfilter for applying the minimum identity threshold to all taxa (default), prokfilter for applying the threshold to Bacteria and Archaea, but not to Eukaryotes, and nofilter for applying no thresholds at all.

Value

SQMlite object containing the parsed tables.

The SQMlite object structure

The SQMlite object is a nested list which contains the following information:

lvl1*

lvl2*

lvl3*

type*

rows/ names

co lumns

data*

** $taxa**

$ superki ngdom

$ abund

numeric matrix*

superk ingdoms

samples

abu ndances

$pe rcent

numeric matrix*

superk ingdoms

samples

perc entages

$p hylum

$ abund

numeric matrix*

phyla

samples

abu ndances

$pe rcent

numeric matrix*

phyla

samples

perc entages

$ class

$ abund

numeric matrix*

classes

samples

abu ndances

$pe rcent

numeric matrix*

classes

samples

perc entages

$ order

$ abund

numeric matrix*

orders

samples

abu ndances

$pe rcent

numeric matrix*

orders

samples

perc entages

$f amily

$ abund

numeric matrix*

f amilies

samples

abu ndances

$pe rcent

numeric matrix*

f amilies

samples

perc entages

$ genus

$ abund

numeric matrix*

genera

samples

abu ndances

$pe rcent

numeric matrix*

genera

samples

perc entages

$sp ecies

$ abund

numeric matrix*

species

samples

abu ndances

$pe rcent

numeric matrix*

species

samples

perc entages

$func tions

** $KEGG**

$ abund

numeric matrix*

KEGG ids

samples

abu ndances (reads)

$ bases

numeric matrix*

KEGG ids

samples

abu ndances (bases)

$tpm*

numeric matrix*

KEGG ids

samples

tpm

** $copy_n umber**

numeric matrix*

KEGG ids

samples

avg. copies

$COG*

$ abund

numeric matrix*

COG ids

samples

abu ndances (reads)

$ bases

numeric matrix*

COG ids

samples

abu ndances (bases)

$tpm*

numeric matrix*

COG ids

samples

tpm

** $copy_n umber**

numeric matrix*

COG ids

samples

avg. copies

** $PFAM**

$ abund

numeric matrix*

PFAM ids

samples

abu ndances (reads)

$ bases

numeric matrix*

PFAM ids

samples

abu ndances (bases)

$tpm*

numeric matrix*

PFAM ids

samples

tpm

** $copy_n umber**

numeric matrix*

PFAM ids

samples

avg. copies

** $total_ reads**

numeric vector*

samples

(n/a)

total reads

** $misc**

$ project _name

ch aracter vector

(empty)

(n/a)

project name

$sa mples

ch aracter vector

(empty)

(n/a)

samples

$ta x_names _long

$ superki ngdom

ch aracter vector

short names

(n/a)

full names

$p hylum

ch aracter vector

short names

(n/a)

full names

$ class

ch aracter vector

short names

(n/a)

full names

$ order

ch aracter vector

short names

(n/a)

full names

$f amily

ch aracter vector

short names

(n/a)

full names

$ genus

ch aracter vector

short names

(n/a)

full names

$sp ecies

ch aracter vector

short names

(n/a)

full names

$tax _names_ short

ch aracter vector

full names

(n/a)

short names

$KEGG_ names*

ch aracter vector

KEGG ids

(n/a)

KEGG names

$KEGG_ paths*

ch aracter vector

KEGG ids

(n/a)

KEGG hi ararchy

$COG_ names

ch aracter vector

COG ids

(n/a)

COG names

$COG_ paths

ch aracter vector

COG ids

(n/a)

COG hi erarchy

$ext_a nnot_so urces*

ch aracter vector

(empty)

(n/a)

e xternal da tabases

If external databases for functional classification were provided to SqueezeMeta or SqueezeMeta_reads via the -extdb argument, the corresponding abundance, tpm and copy number profiles will be present in SQM$functions (e.g. results for the CAZy database would be present in SQM$functions$CAZy). Additionally, the extended names of the features present in the external database will be present in SQM$misc (e.g. SQM$misc$CAZy_names). Note that results generated by SqueezeMeta_reads will contain only read abundances, but not bases, tpm or copy number estimations.

See Also

plotBars and plotFunctions will plot the most abundant taxa and functions in a SQMlite object. exportKrona will generate Krona charts reporting the taxonomy in a SQMlite object.

Examples

## Not run:
## (outside R)
## Run SqueezeMeta on the test data.
/path/to/SqueezeMeta/scripts/SqueezeMeta.pl -p Hadza -f raw -m coassembly -s test.samples
## Generate the tabular outputs!
/path/to/SqueezeMeta/utils/sqm2tables.py Hadza Hadza/results/tables
## Now go into R.
library(SQMtools)
Hadza = loadSQMlite("Hadza/results/tables")
# Where Hadza is the path to the SqueezeMeta output directory.
# Note that this is not the whole SQM project, just the directory containing the tables.
# It would also work with tables generated by sqmreads2tables.py, or combine-sqm-tables.py
plotTaxonomy(Hadza)
plotFunctions(Hadza)
exportKrona(Hadza, 'myKronaTest.html')

## End(Not run)