loadSQMlite

loadSQMlite

R Documentation

Load tables generated by `sqm2tables.py`, `sqmreads2tables.py` or `combine-sqm-tables.py` into R.

Description

This function takes the path to the output directory generated by sqm2tables.py, sqmreads2tables.py or combine-sqm-tables.py a SQMlite object. The SQMlite object will contain taxonomic and functional profiles, but no detailed information on ORFs, contigs or bins. However, it will also have a much smaller memory footprint. A SQMlite object can be used for plotting and exporting, but it can not be subsetted.

Usage

loadSQMlite(tables_path, tax_mode = "allfilter")

Arguments

`tables_path`	character, tables directory generated by `sqm2table.py`, `sqmreads2tables.py` or `combine-sqm-tables.py`.
`tax_mode`	character, which taxonomic classification should be loaded? SqueezeMeta applies the identity thresholds described in Luo et al., 2014. Use `allfilter` for applying the minimum identity threshold to all taxa (default), `prokfilter` for applying the threshold to Bacteria and Archaea, but not to Eukaryotes, and `nofilter` for applying no thresholds at all.

Value

SQMlite object containing the parsed tables.

The SQMlite object structure

The SQMlite object is a nested list which contains the following information:

lvl1	lvl2	lvl3	type	rows/names	columns	data
$taxa	$superkingdom	$abund	numeric matrix	superkingdoms	samples	abundances
		$percent	numeric matrix	superkingdoms	samples	percentages
	$phylum	$abund	numeric matrix	phyla	samples	abundances
		$percent	numeric matrix	phyla	samples	percentages
	$class	$abund	numeric matrix	classes	samples	abundances
		$percent	numeric matrix	classes	samples	percentages
	$order	$abund	numeric matrix	orders	samples	abundances
		$percent	numeric matrix	orders	samples	percentages
	$family	$abund	numeric matrix	families	samples	abundances
		$percent	numeric matrix	families	samples	percentages
	$genus	$abund	numeric matrix	genera	samples	abundances
		$percent	numeric matrix	genera	samples	percentages
	$species	$abund	numeric matrix	species	samples	abundances
		$percent	numeric matrix	species	samples	percentages
$functions	$KEGG	$abund	numeric matrix	KEGG ids	samples	abundances (reads)
		$bases	numeric matrix	KEGG ids	samples	abundances (bases)
		$tpm	numeric matrix	KEGG ids	samples	tpm
		$copy_number	numeric matrix	KEGG ids	samples	avg. copies
	$COG	$abund	numeric matrix	COG ids	samples	abundances (reads)
		$bases	numeric matrix	COG ids	samples	abundances (bases)
		$tpm	numeric matrix	COG ids	samples	tpm
		$copy_number	numeric matrix	COG ids	samples	avg. copies
	$PFAM	$abund	numeric matrix	PFAM ids	samples	abundances (reads)
		$bases	numeric matrix	PFAM ids	samples	abundances (bases)
		$tpm	numeric matrix	PFAM ids	samples	tpm
		$copy_number	numeric matrix	PFAM ids	samples	avg. copies
$total_reads			numeric vector	samples	(n/a)	total reads
$misc	$project_name		character vector	(empty)	(n/a)	project name
	$samples		character vector	(empty)	(n/a)	samples
	$tax_names_long	$superkingdom	character vector	short names	(n/a)	full names
		$phylum	character vector	short names	(n/a)	full names
		$class	character vector	short names	(n/a)	full names
		$order	character vector	short names	(n/a)	full names
		$family	character vector	short names	(n/a)	full names
		$genus	character vector	short names	(n/a)	full names
		$species	character vector	short names	(n/a)	full names
	$tax_names_short		character vector	full names	(n/a)	short names
	$KEGG_names		character vector	KEGG ids	(n/a)	KEGG names
	$KEGG_paths		character vector	KEGG ids	(n/a)	KEGG hiararchy
	$COG_names		character vector	COG ids	(n/a)	COG names
	$COG_paths		character vector	COG ids	(n/a)	COG hierarchy
	$ext_annot_sources		character vector	(empty)	(n/a)	external databases

If external databases for functional classification were provided to SqueezeMeta or SqueezeMeta_reads via the -extdb argument, the corresponding abundance, tpm and copy number profiles will be present in SQM$functions (e.g. results for the CAZy database would be present in SQM$functions$CAZy). Additionally, the extended names of the features present in the external database will be present in SQM$misc (e.g. SQM$misc$CAZy_names). Note that results generated by SqueezeMeta_reads will contain only read abundances, but not bases, tpm or copy number estimations.

Examples

## Not run:
## (outside R)
## Run SqueezeMeta on the test data.
/path/to/SqueezeMeta/scripts/SqueezeMeta.pl -p Hadza -f raw -m coassembly -s test.samples
## Generate the tabular outputs!
/path/to/SqueezeMeta/utils/sqm2tables.py Hadza Hadza/results/tables
## Now go into R.
library(SQMtools)
Hadza = loadSQMlite("Hadza/results/tables")
# Where Hadza is the path to the SqueezeMeta output directory.
# Note that this is not the whole SQM project, just the directory containing the tables.
# It would also work with tables generated by sqmreads2tables.py, or combine-sqm-tables.py
plotTaxonomy(Hadza)
plotFunctions(Hadza)
exportKrona(Hadza, 'myKronaTest.html')

## End(Not run)