*******
loadSQM
*******

======= ===============
loadSQM R Documentation
======= ===============

Load a SqueezeMeta project into R
---------------------------------

Description
~~~~~~~~~~~

This function takes the path to a project directory generated by
`SqueezeMeta <https://github.com/jtamames/SqueezeMeta>`__ (whose name is
specified in the ``-p`` parameter of the SqueezeMeta.pl script) and
parses the results into a SQM object. Alternatively, it can load the
project data from a zip file produced by ``sqm2zip.py``.

Usage
~~~~~

.. code:: R

   loadSQM(
     project_path,
     tax_mode = "prokfilter",
     tax_source = "contigs",
     trusted_functions_only = FALSE,
     single_copy_genes = "MGOGs",
     load_sequences = TRUE,
     engine = "data.table"
   )

Arguments
~~~~~~~~~

+----------------------------+------------------------------------------------------+
| ``project_path``           | character, a vector of project directories generated |
|                            | by SqueezeMeta, and/or zip files generated by        |
|                            | ``sqm2zip.py``.                                      |
+----------------------------+------------------------------------------------------+
| ``tax_mode``               | character, which taxonomic classification should be  |
|                            | loaded? SqueezeMeta applies the identity thresholds  |
|                            | described in `Luo et al.,                            |
|                            | 2014 <https://pubmed.ncbi.nlm.nih.gov/24589583/>`__. |
|                            | Use ``allfilter`` for applying the minimum identity  |
|                            | threshold to all taxa, ``prokfilter`` for applying   |
|                            | the threshold to Bacteria and Archaea, but not to    |
|                            | Eukaryotes, and ``nofilter`` for applying no         |
|                            | thresholds at all (default ``prokfilter``).          |
+----------------------------+------------------------------------------------------+
| ``tax_source``             | character, source data used for the taxonomy tables  |
|                            | present in ``SQM$taxa``, either ``"orfs"``,          |
|                            | ``"contigs"``, ``"bins"`` (GTDB bin taxonomy if      |
|                            | available, SQM bin taxonomy otherwise),              |
|                            | ``"bins_gtdb"`` (GTDB bin taxonomy) or               |
|                            | ``"bins_sqm"`` (SQM bin taxonomy). Default           |
|                            | ``"contigs"``.                                       |
+----------------------------+------------------------------------------------------+
| ``trusted_functions_only`` | logical. If ``TRUE``, only highly trusted functional |
|                            | annotations (best hit + best average) will be        |
|                            | considered when generating aggregated function       |
|                            | tables. If ``FALSE``, best hit annotations will be   |
|                            | used (default ``FALSE``). Will only have an effect   |
|                            | if ``project_path`` is not a zip file, and           |
|                            | ``project_path/results/tables`` is not already       |
|                            | present.                                             |
+----------------------------+------------------------------------------------------+
| ``single_copy_genes``      | character, source of single copy genes for copy      |
|                            | number normalization, either ``"RecA"`` (COG0468,    |
|                            | RecA/RadA), ``"MGOGs"`` (COGs for 10 single copy and |
|                            | housekeeping genes, Salazar, G *et al.* 2019),       |
|                            | ``"MGKOs"`` (KOs for 10 single copy and housekeeping |
|                            | genes, Salazar, G *et al.*, 2019) or ``"USiCGs"``    |
|                            | (KOs for 15 single copy genes, Carr *et al.*, 2013.  |
|                            | Table S1). For ``"MGOGs"``, ``"MGKOs"`` and          |
|                            | ``"USiCGs"``, the median coverage of a set of single |
|                            | copy genes will be used for normalization. Default   |
|                            | ``"MGOGs"``.                                         |
+----------------------------+------------------------------------------------------+
| ``load_sequences``         | logical. If ``TRUE``, contig and orf sequences will  |
|                            | be loaded in the SQM object. Setting it to ``FALSE`` |
|                            | will reduce memory usage. Default ``TRUE``.          |
+----------------------------+------------------------------------------------------+
| ``engine``                 | character. Engine used to load the ORFs and contigs  |
|                            | tables. Either ``"data.frame"`` or ``"data.table"``  |
|                            | (significantly faster if your project is large).     |
|                            | Default ``"data.table"``.                            |
+----------------------------+------------------------------------------------------+

Value
~~~~~

SQM object containing the parsed project. If more than one path is
provided in ``project_path`` this function will return a SQMbunch object
instead. The structure of this object is similar to that of a SQMlite
object (see ``loadSQMlite``) but with an extra entry named ``projects``
that contains one SQM object for input project. SQM and SQMbunch objects
will otherwise behave similarly when used with the subset and plot
functions from this package.

Prerequisites
~~~~~~~~~~~~~

Run `SqueezeMeta <https://github.com/jtamames/SqueezeMeta>`__! An
example call for running it would be:

| ``/path/to/SqueezeMeta/scripts/SqueezeMeta.pl``
| ``-m coassembly -f fastq_dir -s samples_file -p project_dir``

The SQM object structure
~~~~~~~~~~~~~~~~~~~~~~~~

The SQM object is a nested list which contains the following
information:

+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
| **lvl1**         | **lvl2**               | **lvl3**          | **type**    | **rows/names** | **columns** | **data**    |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
| **$orfs**        | **$table**             |                   | *dataframe* | orfs           | misc. data  | misc. data  |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$abund**             |                   | *numeric    | orfs           | samples     | abundances  |
|                  |                        |                   | matrix*     |                |             | (reads)     |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$bases**             |                   | *numeric    | orfs           | samples     | abundances  |
|                  |                        |                   | matrix*     |                |             | (bases)     |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$cov**               |                   | *numeric    | orfs           | samples     | coverages   |
|                  |                        |                   | matrix*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$cpm**               |                   | *numeric    | orfs           | samples     | covs. /     |
|                  |                        |                   | matrix*     |                |             | 10^6 reads  |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$tpm**               |                   | *numeric    | orfs           | samples     | tpm         |
|                  |                        |                   | matrix*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$seqs**              |                   | *character  | orfs           | (n/a)       | sequences   |
|                  |                        |                   | vector*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$tax**               |                   | *character  | orfs           | tax. ranks  | taxonomy    |
|                  |                        |                   | matrix*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$tax16S**            |                   | *character  | orfs           | (n/a)       | 16S rRNA    |
|                  |                        |                   | vector*     |                |             | taxonomy    |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$tax_abund**         |                   | See         |                |             |             |
|                  |                        |                   | SQM$taxa    |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$markers**           |                   | *list*      | orfs           | (n/a)       | CheckM1     |
|                  |                        |                   |             |                |             | markers     |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
| **$contigs**     | **$table**             |                   | *dataframe* | contigs        | misc. data  | misc. data  |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$abund**             |                   | *numeric    | contigs        | samples     | abundances  |
|                  |                        |                   | matrix*     |                |             | (reads)     |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$bases**             |                   | *numeric    | contigs        | samples     | abundances  |
|                  |                        |                   | matrix*     |                |             | (bases)     |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$cov**               |                   | *numeric    | contigs        | samples     | coverages   |
|                  |                        |                   | matrix*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$cpm**               |                   | *numeric    | contigs        | samples     | covs. /     |
|                  |                        |                   | matrix*     |                |             | 10^6 reads  |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$tpm**               |                   | *numeric    | contigs        | samples     | tpm         |
|                  |                        |                   | matrix*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$seqs**              |                   | *character  | contigs        | (n/a)       | sequences   |
|                  |                        |                   | vector*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$tax**               |                   | *character  | contigs        | tax. ranks  | taxonomies  |
|                  |                        |                   | matrix*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$tax_abund**         |                   | See         |                |             |             |
|                  |                        |                   | SQM$taxa    |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$bins**              |                   | *character  | contigs        | bin.        | bins        |
|                  |                        |                   | matrix*     |                | methods     |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
| $bins            | **$table**             |                   | *dataframe* | bins           | misc. data  | misc. data  |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$length**            |                   | *numeric    | bins           | (n/a)       | length      |
|                  |                        |                   | vector*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$abund**             |                   | *numeric    | bins           | samples     | abundances  |
|                  |                        |                   | matrix*     |                |             | (reads)     |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$percent**           |                   | *numeric    | bins           | samples     | abundances  |
|                  |                        |                   | matrix*     |                |             | (reads)     |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$bases**             |                   | *numeric    | bins           | samples     | abundances  |
|                  |                        |                   | matrix*     |                |             | (bases)     |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$cov**               |                   | *numeric    | bins           | samples     | coverages   |
|                  |                        |                   | matrix*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$cpm**               |                   | *numeric    | bins           | samples     | covs. /     |
|                  |                        |                   | matrix*     |                |             | 10^6 reads  |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$tax**               |                   | *character  | bins           | tax. ranks  | taxonomy    |
|                  |                        |                   | matrix*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$tax_abund**         |                   | See         |                |             |             |
|                  |                        |                   | SQM$taxa    |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$tax_gtdb**          |                   | *character  | bins           | tax. ranks  | GTDB        |
|                  |                        |                   | matrix*     |                |             | taxonomy    |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$tax_abund_gtdb**    |                   | See         |                |             |             |
|                  |                        |                   | SQM$taxa    |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
| **$taxa**        | **$superkingdom**      | **$abund**        | *numeric    | superkingdoms  | samples     | abundances  |
|                  |                        |                   | matrix*     |                |             | (reads)     |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  |                        | **$percent**      | *numeric    | superkingdoms  | samples     | percentages |
|                  |                        |                   | matrix*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$phylum**            | **$abund**        | *numeric    | phyla          | samples     | abundances  |
|                  |                        |                   | matrix*     |                |             | (reads)     |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  |                        | **$percent**      | *numeric    | phyla          | samples     | percentages |
|                  |                        |                   | matrix*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$class**             | **$abund**        | *numeric    | classes        | samples     | abundances  |
|                  |                        |                   | matrix*     |                |             | (reads)     |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  |                        | **$percent**      | *numeric    | classes        | samples     | percentages |
|                  |                        |                   | matrix*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$order**             | **$abund**        | *numeric    | orders         | samples     | abundances  |
|                  |                        |                   | matrix*     |                |             | (reads)     |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  |                        | **$percent**      | *numeric    | orders         | samples     | percentages |
|                  |                        |                   | matrix*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$family**            | **$abund**        | *numeric    | families       | samples     | abundances  |
|                  |                        |                   | matrix*     |                |             | (reads)     |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  |                        | **$percent**      | *numeric    | families       | samples     | percentages |
|                  |                        |                   | matrix*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$genus**             | **$abund**        | *numeric    | genera         | samples     | abundances  |
|                  |                        |                   | matrix*     |                |             | (reads)     |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  |                        | **$percent**      | *numeric    | genera         | samples     | percentages |
|                  |                        |                   | matrix*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$species**           | **$abund**        | *numeric    | species        | samples     | abundances  |
|                  |                        |                   | matrix*     |                |             | (reads)     |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  |                        | **$percent**      | *numeric    | species        | samples     | percentages |
|                  |                        |                   | matrix*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
| **$functions**   | **$KEGG**              | **$abund**        | *numeric    | KEGG ids       | samples     | abundances  |
|                  |                        |                   | matrix*     |                |             | (reads)     |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  |                        | **$bases**        | *numeric    | KEGG ids       | samples     | abundances  |
|                  |                        |                   | matrix*     |                |             | (bases)     |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  |                        | **$cov**          | *numeric    | KEGG ids       | samples     | coverages   |
|                  |                        |                   | matrix*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  |                        | **$cpm**          | *numeric    | KEGG ids       | samples     | covs. /     |
|                  |                        |                   | matrix*     |                |             | 10^6 reads  |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  |                        | **$tpm**          | *numeric    | KEGG ids       | samples     | tpm         |
|                  |                        |                   | matrix*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  |                        | **$copy_number**  | *numeric    | KEGG ids       | samples     | avg. copies |
|                  |                        |                   | matrix*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$COG**               | **$abund**        | *numeric    | COG ids        | samples     | abundances  |
|                  |                        |                   | matrix*     |                |             | (reads)     |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  |                        | **$bases**        | *numeric    | COG ids        | samples     | abundances  |
|                  |                        |                   | matrix*     |                |             | (bases)     |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  |                        | **$cov**          | *numeric    | COG ids        | samples     | coverages   |
|                  |                        |                   | matrix*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  |                        | **$cpm**          | *numeric    | COG ids        | samples     | covs. /     |
|                  |                        |                   | matrix*     |                |             | 10^6 reads  |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  |                        | **$tpm**          | *numeric    | COG ids        | samples     | tpm         |
|                  |                        |                   | matrix*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  |                        | **$copy_number**  | *numeric    | COG ids        | samples     | avg. copies |
|                  |                        |                   | matrix*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$PFAM**              | **$abund**        | *numeric    | PFAM ids       | samples     | abundances  |
|                  |                        |                   | matrix*     |                |             | (reads)     |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  |                        | **$bases**        | *numeric    | PFAM ids       | samples     | abundances  |
|                  |                        |                   | matrix*     |                |             | (bases)     |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  |                        | **$cov**          | *numeric    | PFAM ids       | samples     | coverages   |
|                  |                        |                   | matrix*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  |                        | **$cpm**          | *numeric    | PFAM ids       | samples     | covs. /     |
|                  |                        |                   | matrix*     |                |             | 10^6 reads  |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  |                        | **$tpm**          | *numeric    | PFAM ids       | samples     | tpm         |
|                  |                        |                   | matrix*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  |                        | **$copy_number**  | *numeric    | PFAM ids       | samples     | avg. copies |
|                  |                        |                   | matrix*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
| **$total_reads** |                        |                   | *numeric    | samples        | (n/a)       | total reads |
|                  |                        |                   | vector*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
| **$misc**        | **$project_name**      |                   | *character  | (empty)        | (n/a)       | project     |
|                  |                        |                   | vector*     |                |             | name        |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$samples**           |                   | *character  | (empty)        | (n/a)       | samples     |
|                  |                        |                   | vector*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$tax_names_long**    | **$superkingdom** | *character  | short names    | (n/a)       | full names  |
|                  |                        |                   | vector*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  |                        | **$phylum**       | *character  | short names    | (n/a)       | full names  |
|                  |                        |                   | vector*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  |                        | **$class**        | *character  | short names    | (n/a)       | full names  |
|                  |                        |                   | vector*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  |                        | **$order**        | *character  | short names    | (n/a)       | full names  |
|                  |                        |                   | vector*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  |                        | **$family**       | *character  | short names    | (n/a)       | full names  |
|                  |                        |                   | vector*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  |                        | **$genus**        | *character  | short names    | (n/a)       | full names  |
|                  |                        |                   | vector*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  |                        | **$species**      | *character  | short names    | (n/a)       | full names  |
|                  |                        |                   | vector*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$tax_names_short**   |                   | *character  | full names     | (n/a)       | short names |
|                  |                        |                   | vector*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$KEGG_names**        |                   | *character  | KEGG ids       | (n/a)       | KEGG names  |
|                  |                        |                   | vector*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$KEGG_paths**        |                   | *character  | KEGG ids       | (n/a)       | KEGG        |
|                  |                        |                   | vector*     |                |             | hiararchy   |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$COG_names**         |                   | *character  | COG ids        | (n/a)       | COG names   |
|                  |                        |                   | vector*     |                |             |             |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$COG_paths**         |                   | *character  | COG ids        | (n/a)       | COG         |
|                  |                        |                   | vector*     |                |             | hierarchy   |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+
|                  | **$ext_annot_sources** |                   | *character  | COG ids        | (n/a)       | external    |
|                  |                        |                   | vector*     |                |             | databases   |
+------------------+------------------------+-------------------+-------------+----------------+-------------+-------------+

If external databases for functional classification were provided to
SqueezeMeta via the ``-extdb`` argument, the corresponding abundance
(reads and bases), coverages, tpm and copy number profiles will be
present in ``SQM$functions`` (e.g. results for the CAZy database would
be present in ``SQM$functions$CAZy``). Additionally, the extended names
of the features present in the external database will be present in
``SQM$misc`` (e.g. ``SQM$misc$CAZy_names``).

Examples
~~~~~~~~

.. code:: R

   ## Not run: 
   ## (outside R)
   ## Run SqueezeMeta on the test data.
    /path/to/SqueezeMeta/scripts/SqueezeMeta.pl -p Hadza -f raw -m coassembly -s test.samples
   ## Now go into R.
   library(SQMtools)
   Hadza = loadSQM("Hadza") # Where Hadza is the path to the SqueezeMeta output directory.

   ## End(Not run)

   data(Hadza) # We will illustrate the structure of the SQM object on the test data
   # Which are the ten most abundant KEGG IDs in our data?
   topKEGG = names(sort(rowSums(Hadza$functions$KEGG$tpm), decreasing=TRUE))[1:11]
   topKEGG = topKEGG[topKEGG!="Unclassified"]
   # Which functions do those KEGG IDs represent?
   Hadza$misc$KEGG_names[topKEGG]
   # What is the relative abundance of the Negativicutes class across samples?
   Hadza$taxa$class$percent["Negativicutes",]
   # Which information is stored in the orf, contig and bin tables?
   colnames(Hadza$orfs$table)
   colnames(Hadza$contigs$table)
   colnames(Hadza$bins$table)
   # What is the GC content distribution of my metagenome?
   boxplot(Hadza$contigs$table[,"GC perc"]) # Not weighted by contig length or abundance!