********* subsetFun ********* .. container:: ========= =============== subsetFun R Documentation ========= =============== .. rubric:: Filter results by function :name: subsetFun .. rubric:: Description :name: description Create a SQM or SQMbunch object containing only the ORFs with a given function, and the contigs and bins that contain them. .. rubric:: Usage :name: usage .. code:: R subsetFun( SQM, fun, columns = NULL, ignore_case = TRUE, fixed = FALSE, trusted_functions_only = FALSE, ignore_unclassified_functions = FALSE, rescale_tpm = FALSE, rescale_copy_number = FALSE, recalculate_bin_stats = FALSE, allow_empty = FALSE ) .. rubric:: Arguments :name: arguments +----------------------------------+----------------------------------+ | ``SQM`` | SQM or SQMbunch object to be | | | subsetted. | +----------------------------------+----------------------------------+ | ``fun`` | character. Pattern to search for | | | in the different functional | | | classifications. | +----------------------------------+----------------------------------+ | ``columns`` | character. Restrict the search | | | to the provided column names | | | from ``SQM$orfs$table``. If not | | | provided the search will be | | | performed in all the columns | | | containing functional | | | information (default ``NULL``). | +----------------------------------+----------------------------------+ | ``ignore_case`` | logical Make pattern matching | | | case-insensitive (default | | | ``TRUE``). | +----------------------------------+----------------------------------+ | ``fixed`` | logical. If ``TRUE``, pattern is | | | a string to be matched as is. If | | | ``FALSE`` the pattern is treated | | | as a regular expression (default | | | ``FALSE``). | +----------------------------------+----------------------------------+ | ``trusted_functions_only`` | logical. If ``TRUE``, only | | | highly trusted functional | | | annotations (best hit + best | | | average) will be considered when | | | generating aggregated function | | | tables. If ``FALSE``, best hit | | | annotations will be used | | | (default ``FALSE``). | +----------------------------------+----------------------------------+ | ` | logical. If ``FALSE``, ORFs with | | `ignore_unclassified_functions`` | no functional classification | | | will be aggregated together into | | | an "Unclassified" category. If | | | ``TRUE``, they will be ignored | | | (default ``FALSE``). | +----------------------------------+----------------------------------+ | ``rescale_tpm`` | logical. If ``TRUE``, TPMs for | | | KEGGs, COGs, and PFAMs will be | | | recalculated (so that the TPMs | | | in the subset actually add up to | | | 1 million). Otherwise, | | | per-function TPMs will be | | | calculated by aggregating the | | | TPMs of the ORFs annotated with | | | that function, and will thus | | | keep the scaling present in the | | | parent object (default | | | ``FALSE``). | +----------------------------------+----------------------------------+ | ``rescale_copy_number`` | logical. If ``TRUE``, copy | | | numbers with be recalculated | | | using the median single-copy | | | gene coverages in the subset. | | | Otherwise, single-copy gene | | | coverages will be taken from the | | | parent object. By default it is | | | set to ``FALSE``, which means | | | that the returned copy numbers | | | for each function will represent | | | the average copy number of that | | | function per genome in the | | | parent object. | +----------------------------------+----------------------------------+ | ``recalculate_bin_stats`` | logical. If ``TRUE``, bin | | | abundance, quality and taxonomy | | | are recalculated based on the | | | contigs present in the subsetted | | | object (default ``FALSE``). | +----------------------------------+----------------------------------+ | ``allow_empty`` | (internal use only). | +----------------------------------+----------------------------------+ .. rubric:: Value :name: value SQM or SQMbunch object containing only the requested function. .. rubric:: See Also :name: see-also ``subsetTax``, ``subsetORFs``, ``subsetSamples``, ``combineSQM``. The most abundant items of a particular table contained in a SQM object can be selected with ``mostAbundant``. .. rubric:: Examples :name: examples .. code:: R data(Hadza) Hadza.iron = subsetFun(Hadza, "iron") Hadza.carb = subsetFun(Hadza, "Carbohydrate metabolism") # Search for multiple patterns using regular expressions Hadza.twoKOs = subsetFun(Hadza, "K00812|K00813", fixed=FALSE)