Generates a heatmap of amplicon data by using sample metadata to aggregate samples and taxonomy to aggregate OTUs.

amp_heatmap(data, group_by = "")

Arguments

data

(required) Data list as loaded with amp_load.

group_by

(recommended) Group the samples by a categorical variable in the metadata. If NULL then all samples are shown.

facet_by

Facet the samples by a categorical variable in the metadata.

normalise

(logical) Transform the OTU read counts to be in percent per sample. (default: TRUE)

tax_aggregate

The taxonomic level to aggregate the OTUs. (default: "Phylum")

tax_add

Additional taxonomic level(s) to display, e.g. "Phylum". (default: "none")

tax_show

The number of taxa to show, or a vector of taxa names. (default: 10)

tax_class

Converts a specific phylum to class level instead, e.g. "p__Proteobacteria".

tax_empty

How to show OTUs without taxonomic information. One of the following:

  • "remove": Remove OTUs without taxonomic information.

  • "best": (default) Use the best classification possible.

  • "OTU": Display the OTU name.

order_x_by

A sample or vector to order the y-axis by, or "cluster" for hierarchical clustering by hclust.

order_y_by

A taxonomy group or vector to order the x-axis by, or "cluster" for hierarchical clustering by hclust.

plot_values

(logical) Plot the values on the heatmap or not. (default: TRUE)

plot_values_size

The size of the plotted values. (default: 4)

plot_legendbreaks

A vector of breaks for the abundance legend, fx c(1, 10, 20).

plot_colorscale

The type of scale used for the coloring of abundances, either "sqrt" or "log10". (default: "log10")

plot_na

(logical) Whether to color missing values with the lowest color in the scale or not. (default: TRUE)

measure

Calculate and display either "mean", "max" or "median" across the groups. (default: "mean")

min_abundance

All values below this value are given the same color. (default: 0.1)

max_abundance

All values above this value are given the same color.

sort_by

Sort the heatmap by a specific value of the "group_by" argument, e.g. "Treatment A".

normalise_by

A variable or a specific sample in the metadata to normalise the counts by.

scale_by

Scale the abundances by a variable in the metadata.

color_vector

Vector of colors for the colorscale, e.g. c("white", "red").

round

Number of digits to show with the values. (default: 1)

textmap

(logical) Return a data frame to print as raw text instead of a ggplot2 object. (default: FALSE)

plot_functions

Return a 2-column grid plot instead, showing known functional information about the Genus-level OTUs next to the heatmap. When using this feature, make sure that either tax_aggregate is set to "Genus" or that tax_add contains "Genus". (default: FALSE)

function_data

A data frame with functional information about genus-level OTUs in each column. If NULL the data("MiF") dataset will be used. (default: NULL)

functions

A vector with the functions to be displayed. (default: c("MiDAS","FIL", "AOB", "NOB", "PAO", "GAO"))

functions_point_size

Size of the plotted points in the function grid. (default: 5)

rel_widths

A vector with the relative widths of the heatmap and function grid when plot_functions = TRUE. (default: c(0.75, 0.25))

Value

A ggplot2 object, or a data frame if textmap = TRUE.

Preserving relative abundances in a subset of larger data

By default the raw read counts in the abundance matrix are normalised (transformed to percentages) by some plotting functions automatically (for example amp_heatmap, amp_timeseries, and more). This means that the relative abundances shown will be calculated based on the remaining taxa after the subset, not including the removed taxa, if any. To circumvent this, set normalise = TRUE when subsetting with the amp_subset_taxa and amp_subset_samples functions, and then set normalise = FALSE in the plotting function. This will transform the OTU counts to relative abundances BEFORE the subset, and setting normalise = FALSE will skip the transformation in the plotting function, see the example below.

data("MiDAS")
subsettedData <- amp_subset_samples(MiDAS,
                                    Plant %in% c("Aalborg West", "Aalborg East"),
                                    normalise = TRUE
                                    )
amp_heatmap(subsettedData,
            group_by = "Plant",
            tax_aggregate = "Phylum",
            tax_add = "Genus",
            normalise = FALSE
            )

Accessing detailed raw data

The complete raw data used to generate any ggplot can always be accessed with ggplot2_object$data when the plot is saved as a ggplot2 object. Additionally, a "textmap" version of the generated heatmap can also be generated by setting textmap = TRUE to only extract the raw data as shown on the particular heatmap, see examples.

Examples

#Load example data data("AalborgWWTPs") #Heatmap grouped by WWTP amp_heatmap(AalborgWWTPs, group_by = "Plant")
#Heatmap of 20 most abundant Genera (by mean) grouped by WWTP, split by Year, #values not plotted for visibility, phylum name added and colorscale adjusted manually amp_heatmap(AalborgWWTPs, group_by = "Plant", facet_by = "Year", plot_values = FALSE, tax_show = 20, tax_aggregate = "Genus", tax_add = "Phylum", color_vector = c("white", "red"), plot_colorscale = "sqrt", plot_legendbreaks = c(1, 5, 10) )
#Heatmap with known functional information about the Genera shown to the right amp_heatmap(AalborgWWTPs, group_by = "Plant", tax_aggregate = "Genus", plot_functions = TRUE, functions = c("PAO", "GAO", "AOB", "NOB") )
#A raw text version of the heatmap can be printed or saved as a data frame with textmap = TRUE: textmap <- amp_heatmap(AalborgWWTPs, group_by = "Plant", tax_aggregate = "Genus", plot_functions = TRUE, functions = c("PAO", "GAO", "AOB", "NOB"), textmap = TRUE ) textmap
#> Aalborg East Aalborg West PAO GAO AOB NOB #> Tetrasphaera 5.478941 6.842382 POS NEG NEG NEG #> Trichococcus 7.098638 3.028602 NT NT NT NT #> Candidatus Microthrix 2.769444 6.503052 NEG NEG NEG NEG #> Rhodoferax 3.361297 2.403885 NEG NT NT NT #> Rhodobacter 1.847766 2.967430 NT NT NT NT #> Candidatus Promineofilum 1.317348 3.028949 NEG NEG NEG NEG #> Dechloromonas 1.220824 2.503898 VAR NEG NT NT #> Candidatus Defluviifilum 1.824867 1.502981 NEG NEG NEG NEG #> Propionicimonas 1.229790 1.726184 NT NT NT NT #> Fodinibacter 1.382478 1.530972 NT NT NT NT