Load packages

library("ampvis")

Load data

data(DNAext_1.0)

Subset to the relevant dataset

All samples are subset to 25.000 reads and then only OTUs which are seen at least 10 / 25000 times in a single sample is kept for further ordination analysis.

storage <- subset_samples(V13, Exp.storage == "YES") %>%
  rarefy_even_depth(sample.size = 25000, rngseed = 712) %>%
  filter_taxa(function(x) max(x) >= 10, TRUE)

Figure S2A: Overall differences between samples using PCA

PCA with square root transformed OTU abundances. The effect of sampling from different tanks is tested using the envfit function in vegan (permutation test).

pca <- amp_ordinate(data = storage, 
             plot.color = "Storage", 
             plot.point.size = 3,
             plot.theme = "clean",
             envfit.factor = "Storage",
             envfit.show = F,
             output = "complete"
             )

Plot the PCA. It looks like there might be some significant grouping.

pca$plot +
  theme(legend.position = "none")

ggsave("plots/S2A.eps", width = 55, height = 55, units = "mm")

The model reports a p-value of 0.01, hence there no overall effect of storage methods.

pca$eff.model
## 
## ***FACTORS:
## 
## Centroids:
##                    PC1     PC2
## Storage24h.20C  2.4467  0.2418
## Storage24h.4C  -2.2132 -2.1112
## StorageDirect  -0.2335  1.8694
## 
## Goodness of fit:
##             r2 Pr(>r)   
## Storage 0.5802   0.01 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Permutation: free
## Number of permutations: 999

Figure S2B: Cluster analysis of beta diversity using Bray-Curtis

The Bray-Curtis dissimilarity index is used as an alternative method to test for significant groupings in the dataset.

beta <- amp_test_cluster(data = storage, 
                         group = "Storage", 
                         method = "bray", 
                         plot.color = "Storage", 
                         plot.label = "Storage",
                         plot.theme = "clean")

Using adonis we also a small significant effect of storage method as the p-value is 0.01.

beta$adonis
## 
## Call:
## adonis(formula = test_formula, data = sample) 
## 
## Permutation: free
## Number of permutations: 999
## 
## Terms added sequentially (first to last)
## 
##           Df SumsOfSqs   MeanSqs F.Model     R2 Pr(>F)   
## Storage    2  0.015723 0.0078617  1.3397 0.3087   0.01 **
## Residuals  6  0.035211 0.0058685         0.6913          
## Total      8  0.050935                   1.0000          
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Clustering the data also shows that there is no distinct effect of storage.

beta$plot_cluster +
  theme(legend.position = "none")

ggsave("plots/S2B.eps", width = 60, height = 55, units = "mm")

Figure 2C: Variance compared to time-series data

storage_time <- subset_samples(V13, Exp.storage == "YES"| Exp.time == "YES") %>%
  rarefy_even_depth(sample.size = 25000, rngseed = 712) %>%
  filter_taxa(function(x) max(x) >= 10, TRUE)

Looking at the data using PCA. It seems like we can’t seperate the timpoints within 2 weeks now.

amp_ordinate(data = storage_time, 
             plot.color = "Date", 
             plot.point.size = 3,
             plot.theme = "clean"
             ) +
  scale_color_discrete(name = "Sampling date") +
  theme(legend.key.height = unit(3, "mm"))

ggsave("plots/S2C.eps", width = 90, height = 55, units = "mm")

Figure 2D: Using clustering to estimate classification resolution

beta_time <- amp_test_cluster(data = storage_time, 
                              group = "Storage", 
                              method = "bray", 
                              plot.color = "Date", 
                              plot.label = c("Storage"),
                              plot.theme = "clean")
beta_time$plot_cluster +
  theme(legend.position = "none")

ggsave("plots/S2D.eps", width = 60, height = 55, units = "mm")