library("ampvis")

data(DNAext_1.0)

## Subset to the relevant samples

All samples are subset to 17.000 reads and then only OTUs which are seen at least 10 / 17000 times in a single sample is kept for further ordination analysis.

rep <- subset_samples(V13, Exp.Biorep == "YES") %>%
rarefy_even_depth(sample.size = 17000, rngseed = 712) %>%
filter_taxa(function(x) max(x) >= 10, TRUE)

## Figure 1A: PCA colored by sampling site

PCA with square root transformed OTU abundances. The effect of sampling from different tanks is tested using the envfit function in vegan (permutation test).

pca <- amp_ordinate(data = rep,
plot.point.size = 3,
plot.theme = "clean",
envfit.show = F,
output = "complete"
)

It looks like there is no significant groupings.

pca$plot + scale_color_discrete(name = "Sampling site") + theme(legend.position = "none") + xlim(-5,5) ggsave("plots/Fig1A.eps", width = 55, height = 55, units = "mm") The model reports a p-value of 0.529, hence there is no effect of sampling from different tanks. pca$eff.model
##
## ***FACTORS:
##
## Centroids:
##                      PC1     PC2
##
## Goodness of fit:
##              r2 Pr(>r)
## Permutation: free
## Number of permutations: 999

## Cluster analysis of beta diversity using Bray-Curtis

The Bray-Curtis dissimilarity index is used as an alternative method to test for significant groupings in the dataset.

beta <- amp_test_cluster(data = rep,
group = "Bio.Rep",
method = "bray",
plot.theme = "clean")

Using adonis we do not find a significant effect either as the p-value is 0.111.

beta$adonis ## ## Call: ## adonis(formula = test_formula, data = sample) ## ## Permutation: free ## Number of permutations: 999 ## ## Terms added sequentially (first to last) ## ## Df SumsOfSqs MeanSqs F.Model R2 Pr(>F) ## Bio.Rep 3 0.025534 0.0085112 1.132 0.29801 0.111 ## Residuals 8 0.060148 0.0075185 0.70199 ## Total 11 0.085682 1.00000 Clustering the data show a similar trend, no grouping by sampling site. beta$plot_cluster +
theme(legend.position = "none")

## Variation at OTU level

### Subset to the relevant samples

rr <- subset_samples(V13, Exp.Biorep == "YES") %>%
rarefy_even_depth(sample.size = 32000, rngseed = 712) 

## Calculate 95% confidence interval

The 95% confidence interval for each OTU is calculated in order to compare the variation as a function of sequencing depth. The confidence interval is expressed as % of the mean count in order to compare across sequencing depth. The population standard deviation is estimated from the 11 samples and the confidence interval is calculated using 3 samples.

data <- data.frame(OTU = rownames(otu_table(rr)@.Data), otu_table(rr)@.Data) %>%
melt(id.vars = "OTU", value.name = "Count", variable.name = "Sample") %>%
group_by(OTU) %>%
summarise(Mean = mean(Count),
pRR = sd(Count)*qt(0.975,df = n()-1)/mean(Count)*100,
pCI3 = (sd(Count)*qt(0.975,df = n()-1)/sqrt(3))/mean(Count)*100)

## Figure Fig1B: 95% confidence interval as function of mean OTU count

ggplot(data, aes(x = Mean, y = pCI3)) +
geom_point(size = 1) +
geom_smooth(se = F, color = "darkred", size = 1) +
scale_x_log10(limits=c(1,2000), breaks = c(1, 10, 100, 1000)) +
scale_y_continuous(limits=c(1,200), breaks = c(0, 20, 50, 100, 150, 200)) +
xlab("Mean count") +
ylab("95% CI as % of mean count") +
#geom_hline(y=15, linetype = "dashed", color = "darkred") +
#geom_vline(x=10, linetype = "dashed", color = "darkred") +
theme(legend.position = "none",
text = element_text(size = 8, color = "black"),
axis.text = element_text(size = 8, color = "black"),
axis.text.y = element_text(hjust = 1),
plot.margin = unit(c(0,0,0,0), "mm"),
axis.line = element_line(color = "black"),
panel.grid.minor = element_blank(),
panel.grid.major = element_line(color = "grey95"),
axis.ticks = element_line(color = "black"),
axis.ticks.length = unit(1, "mm"),
panel.background = element_blank()
)

ggsave("plots/Fig1B.eps", width = 55, height = 55, units = "mm")