Figure 1 Sampling

Load packages

library("ampvis")

Load data

data(DNAext_1.0)

Subset to the relevant samples

All samples are subset to 17.000 reads and then only OTUs which are seen at least 10 / 17000 times in a single sample is kept for further ordination analysis.

rep <- subset_samples(V13, Exp.Biorep == "YES") %>%
  rarefy_even_depth(sample.size = 17000, rngseed = 712) %>%
  filter_taxa(function(x) max(x) >= 10, TRUE)

Figure 1A: PCA colored by sampling site

PCA with square root transformed OTU abundances. The effect of sampling from different tanks is tested using the envfit function in vegan (permutation test).

pca <- amp_ordinate(data = rep, 
             plot.color = "Add.Label", 
             plot.point.size = 3,
             plot.theme = "clean",
             envfit.factor = "Add.Label",
             envfit.show = F,             
             output = "complete"
             )

It looks like there is no significant groupings.

pca$plot +
  scale_color_discrete(name = "Sampling site") +
  theme(legend.position = "none") +
  xlim(-5,5)

ggsave("plots/Fig1A.eps", width = 55, height = 55, units = "mm")

The model reports a p-value of 0.529, hence there is no effect of sampling from different tanks.

pca$eff.model

## 
## ***FACTORS:
## 
## Centroids:
##                      PC1     PC2
## Add.LabelTank1-A  1.9160 -1.0045
## Add.LabelTank1-B -0.0376  1.2777
## Add.LabelTank2-A -1.6910 -0.1374
## Add.LabelTank2-B -0.1874 -0.1357
## 
## Goodness of fit:
##              r2 Pr(>r)
## Add.Label 0.259  0.529
## Permutation: free
## Number of permutations: 999

Cluster analysis of beta diversity using Bray-Curtis

The Bray-Curtis dissimilarity index is used as an alternative method to test for significant groupings in the dataset.

beta <- amp_test_cluster(data = rep, 
                         group = "Bio.Rep", 
                         method = "bray", 
                         plot.color = "Add.Label", 
                         plot.label = "Add.Label",
                         plot.theme = "clean")

Using adonis we do not find a significant effect either as the p-value is 0.111.

beta$adonis

## 
## Call:
## adonis(formula = test_formula, data = sample) 
## 
## Permutation: free
## Number of permutations: 999
## 
## Terms added sequentially (first to last)
## 
##           Df SumsOfSqs   MeanSqs F.Model      R2 Pr(>F)
## Bio.Rep    3  0.025534 0.0085112   1.132 0.29801  0.111
## Residuals  8  0.060148 0.0075185         0.70199       
## Total     11  0.085682                   1.00000

Clustering the data show a similar trend, no grouping by sampling site.

beta$plot_cluster +
  theme(legend.position = "none")

Variation at OTU level

Subset to the relevant samples

The samples are subset to 32.000 reads. One of 12 biological replicates is discarded as it only had 17.000 reads.

rr <- subset_samples(V13, Exp.Biorep == "YES") %>%
  rarefy_even_depth(sample.size = 32000, rngseed = 712)

Calculate 95% confidence interval

The 95% confidence interval for each OTU is calculated in order to compare the variation as a function of sequencing depth. The confidence interval is expressed as % of the mean count in order to compare across sequencing depth. The population standard deviation is estimated from the 11 samples and the confidence interval is calculated using 3 samples.

data <- data.frame(OTU = rownames(otu_table(rr)@.Data), otu_table(rr)@.Data) %>%
  melt(id.vars = "OTU", value.name = "Count", variable.name = "Sample") %>%
  group_by(OTU) %>%
  summarise(Mean = mean(Count),
            pRR = sd(Count)*qt(0.975,df = n()-1)/mean(Count)*100,
            pCI3 = (sd(Count)*qt(0.975,df = n()-1)/sqrt(3))/mean(Count)*100)

Figure Fig1B: 95% confidence interval as function of mean OTU count

ggplot(data, aes(x = Mean, y = pCI3)) +
  geom_point(size = 1) +
  geom_smooth(se = F, color = "darkred", size = 1) +
  scale_x_log10(limits=c(1,2000), breaks = c(1, 10, 100, 1000)) +
  scale_y_continuous(limits=c(1,200), breaks = c(0, 20, 50, 100, 150, 200)) +
  xlab("Mean count") +
  ylab("95% CI as % of mean count") +
  #geom_hline(y=15, linetype = "dashed", color = "darkred") +
  #geom_vline(x=10, linetype = "dashed", color = "darkred") +
  theme(legend.position = "none",
        text = element_text(size = 8, color = "black"),
        axis.text = element_text(size = 8, color = "black"),
        axis.text.y = element_text(hjust = 1),
        plot.margin = unit(c(0,0,0,0), "mm"),
        axis.line = element_line(color = "black"),
        panel.grid.minor = element_blank(),
        panel.grid.major = element_line(color = "grey95"),
        axis.ticks = element_line(color = "black"),
        axis.ticks.length = unit(1, "mm"),
        panel.background = element_blank()
        )

ggsave("plots/Fig1B.eps", width = 55, height = 55, units = "mm")

Figure 1 Sampling

Mads Albertsen

19-07-2015

Load packages

Load data

Subset to the relevant samples

Figure 1A: PCA colored by sampling site

Cluster analysis of beta diversity using Bray-Curtis

Variation at OTU level

Subset to the relevant samples

Calculate 95% confidence interval

Figure Fig1B: 95% confidence interval as function of mean OTU count