Introduction

This report documents the binning of a Nitrospira Comammox genome from a pilot-scale MBR reactor used for wastewater treatment (MBR). See Daims et al., 2015: Complete Nitrification by Nitrospira Bacteria for further details.

Load the mmgenome package

The metagenome data is analysed using the mmgenome package.

library("mmgenome")

Import data

The Rmarkdown file Load_data.Rmd describes the loading of the data and can be imported using the mmimport function. However, the preprocessed data can also be downloaded directly from figshare: Daims_MBR. Hence, here we import the prepocessed data from figshare instead.

load("Daims_MBR.RData")

Data overview

The object d contains information on scaffolds and essential genes within the scaffolds. For each scaffold the dataset contains the following information: The columns MBR1, Gel1, Gel7, Gel, MBR, Foam and Foam2 contain the coverage information from 4 different samples; PC1, PC2 and PC3 contain coordinates of the three first principal components from a PCA analysis on tetranucleotide frequencies; essential contain information taxonomic information for each scaffold based on classification on essential genes; rRNA contain taxonomic information on scaffolds that have an associated 16S rRNA gene.

colnames(d$scaffolds)
##  [1] "scaffold"  "length"    "gc"        "MBR1"      "Gel1"     
##  [6] "Gel7"      "Gel"       "MBR"       "Foam"      "Foam2"    
## [11] "PC1"       "PC2"       "PC3"       "essential" "rRNA16S"

The basic statistics of the full dataset can be summarised using the mmstats function.

mmstats(d, ncov = 7)
##                General Stats
## n.scaffolds        102630.00
## GC.mean                56.70
## N50                  3791.00
## Length.total    301288901.00
## Length.max         913506.00
## Length.mean          2935.70
## Coverage.MBR1           2.86
## Coverage.Gel1           6.33
## Coverage.Gel7           8.12
## Coverage.Gel            1.21
## Coverage.MBR            0.66
## Coverage.Foam           0.84
## Coverage.Foam2          4.05
## Ess.total            5840.00
## Ess.unique            109.00

Overview plot

The assembly is decent, even though it is from a full-scale sample with a high degree of micro-diversity.