Introduction

This report documents the binning of a Nitrospira genome from the GWW sample. See Daims et al., 2015: Complete Nitrification by Nitrospira Bacteria for further details.

Load the mmgenome package

The metagenome data is analysed using the mmgenome package.

library("mmgenome")

Import data

The Rmarkdown file Load_data.Rmd describes the loading of the data and can be imported using the mmimport function. However, the preprocessed data can also be downloaded directly from figshare: Daims_GWW. Hence, here we import the prepocessed data from figshare instead.

load("Daims_GWW.RData")

Extract Nitrospira 2

The second Nitrospira bin can also be relatively easy seperated from the other genomes.

p <- mmplot(data = d, 
            x = "HPD", 
            y = "HPF1", 
            log.x = F, 
            log.y = F, 
            color = "essential", 
            minlength = 1000,
            factor.shape = "solid") +
  xlim(0, 50) +
  ylim(50, 600)

#p
#sel <- mmplot_locator(p)

sel <- data.frame(HPD  =  c(-0.483, 0.487, 12.9, 25.5, 23.4, 11.7, 2.75),
                  HPF1  =  c(210, 292, 307, 292, 209, 148, 148))

mmplot_selection(p, sel)

The scaffolds included in the defined subspace are extracted using the mmextract function.

dA <- mmextract(d, sel)

The mmstats function applies to any extracted object. Hence, it can be used directly on the subset.

mmstats(dA, ncov = 2)
##               General Stats
## n.scaffolds          559.00
## GC.mean               57.10
## N50                10413.00
## Length.total     3703093.00
## Length.max         56929.00
## Length.mean         6624.50
## Coverage.HPD          10.55
## Coverage.HPF1        228.21
## Ess.total            100.00
## Ess.unique            84.00

Export the scaffolds

Finally the binned scaffolds are exported.

mmexport(data = dA, assembly = assembly, file = "GWW_Nitrospira2.fa")