This report documents the binning of a Nitrospira genome from the GWW sample. See Daims et al., 2015: Complete Nitrification by Nitrospira Bacteria for further details.
The metagenome data is analysed using the mmgenome package.
library("mmgenome")
The Rmarkdown file Load_data.Rmd describes the loading of the data and can be imported using the mmimport
function. However, the preprocessed data can also be downloaded directly from figshare: Daims_GWW. Hence, here we import the prepocessed data from figshare instead.
load("Daims_GWW.RData")
The second Nitrospira bin can also be relatively easy seperated from the other genomes.
p <- mmplot(data = d,
x = "HPD",
y = "HPF1",
log.x = F,
log.y = F,
color = "essential",
minlength = 1000,
factor.shape = "solid") +
xlim(0, 50) +
ylim(50, 600)
#p
#sel <- mmplot_locator(p)
sel <- data.frame(HPD = c(-0.483, 0.487, 12.9, 25.5, 23.4, 11.7, 2.75),
HPF1 = c(210, 292, 307, 292, 209, 148, 148))
mmplot_selection(p, sel)
The scaffolds included in the defined subspace are extracted using the mmextract
function.
dA <- mmextract(d, sel)
The mmstats
function applies to any extracted object. Hence, it can be used directly on the subset.
mmstats(dA, ncov = 2)
## General Stats
## n.scaffolds 559.00
## GC.mean 57.10
## N50 10413.00
## Length.total 3703093.00
## Length.max 56929.00
## Length.mean 6624.50
## Coverage.HPD 10.55
## Coverage.HPF1 228.21
## Ess.total 100.00
## Ess.unique 84.00
Finally the binned scaffolds are exported.
mmexport(data = dA, assembly = assembly, file = "GWW_Nitrospira2.fa")