Harper et al. (2020)

Data processing workflow and supplementary data for:

Harper et al. (2020) Using DNA metabarcoding to investigate diet and niche partitioning in the native European otter (Lutra lutra) and invasive American mink (Neovison vison)

Permanently archived at:

Instructions to set up dependencies for data processing and analyses

To facilitate full reproducibility of our analyses, we provide Jupyter notebooks illustrating our workflow and all necessary associated data in this repository.

Illumina data was processed (from raw reads to taxonomic assignment) using the metaBEAT pipeline. The pipeline relies on a range of open bioinformatics tools, which we have wrapped up in a self-contained docker image that includes all necessary dependencies here.

Setting up the environment

In order to retrieve scripts and associated data (reference sequences, sample metadata etc.), start by cloning this repository to your current directory:

git clone --recursive https://github.com/HullUni-bioinformatics/Harper_et_al_2020_mustelid_diet_metabarcoding.git

In order to make use of our self contained analysis environment, you will have to install Docker on your computer. Docker is compatible with all major operating systems, but see the Docker documentation for details. On Ubuntu, installing Docker should be as easy as:

sudo apt-get install docker.io

Once Docker is installed, you can enter the environment by typing:

sudo docker run -i -t --net=host --name metaBEAT -v $(pwd):/home/working chrishah/metabeat /bin/bash

This will download the metaBEAT image (if not yet present on your computer) and enter the 'container' i.e. the self contained environment (NB: sudo may be necessary in some cases). With the above command, the container's directory /home/working will be mounted to your current working directory (as instructed by $(pwd)). In other words, anything you do in the container's /home/working directory will be synced with your current working directory on your local machine.

Data processing workflow as Jupyter notebooks

Raw illumina data has been deposited on the NCBI SRA:

Study: SRP270831
BioProject: PRJNA644190
BioSample accessions: SAMN15452877-SAMN15453005 (Library 1) and SAMN15455442-SAMN15455596 (Library 2)
SRA accessions: SRR12168859-SRR12168984 (Library 1) and SRR12176017-SRR12176170 (Library 2)

The sample specific accessions can be found here. Before following the workflow for data processing, you'll need to download the raw reads from the SRA. To download the raw read data, you can follow the steps in this Jupyter notebook.

With the data in place, you should be able to fully reproduce our analyses by following the steps outlined in the Jupyter notebook.

The workflow illustrated in the notebooks assumes that the raw Illumina data is present in a directory raw_reads at the base of the repository structure and that the files are named according to the following convention: 'sampleID-marker', followed by '_R1' or '_R2' to identify the forward/reverse read file respectively. SampleID must correspond to the first column in the file Sample_accessions.tsv here.

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
Data		Data
Jupyter_notebooks		Jupyter_notebooks
R_scripts		R_scripts
Reference_database		Reference_database
raw_reads		raw_reads
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Harper et al. (2020)

Contents

Instructions to set up dependencies for data processing and analyses

Setting up the environment

Data processing workflow as Jupyter notebooks

About

Releases 4

Packages

Languages

HullUni-bioinformatics/Harper_et_al_2020_mustelid_diet_metabarcoding

Folders and files

Latest commit

History

Repository files navigation

Harper et al. (2020)

Contents

Instructions to set up dependencies for data processing and analyses

Setting up the environment

Data processing workflow as Jupyter notebooks

About

Resources

Stars

Watchers

Forks

Releases 4

Packages 0

Languages

Packages