16S_QIIME2

This is a Snakemake based 16S QIIME2 pipeline.

Installation

To install, we assume you already have installed Miniconda3 (https://docs.conda.io/en/latest/miniconda.html)

Clone this repository:

git clone https://github.com/scottdaniel/16S_QIIME2.git

Create a conda environment:

cd 16S_QIIME2
conda create --name qiime2-amplicon-2026.1 --file environment.yml

If this doesn't work or you're not on a linux platform, you can manually install following these instructions: (https://library.qiime2.org/quickstart/amplicon)

To run the pipeline, activate the envrionment (currently based on qiime2-amplicon-2026.1) by entering conda activate qiime2-amplicon-2026.1

The following software also need to be installed within the environment you created:
- dnabc (https://github.com/PennChopMicrobiomeProgram/dnabc)
- unassigner (https://github.com/kylebittinger/unassigner)

Required input files for the pipeline

To run the pipeline, we need

Multiplexed R1/R2 read pairs (Undetermined_S0_L001_R1_001.fastq.gz, Undetermined_S0_L001_R2_001.fastq.gz), and
QIIME2 compatible mapping file
- Tab delimited
- The first two columns should be SampleID (or #SampleID) and BarcodeSequence

Databases required

Qiime2 classifier (https://library.qiime2.org/data-resources#qiime-2-2024-5-present) dada2 training set (https://benjjneb.github.io/dada2/training.html)

How to run

Create a project directory, e.g. ~/16S_QIIME2/test and put the mapping file, e.g. test_mapping_file.tsv in the project directory. If you are running this on the cluster, the data would be staged in a scratch drive e.g. /scr1/username
Edit qiime2_config.yml so that it suits your project. In particular,
- all: project: path to the project directory, e.g. ~/16S_QIIME2/test
- all: mux_dir: the direcotry containing multiplexed R1/R2 read pairs, e.g. ~/16S_QIIME2/test/multiplexed_fastq
- all: mapping: the name of mapping file, e.g. test_mapping_file.tsv
Edit config.yaml for platform specific settings (currently formatted for SLURM on republica)
(Optional) Edit rules\targets\targets.rules to comment out steps you don't need (e.g. #TARGET_PICRUST2)
To run the pipeline, activate the envrionment by entering conda activate qiime2-2023.2, cd into 16S_QIIME2 and execute snakemake --profile ./
- If using sbatch you can just execute the script ./run_snakemake.bash
- You can also do a dryrun: ./dryrun_snakemake.bash

Intermediate steps and corresponding input/output

Demultiplexing

Input

Multiplexed R1/R2 read pairs
QIIME2 compatible mapping file

Output

Demultiplexed fastq(.gz) files
Total read count summary (tsv)
QIIME2 compatible manifest file (csv)

QIIME2 import

Input

QIIME2 compatible manifest file
Demultiplexed fastq files

Output

QIIME2 PairedEndSequencesWithQuality artifact and corresponding visualization
QIIME2-generated demultiplexing stats

DADA2 denoise

Input

QIIME2 PairedEndSequencesWithQuality artifact

Output

Feature table (QIIME2 artifact, tsv)
Representative sequences (QIIME2 artifact, fasta)

Taxonomy classification

Input

Representative sequences

Output

Taxonomy classification table (QIIME2 artifact, tsv)

Tree building

Input

Representative sequences

Output

Aligned sequence
Masked (aligned) sequence
Unrooted tree
Rooted tree

Diversity calculation

Input

Rooted tree

Output

Various QIIME2 diversity metric artifacts
Faith phylogenetic diversity vector (tsv)
Weighted/unweighted UniFrac distance matrices (tsv)

Unassigner

Input

Representative sequences (fasta)

Output

Unassigner output (tsv) for species level classification of representative sequences

dada2_species

Input

Representative sequences (fasta)

Output

Dada2 species assignments (tsv)
Dada2 Raw data for loading in R (RData format)

vsearch

Input

Representative sequences (fasta)

Output

Vsearch report (tsv) customized to be like BLAST results (see config.yml)
Vsearch list of representative sequences that aligned (fasta)

Updating qiime2

Manually install a new version of qiime2 using conda (https://library.qiime2.org/quickstart/amplicon)
Activate the environment and additionally install snakemake, dnabc, and unassigner
Update the environment.yml using:
conda activate myenv
conda env export > environment.yml
git commit / push your changes (to your own fork) and create a pull request for scottdaniel/16S_QIIME2

Name		Name	Last commit message	Last commit date
Latest commit History 188 Commits
rules		rules
scripts		scripts
test		test
.gitignore		.gitignore
README.md		README.md
Snakefile		Snakefile
config.yaml		config.yaml
dryrun_snakemake.bash		dryrun_snakemake.bash
environment.yml		environment.yml
qiime2_config.yml		qiime2_config.yml
run_snakemake.bash		run_snakemake.bash

Folders and files

Latest commit

History

Repository files navigation

16S_QIIME2

Installation

Required input files for the pipeline

Databases required

How to run

Intermediate steps and corresponding input/output

Demultiplexing

Input

Output

QIIME2 import

Input

Output

DADA2 denoise

Input

Output

Taxonomy classification

Input

Output

Tree building

Input

Output

Diversity calculation

Input

Output

Unassigner

Input

Output

dada2_species

Input

Output

vsearch

Input

Output

Updating qiime2

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages