Software Installation and Management

Installing essential software for Nanopore sequencing

To operate a Nanopore sequencer and analyze sequencing data, three key software tools are required:

MinKNOW: The primary control software for Oxford Nanopore sequencing devices.
EPI2ME: A cloud-based or local analysis platform for processing sequencing data.
Docker: A containerization platform required for running some local bioinformatics workflows.

Below are step-by-step instructions for downloading and installing each tool on macOS and Windows.

MinKNOW

MinKNOW is used to control Nanopore sequencing devices, perform basecalling, and monitor sequencing progress.

macOS InstallationWindows Installation

Visit the Oxford Nanopore Downloads Page and choose your device.
Select MacOS for Apple or Intel (open About This Mac, choose Apple menu  > About This Mac to check the Chip listed.); click Download.
Sign in to your Oxford Nanopore account (or create one if you don’t have an account); your download should begin.
Open the downloaded .dmg file and drag the MinKNOW application to the Applications folder.
Open MinKNOW from the Applications folder.
Follow the on-screen setup instructions and allow necessary permissions when prompted.
Restart your computer to ensure all components load correctly.

Visit the Oxford Nanopore Downloads Page and choose your device.
Sign in to your Oxford Nanopore account (or create one if you don’t have an account); your download should begin.
Download the latest MinKNOW for Windows.
Open the downloaded .exe file and follow the on-screen installation instructions.
Restart your computer if prompted.
Open MinKNOW and ensure it launches properly.

Installing EPI2ME

EPI2ME is a workflow management tool used for analyzing Nanopore sequencing data.

macOS InstallationWindows Installation

Visit the EPI2ME Labs download page.
Select macOS for M1/M2 (Apple) or Intel (open About This Mac, choose Apple menu  > About This Mac to check the Chip listed.); click Download.
Open the .dmg file and drag the EPI2ME application to the Applications folder.
Open EPI2ME from the Applications folder.
Sign in with an existing account or create a new account if needed.
Follow the initial setup process to ensure EPI2ME is ready for use.

Visit the EPI2ME Labs download page.
Download EPI2ME for Windows.
Open the downloaded .exe file and follow the on-screen installation instructions.
Launch EPI2ME and sign in with an existing account or create a new one.
Complete the initial setup and confirm that the software runs correctly.

Tip

EPI2ME requires the Window's Subsystem for Linux (WSL) please see these instructions for installation.

Installing Docker

Docker is required to run many EPI2ME workflows locally.

macOS InstallationWindows Installation

Visit the Docker website and scroll to find the Download Docker for Desktop button. Select Download for Mac MacOS for Apple Silicon or Intel Chip (open About This Mac, choose Apple menu  > About This Mac to check the Chip listed.).
Open the .dmg file and drag the Docker application to the Applications folder.
Open Docker from the Applications folder.
Follow the setup process and allow necessary permissions when prompted.
Once Docker is running, verify the installation by opening a terminal and running:

docker --version

Download Docker Desktop for Windows.
Open the downloaded .exe file and follow the setup instructions.
During installation, enable WSL 2 (Windows Subsystem for Linux) when prompted. If WSL 2 is not installed, follow the instructions provided by Docker to install it.
Restart your computer when the installation is complete.
Open Docker Desktop from the Start menu.
Verify the installation by opening Command Prompt and running:

docker --version

Other software packages for Nanopore sequencing and genomics analysis

Below is a curated list of 40 software tools that are widely used for Nanopore sequencing and genomics data analysis. The selection is based on recent updates in their documentation (within the past two years) and current usage in the field. Each entry includes a brief description and a link to the tool's documentation or repository. Some of these are more advanced in their usage (i.e., commandline) but you are likely to see them or want to learn more about their capabilities.

Below is a curated list of 40 software tools widely used for Nanopore sequencing and genomics data analysis. The tools are categorized based on their primary function to facilitate easy navigation.

Quality Control and Preprocessing

FastQC
Performs quality control checks on raw sequencing data in FASTQ format.
MultiQC
Aggregates reports from multiple QC tools into a single summary.
NanoPlot
Generates visualizations and statistics for assessing Nanopore sequencing data quality.
Seqtk
A lightweight tool for processing FASTA/FASTQ files, including format conversion and subsampling.

Basecalling and Signal Processing

Dorado
High-performance basecaller for Nanopore sequencing data.
Bonito
A deep learning-based basecaller for improved accuracy.
Nanopolish
Analyzes raw signal data to refine consensus sequences and detect base modifications.

Genome Assembly

Flye
A de novo assembler optimized for Nanopore and PacBio long reads. Probably the best place to start for assembly with Nanopore reads.
Canu
Assembles genomes using long-read sequencing data.
SPAdes
A versatile genome assembler that supports hybrid assemblies combining short and long reads.
Unicycler
A hybrid assembly pipeline combining short and long reads for high-quality microbial genome assemblies.

Genome Alignment

Minimap2
An efficient aligner for mapping long sequencing reads to a reference genome.
BWA-MEM2
A faster, more efficient version of BWA-MEM for read alignment.
STAR
A fast aligner optimized for RNA sequencing.

Variant Calling and Genome Polishing

Medaka
Polishes consensus sequences generated from Nanopore sequencing data.
Longshot
A variant caller optimized for long-read sequencing.
FreeBayes
A haplotype-based variant caller for detecting genetic variations.
DeepVariant
Uses deep learning to improve variant calling accuracy.
Pilon
Improves genome assemblies by correcting errors with short-read data.
Racon
A consensus module for polishing genome assemblies from long-read sequencing.

Taxonomic and Metagenomic Analysis

Kraken2
A taxonomic classifier that assigns microbial reads to taxa using k-mer-based algorithms.
MEGAN
Enables taxonomic and functional analysis of metagenomic data.
MetaPhlAn
A tool for profiling microbial communities from metagenomic data.

Genome Visualization and Annotation

IGV (Integrative Genomics Viewer)
A high-performance visualization tool for large-scale genomic data.
UCSC Genome Browser
A web-based genome browser for accessing and visualizing genomic annotations.
BEDTools
A toolkit for genome arithmetic and feature comparisons.
SnpEff
Annotates genetic variants and predicts their functional impact.
VCFtools
A collection of utilities for processing and analyzing Variant Call Format (VCF) files.
PLINK
A toolset for population-based genetic association studies.
QUAST
Evaluates and compares genome assemblies.

Workflow Management and Reproducibility

Nextflow
A workflow management system for scalable and reproducible bioinformatics pipelines.
Snakemake
A workflow engine for creating reproducible bioinformatics pipelines.
Bioconda
A Conda channel that simplifies installation of bioinformatics tools.
Galaxy
A web-based platform for running bioinformatics workflows.
CyVerse
A cloud-based data management and computation platform for large-scale genomics projects.

General-Purpose Computational Tools

Docker
A containerization platform for reproducible bioinformatics environments.
Jupyter Notebook
An interactive computing environment for running and sharing bioinformatics code.
RStudio
A popular IDE for R, widely used in bioinformatics and data visualization.
NCBI SRA Toolkit
A command-line toolset for accessing sequencing data from the NCBI Sequence Read Archive.
JetStream2
A cloud-based computing resource providing access to GPUs and bioinformatics tools.

For further reading see Awesome Bioinformatics.

Building more advanced bioinformatics skills – The Carpentries

The Carpentries is a global community dedicated to teaching foundational coding and data science skills through hands-on, interactive workshops. Their lessons are particularly valuable for researchers and educators looking to build computational skills for bioinformatics and genomics. The Software Carpentry and Data Carpentry initiatives offer structured lessons on command-line tools, version control, and programming languages like Python and R. Lessons are self-paced, and in-person workshops are available on every continent!

Additional Reading

The Genomics Data Carpentry lessons provide an excellent introduction to managing and analyzing sequencing data, covering topics such as the UNIX command line, working with FASTQ files, quality control, and genome assembly. These lessons are designed for beginners and offer an accessible entry point for educators and students looking to incorporate bioinformatics into their work.

Comments and discussion

See recent comments or start a discussion on our Slack channel.

Powered by Curator.io