This is an example of a Snakemake workflow that:
Snakemake functionality is provided through
a command line tool called
byok8s, so that
it allows you to do this (abbreviated for clarity):
# Create virtual k8s cluster minikube start # Run the workflow byok8s --s3-bucket=mah-s3-bukkit my-workflowfile my-paramsfile # Clean up the virtual k8s cluster minikube stop
Snakemake workflows are provided via a Snakefile by the user. Snakemake runs tasks on the Kubernetes (k8s) cluster. The approach is for the user to provide their own Kubernetes cluster (byok8s = Bring Your Own Kubernetes).
The example above uses
to make a virtual k8s cluster, useful for testing.
For real workflows, your options for kubernetes clusters are cloud providers:
The Travis CI tests utilize minikube to run test workflows.
This runs through the installation and usage
Step 1: Set up Kubernetes cluster with
Step 2: Install
Step 3: Run the
byok8s workflow using the Kubernetes cluster.
Step 4: Tear down Kubernetes cluster with
For the purposes of the quickstart, we will walk
through how to set up a local, virtual Kubernetes
Start by installing minikube:
Once it is installed, you can start up a kubernetes cluster with minikube using the following commands:
cd test minikube start
NOTE: If you are running on AWS, run this command first
minikube config set vm-driver none
to set the the vm driver to none and use native Docker to run stuff.
If you are running on AWS, the DNS in the minikube
kubernetes cluster will not work, so run this command
to fix the DNS settings (should be run from the
kubectl apply -f fixcoredns.yml kubectl delete --all pods --namespace kube-system
Start by setting up a python virtual environment, and install the required packages into the virtual environment:
pip install -r requirements.txt
This installs snakemake and kubernetes Python
modules. Now install the
byok8s command line
python setup.py build install
Now you can run:
and you should see
byok8s in your virtual
This command line utility will expect a kubernetes cluster to be set up before it is run.
Setting up a kubernetes cluster will create… (fill in more info here)…
Snakemake will automatically create the pods in the cluster, so you just need to allocate a kubernetes cluster.
Now you can run the workflow with the
This submits the Snakemake workflow jobs to the Kubernetes
cluster that minikube created.
You should have your workflow in a
Snakefile in the
current directory. Use the
--snakefile flag if it is
named something other than
You will also need to specify your AWS credentials
environment variables. These are used to to access
S3 buckets for file I/O.
Finally, you will need to create an S3 bucket for
Snakemake to use for file I/O. Pass the name of the
bucket using the
Start by exporting these two vars (careful to scrub them from bash history):
export AWS_ACCESS_KEY_ID=XXXXX export AWS_SECRET_ACCESS_KEY=XXXXX
Run the alpha workflow with blue params:
byok8s --s3-bucket=mah-bukkit workflow-alpha params-blue
Run the alpha workflow with red params:
byok8s --s3-bucket=mah-bukkit workflow-alpha params-red
Run the gamma workflow with red params, &c:
byok8s --s3-bucket=mah-bukkit workflow-gamma params-red
(NOTE: May want to let the user specify input and output directories with flags.)
All input files are searched for relative to the working directory.
The last step once the workflow has been finished, is to tear down the kubernetes cluster. The virtual kubernetes cluster created by minikube can be torn down with the following command:
|Cloud Provider||Kubernetes Service||Guide|
|Minikube (on AWS EC2)||Minikube||Minikube AWS Guide|
|Google Cloud Platform (GCP)||Google Container Engine (GKE)||GCP GKE Guide|
|Amazon Web Services (AWS)||Elastic Container Service (EKS)||AWS EKS Guide|
|Digital Ocean (DO)||DO Kubernetes (DOK)||DO DOK Guide|