# Containers for HPC

## Intro to Docker

**Docker** is a tool that packages applications and their dependencies into containers, ensuring they run the same way on any system. It's useful because it:

1. **Ensures Consistency** across different environments (e.g., development, testing, production).
2. **Isolates Applications**, preventing conflicts and improving security.
3. **Makes Deployment Easy** by packaging everything needed to run an app in one container.
4. **Increases Efficiency** since containers are lightweight and use less resources compared to traditional virtual machines.

In short, Docker simplifies running and deploying applications by making them portable and consistent.

***

{% hint style="warning" %}
Following requires docker downloaded / docker account. You can skip these steps if you dont have them.&#x20;
{% endhint %}

## Download sample code

```
git clone https://github.com/sanjeev-one/Intro-to-Supercomputing-24---Duke-IEEE.git
```

The dockerfile tells docker how to setup the container:

````
```dockerfile
# Use the official Python image with version 3.8
FROM python:3.8-slim

# Set the working directory in the container
WORKDIR /app

# Copy the current directory contents into the container at /app
COPY . .

# Install any dependencies specified in requirements.txt
RUN pip install --no-cache-dir -r requirements.txt

# Run ml_app.py when the container launches
CMD ["python", "ml_app.py"]

```
````

### Build and Run the Docker Container

Build the Docker Image:

In a local terminal that has the directory with ml\_app.py open: (This may need logging into docker with docker login and using a [dockerhub](https://hub.docker.com/) account)

```bash
docker build -t ml-app .
```

#### Run the Container Locally:

```bash
docker run ml-app
```

This will print the model's accuracy and the prediction for the sample flower measurements in your terminal.

#### Push to Docker Hub and Deploy on a VM - skip if doing workshop

Log In to Docker Hub: needs <https://hub.docker.com/account>

```bash
docker login
```

Tag and Push the Image: (change username to your docker hub account username)

```bash
docker tag ml-app username/ml-app
docker push username/ml-app
```

## On the VM, Pull and Run the Image:

You can follow [Jetstream 2 tutorial](/workshops/jetstream-2-tutorial.md) to connect to a jetstream 2 vm

1. **Pull the Docker Image**:

   ```bash
   docker pull dukeieee/ml-app
   ```

   This command downloads the Docker image `dukeieee/ml-app` from Docker Hub to your local system or VM. The image contains the Python ML app and its required dependencies.
2. **Run the Docker Container**:

   ```bash
   docker run dukeieee/ml-app
   ```

   This command starts a container using the downloaded image. The app will train a machine learning model on the iris dataset and print the model's accuracy and predictions directly to your terminal.

This approach ensures that the app runs with all necessary dependencies, regardless of the environment, providing a consistent and reproducible setup.

## HPC Specific Variants

### Apptainer

Both Apptainer and Docker are containerization tools, but they have different primary use cases and features:

* **Apptainer**:
  * Formerly known as Singularity, it's designed for high-performance computing (HPC) environments.
  * Focuses on user-level container management, which does not require root privileges.
  * Highly compatible with HPC batch systems and allows seamless integration into shared file systems.
  * Emphasizes security, enabling users to securely run containers without additional system privileges.

***

## Running container on TAMU Faster

Authorized ACCESS users can log in using the Web Portal:

{% embed url="<https://portal-faster-access.hprc.tamu.edu>" %}

On a login node:

```bash
srun --nodes=1 --ntasks-per-node=4 --mem=30G --time=01:00:00 --pty bash -i
#(wait for job to start)
```

On a compute node:

```bash
cd $SCRATCH
export SINGULARITY_CACHEDIR=$TMPDIR/.singularity
module load WebProxy
singularity pull hello-world.sif docker://hello-world
singularity pull ml-app.sif docker://dukeieee/ml-app
#(wait for download and convert)
exit
```

**Example on Grace, batch job**

Create a file named `singularity_pull.sh`:

```bash
#!/bin/bash

## JOB SPECIFICATIONS
#SBATCH --job-name=singularity_pull  #Set the job name to "singularity_pull"
#SBATCH --time=01:00:00              #Set the wall clock limit to 1hr
#SBATCH --nodes=1                    #Request 1 node
#SBATCH --ntasks=4                   #Request 4 task
#SBATCH --mem=30G                    #Request 30GB per node
#SBATCH --output=singularity_pull.%j #Send stdout/err to "singularity_pull.[jobID]"

# set up environment for download
cd $SCRATCH
export SINGULARITY_CACHEDIR=$TMPDIR/.singularity
module load WebProxy

# execute download
singularity pull hello-world.sif docker://hello-world
singularity pull ml-app.sif docker://dukeieee/ml-app
```

On a login node,

```bash
sbatch singularity_pull.sh

#wait to complete
```

```
cd $SCRATCH
#see files
```

***

### Interact with container <a href="#interact-with-container" id="interact-with-container"></a>

{% hint style="info" %}
make sure you are on a compute node:\
srun --pty --time=00:30:00 --mem=10G --ntasks=1 bash -i
{% endhint %}

When a container image file is in place at HPRC, it can be used to control your environment for doing computation tasks.

These examples use a container image `almalinux.sif` from <https://hub.docker.com/_/almalinux>, which is a lightweight derivative of the Redhat OS.

#### Shell <a href="#shell" id="shell"></a>

The shell command allows you to spawn a new shell within your container and interact with it one command at a time. Don't forget to `exit` when you're done.

```
singularity shell <image.sif>
```

Example:

```
singularity shell ml-app.sif
```

#### Executing commands <a href="#executing-commands" id="executing-commands"></a>

The *exec* command allows you to execute a custom command within a container by specifying the image file and the command.

```
singularity exec <image.sif> <command>
```

The command can refer to an executable installed inside the container, or to a script located on a mounted cluster filesystem (see [Files in and outside a container](https://hprc.tamu.edu/kb/Software/Singularity/#files-in-and-outside-a-container)).

Example program installed inside image:

```
singularity exec almalinux.sif bash --version
```

Example executable file `myscript.sh`:

* starts with `#!/usr/bin/env bash`
* has the executable permission set by `chmod u+x myscript.sh`
* is located in the current directory

```
singularity exec almalinux.sif ./myscript.sh
```

#### Running a container <a href="#running-a-container" id="running-a-container"></a>

Execute the default [runscript](https://sylabs.io/guides/latest/user-guide/quick_start.html#running-a-container) defined in the container

```
singularity run hello-world.sif
```

```
singularity run --pwd /app ml-app.sif
```

{% hint style="info" %}
\--pwd /app specifies the working directory in the container.
{% endhint %}

{% hint style="info" %}
Want to learn more? <https://www.deeplearningwizard.com/language_model/containers/hpc_containers_apptainer/#available-containers-definition-files>
{% endhint %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://workshop.dukeieee.org/workshops/publish-your-docs.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
