HEAVY.AI Installation using Docker on Ubuntu

Follow these steps to install HEAVY.AI as a Docker container on a machine running with on CPU or with supported NVIDIA GPU cards using Ubuntu as the host OS.

Preparation

Prepare your host by installing Docker and if needed for your configuration NVIDIA drivers and NVIDIA runtime.

Install Docker

Remove any existing Docker Installs and if on GPU the legacy NVIDIA docker runtime.

sudo docker volume ls -q -f driver=nvidia-docker \
| xargs -r -I{} -n1 docker ps -q -a -f volume={} | xargs -r docker rm -f
sudo apt-get purge nvidia-docker
sudo apt-get remove docker docker-engine docker.io containerd runc

Use curl to add the docker's GPG key.

sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg \
| sudo apt-key add -

Add Docker to your Apt repository.

sudo add-apt-repository \
   "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
   $(lsb_release -cs) \
   stable"

Update your repository.

sudo apt update

Install Docker, the command line interface, and the container runtime.

sudo apt install docker-ce docker-ce-cli containerd.io

Run the following usermod command so that docker command execution does not require sudo privilege. Log out and log back in for the changes to take effect. (reccomended)

sudo usermod  --append --groups docker $USER

Verify your Docker installation.

sudo docker run hello-world

For more information on Docker installation, see the Docker Installation Guide.

Install NVIDIA Drivers and NVIDIA Container ᴳᴾᵁ ᴼᴾᵀᴵᴼᴺ

Install NVIDIA driver and Cuda Toolkit using Install NVIDIA Drivers and Vulkan on Ubuntu

Install NVIDIA Docker Runtime

Use curl to add Nvidia's Gpg key:

curl --silent --location https://nvidia.github.io/nvidia-container-runtime/gpgkey | \
sudo apt-key add -

Update your sources list:

distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl --silent --location https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidia-container-runtime.list | \
  sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list

Update apt-get and install nvidia-container-runtime:

sudo apt-get update
sudo apt-get install -y nvidia-container-runtime

Edit /etc/docker/daemon.json to add the following, and save the changes:

{
  "default-runtime": "nvidia",
  "runtimes": {
     "nvidia": {
         "path": "/usr/bin/nvidia-container-runtime",
         "runtimeArgs": []
     }
 }
}

Restart the Docker daemon:

sudo pkill -SIGHUP dockerd

Check Nvidia Drivers

Verify that docker and NVIDIA runtime work together.

sudo docker run --gpus=all \
--rm nvidia/cuda:11.0-runtime-ubuntu20.04 nvidia-smi

If everything is working you should get the output of nvidia-smi command showing the installed GPUs in the system.

HEAVY.AI Installation

Create a directory to store data and configuration files

sudo mkdir -p /var/lib/heavyai && sudo chown $USER /var/lib/heavyai

Then a minimal configuration file for the docker installation

echo "port = 6274
http-port = 6278
calcite-port = 6279
data = \"/var/lib/heavyai\"
null-div-by-zero = true

[web]
port = 6273
frontend = \"/opt/heavyai/frontend\"" \
>/var/lib/heavyai/heavy.conf

Ensure that you have sufficient storage on the drive you choose for your storage dir running this command

if test -d /var/lib/heavyai; then echo "There is $(df -kh /var/lib/heavyai --output="avail" | sed 1d) avaibale space in you storage dir"; else echo "There was a problem with the creation of storage dir";  fi;

Download HEAVY.AI from DockerHub and Start HEAVY.AI in Docker. Select the tab depending on the Edition (Enterprise, Free, or Open Source) and execution Device (GPU or CPU) you are going to use.

sudo docker run -d --gpus=all \
-v /var/lib/heavyai:/var/lib/heavyai \
-p 6273-6278:6273-6278 \
heavyai/heavyai-ee-cuda:latest

Check that the docker is up and running a docker ps commnd:

sudo docker container ps --format "{{.Image}} {{.Status}}" \
-f status=running | grep heavyai\/

You should see an output similar to the following.

heavyai/heavyai-ee-cuda Up 48 seconds ago 

See also the note regarding the CUDA JIT Cache in Optimizing Performance.

Configure Firewall ᴼᴾᵀᴵᴼᴺᴬᴸ

If a firewall is not already installed and you want to harden your system, install theufw.

sudo apt install ufw
sudo ufw allow ssh

To use Heavy Immerse or other third-party tools, you must prepare your host machine to accept incoming HTTP(S) connections. Configure your firewall for external access.

sudo ufw disable
sudo ufw allow 6273:6278/tcp
sudo ufw enable

Most cloud providers use a different mechanism for firewall configuration. The commands above might not run in cloud deployments.

For more information, see https://help.ubuntu.com/lts/serverguide/firewall.html.

Licensing HEAVY.AI ᵉᵉ⁻ᶠʳᵉᵉ ᵒⁿˡʸ

If you are on Enterprise or Free Edition, you need to validate your HEAVY.AI instance using your license key. You must skip this section if you are on Open Source Edition ²

  1. Copy your license key of Enterprise or Free Edition from the registration email message. If you don't have a license and you want to evaluate HEAVY.AI in an enterprise environment, contact your Sales Representative or register for your 30-day trial of Enterprise Edition here. If you need a Free License you can get one here.

  2. Connect to Heavy Immerse using a web browser to your host on port 6273. For example, http://heavyai.mycompany.com:6273.

  3. When prompted, paste your license key in the text box and click Apply.

  4. Log into Heavy Immerse by entering the default username (admin) and password (HyperInteractive), and then click Connect.

Command-Line Access

You can access the command line in the Docker image to perform configuration and run HEAVY.AI utilities.

You need to know the container-id to access the command line. Use the command below to list the running containers.

sudo docker container ps

You see output similar to the following.

CONTAINER ID        IMAGE                     COMMAND                     CREATED             STATUS              PORTS                                            NAMES
9e01e520c30c        heavyai/heavyai-ee-gpu    "/bin/sh -c '/heavyai..."   50 seconds ago      Up 48 seconds ago   0.0.0.0:6273-6280->6273-6280/tcp                 confident_neumann

Once you have your container ID, in the example 9e01e520c30c, you can access the command line using the Docker exec command. For example, here is the command to start a Bash session in the Docker instance listed above. The -it switch makes the session interactive.

sudo docker exec -it 9e01e520c30c bash

You can end the Bash session with the exit command.

Final Checks

To verify that everything is working, load some sample data, perform a heavysql query, and generate a Scatter Plot or a Bubble Chart using Heavy Immerse ¹

Load Sample Data and Run a Simple Query

HEAVY.AI ships with two sample datasets of airline flight information collected in 2008, and a census of New York City trees. To install sample data, run the following command.

sudo docker exec -it <container-id> \
./insert_sample_data --data /var/lib/heavyai/storage

Where <container-id> is the container in which HEAVY.AI is running.

When prompted, choose whether to insert dataset 1 (7,000,000 rows), dataset 2 (10,000 rows), or dataset 3 (683,000 rows). The examples below use dataset 2.

Enter dataset number to download, or 'q' to quit:
#     Dataset                   Rows    Table Name             File Name
1)    Flights (2008)            7M      flights_2008_7M        flights_2008_7M.tar.gz
2)    Flights (2008)            10k     flights_2008_10k       flights_2008_10k.tar.gz
3)    NYC Tree Census (2015)    683k    nyc_trees_2015_683k    nyc_trees_2015_683k.tar.gz

Connect to HeavyDB by entering the following command (a password willò be asked; the default password is HyperInteractive):

sudo docker exec -it <container-id> bin/heavysql 

Enter a SQL query such as the following:

SELECT origin_city AS "Origin", 
dest_city AS "Destination", 
ROUND(AVG(airtime),1) AS "Average Airtime" 
FROM flights_2008_10k 
WHERE distance < 175 GROUP BY origin_city,
dest_city;

The results should be similar to the results below.

Origin|Destination|Average Airtime
West Palm Beach|Tampa|33.8
Norfolk|Baltimore|36.1
Ft. Myers|Orlando|28.7
Indianapolis|Chicago|39.5
Tampa|West Palm Beach|33.3
Orlando|Ft. Myers|32.6
Austin|Houston|33.1
Chicago|Indianapolis|32.7
Baltimore|Norfolk|31.7
Houston|Austin|29.6

Create a Dashboard Using Heavy Immerse ᵉᵉ⁻ᶠʳᵉᵉ ᵒⁿˡʸ ¹

Installing Enterprise or Free Edition, check if Heavy Immerse is running as intended.

  1. Connect to Heavy Immerse using a web browser connected to your host machine on port 6273. For example, http://heavyai.mycompany.com:6273.

  2. Log into Heavy Immerse by entering the default username (admin) and password (HyperInteractive), and then click Connect.

Create a new dashboard and a Scatter Plot to verify that backend rendering is working.

  1. Click New Dashboard.

  2. Click Add Chart.

  3. Click SCATTER.

  4. Click Add Data Source.

  5. Choose the flights_2008_10k table as the data source.

  6. Click X Axis +Add Measure.

  7. Choose depdelay.

  8. Click Y Axis +Add Measure.

  9. Choose arrdelay.

  10. Click Size +Add Measure.

  11. Choose airtime.

  12. Click Color +Add Measure.

  13. Choose dest_state.

The resulting chart shows, unsurprisingly, that there is a correlation between departure delay and arrival delay.

¹ In the OS Edition, Heavy Immerse Service is unavailable.

² The OS Edition does not require a license key.

Last updated