Introduction

GPU enabled Docker image

  • Use the image from https://hub.docker.com/r/cschranz/gpu-jupyter for GPUs to be recognised inside your environment (it's already in the config file, just choose that as the environment)
  • Other tensorflow images won't detect the GPUs due to issues with CUDA and cuDNN

Guidelines For a new installation

Initial Setup

https://docs.aws.amazon.com/eks/latest/userguide/getting-started-console.html

  • Create a new group called admin
  • Assign Administrator Access policy
  • Create user, enable programmatic and AWS console access, and add to the admin group (if required)
  • MFA in the root account must be enabled. Assign Virtual MFA (Microsoft Authenticator)
  • Root account access key should be deleted
  • Create additional users and groups as necessary
  • Sign out of root account
  • Sign in as i-am-user
  • Cloud Formation Role needed to create any of the following components. This also needs IAM access

Creating a VPC

https://docs.aws.amazon.com/eks/latest/userguide/create-public-private-vpc.html

  • Add the necessary policies to the I-AM role.
  • Create a VPC from cloud formation using the CF template
  • Default is Public-Private

Creating the Cluster

https://docs.aws.amazon.com/eks/latest/userguide/cluster-endpoint.html#modify-endpoint-access

  • Create cluster from the EKS console
  • Add security groups and subnets related to the previously created VPC
  • Security group has ControlPlaneSecurityGroup in the name available in the drop down
  • Configure cluster end point access. It is Public by default

Attaching Nodes to the Cluster

Opening a ticket on AWS: AWS does not let you create GPU instances, so a ticket must be created for this

Nodes do not connect to cluster: This is because of a changed configuration from AWS. MapPublicIpOnLaunch should be set to true in the VPC settings.

JupyterHub Configuration:

You’ve successfully subscribed to Sudhanva
Welcome back! You’ve successfully signed in.
Great! You’ve successfully signed up.
Success! Your email is updated.
Your link has expired
Success! Check your email for magic link to sign-in.