Jupyter is a commonly installed component in most Python modules, and it can be easily utilized by following a few simple steps. For instance:
jupyter-notebook
However, using it effectively on the cluster is a bit more complicated…
After Logging in to your account check for your preferred python version as follows:
module avail 2>&1 | grep python
You can now request an interactive compute node to establish a configuration file and implement security measures. For this exercise, a single core on an interactive node has been found to be sufficient. However, if you plan to open multiple notebooks, please request an appropriate number of cores, such as four, using the following command:
qsub -I -P PROJ0101 -q serial -l select=1:ncpus=4:mpiprocs=4
Note:
Certain users may require nodes with more memory in order to post-process large data sets. You may use a “fat” node for this purpose. These nodes have 1 TB of memory and 56 cores each. It does not make sense to request an entire fat node and all its memory for a Jupyter task, so it is best to share such a node with other users. If you do not already have access to the bigmem queue, you will need to request this from the CHPC Helpdesk
qsub -I -P PROJ0101 -q bigmem -l select=1:ncpus=4:mpiprocs=4
In your interactive node, load the preferred python module as follows:
module add chpc/python/3.6.0_gcc-6.3.0
You are not the only person on the system, so it is important to set up authentication on your notebook so that not everyone gets access to your notebook (and worse – your data).
So first one needs a configuration file, this can be done by passing the generate-config parameter to jupyter as follows:
[USERNAME@cnode0010 ~]$ jupyter-notebook --generate-config
Note
Next you need to generate your password (remember it – you'll need it when you connect later):
python
from notebook.auth import passwd passwd() Enter password: Verify password: 'sha1:f27008fdb0eb:4c2f305d5e230edca16c7059882ba3ba63bee03b'
Your password hash will be different. Obviously use the one in your terminal, not the one shown in this example.
Exit python and then use the following command to access the “jupyter_notebook_config.py” file:
cd .jupyter/
Now edit the file jupyter_notebook_config.py
with your favourite editor.
vim jupyter_notebook_config.py c.NotebookApp.password = 'sha1:f27008fdb0eb:4c2f305d5e230edca16c7059882ba3ba63bee03b'
Remember to uncomment it, and copy and paste your own hash in.
There might be a cleaner way of doing this… Please let us know if you have one!
VERY IMPORTANT: Do not add the lines below to your .ssh/config file on the cluster, you WILL break any attempt at parallel processing!
Open a terminal on your local workstation and create a .ssh/config
file.
touch .ssh/config
If your desktop system runs Windows, a simple way to deal with this is to run a unix-like environment inside Windows. You can either useCygwin Cygwin directly, or start a “Local Terminal” in MobaXterm. From this terminal you can edit the local .ssh/config
file as if you were working on a Linux computer.
Find,
ls -al
… and continue to edit the .ssh/config
file on your local machine by adding in these lines:
Host cnode* Hostname %h User YOURUSERNAME ProxyCommand ssh YOURUSERNAME@lengau.chpc.ac.za nc %h 22 LocalForward 8888 localhost:8888 Host fat* Hostname %h User YOURUSERNAME ProxyCommand ssh YOURUSERNAME@lengau.chpc.ac.za nc %h 22 LocalForward 8888 localhost:8888
Remember to replace “YOURUSERNAME” with your own username on the cluster.
At this point you may not know what the LocalForward and localhost port numbers are so, on the cluster in the interactive node you had opened earlier type:
jupyter-notebook --no-browser
Edit the given port number into LocalForward and localhost above
Note
In your local terminal ssh
directly to the compute node that you are using. Let us assume that it is cnode1234:
ssh cnode1234
You should be prompted for your CLUSTER password, twice. This is because the ssh
logs in to Lengau first and then from Lengau it logs into the the compute node. On the first login, MobaXterm will offer to remember your password, and if you allow it to do so it will not be necessary to re-enter it.
You are now ready to roll…
In your browser go to: http://localhost:8888 (note you can only do this for nodes where you currently have a jupyter job running). Your port number may be different. Use the number that you have been assigned.
The jobscript will look something like:
#!/bin/bash #PBS -P ERTH1234 #PBS -q serial #PBS -l select=1:ncpus=8:mpiprocs=8 #PBS -l walltime=08:00:00 #PBS -N Jupyter #PBS -m abe #PBS -M YOUR@EMAIL.ADDRESS module add chpc/python/3.6.0_gcc-6.3.0 JUPYTERPORT=8888 # you could change this too, if you wanted to. hostname > ~/jupyter.host jupyter-notebook --port=${JUPYTERPORT} --no-browser
If you submit that job and wait for it to start running then you can check which host the session is running on with:
cat ~/jupyter.host
Then, again on your local machine, you need to connect to the allocated compute node, for example ssh cnode0101
. If you are working in Windows, do this from your Cygwin or MobaXterm terminal command line. You will be prompted for your Lengau login password.
Create the following job script:
#!/bin/bash #PBS -P SHORTNAME #PBS -q gpu_1 #PBS -l select=1:ncpus=1:ngpus=1 #PBS -l walltime=8:00:00 #PBS -N Jupyter #PBS -m abe #PBS -M YOUR@EMAIL.ADDRESS module purge module add chpc/python/anaconda/3-2019.10 # Go to your directory cd /mnt/lustre/users/username/jupyter_notebook ## get tunneling info XDG_RUNTIME_DIR="" ipnport=$(shuf -i8000-9999 -n1) ipnip=$(hostname -i) ## print tunneling instructions to an output file echo -e " Copy/Paste this in your local terminal to ssh tunnel with remote ----------------------------------------------------------------- ssh -N -L $ipnport:$ipnip:$ipnport user@host ----------------------------------------------------------------- Then open a browser on your local machine to the following address ------------------------------------------------------------------ localhost:$ipnport ------------------------------------------------------------------ " > tunnel.out ## start an ipcluster instance and launch jupyter server jupyter-notebook --NotebookApp.token='' --no-browser --port=$ipnport --ip=$ipnip sleep 8h
Submit the job with qsub jupyter.pbs
Once jobs is running go to the folder that you cd to in the script above and cat tunnel.out
The information needed to setup your tunnel will be shown in the line ssh -N -L ….
Copy this line and paste it into MobaXterm is you are a Windows user or a terminal if you are a MacOS or Linux User Just be sure to change user to your cluster username and host to lengau.chpc.ac.za