User Tools

Site Tools


tipsntricks:ipython_notebook

Jupyter / Ipython Notebook

Jupyter is installed in most of the python modules. To make use of it you would do something like:

jupyter-notebook

However using it effectively on the cluster is a bit more complicated…

After Logging in to your account check for your preferred python version as follows:

module avail 2>&1 | grep python

Now proceed to request for an interactive compute node where a configuration file and security measures will be established. A single core on an interactive node has proven sufficient for this exercise, but if you are going to open multiple notebooks, request an appropriate number of cores, let's say 4, as follows:

qsub -I -P PROJ0101 -q serial -l select=1:ncpus=4:mpiprocs=4:nodetype=haswell_reg

Note:

  1. “PROJ0101” is an arbitrary project name, kindly use your research project's shortname e.g. ERTH0859
  2. Record the cnode ID because you will ssh into that particular compute node when setting up your password.
  3. Advanced interactive compute node settings are found here: Example interactive job request.

Certain users may require nodes with more memory in order to post-process large data sets. You may use a “fat” node for this purpose. These nodes have 1 TB of memory and 56 cores each. It does not make sense to request an entire fat node and all its memory for a Jupyter task, so it is best to share such a node with other users. If you do not already have access to the bigmem queue, you will need to request this from the CHPC Helpdesk

qsub -I -P PROJ0101 -q bigmem -l select=1:ncpus=4:mpiprocs=4:nodetype=haswell_fat

In your interactive node, load the preferred python module as follows:

module add chpc/python/3.6.0_gcc-6.3.0

Security

You are not the only person on the system, so it is important to set up authentication on your notebook so that not everyone gets access to your notebook (and worse – your data).

So first one needs a configuration file, this can be done by passing the generate-config parameter to jupyter as follows:

[USERNAME@cnode0010 ~]$ jupyter-notebook --generate-config

Note

  1. This writes a default config file to: /home/USERNAME/.jupyter/jupyter_notebook_config.py, where USERNAME is YOUR username.
  2. The output file “jupyter_notebook_config.py” will be listed as a hidden file. Thus, to list it do a $ ls -a

Next you need to generate your password (remember it – you'll need it when you connect later):

python
from notebook.auth import passwd
passwd()
Enter password:
Verify password:
'sha1:f27008fdb0eb:4c2f305d5e230edca16c7059882ba3ba63bee03b'

Your password hash will be different. Obviously use the one in your terminal, not the one shown in this example.

Use the following command to access the “jupyter_notebook_config.py” file: $ cd $HOME/.jupyter/
Now edit the file $HOME/.jupyter/jupyter_notebook_config.py with your favourite editor.

c.NotebookApp.password = 'sha1:f27008fdb0eb:4c2f305d5e230edca16c7059882ba3ba63bee03b'

Remember to uncomment it, and copy and paste your own hash in.

Starting a notebook inside a job

There might be a cleaner way of doing this… Please let us know if you have one!

VERY IMPORTANT: Do not add the lines below to your .ssh/config file on the cluster, you WILL break any attempt at parallel processing!

Open a terminal on your local workstation and create a .ssh/config file.

touch .ssh/config

If your desktop system runs Windows, a simple way to deal with this is to run a unix-like environment inside Windows. You can either useCygwin Cygwin directly, or start a “Local Terminal” in MobaXterm. From this terminal you can edit the local ~/.ssh/config file as if you were working on a Linux computer.

Continue to edit the ~/.ssh/config file on your local machine by adding in these lines:

Host cnode*
    Hostname %h
    User YOURUSERNAME
    ProxyCommand ssh YOURUSERNAME@lengau.chpc.ac.za nc %h 22
    LocalForward 8888 localhost:8888
Host fat*
    Hostname %h
    User YOURUSERNAME
    ProxyCommand ssh YOURUSERNAME@lengau.chpc.ac.za nc %h 22
    LocalForward 8888 localhost:8888    

Remember to replace “YOURUSERNAME” with your own username on the cluster.

At this point you may not know what the LocalForward and localhost port numbers are so, on the cluster in the interactive node you had opened earlier type:

jupyter-notebook --no-browser

Edit the given port number into LocalForward and localhost above

Note

  1. Each session is likely to be assigned a different compute node and port number (i.e. cnode* changes every-time you request for an interactive compute node session)
  2. Confirm the localForward and localhost port numbers provided for each different session.

In your local terminal ssh directly to the compute node that you are using. Let us assume that it is cnode1234:

ssh cnode1234

You should be prompted for your password, twice. This is because the ssh logs in to Lengau first and then from Lengau it logs into the the compute node. On the first login, MobaXterm will offer to remember your password, and if you allow it to do so it will not be necessary to re-enter it.

You are now ready to roll…

In your browser go to: http://localhost:8888 (note you can only do this for nodes where you currently have a jupyter job running). Your port number may be different. Use the number that you have been assigned.

The jobscript will look something like:

jupyter.qsub
#!/bin/bash
#PBS -P ERTH1234 
#PBS -q serial
#PBS -l select=1:ncpus=8:mpiprocs=8
#PBS -l walltime=08:00:00
#PBS -N Jupyter
#PBS -m abe
#PBS -M YOUR@EMAIL.ADDRESS
 
module add chpc/python/3.6.0_gcc-6.3.0
 
JUPYTERPORT=8888  # you could change this too, if you wanted to.
 
hostname > ~/jupyter.host
 
jupyter-notebook --port=${JUPYTERPORT} --no-browser

If you submit that job and wait for it to start running then you can check which host the session is running on with:

cat ~/jupyter.host

Then, again on your local machine, you need to connect to the allocated compute node, for example ssh cnode0101. If you are working in Windows, do this from your Cygwin or MobaXterm terminal command line. You will be prompted for your Lengau login password.

Running Jupyter on a GPU node

Create the following job script:

jupyter.pbs
#!/bin/bash
#PBS -P SHORTNAME
#PBS -q gpu_1
#PBS -l select=1:ncpus=1:ngpus=1
#PBS -l walltime=8:00:00
#PBS -N Jupyter
#PBS -m abe
#PBS -M YOUR@EMAIL.ADDRESS
 
module purge
module add chpc/python/anaconda/3-2019.10
 
# Go to your directory
cd /mnt/lustre/users/username/jupyter_notebook
 
## get tunneling info
XDG_RUNTIME_DIR=""
ipnport=$(shuf -i8000-9999 -n1)
ipnip=$(hostname -i)
 
## print tunneling instructions to an output file
echo -e "
Copy/Paste this in your local terminal to ssh tunnel with remote
-----------------------------------------------------------------
ssh -N -L $ipnport:$ipnip:$ipnport user@host
-----------------------------------------------------------------
Then open a browser on your local machine to the following address
------------------------------------------------------------------
localhost:$ipnport
------------------------------------------------------------------
" > tunnel.out
## start an ipcluster instance and launch jupyter server
jupyter-notebook --NotebookApp.token='' --no-browser --port=$ipnport --ip=$ipnip
sleep 8h

Submit the job with qsub jupyter.pbs

Once jobs is running go to the folder that you cd to in the script above and cat tunnel.out

The information needed to setup your tunnel will be shown in the line ssh -N -L ….

Copy this line and paste it into MobaXterm is you are a Windows user or a terminal if you are a MacOS or Linux User Just be sure to change user to your cluster username and host to lengau.chpc.ac.za

/app/dokuwiki/data/pages/tipsntricks/ipython_notebook.txt · Last modified: 2021/12/09 16:42 (external edit)