Use of the Command Line Interface

INDEX

GSWHC-B Getting Started with HPC Clusters $\rightarrow$ USE1-B Use of the Cluster Operating System $\rightarrow$ USE1.1-B Use of the Command Line Interface

Relevant for: Tester, Builder, and Developer

Description:

You will learn about the book “The Linux Command Line” by William Shotts which we recommend for learning the command line (basic level)
You will learn to login remotely with public key authentication (which is not covered in the book) (basic level)
You will learn to use text editors (first steps without the need to consult the book) (basic level)

This skill requires no sub-skills

Level: basic

Use of the command line interface

Users interact with an HPC system through a command line interface (CLI), there are no graphical user interfaces. The command line is used for interactive work and for writing shell scripts that run as batch jobs.

Using a command line interface is a good way to interact very efficiently with a computer by just typing on a keyboard. Fortunately this is especially true for Unix-like operating systems like Linux, because for the typically Linux based cluster systems graphical user interfaces (GUIs) are rarely available. In the Unix or Linux world a program named shell is used as a command language interpreter. The shell executes commands it reads from the keyboard or – more strictly speaking – from the standard input device (e.g. mapped to a file).

The shell can also run programs named shell scripts, e.g. to automate the execution of several commands in a row.

The command line is much more widely used than in HPC. Accordingly, much more has been written about the command line than on HPC. Instead of trying to write yet another text on the command line we would like to refer the reader to the very nice book The Linux Command Line by William Shotts. The book can be read online. A pdf version (here: 19.01) is also available.

As a motivation to start reading it, we quote a few lines from the Introduction of version 19.01 of the book:

It’s been said that “graphical user interfaces make easy tasks easy, while command line interfaces make difficult tasks possible” and this is still very true today.

…, there is no shortcut to Linux enlightenment. Learning the command line is challenging and takes real effort. It’s not that it’s so hard, but rather it’s so vast.

And, unlike many other computer skills, knowledge of the command line is long lasting.

Another goal is to acquaint you with the Unix way of thinking, which is different from the Windows way of thinking.

The book also introduces the secure shell (ssh). The secure shell is a prerequisite for using HPC systems, because it is needed to log in, and to copy files from and to the system. Since The Linux Command Line does not cover ssh keys, we explain how to use these below. We also mention the very first steps with text editors.

To get access to a cluster system, a terminal emulator program is used to remotely connect to a cluster node (possibly belonging to a group of such nodes) which authenticates the user and grants access to the actual cluster system. Such login nodes are also named gateway nodes or head nodes.

SSH (Secure Shell) and Windows Equivalents

In the Linux world the ssh (Secure Shell) client program is used as a terminal emulator for logging into a remote machine like a login node (running the SSH server counterpart).

The ssh program provides a secured and encrypted communication for executing commands on a remote machine like building programs and submitting jobs on the cluster system. Windows users can use third-party software like putty or MobaXterm to establish ssh connections to a cluster system. Meanwhile a quite mature OpenSSH port (beta version), i.e. a collection of client/server ssh utilities, is also available for Windows.

Another very useful option is to run Linux (e.g. Ubuntu or openSUSE) on a Windows computer in a virtual machine (VM) using a hypervisor like VMware Workstation Player, VirtualBox, or Hyper-V.

As an example, it is shown how to login to the Hummel cluster at Universität Hamburg (there are two login gateways available: hummel1 and hummel2):

user@your-pc:~$ ssh yourHummelUsername@hummel1.rrz.uni-hamburg.de
Enter passphrase for key '/home/user/.ssh/id_rsa': ****************
       _                                                                _
    __/ \   __/ RRZ HPC Login | Zugang nur mit Berechtigung      \__   / \_
         \_/  \ RRZ HPC login | Access for authorized users only /  \_/    


  * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
 *                                                                             *
 *  Hummel HPC Cluster (2015) am RRZ der Universität Hamburg     ____Ô____     *
 *                                                              /|\  |  /|\    *
 * > Interaktive Knoten / Interactive nodes:                   H U M | M E L   *
 *     front1, front2                                              _/ \_       *
 * > Weitere Informationen / Further information:                              *
 *     http://www.rrz.uni-hamburg.de/de/services/hpc.html                      *
 * > Beratung und Hilfe / Support: mailto:hpc@uni-hamburg.de                   *
 *                                                                             *
  * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

[yourHummelUsername@login1 14:48:23]~$

Public-key cryptography

For the Hummel cluster public-key cryptography is used for authentication to increase security in comparison to simple username / password authentication. To access the cluster via ssh, as shown above, the user initially generates a private/public keypair. The public key file will then be installed on the cluster by the cluster administrator. The private key file is kept in home directory on the user’s computer (e.g. ~/.ssh/id_rsa). The private key will never leave the user’s computer, which is a major concept and advantage of public key cryptography. For more details also see challenge-response authentication in Wikipedia. On the user’s computer, access to the private key needs to be protected by appropriate file permissions (only the owner of the private key file shall have permission to read the file) and above all by a passphrase. Note: Everyone who has access to the private key can gain access to the machine on which the public key is installed. Good practice is to use different key pairs for different HPC cluster systems.

SSH key pairs

The ssh-keygen command is used to generate, manage and convert authentication keys for ssh. Below, an example is given on how to generate an RSA type key pair (the newer type ed25519 is also a good choice) with a length of 4096 bits (minimum length should be 2048 bits). Windows users can use the PuTTYgen helper program in connection with third-party software like putty or MobaXterm to generate key pairs. ssh-keygen and PuTTYgen use different private key file formats. PuTTYgen can be used to convert between these formats via its import and export functionality (e.g. to reuse a private key when a user migrates the local PC from Windows to Linux or vice versa).

user@your-pc:~$ ssh-keygen -t rsa -b 4096

Generating public/private rsa key pair.
Enter file in which to save the key (/home/user/.ssh/id_rsa):
Enter passphrase (empty for no passphrase): ****************
Enter same passphrase again: ****************
Your identification has been saved in /home/user/.ssh/id_rsa.
Your public key has been saved in /home/user/.ssh/id_rsa.pub.  
The key fingerprint is:  
b8:df:d1:14:48:03:00:68:5e:46:9c:1a:b2:b2:d4:f4 user@your-pc  
The key's randomart image is:   
+--[ RSA 4096]----+
|   +oo....o      |  
|. +.=    . o     |
| =o=.     . .    |  
|o.o. E .     .   |  
|o.    . S   .    |
|.      .   o     |  
|      .   . .    |  
|       . . .     |
|        . .      |
+-----------------+

SSH-agent and Windows equivalents

A further advantage of public-key cryptography in connection with the ssh command is the possibility to use the ssh-agent helper program, that keeps private keys after they have been unlocked with their passphrase. After the user grants access via the ssh-agent and the ssh-add command to a private key once, the agent can then use the key to log into other servers without having the user to type in the passphrase again:

$ ssh-add $HOME/.ssh/id_rsa
Enter passphrase for key '/home/user/.ssh/id_rsa': ****************
Identity added: /home/user/.ssh/id_rsa (/home/user/.ssh/id_rsa)

$ ssh yourHummelUsername@hummel1.rrz.uni-hamburg.de

This is similar to the idea of Single Sign-On (SSO). Windows users can use corresponding helper programs for third-party software like putty or MobaXterm (e.g. the Pageant tool, which provides the same functionality).

Once the user is logged in to one of the gateway nodes (hummel1 or hummel2) $HOME and $WORK directories can be accessed, e.g. to transfer data using the scp (secure copy) command.

A special security feature of the Hummel cluster is that an additional login from the gateway node to one of two front end nodes (front1 or front2) is required to gain full access – e.g. to build programs and submit jobs – to the cluster environment. This is called a multi-hop login.

The corresponding ssh command is shown below.

[yourHummelUsername@login1 14:48:23]~$ ssh front1

[yourHummelUsername@node001 14:48:33]~$

The front end nodes are two (in principle arbitrary) nodes of the cluster (usually node001 and node002). The usage of alias names (front1 and front2) ensures to connect to a node providing the front end functionality. In contrast to the other compute nodes of the cluster the front end nodes are meant e.g. for interactive program development and job submitting.

Both ssh commands can also be grouped together in a single ssh command line using ssh connection chaining:

user@your-pc:~$ ssh -t yourHummelUsername@hummel1.rrz.uni-hamburg.de ssh front1

   _                                                                _
__/ \   __/ RRZ HPC Login | Zugang nur mit Berechtigung      \__   / \_
     \_/  \ RRZ HPC login | Access for authorized users only /  \_/    


[yourHummelUsername@node001 14:48:33]~$

The -t option is used to force an interactive terminal connection. For more information about the ssh usage and its options also see the man page for the ssh command (i.e. man ssh). man is an interface to the on-line reference manuals. For more information about the man page usage also see man man.

Further possibilities (for more advanced users) to connect transparently to a front end node are the use of proxy commands and ssh tunnels. Windows users can configure third-party software like putty or MobaXterm to achieve the same convenience.

Agent forwarding

Agent forwarding is used to connect to a third HPC system from an HPC system that you logged into with ssh from your computer. This is needed, for example, for copying files between two remote systems. The advantage of agent forwarding is that no secret information needs to be stored on a remote system (a private key), or needs to be passed to it (the password for the third machine).

Agent forwarding is switched off by default. It is switched on by -A and is used as shown in this example:

log into hpc_system1

      user@your-pc$ ssh -A username1@hpc_system1.example.com

from there, log into hpc_system2

      hpc_system1$ ssh username2@hpc_system2.example.com

or copy a file from hpc_system1 to hpc_system2

      hpc_system1$ scp example.c username2@hpc_system2.example.com:

Text editors

On an HPC-cluster one typically works in a terminal mode (or text mode in contrast to a graphical mode), i.e. one can use the keyboard but there is neither mouse support nor graphical output. Accordingly, text editors have to be used in text mode as well. For newcomers this is an obstacle. As a workaround, or for editing large portions, one can use the personal computer, where a graphical mode is available, and copy files to the cluster when editing is completed. However, copying files back and forth can be cumbersome if edit cycles are short: for example, if one is testing code on a cluster, where only small changes are necessary, using an editor directly on the cluster is very helpful.

Text user interface

Classic Unix/Linux text editors are vi/vim and GNU Emacs. Both are very powerful but their handling is not intuitive. The least things to know are the key strokes to quit them:

vi: <esc>:q! (quit without saving)
vi: <esc>ZZ (save and quit)
emacs: <cntl-x><cntl-c>

GNU nano is a small text editor that is more intuitive, in particular, because the main control keys are displayed with explanations in two bars at the bottom of the screen. The exit key is:

nano: <cntl-x>.

Graphical user interface

In addition to text user interfaces vim and GNU Emacs have graphical interfaces. On an HPC-cluster it is possible to use graphical interfaces if X11 forwarding is enabled (see -X option of the ssh command). Two other well know text editors, that you might find on your cluster, are gedit and kate.

However, X11 forwarding can be annoyingly slow if the internet connection is not good enough.

A trick for advanced users is to mount file systems from a cluster with SSHFS. Then one can transparently use one’s favourite text editor on the local computer for editing files on the remote cluster.

Use of the Command Line Interface

Use of the command line interface

Remote login

Login nodes