Computing Systems
Our computing systems are almost exclusively Linux based and currently (2026-03), we run Fedora 42 and Almalinus 9. For more info, please see the following links:

Conditions of Use of the PP computing facilities

  1. Users of the system should make themselves familiar with and obey the University's Policies and Procedures including the General Conditions of Use of Computing and Network Facilities.

  2. Users should protect their account with a well-chosen obscure password and not divulge that password to anyone else, or any Internet Service Provider or other organisation. If you think your account has been used by someone else then you must inform Mark Slater. Particular care must be taken when connecting from a computer outside the group. Change your password reasonably often using the passwd command at any terminal.

  3. Users should use email in a responsible way: avoid unnecessary multiple mailings and don't use email to transmit large files. If you receive junk mail (spam) about any subject, don't reply to it or click on any contained links. There are ways of reporting it locally.

  4. Users should only run software which is provided by the system and software directly associated with their experiments. Any other software should not be imported. You should not run any servers or services. Consult with Mark Slater if in doubt. If you need a particular package and it is not provided on our system, then approach MWS who will look into installing it.

  5. You may not disconnect or connect any provided PCs on the network. Computers already on the network must only run the operating system and services agreed with MWS, via the connection point provided for that computer.

  6. Generally, personal laptops do not need to be connected to the wired network. See MWS for special dispensation to connect a laptop or unmanaged device to the wired network.

  7. Users should be aware that the university can monitor all web traffic, including the recording of each end of connections at the very least, and monitors all email traffic, as required by legislation.

  8. Users should not display, download, or store or send, any illegal or offensive material on our computer facilities. University regulations forbid storing of pornographic material on any computers: it's a disciplinary offence. Because browsers have a cache, simply viewing implies storing.

  9. The university reserves the right to do spot-checks on any computer without notice, and on previous form, will remove for several weeks equipment associated with infringements of the rules. (This would have a serious effect on research).

  10. Users may have their own web pages provided through the group web server, if they wish and (for students) if their supervisor agrees. These should be in keeping with the experiments they are involved with. Don't use other's material as this infringes their copyright. Those wishing to have web pages which reflect their personal interests outside their experiment should use an external web service provider, or a social networking site.

  11. Please take care of the equipment provided for you. Avoid marking it or putting stickers on it that might be difficult to remove in the future. Do not move computing items from one PC to another without consulting MWS. If there is a problem with a PC, consult MWS first. Keep equipment well-ventilated to avoid risk of fire.

  12. Drinks, food and cigarettes are bad for computers, particularly keyboards. Please keep them apart. (Smoking is not allowed in university buildings in any case).

Accessing the System

Logging on from a Desktop

You can use your linux user ID (as given to you by Mark Slater) and password to log on to any of the desktops in the department. As the home areas are mounted from a single server you should have the same environment and settings no matter which desktop you use.

Connecting via SSH

To log in to the PP system remotely via SSH depends on your OS: Windows users should use Putty, for MAC and Linux, ssh is provided so just open a terminal and type 'ssh ...'

As regards which machine to access and where from, for basic terminal access similar to Lxplus, use eprexa.ph.bham.ac.uk, eprexb.ph.bham.ac.uk or eprexc.ph.bham.ac.uk. These are dedicated 'login' nodes that can always be connected to from CERN or on campus. From the login nodes, it's also possible to then log in to the individual desktops.

If you are a member of staff or PhD student, you can apply for remote access using a VPN client. Go to the Service Desk and request 'Remote Access'. This should allow you to download the VPN client. For more info, see this knowledge base article. After connecting through the VPN, you can log in directly to eprexa/b and also the office desktops.

If you are unable to use the VPN, you can access the login nodes by using a valid grid certificate. load it in your browser and go here.

Finally, if all else fails, you can email MWS with your IP address and I'll add it ASAP to give you access to the login nodes.

File Access

Again, this depends on the OS. WinSCP is a great graphical solution for Windows. For MAC and Linux, 'sshfs' is a good and freely available solution that allows access to your remote directories over SSH. For Linux, use your normal package manager. For Mac download the pkg from here and the required Fuse package here. After doing this, you can 'mount' you home area (or any other folder you have access to on our system) remotely using:

sshfs user@...:/remote/dir/to/access /local/dir/to/put/it

You can then browse it just like a normal directory. Be aware that you need to keep a constant internet connection for this to work!

Graphical Access

Due to the slowness of X sessions and the fact that I've seen many problems recently with MAC and XQuartz, I would recommend using VNC if you need graphical access to one of our machines. To do this, first install Tiger VNC from tigervnc.org, or more directly, from it's bintray page here on your local machine.

Next, choose the machine you want to run VNC on (and I would strongly recommend your own desktop in the department if possible!). When logging in, you will need to 'tunnel' the connection through as the firewall is blocking the normal VNC ports. This is a lot easier than it sounds - for each machine you log in to before the final machine running VNC (e.g. Lxplus and then eprexa), log in using the following:

ssh -4 -L 59nn:localhost:59nn user@machine...

where nn will be the desktop number used below. Nore that for Putty, it's a little more complicated. There's a decent description here - just change the source port to 59nn and the destination port to localhost:59nn.

For the 'final' connection to eprex or epldt, do the same:

ssh -4 -L 59nn:localhost:59nn user@epldt...

and then from the terminal window you get to, run the following to start a VNC desktop running:

vncserver :nn -localhost

again, where nn is a number between 0-99, dropping the leading zero if there is one (pick any number you like - it'll tell you if another VNC server is running on this port!). It's best to stick with the same port number through all tunneled machines but technically, you can tunnel the 59nn from eprex/epldt to any port number > 1024.

You should now be able to start Tiger VNC and connect to localhost:59nn and it should ask you for a password that you will have set up when first running vncserver. This should hopefully present you with a desktop just like one of our machines! You can close this window and log out whenever you want. As long as you follow the above instructions to log back in then your desktop will be kept running just as you left it. The only other thing to note is that the speed with which the desktop starts up seems to be very dependant on the number of other VNC sessions running so it can take up to >10mins on eprexa/b - another reason to use your desktop if you can!

If you have finished using a VNC server, then you can kill it and any associated processes by doing:

vncserver -kill :nn

again, dropping the leading zero if necessary.

Finally, if you start the VNC viewer and see a desktop but can't click on anything, even after waiting a few minutes (it can take some time to start up) then try removing the file:

~/.vnc/xstartup

and, if you have it, the folders:

~/.cinnamon ~/.gnome ~/.gnome2

General System Information

Default Shell

By default, the login shell is Bash. This can be changed by sending a request to MWS. To run commands when you login, you can put them in the file:

~/.bashrc

This will be source'd every time you login to the system. Note that if you want specific things only for opening a new terminal window/screen window, then use:

~/.bash_profile

For more info on the order of what scripts are executed and when on login/logout, see this page.

Operating Systems Available

In order to give a decent and up-to-date desktop experience, we use the Fedora distribution that is based on Red Hat. However, as a lot of the software we use is tied to the more stable RHEL releases, you can switch to using either a CentOS 7 or AlmaLinux 9 container by doing:

cc7
or
alma9

If there is specific setup you want done when you enter the container, add them to the file:

~/.bashrc_cc7
or
~/.bashrc_alma9

and this will be source'd when the shell starts.

Changing Password

To change your password (which you should do on a semi-regular basis!), you can use the command:

passwd

After giving your current password you will be able to enter a new one. Note that it needs to pass the complexity restrictions to be accepted which means it must:

  • Be a minimum of 8 characters long (though the longer the better!)
  • Contain upper and lowercase letters, numbers and punctuation
  • Cannot include dictionary words or your username

Email

On joining the group you'll get an email address that is based on:

user@hep.ph.bham.ac.uk

This will be set to forward any group email to the forwarding address you gave when the account was created. This forwarding address can be any valid email you'd like, e.g. CERN, University, Gmail, etc. In addition, your email will automatically added to the appropriate group email lists. If you'd like to change any of these settings, please get in touch with MWS.

There are several email lists available, the most useful ones are (all are @hep.ph.bham.ac.uk):

  • ppgroup, ppstaff, ppstud: the whole group, staff or students excluding ALICE members
  • epgroup, epstaff, epstud: the whole group, staff or students including ALICE members
  • localusers: Users with a registered office in the department
  • e.g. atlasgroup, atlasstaff, atlasstud: Each group (e.g. atlas, lhcb, dune, na62, etc.) have their own group, staff and student lists
  • newstud: PhD students that have joined in the last 12 months
  • uglist: Any currently active undergraduates

To find out which users are on which lists, you can just grep for them:

grep ^epgroup /disk/maildisk/home/mail/share/aliases

To avoid confusion, note that your University provided email (adfusername@bham.ac.uk or adfusername@student.bham.ac.uk) is accessed through the page mail.bham.ac.uk. From this Outlook web interface, you can manage this email here or get it forwarded somewhere else (Go to 'Settings' -> 'Options' -> 'Inbox and Sweep Rules'). Again though, the login here is your University provided (or ADF) login, NOT your PP group Linux login!

Grid Certificates and Website SSL Access

It is highly likely you will need a Grid Certificate while working in the group. Your experiment may have particular rules for how to go about this, but if not you can request a CERN one from here. Alternatively, you can request a UK one from here. This will involve getting in touch with MWS to show your university ID in order to authorise your request. There is a dedicated page describing the process here: Requesting a Grid Certificate.

Printing and Scanning

To print when in the department, please use the Managed Print Service. The desktops are already setup to print to this using the 'SafeQ' printer option. After sending a print job, go to one of the printers (Coffee area 2nd floor, W333A on the 3rd floor), tap your University Card and give your central university username and password. You should then be able to print your job from the 'Waiting' Queue. You can also use these devices for photocopying or scanning to email. For more info about the service and instructions about how to set it up on a privately owned laptop, please see this KB article. Note that if you find it doesn't work, you may not have given MWS your ADF username - get in touch with it and it will be updated!

File Storage Information

Home Area

Each user has, by default, 120GB of space in their home area that is stored on a fast, NFS server with redundancy against Hard Disk failure. Backups (excluding *.root files) are taken every couple of days. This storage is generally meant for code and documents as it's designed for fast access to small files, e.g. compilation of code. If you find you need more space then email MWS. Also, by default all files created in your home area can only be accessed by you. Use the chmod command to allow read access for others if you wish.

Mass Storage

In addition to your home area, there is a general mass storage system running the Moose Filesystem which can be accessed from:

/disk/moose

This contains 855TB of storage but only half of that is usable due two copies of each file being stored for redundancy. There is no other backup procedure so though it's very unlikely, if we have two disks fail on different servers in quick succession (i.e. before the system has chance to rebalance) there will be data loss. If you want to store files with even less chance of loss, put them in:

/disk/moose/critical

as 4 copies of these files are kept. Don't abuse this though as MWS will ask this to be cleaned first if space is running low!

Under both of these directories are folders for each experiment. Only experiment members have read/write access to these but they can be organised as the experiment wishes.

Scratch Area

On each linux machine there is a /scratch area which writes to the physical disk of the machine. This can be useful for storing large files that won't fit into /tmp. Note however, that files here should be considered volatile and copied elsewhere for long term storage. Full or damaged disks and reinstallation of the machine could all result in data loss.

Batch Farm Information

Specifications

The batch system available in the PP group is running HTCondor and consists of 4 nodes with 64 cores per node and 2GB RAM per core. In total there are a maximum of 256 cores available. A fairshare system is in place to minimise the monopoly of the cluster by any single user. To get an overview of the whole cluster and see what's currently running, use condor_status.

Submitting and Managing Jobs

There are two principle ways of submitting: Using a submission config script or direct submission of a script. The simplest option is just to submit a script using:

condor_qsub [script_name]

This will submit the job and it will wait in the queue until it's matched and then run on the given worker node. Note that you need to give an actual executable script in the above otherwise you'll get an error and the job will go on hold. If you want more control over the submission, including submitting batches of jobs, you should use a condor submission script. A basic version of this is:

Executable = [script_name]
Universe = vanilla
output = [stdout file]
error = [stderr file]
log = [condor log file]
# in MB
request_memory = 500
request_disk = 3000000
environment="TESTVAR=mytestvar"
queue

There are many more options to this subission file so please see the condor documentation for more info.

After you've submitted your job, you can view the status of it and any other jobs you've submitted using condor_q. If you want more information about a job, use condor_q -better-analyze [jobID].

Troubleshooting

If your jobs refuse to run or go immediately on to HOLD, then there are a few ways to find out what the problem is:

  • Use condor_q -better-analyze [jobID] and see if that gives a reason for the failure
  • Look at the produced condor log file. This will record each step of your jobs trip through the batch system and should give reasons for any failure
  • If your job has run but then failed, you can check the stderr file as that may also show problems
  • If your jobs are never going to run for whatever reason, please delete them using condor_rm [jobID]

If you've checked these but still can't figure out where the problem is, please get in touch with MWS.

Special Servers

epldt001

An older GPU chassis with 2 Tesla P100 cards available for use. It has CUDA 12.4 available (/usr/local/cuda) and has a base OS of Almalinux 9. If you login using ssh as usual and run nvidia-smi you will see the specs of the cards. You should be able (and it is recommended) to use the latest TensorFlow releases ( > 2.1.0 ) with this version of CUDA.

epldt002

This is a more powerful server to be used for software and jobs that can't be parallelised and will require signficant computing power. It has Intel Xeon Gold 6152 CPUs providing 88 cores, 128GB RAM and 3.6TB HDD space. Though this is very useful to run large scale compute tasks and is intended to stop people using the login nodes, please use the batch where possible as there is no fairshare or rationing of the resources and you could easily cause others problems if you blindly run jobs on it.

epldt003

A similar GPU chassis to epldt001 but this is more powerful and has an A100 card available.