passwd command at any terminal.
You can use your linux user ID (as given to you by Mark Slater) and password to log on to any of the desktops in the department. As the home areas are mounted from a single server you should have the same environment and settings no matter which desktop you use.
To log in to the PP system remotely via SSH depends on your OS: Windows users should use Putty, for MAC and Linux, ssh is provided so just open a terminal and type 'ssh ...'
As regards which machine to access and where from, for basic terminal access similar to Lxplus, use eprexa.ph.bham.ac.uk, eprexb.ph.bham.ac.uk or eprexc.ph.bham.ac.uk. These are dedicated 'login' nodes that can always be connected to from CERN or on campus. From the login nodes, it's also possible to then log in to the individual desktops.
If you are a member of staff or PhD student, you can apply for remote access using a VPN client. Go to the Service Desk and request 'Remote Access'. This should allow you to download the VPN client. For more info, see this knowledge base article. After connecting through the VPN, you can log in directly to eprexa/b and also the office desktops.
If you are unable to use the VPN, you can access the login nodes by using a valid grid certificate. load it in your browser and go here.
Finally, if all else fails, you can email MWS with your IP address and I'll add it ASAP to give you access to the login nodes.
Again, this depends on the OS. WinSCP is a great graphical solution for Windows. For MAC and Linux, 'sshfs' is a good and freely available solution that allows access to your remote directories over SSH. For Linux, use your normal package manager. For Mac download the pkg from here and the required Fuse package here. After doing this, you can 'mount' you home area (or any other folder you have access to on our system) remotely using:
sshfs user@...:/remote/dir/to/access /local/dir/to/put/it
You can then browse it just like a normal directory. Be aware that you need to keep a constant internet connection for this to work!
Due to the slowness of X sessions and the fact that I've seen many problems recently with MAC and XQuartz, I would recommend using VNC if you need graphical access to one of our machines. To do this, first install Tiger VNC from tigervnc.org, or more directly, from it's bintray page here on your local machine.
Next, choose the machine you want to run VNC on (and I would strongly recommend your own desktop in the department if possible!). When logging in, you will need to 'tunnel' the connection through as the firewall is blocking the normal VNC ports. This is a lot easier than it sounds - for each machine you log in to before the final machine running VNC (e.g. Lxplus and then eprexa), log in using the following:
ssh -4 -L 59nn:localhost:59nn user@machine...
where nn will be the desktop number used below. Nore that for Putty, it's a little more complicated. There's a decent description here - just change the source port to 59nn and the destination port to localhost:59nn.
For the 'final' connection to eprex or epldt, do the same:
ssh -4 -L 59nn:localhost:59nn user@epldt...
and then from the terminal window you get to, run the following to start a VNC desktop running:
vncserver :nn -localhost
again, where nn is a number between 0-99, dropping the leading zero if there is one (pick any number you like - it'll tell you if another VNC server is running on this port!). It's best to stick with the same port number through all tunneled machines but technically, you can tunnel the 59nn from eprex/epldt to any port number > 1024.
You should now be able to start Tiger VNC and connect to localhost:59nn and it should ask you for a password that you will have set up when first running vncserver. This should hopefully present you with a desktop just like one of our machines! You can close this window and log out whenever you want. As long as you follow the above instructions to log back in then your desktop will be kept running just as you left it. The only other thing to note is that the speed with which the desktop starts up seems to be very dependant on the number of other VNC sessions running so it can take up to >10mins on eprexa/b - another reason to use your desktop if you can!
If you have finished using a VNC server, then you can kill it and any associated processes by doing:
vncserver -kill :nn
again, dropping the leading zero if necessary.
Finally, if you start the VNC viewer and see a desktop but can't click on anything, even after waiting a few minutes (it can take some time to start up) then try removing the file:
~/.vnc/xstartup
and, if you have it, the folders:
~/.cinnamon ~/.gnome ~/.gnome2
By default, the login shell is Bash. This can be changed by sending a request to MWS. To run commands when you login, you can put them in the file:
~/.bashrc
This will be source'd every time you login to the system. Note that if you want specific things only for opening a new terminal window/screen window, then use:
~/.bash_profile
For more info on the order of what scripts are executed and when on login/logout, see this page.
In order to give a decent and up-to-date desktop experience, we use the Fedora distribution that is based on Red Hat. However, as a lot of the software we use is tied to the more stable RHEL releases, you can switch to using either a CentOS 7 or AlmaLinux 9 container by doing:
cc7
or
alma9
If there is specific setup you want done when you enter the container, add them to the file:
~/.bashrc_cc7
or
~/.bashrc_alma9
and this will be source'd when the shell starts.
To change your password (which you should do on a semi-regular basis!), you can use the command:
passwd
After giving your current password you will be able to enter a new one. Note that it needs to pass the complexity restrictions to be accepted which means it must:
On joining the group you'll get an email address that is based on:
user@hep.ph.bham.ac.uk
This will be set to forward any group email to the forwarding address you gave when the account was created. This forwarding address can be any valid email you'd like, e.g. CERN, University, Gmail, etc. In addition, your email will automatically added to the appropriate group email lists. If you'd like to change any of these settings, please get in touch with MWS.
There are several email lists available, the most useful ones are (all are @hep.ph.bham.ac.uk):
ppgroup, ppstaff, ppstud: the whole group, staff or students excluding ALICE members epgroup, epstaff, epstud: the whole group, staff or students including ALICE members localusers: Users with a registered office in the department atlasgroup, atlasstaff, atlasstud: Each group (e.g. atlas, lhcb, dune, na62, etc.) have their own group, staff and student lists newstud: PhD students that have joined in the last 12 months uglist: Any currently active undergraduates To find out which users are on which lists, you can just grep for them:
grep ^epgroup /disk/maildisk/home/mail/share/aliases
To avoid confusion, note that your University provided email (adfusername@bham.ac.uk or adfusername@student.bham.ac.uk) is accessed through the page mail.bham.ac.uk. From this Outlook web interface, you can manage this email here or get it forwarded somewhere else (Go to 'Settings' -> 'Options' -> 'Inbox and Sweep Rules'). Again though, the login here is your University provided (or ADF) login, NOT your PP group Linux login!
It is highly likely you will need a Grid Certificate while working in the group. Your experiment may have particular rules for how to go about this, but if not you can request a CERN one from here. Alternatively, you can request a UK one from here. This will involve getting in touch with MWS to show your university ID in order to authorise your request. There is a dedicated page describing the process here: Requesting a Grid Certificate.
To print when in the department, please use the Managed Print Service. The desktops are already setup to print to this using the 'SafeQ' printer option. After sending a print job, go to one of the printers (Coffee area 2nd floor, W333A on the 3rd floor), tap your University Card and give your central university username and password. You should then be able to print your job from the 'Waiting' Queue. You can also use these devices for photocopying or scanning to email. For more info about the service and instructions about how to set it up on a privately owned laptop, please see this KB article. Note that if you find it doesn't work, you may not have given MWS your ADF username - get in touch with it and it will be updated!
Each user has, by default, 120GB of space in their home area that is stored on a fast, NFS server with redundancy against Hard Disk failure. Backups (excluding *.root files) are taken every couple of days. This storage is generally meant for code and documents as it's designed for fast access to small files, e.g. compilation of code. If you find you need more space then email MWS. Also, by default all files created in your home area can only be accessed by you. Use the chmod command to allow read access for others if you wish.
In addition to your home area, there is a general mass storage system running the Moose Filesystem which can be accessed from:
/disk/moose
This contains 855TB of storage but only half of that is usable due two copies of each file being stored for redundancy. There is no other backup procedure so though it's very unlikely, if we have two disks fail on different servers in quick succession (i.e. before the system has chance to rebalance) there will be data loss. If you want to store files with even less chance of loss, put them in:
/disk/moose/critical
as 4 copies of these files are kept. Don't abuse this though as MWS will ask this to be cleaned first if space is running low!
Under both of these directories are folders for each experiment. Only experiment members have read/write access to these but they can be organised as the experiment wishes.
On each linux machine there is a /scratch area which writes to the physical disk of the machine. This can be useful for storing large files that won't fit into /tmp. Note however, that files here should be considered volatile and copied elsewhere for long term storage. Full or damaged disks and reinstallation of the machine could all result in data loss.
The batch system available in the PP group is running HTCondor and consists of 4 nodes with 64 cores per node and 2GB RAM per core. In total there are a maximum of 256 cores available. A fairshare system is in place to minimise the monopoly of the cluster by any single user. To get an overview of the whole cluster and see what's currently running, use condor_status.
There are two principle ways of submitting: Using a submission config script or direct submission of a script. The simplest option is just to submit a script using:
condor_qsub [script_name]
This will submit the job and it will wait in the queue until it's matched and then run on the given worker node. Note that you need to give an actual executable script in the above otherwise you'll get an error and the job will go on hold. If you want more control over the submission, including submitting batches of jobs, you should use a condor submission script. A basic version of this is:
Executable = [script_name]
Universe = vanilla
output = [stdout file]
error = [stderr file]
log = [condor log file]
# in MB
request_memory = 500
request_disk = 3000000
environment="TESTVAR=mytestvar"
queue
There are many more options to this subission file so please see the condor documentation for more info.
After you've submitted your job, you can view the status of it and any other jobs you've submitted using condor_q. If you want more information about a job, use condor_q -better-analyze [jobID].
If your jobs refuse to run or go immediately on to HOLD, then there are a few ways to find out what the problem is:
condor_q -better-analyze [jobID] and see if that gives a reason for the failurecondor_rm [jobID] If you've checked these but still can't figure out where the problem is, please get in touch with MWS.
An older GPU chassis with 2 Tesla P100 cards available for use. It has CUDA 12.4 available (/usr/local/cuda) and has a base OS of Almalinux 9. If you login using ssh as usual and run nvidia-smi you will see the specs of the cards. You should be able (and it is recommended) to use the latest TensorFlow releases ( > 2.1.0 ) with this version of CUDA.
This is a more powerful server to be used for software and jobs that can't be parallelised and will require signficant computing power. It has Intel Xeon Gold 6152 CPUs providing 88 cores, 128GB RAM and 3.6TB HDD space. Though this is very useful to run large scale compute tasks and is intended to stop people using the login nodes, please use the batch where possible as there is no fairshare or rationing of the resources and you could easily cause others problems if you blindly run jobs on it.
A similar GPU chassis to epldt001 but this is more powerful and has an A100 card available.