HSPH is using the FAS Research Computing environment to host our data and run analysis. The compute cluster is Odyssey and runs the same OS as the HSPH HPCC environment. If you are curious, do take a look at their list of frequently asked questions (FAQ); a list of the currently installed software is available online as well. Additional software can be installed on request.
Please note that access to the FAS RC environment is currently limited to participants in the HSPH-FAS pilot project. We will open up access up to all HSPH researchers as soon as possible.
Requesting an account
Accounts can be requested by filling out a simple signup form. Select
Odyssey Cluster as the subject, provide information on your lab affiliation (sponsor) and list
HSPH Pilot / Oliver Hofmann in the comments field. All other fields are optional.
Authentication and security
FAS uses a two-key authentication system similar to what many banks, Google and other places offer. Details are available at OpenAuth — basically you will have a small program widget for your PC or smart phone provide you with a number during the login which you will need in addition to your password. Client applications exist for Windows, OSX and Linux as well as iPhone, Blackberry and Android devices.
The only way to access the FAS RC environment is via SSH. Typing your password and security code for every data transfer and shell can quickly become tedious, but OpenSSH supports sharing of authentication sessions.
Finally, in order to access some of the webservers and other services at FAS you will require an active VPN connection which also allows you off-campus access to your data.
Odyssey is the FAS research computing cluster. If you plan on running computationally intensive jobs do take a look at the queue information, and make sure to browse through the extensive list of performance computing related questions.
Dedicated setups exists for high-memory jobs, I/O-intensive tasks or shared memory algorithms, and you can monitor the current status of the system at all times. Note that it is relatively easy to overload the file storage system. If you are submitting a large number of tasks it might make sense to throttle the number of concurrently running jobs. If in doubt, do mail the very responsive RC Helpdesk and they will get right back to you.
Odyssey allows interactive use which is great for exploring new tools or running shell-based sessions (SAS, Matlab, R) without having to submit jobs.
If you would like to use a graphical client you need to enable X11 forwarding. A sample session for using Matlab or SAS would look like this:
1. Log on to the cluster with X11 forwarding enabled
2. Request an interactive shell on Odyssey (
bsub -Is -q interact -n 1 bash)
3. Load the SAS or Matlab module (e.g.,
module load hpc/sas-9.3)
4. Start the shell (
At this point the SAS GUI should pop out.
Data storage and security
Data can be transferred to and from FAS using secure shell/copy only.
HSPH is in a process of purchasing and setting up additional backed up storage to meet users’ demands. Please, feel free to contact Oliver Hofmann about the status of this additional storage. HSPH researchers have access to the RC computing file systems, in particular 1TB of space per lab (
/n/hsphsS10/hsphfs1/scratch). All data stored in this
hsphS10 environment is redundantly stored and backed up at a second site.
- Backups are a a second copy of data. Though all FAS storage hardware has built-in redundancy so that a limited number of disk drive failures and other hardware faults can be tolerated, a backup is required in order for the data to survive a catastrophic failure of the entire system or facility. The Odyssey cluster consists of several data centers and offers off-site (inter-datacenter) backup. Backups are not accessible from the cluster or campus network—recovery from backups is by request only.
- Check-points are like a freeze frame picture of data at a point in time. Checkpoints are also known as snapshots. You can use checkpoints to undo recent changes to files, recover deleted files, etc. Though checkpoints function much like a backups, they’re not backups, since the data still only exist as one copy in one place (checkpoints are reconstructed algorithmically, not stored as separate copies). You can access checkpoints as easily as the primary storage.
The FAS environment does not have a default process for handling data with special security requirements. If access to your data needs to be limited in any way contact the RC Helpdesk prior to transferring data.