Skip to content

parkermac/LO_roms_user

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

README for LO_roms_user

This repo is a place for user versions of code used to compile ROMS code associated with the git repository LO_roms_source_git.

These notes are written for klone.


Overview: klone is a UW supercomputer in the hyak system. It is what we use for all our ROMS simulations.

Here are examples of aliases I have on my mac ~/.bash_profile (equivalent to ~/.bashrc on the linux machines) to quickly get to my machines

alias klo='ssh [email protected]'
alias pgee='ssh [email protected]'
alias agee='ssh [email protected]'

Note: klone1 is the same as klone.


Tools to control jobs running on klone

There is excellent documentation of the UW hyak system, for example starting here:

https://hyak.uw.edu/docs/compute/start-here

I encourage yo to explore the tabs on the left of that page to answer any questions you have.

/gscratch/macc is our working directory on klone because on hyak we are the "macc" group. I have created my own directory inside that: "parker", where all my code for running ROMS is stored.

Here are a few useful commands.

When you have a job running on klone you can check on it using:

squeue -A macc

-A refers to the account (macc in our case). We also have resources in "coenv".

You could also use other command line arguments to get specific info:

-p [compute, cgu-g2, ckpt-g2] for different "partitions" which is a fancy word meaning which type of computer is your job running on.

-u [a UW NetID] to see info on the job(s) for a specific user

If you want to stop a running job, find the job ID (the number to the left in the squeue listing) and issue the command:

scancel [job ID]

Since your job will typically have been launched by a python driver you will also want to stop that driver. Use "top" to find the associated job ID, and then use the "kill" command.


Getting resource info

hyakstorage will give info about storage on klone. Use hyakstorage --help to get more info on command options.

hyakalloc will give info on the nodes we have priority access to.

More specifics about nodes we can use:

-A macc -p compute: These are the original klone nodes. We own 600 cores (15 nodes with 40 cores each). We are allocated 1 TB of storage for each node, so 15 TB total.

-A macc -p cpu-g2: These are the new klone nodes. We own 480 cores (15 "slices" with 32 cores each). Each node consists of 6 slices, so we own 2.5 nodes. The advantage of running on the these slices is that it is easier for the scheduler to allocate resources because they are all on one node. They are also faster. Currently 6 slices are reserved for the daily forecast system.

-A coenv -p cpu-g2: We own 320 cores (10 slices with 32 cores each). This is in a separate account because of the history of how they were purchased.

-p ckpt-g2: These are cpu-g2 nodes that are available to anyone in the UW system, no -A account needed. They have proven to be useful even for long, large runs.


Notes on using these resources

The differences among these compute resources have been incoportated into the new driver_roms00.py. You need to specify three flag-value pairs:

-grp [macc, conenv,] (not needed if using -cpu ckpt-g2)

-cpu [compute, cpu-g2]

-np [some number]

For example:

python3 driver_roms00.py -grp macc -cpu compute -np 200 [plus other required flags]

For the old "compute" nodes you would want to use -np as some integer times 40, e.g. 40, 80, 200. If you use 200 for example your job will be using 5 nodes, which you can confirm sith squeue.

For the new cpu-g2 nodes you would want to use -np as some integer times 32, e.g. 32, 64, 160, 192. It can be more reliable to have a job on one node, so 192 is a good maximum value.

In general everyone except for Parker should stick to the coenv/cpu-g2 or macc/compute or ckpt-g2 unless I specifically give you the okay to work on macc/cpu-g2.


Some resources specific to the coenv nodes

Useful email: [email protected]

  • This mailman list was created to help facilitate communications between all the COENV Hyak owners and users, with an emphasis on this list being used to:
  • Communicate when there is a job/scheduling issue within the COENV nodes with hopes that a resolution will be handled by this group.
  • Communicate when your job would benefit from extra nodes/cores past what your group has purchased. This requires permission from your fellow COENV node owners and users.
  • Allows eSITS to monitor COENV communications and step in for assistance or resolution as needed or required.

Useful commands There are some new commands available to College of the Environment Hyak users that should help increase the visibility of our shared Hyak compute resources. Note: to access the commands below, run this script, then log out and log back into your Hyak terminal:

/gscratch/coenv/shared/bin/coenv_install.sh

coenvalloc prints out a summary of the current usage of allocated coenv Hyak resources, broken down by PI Group. By default only the current user's PI Group is shown, but you can view all groups using the '--all' option.

coenvjobs shows a summary of jobs that are currently running under the coenv, broken down by PI group. As with coenvalloc, the default is to only show the current PI group's jobs, but you can see all jobs using the '--all' option.


Once you have gotten a klone account from our system administrator, you have two directories to be aware of.

First directory: In your home directory (~) you will need to add some lines to your .bashrc using vi or whatever your favorite command line text editor is.

Here is my .bashrc on klone as of 8/29/2025:

# .bashrc

# Source global definitions
if [ -f /etc/bashrc ]; then
	. /etc/bashrc
fi

# User specific environment
if ! [[ "$PATH" =~ "$HOME/.local/bin:$HOME/bin:" ]]
then
    PATH="$HOME/.local/bin:$HOME/bin:$PATH"
fi
export PATH

LODIR=/gscratch/macc/local
NFDIR=${LODIR}/netcdf-ifort
NCDIR=${LODIR}/netcdf-icc
export LD_LIBRARY_PATH=${NFDIR}/lib:${NCDIR}/lib:${LD_LIBRARY_PATH}
export PATH=/gscratch/macc/local/netcdf-ifort/bin:$PATH
export PATH=/gscratch/macc/local/netcdf-icc/bin:$PATH

# New imports 2025.07.14 after Darr compiled newer NetCDF libraries
#LODIR=/gscratch/macc/local
#NFDIR=${LODIR}/netcdf-ifort-4.6.2
#NCDIR=${LODIR}/netcdf-c-4.9.3
#export LD_LIBRARY_PATH=${NFDIR}/lib:${NCDIR}/lib:${LD_LIBRARY_PATH}
#export PATH=/gscratch/macc/local/netcdf-ifort-4.6.2/bin:$PATH
#export PATH=/gscratch/macc/local/netcdf-c-4.9.3/bin:$PATH

# Uncomment the following line if you don't like systemctl's auto-paging feature:
# export SYSTEMD_PAGER=

# User specific aliases and functions
LOd=/gscratch/macc/parker/LO/driver
alias cdpm='cd /gscratch/macc/parker'
alias cdLo='cd /gscratch/macc/parker/LO'
alias cdLu='cd /gscratch/macc/parker/LO_user'
alias cdLoo='cd /gscratch/macc/parker/LO_output'
alias cdLor='cd /gscratch/macc/parker/LO_roms'
alias cdLru='cd /gscratch/macc/parker/LO_roms_user'
alias cdLrs='cd /gscratch/macc/parker/LO_roms_source_git'
alias cdLod='cd /gscratch/macc/parker/LO_data'
alias pmsrun='srun -p compute -A macc --pty bash -l'
alias pmsrun2='srun -p cpu-g2 -A macc --pty bash -l'
alias buildit='./build_roms.sh -j 10 < /dev/null > bld.log &'
alias buildit_dev='./build_roms.sh -j 10 -b develop < /dev/null > bld.log &'
alias mli='module load intel/oneAPI'

# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/gscratch/macc/parker/miniconda3/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
    eval "$__conda_setup"
else
    if [ -f "/gscratch/macc/parker/miniconda3/etc/profile.d/conda.sh" ]; then
        . "/gscratch/macc/parker/miniconda3/etc/profile.d/conda.sh"
    else
        export PATH="/gscratch/macc/parker/miniconda3/bin:$PATH"
    fi
fi
unset __conda_setup
# <<< conda initialize <<<

The section of aliases are what I use to help move around quickly. You might want similar aliases but be sure to substitute the name of your working directory for "parker".

In particular you will need to copy and paste in the section with all the module and export lines. These make sure you are using the right NetCDF libraries.

The conda part was added automatically when I set up a python environment on klone. At this point, however you DO NOT need to create a new python environment on klone. The one that is already there is enough to do all our model runs.

TO DO: I need to set the base working directory as a variable.

Second directory: The main place where you will install, compile, and run ROMS is your working directory:

/gscratch/macc/[your directory name] We call this (+) below.

Note: Even though my username on klone is "pmacc" my main directory is "parker". This implies that there is less restriction in naming things on klone compared to apogee and perigee. I don't recall who set up my initial directory. Either David Darr or I did it.


Set up ssh-keygen to apogee

The LO ROMS driver system tries to minimize the files we store on hyak, because the ROMS output files could quickly exceed our quotas. To do this the drivers (e.d. LO/driver/driver_roms00.py) uses scp to copy forcing files and ROMS output files from apogee or perigee where we have lots of storage. Then the driver automatically deletes unneeded files on hyak after each day it runs. To allow the driver to do this automatically you have to grant it access to your account on perigee or apogee, using the ssh-keygen steps described here.

Log onto klone1 and do:

ssh-keygen

and hit return for most everything. However, you may encounter a prompt like this:

Enter file in which to save the key (/mmfs1/home/pmacc/.ssh/id_rsa):
/mmfs1/home/pmacc/.ssh/id_rsa already exists.
Overwrite (y/n)?

Looking HERE, I found out that id_rsa is the default name that it looks for automatically. You can name the key anything and then just refer to it when using ssh and etc. like:

ssh [email protected] -i /path/to/ssh/key

In the interests of tidying up I chose to overwrite in the above. When I did this it asked for a passphrase and I hit return (no passphrase).

Then I did:

ssh-copy-id [email protected]

(it asks for my apogee password)

And now I can ssh and scp from klone to apogee without a password, and on apogee it added a key with [email protected] at the end to my ~/.ssh/authorized_keys.

Similarly, on klone there is now an entry in ~/.ssh/known_hosts for apogee.ocean.washington.edu.

So, in summary: for going from klone1 to apogee it added to:

  • ~/.ssh/known_hosts on klone, and
  • ~/.ssh/authorized_keys on apogee

Now I can run ssh-copy-id again for other computers, without having to do the ssh-keygen step.

Don't worry if things get messed up. Just delete the related entries in the .ssh files and start again. This is a good place to remind yourself that you need to be able to edit text files from the command line on remote machines, e.g. using vi.


Working from (+), clone the LO repo:

git clone https://github.com/parkermac/LO.git

Also clone your own LO_user repo. Note that you do not have to install the "loenv" python environment. All the code we run on klone is designed to work with the default python installation that is already there.


Before you start using ROMS you should get a ROMS account. See the first bullet link below.

Places for ROMS info:


Get the ROMS source code

Then put the ROMS source code on klone, again working in (+). Do this using git. Just type this command. This will create a folder LO_roms_source_git with all the ROMS code.

git clone https://github.com/myroms/roms.git LO_roms_source_git

You can bring the repo up to date anytime from inside LO_roms_source_git by typing git pull.


Next, create (on your personal computer) a git repo called LO_roms_user, and publish it to your account on GitHub.

Copy some of my code from https://github.com/parkermac/LO_roms_user into your LO_roms_user. Specifically you want to get the folder "upwelling".

This is the upwelling test case that comes with ROMS. It is always the first thing you should try to run when moving to a new version of ROMS or a new machine.

I have created a few files to run it on klone:

  • build_roms.sh modified from LO_roms_source_git/ROMS/Bin. You need to edit line 173 so that MY_ROOT_DIR is equal to your (+).
  • You may also need to issue the command chmod u+x build_roms.sh so that you have permisssion to execute that script.
  • upwelling.h copied from LO_roms_source_git/ROMS/Include. No need to edit.
  • roms_upwelling.in modified from LO_roms_source_git/ROMS/External. You will need to edit line 76 so that the path to varinfo.yaml points to (+).
  • klone_batch0.sh created from scratch. You will need to edit line 24 so that RUN_DIR points to (+).

After you have edited everything on your personal computer, push it to GitHub, and clone it to (+) on klone.


Now you are ready to compile and run ROMS (in parallel) for the first time!

Working on klone in the directory LO_roms_user/upwelling, do these steps, waiting for each to finish, to compile ROMS:

srun -p compute -A macc --pty bash -l

The purpose of this is to log you onto one of our compute nodes because in the hyak system you are supposed to compile on a compute node, leaving the head node for stuff like running our drivers and moving files around. You should notice that your prompt changes, now showing which node number you are on. Any user in the LiveOcean group should be able to use this command as-is because "macc" refers to our group ownership of nodes, not a single user. Note that in my .bashrc I made an alias pmsrun for this hard-to-remember command. I also have pmsrun2 to use "-p cpu-g2", the next-generation nodes. Don't use these without checking with Parker; some are reserved for the daily forecast system! It makes no difference which nodes you use to compile on.

Then before you can do the compiling on klone you have to do:

module load intel/oneAPI

I have this aliased to mli in my .bashrc.

Then to actually compile you do:

./build_roms.sh -j 10 < /dev/null > bld.log &

This will take about ten minutes, spew a lot of text to bld.log, and result in the executable romsM. It also makes a folder Build_romsM full of intermediate things such as the .f90 files that result from the preprocessing of the original .F files. I have this aliased as buildit in my .bashrc.

The -j 10 argument means that we use 10 cores to compile, which is faster. Note that each compute node on klone had 40 cores (or 32 if we were using the cpu-g2 partition).

On occasion I have a problem where keyboard input (like hitting Return because you are impatient) causes the job to stop. That is why I added the < /dev/null thing to this command.

>>> IMPORTANT: After compiling is done, DO NOT FORGET TO: <<<

logout

to get off of the compute node and back to the head node. If I forget to do logout and instead try to run ROMS from the compute node it will appear to be working but not make any progress.

Then to run ROMS do (from the klone head node, meaning after you logged out of the compute node):

sbatch -p compute -A macc klone_batch0.sh

This will run the ROMS upwelling test case on 4 cores. It should take a couple of minutes. You can add the < > & things to the sbatch command line not have to wait for it to finish.

If it ran correctly it will create a log file roms_log.txt and NetCDf output: roms_[his, dia, avg, rst].nc


Running things by cron

These are mainly used by the daily forecast but can also be helpful for checking on long hindcasts and sending you an email. See LO/driver/crontabs for my current versions. You can look at these to see examples of the commands I use with driver forcing and driver_roms.


LO Compiler Configurations

Below we list the current folders where we define LO-specific compiling choices. The name of each folder refers to [ex_name] in the LO run naming system. Before compiling, each contains:

  • build_roms.sh Which can be copied directly from your upwelling folder, without need to edit.
  • [ex_name].h This has configuration specific compiler flags. You can explore the full range of choices and their meanings in LO_roms_source/ROMS/External/cppdefs.h.
  • fennel.h if this is a run with biology.

NOTE: to run any of these, or your own versions, you have to make the LO_data folder in (+) and use scp to get your grid folder from perigee or apogee.

NOTE: the ex_name can have numbers, but no underscores, and all letters MUST be lowercase.

Naming conventions: There are not formal naming conventions, but I typically start with a letter like "x", or "xn" if it is for a nested (no tides) case. Then a number like "4" to give some indication of where it is in our development. I typically append "b" if the run includes biology.


CURRENT

x11b

This is the current primary forecast executable, as of 6/22/2025. It is like x10ab but with updated ROMS version (4.3, as of 2025.06.12) and compiled using the -b develop branch (probably not needed if you did git pull recently in LO_roms_source_git), and defining OMEGA_IMPLICIT. This has the Harcourt turbulence improvements as the default advection scheme. I also created a new build_roms.sh, to keep it up to date with the one in the ROMS repo.

x11ab

Like x11ab but defining averages. Intended for daily saves.

xn11b

Like x11b but without tides. Used for nested runs.

y11b

A test of using different compiler flags (part of the bit-reproducibility effort). In bould_roms.sh you can just point to a local version fo Lunux-ifort.mk and make the edits there. This run also defined OUT_DOUBLE in the cppdefs; this proved to be very informative in the exploration of numerical "noise."

x11ecb

Code based on x11ab but incorporating a new carbon module from Kyle Hinson. Part of the ARPA-e mCDR project. 2025.09.25

x11

Just like x11b for without biology. So this just runs the physics. Intended for debugging blowups. 2025.09.27


xa0

Meant for an analytical run. Basically identical to x4b but with the atmospheric forcing set to zero, and biology turned off. This replaces uu1k which did the same thing previously.


OLD

x10ab

Like x4b but with 50% burial of organic particulate N and C in the Salish Sea. It also saves averages. It is designed to run only saving two history files per day and an average file. No PERFECT_RESTART. This was a big step in the development leading from x4b to x11b. It uses new flags in the .h to turn on and off the bgc edits that apply only to the Salish Sea.

x4b

The old default code used for the long hindcast and daily forecast cas7_t0_x4b. The fennel.h code has lines to increase the light attenuation by a factor of three for the Salish Sea. It allows for vertical point sources (like wastewater treatment plants) which requires a more recent ROMS repo (~January 2024). It uses MPDATA for bio tracer advection in the dot_in.

xn4b

Like x4b but without tides, for nested runs.


x4a

Like x4b but no biology, no perfect restart, but defining AVERAGES. This is for the new experiment with GLORYS forcing (May 2025) but could be used for any physics-only experiment where we want to run fast.


x4

Like x4b but without biology, for testing physics changes like tidal forcing.


x4tf

Like x4 but with the tidal tractive force turned on.


OBSOLETE

Mostly I call these obsolete becasue they use the somewhat older ROMS we had from svn, and they rely on varinfo.yaml in LO_roms_source_alt.

uu0mb

This is a major step in the ROMS update process.

  • It uses the near-latest version of ROMS.
  • It is meant to be run using driver_roms3.py. Please look carefully at the top of that code to see all the command line arguments.
  • It uses the PERFECT_RESTART cpp flag. The leads to a smoother run and fewer blow-ups. It also means that it no longer writes an ocean_his_0001.nc file. This would be identical to the 0025 file from the previous day. This change is accounted for in Lfun.get_fn_list().
  • It incorporates rain (EMINUSP).
  • It assumes that the forcing was created using driver_forcing3.py. This uses the new organizational structure where forcing is put in a [gridname] folder, not [gridname_tag] or [gtag].
  • See LO/dot_in/cas6_v00_uu0mb for an example dot_in that runs this.

For the bio code:

  • It uses my edited version of the fennel bio code, which I keep in LO_roms_source_alt/npzd_banas.
  • We correct att and opt in the bio code, the match BSD as written.
  • Better atm CO2.

uu0m

This is just like uu0mb except without biology.


x0mb

Like uu0mb but with the rOxN* ratios set back to the original Fennel values (instead of the larger Stock values). Also some changes to the benthic remin: (i) fixed a bug in the if statement to test if the aerobic flux would pull DO negative, and (ii) a simpler handling of denitrification from benthic flux, but ensuring it does not pull NO3 negative.

I introduced a new name here because I had been recycling uu0mb to many times!


x1b

Like x0mb but I edited the bio code to include the "optimum uptake" form of nutrient limitation for NH4. It was already in NO3. Created 2023.04.08.

It is poor design to have the bio code in a separate folder. For example, if I now recompiled x0mb I would get code that reflected x1mb. So I am going to put fennel.h in this folder and then set MY_ANALYTICAL_DIR=${MY_PROJECT_DIR} in build_roms.sh.

I am also dropping the "m" for mox. There unless I was running parallel forecasts on both mox and klone (as I was once) there is no reason for this.


x2b

This starts from the fennel.h code in x1b and modifies it so the the benthic flux conforms more closely to Siedlecki et al. (2015) except with the necessary change that remineralization goes into NH4 instead of NO3. Denitrification still comes out of NO3. I also turn off the light limitation in Nitrification.


x3b

An experiment using the fennel.h code from x2b but modifying light attenuation to be what it is in the current forecast. Note that the current forecast has bugs in this part of the code that make it different from Davis et al. (2014) as written.


uu1k

This is much like uu0mb except it drops the cppdefs flags associated with atm forcing and biology. This makes it useful for analytical runs that don't have atm forcing. Note carefully the ANA flags used in the cpp file. Like uu0mb, it makes use of forcing files that use the new varinfo.yaml to automate the naming of things in the NetCDF forcing files (the "A0" sequence).


xn0

Designed to run a nested model. Omits tidal forcing. No biology. Otherwise based on x2b.


xn0b

Designed to run a nested model. Omits tidal forcing. Has biology from x2b. Otherwise based on xn0.


About

User ROMS code

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published