.. _ConfigWorkflow: *************************************************** Available Workflow Configuration Parameters *************************************************** Among other tasks, the setup workflow Python script (``parm/setup_wflow_env.py``) generates a ``land_analysis.yaml`` file that contains all of the settings for the experiment --- user-selected settings from ``config.yaml``, default values, and machine-dependent settings. The script also uses the ``uwtools`` Python package to generate a Rocoto XML file using the ``uwtools`` Python package. The ``template.land_analysis.yaml`` file contains all parameters that can ultimately be included in the ``land_analysis.yaml`` file and workflow XML. ``setup_wflow_env.py`` first sets default values for all parameters necessary for the experiment. It sets machine-specific values based on the user's platform. Then the user-specified values from ``config.yaml`` are loaded and override any previously-set defaults. Finally, the script generates an experiment directory containing the ``land_analysis.yaml`` file and the Rocoto XML. .. _setup-wflow-script: .. figure:: https://github.com/ufs-community/land-DA_workflow/wiki/images/setup_wflow_env.png :width: 50% :alt: Flowchart describing the Land DA setup_wflow_env.py script. First, the script detects the platform/machine the user is working on. Then, it sets default parameters, followed by machine-based parameters. It updates these parameter values based on the information provided in config.yaml and calculates additional parameters as needed. The ush/fill_jinja_template.py script is called to assemble the values from template.land_analysis.yaml and config.yaml into one complete land_analysis.yaml file, which is then converted into a land_analysis.xml file for use by Rocoto. Overview of the ``setup_wflow_env.py`` script .. _wf-attributes: Workflow Attributes (``attrs:``) ================================= Attributes pertaining to the overall workflow are defined in the ``attrs:`` section of ``template.land_analysis.yaml`` under ``workflow:``. For example: .. code-block:: console workflow: attrs: realtime: false scheduler: slurm cyclethrottle: 24 taskthrottle: 24 ``realtime:`` (Default: false) Indicates whether it is a realtime run (true) or a retrospective run (false). Valid values: ``true`` | ``false`` ``scheduler:`` (Default: slurm) The job scheduler to use on the specified machine. Valid values: ``"slurm"``. Other options may work with a container but have not been tested: ``"pbspro"`` | ``"lsf"`` | ``"lsfcray"`` | ``"none"`` ``cyclethrottle:`` (Default: 24) The number of cycles that can be active at one time. Valid values: Integers > 0. ``taskthrottle:`` (Default: 24) The number of tasks that can be active at one time. Valid values: Integers > 0. .. _wf-cycledef: Workflow Cycle Definition (``cycledef``) ========================================== Cycling information is defined in the ``cycledef:`` section under ``workflow:``. Each cycle definition starts with a hyphen (``-``) and has information on cycle attributes (``attrs:``) and a cycle specification (``spec:``). .. code-block:: console workflow: cycledef: - attrs: group: cycled spec: {{ DATE_FIRST_CYCLE }}00 {{ DATE_LAST_CYCLE }}00 {{ DATE_CYCLE_FREQ_HR }}:00:00 - attrs: group: first_cycle spec: {{ DATE_FIRST_CYCLE }}00 {{ DATE_FIRST_CYCLE }}00 {{ DATE_CYCLE_FREQ_HR }}:00:00 - attrs: group: cycled_from_second spec: {{ date_second_cycle }}00 {{ DATE_LAST_CYCLE }}00 {{ DATE_CYCLE_FREQ_HR }}:00:00 ``attrs:`` Attributes of ``cycledef``. Includes ``group:`` but may also include ``activation_offset:``. See the :rocoto:`Rocoto Documentation <>` for more information. ``group:`` The group attribute allows users to assign a set of cycles to a particular group. The group tag can later be used to control which tasks are run for which cycles. See the :rocoto:`Rocoto Documentation <>` for more information. ``spec:`` The cycle is defined using the "start stop step" method, with the cycle start date listed first in YYYYMMDDHHmm format, followed by the end date and then the step in HH:mm:SS format (e.g., ``202501190000 202501220000 24:00:00``). The ``template.land_analysis.yaml`` values are rendered with the user-provided cycle information in the ``config.yaml`` file; ``DATE_FIRST_CYCLE:``, ``DATE_LAST_CYCLE:``, and ``DATE_CYCLE_FREQ_HR:`` are defined in the :ref:`Workflow Entities ` section below. ``date_second_cycle:`` Start date of subsequent cycle(s), derived in ``setup_wflow_env.py``. For example, a ``land_analysis.yaml`` file generated by ``setup_wflow_env.py`` on Hercules might be rendered as: .. code-block:: console workflow: cycledef: - attrs: group: cycled spec: 202501190000 202501220000 24:00:00 - attrs: group: first_cycle spec: 202501190000 202501190000 24:00:00 - attrs: group: cycled_from_second spec: 202501200000 202501220000 24:00:00 .. _wf-entities: Workflow Entities =================== In the ``land_analysis.yaml`` file, entities are constants that are referred to throughout the workflow using the ampersand (``&``) prefix and semicolon (``;``) suffix (e.g., ``&MACHINE;``) to avoid defining the same constants repetitively in each workflow task. The ``entities:`` section of ``template.land_analysis.yaml`` provides the structure for this section of ``land_analysis.yaml``. Then the default values for these entities are set in ``setup_wflow_env.py`` and updated with user-selected values from ``config.yaml``. The resulting ``land_analysis.yaml`` file will include an ``entities:`` section with concrete values for several variables that are used throughout the workflow. For example, in a ``land_analysis.yaml`` file created on Hercules based on the ``config.LND.era5.3dvar.ims.DA-fcst.warmstart.yaml`` case, the following entities are defined: .. code-block:: console workflow: entities: ACCOUNT: "epic" APP: "LND" ATM_IO_LAYOUT_X: "1" ATM_IO_LAYOUT_Y: "1" ATM_LAYOUT_X: "3" ATM_LAYOUT_Y: "8" ATMOS_FORC: "era5" BKG_ANAL_EXT_SRC_OPT: "era5land" CCPP_SUITE: "FV3_GFS_v17_p8_ugwpv1" COLDSTART: "NO" COMINgdas: "" COMINgfs: "" COUPLER_CALENDAR: "2" CUSTOM_JEDI_CONFIG_FLAG: "NO" CUSTOM_JEDI_CONFIG_PATH: "/work/noaa/epic/UFS_Land-DA_v3.0/inputs/test_base/jedi_yaml" CUSTOM_JEDI_CONFIG_PREFIX: "/prefix/of/custom/JEDI/config/file/name" DATE_CYCLE_FREQ_HR: "24" DATE_FIRST_CYCLE: "2025011900" DATE_LAST_CYCLE: "2025012000" DATM_STREAM_FN_LAST_DATE: "" DCOMINera5: "" DCOMINera5land: "" DCOMINghcn: "" DCOMINgswp3: "" DCOMINsmap: "" DCOMINsmops: "" DO_BKG_ANAL_EXT_SRC: "NO" DO_FREE_FORECAST: "NO" do_jedi_snow: "YES" do_jedi_soil_moisture: "NO" DT_ATMOS: "900" DT_RUNSEQ: "3600" envir: "lnd_era5_3dvar_ims" exp_basedir: "/Users/Joe.Schmoe/landda" EXP_CASE_NAME: "lnd_era5_3dvar_ims_00" FCSTHR: "24" FHROT: "0" FRAC_GRID: "NO" IC_DATA_MODEL: "gfs" IMO: "384" JEDI_ALGORITHM: "3dvar" JEDI_IODACONV_PATH: "/work/noaa/epic/UFS_Land-DA_v3.0/jedi_bundle_hercules/build/lib/python3.11" JEDI_PATH: "/work/noaa/epic/UFS_Land-DA_v3.0/jedi_bundle_hercules" JMO: "190" KEEPDATA: "YES" LND_CALC_SNET: ".true." LND_IC_TYPE: "custom" LND_INITIAL_ALBEDO: "0.25" LND_LAYOUT_X: "1" LND_LAYOUT_Y: "2" LND_OUTPUT_FREQ_SEC: "21600" MACHINE: "hercules" MED_COUPLING_MODE: "ufs.nfrac.aoflux" model_ver: "v3.0.0" native_default: "None" NET: "landda" NPROCS_ANALYSIS: "6" NPROCS_FCST_IC: "36" NPZ: "127" nnodes_forecast: "1" nprocs_forecast: "26" nprocs_forecast_atm: "12" nprocs_forecast_lnd: "12" nprocs_per_node: "26" OBSDIR: "" OBS_GHCN_SNOW: "NO" OBS_IMS_SNOW: "YES" OBS_SFCSNO: "YES" OBS_SMAP: "NO" OBS_SMOPS: "NO" OUTPUT_FH: "1 -1" partition_default: "hercules" PTMP: "/Users/Joe.Schmoe/landda/ptmp" PY_LOG_LEVEL: "INFO" queue_default: "batch" RES: "96" RESTART_INTERVAL: "12 -1" RUN: "landda" res_p1: "97" SCHED: "slurm" SMAP_RAW_WINDOW_SPAN_HALF: "5" WARMSTART_DIR: "/Users/Joe.Schmoe/landda/land-DA_workflow/fix/DATA_RESTART" WE2E_TEST: "NO" WE2E_ATOL: "1e-7" WE2E_LOG_FN: "we2e.log" WRITE_GROUPS: "1" WRITE_TASKS_PER_GROUP: "6" HOMElandda: "&exp_basedir;/land-DA_workflow" COMROOT: "&PTMP;/&envir;/com" DATAROOT: "&PTMP;/&envir;/tmp" LOGDIR: "&COMROOT;/output/logs" LOGFN_SUFFIX: "_@Y@m@d@H.log" PDY: "@Y@m@d" cyc: "@H" DATADEP_LRST1: "&DATAROOT;/DATA_SHARE/RESTART/ufs_land_restart.@Y-@m-@d_@H-00-00.tile6.nc" DATADEP_LRST2: "&WARMSTART_DIR;/ufs_land_restart.@Y-@m-@d_@H-00-00.tile6.nc" DATADEP_COLDSTART: "&exp_basedir;/exp_case/&EXP_CASE_NAME;/task_skip_coldstart_@Y@m@d@H.txt" DATADEP_DATM1: "&DATAROOT;/DATA_SHARE/RESTART/ufs.cpld.datm.r.@Y-@m-@d-00000.nc" DATADEP_DATM2: "&WARMSTART_DIR;/ufs.cpld.datm.r.@Y-@m-@d-00000.nc" DATADEP_FREEFCST: "&exp_basedir;/exp_case/&EXP_CASE_NAME;/task_analysis_done_@Y@m@d@H.txt" DATADEP_SFC1: "&DATAROOT;/DATA_SHARE/RESTART/@Y@m@d.@H0000.sfc_data.tile6.nc" DATADEP_SFC2: "&WARMSTART_DIR;/@Y@m@d.@H0000.sfc_data.tile6.nc" .. _nco-note: .. note:: The workflow entities include certain standard environment variables that are defined in the NCEP Central Operations :nco:`WCOSS Implementation Standards ` document (pp. 4-5). These variables are used in forming the path to various directories containing input, output, and workflow files. For a visual aid, see the :ref:`Land DA Directory Structure Diagram `. ``ACCOUNT:`` An account where users can charge their compute resources on the specified ``MACHINE``. To determine an appropriate ``ACCOUNT`` field on a system with a Slurm job scheduler, users may run the ``saccount_params`` command to display account details. On other systems, users may run the ``groups`` command, which will return a list of projects that the user has permissions for. Not all of the listed projects/groups have an HPC allocation, but those that do are potentially valid account names. ``APP:`` Application/configuration to use. Valid values: ``LND`` | ``ATML``. ``ATM_IO_LAYOUT_X:`` Specifies how many MPI ranks to use in the X direction for input/output (I/O) to the atmospheric component. ``ATM_IO_LAYOUT_Y:`` Specifies how many MPI ranks to use in the Y direction for input/output (I/O) to the atmospheric component. ``ATM_LAYOUT_X:`` Number of processes in the X direction per tile for the atmospheric component. ``ATM_LAYOUT_Y:`` Number of processes in the Y direction per tile for the atmospheric component. ``ATMOS_FORC:`` Type of atmospheric forcing data used. Valid values: ``"era5"`` | ``"gswp3"``. ``CCPP_SUITE:`` The physics suite to use in the experiment (only relevant for :term:`ATML` configurations, which have an active atmospheric component). ``COLDSTART:`` Flag that indicates whether the experiment is a :term:`coldstart` experiment (``"YES"``) or a :term:`warmstart` experiment (``"NO"``). ``COMINgdas:`` Output from the GDAS model, which can be used as input for a new forecast. See :nco:`WCOSS Implementation Standards ` for information on operational data naming conventions. ``COMINgfs:`` Output from the GFS model, which can be used as input for a new forecast. See :nco:`WCOSS Implementation Standards ` for information on operational data naming conventions. ``COUPLER_CALENDAR:`` Coupler calendar. Options: ``no_calendar=0``, ``thirty_day_months=1``, ``julian=2``, ``gregorian=3``, ``noleap=4`` ``CUSTOM_JEDI_CONFIG_FLAG:`` Whether to use a custom JEDI configuration file (``"YES"``) or not (``"NO"``). If this parameter is set to ``"YES"``, in the configuration file ``config.yaml``, the custom input file which is located at ``CUSTOM_JEDI_CONFIG_PATH`` will be used as the JEDI input file in the *analysis* task. ``CUSTOM_JEDI_CONFIG_PATH:`` Path to the custom JEDI configuration file. Valid values: ``"YES"`` | ``"NO"``. ``CUSTOM_JEDI_CONFIG_PREFIX:`` Prefix for the custom JEDI file. For example, if the file were named ``custom_jedi_2026022600.yaml``, then the ``CUSTOM_JEDI_CONFIG_PREFIX`` is ``custom_jedi_``. Note that the YAML file name should include the date for cycling; the prefix is everything before the cycle date. ``DATE_CYCLE_FREQ_HR:`` Cycling frequency (in integer hours). ``DATE_FIRST_CYCLE:`` Starting :term:`cycle` date of the *first* forecast in the set of forecasts to run. Format is “YYYYMMDDHH”. ``DATE_LAST_CYCLE:`` Starting cycle date of the *last* forecast in the set of forecasts to run. Format is “YYYYMMDDHH”. ``DATM_STREAM_FN_LAST_DATE:`` The last date of a warmstart run. Requires a valid date in ``YYYYMMDDHH`` format. This variable is a temporary fix for a bug in the UFS WM. Restart files produced by the :term:`LND` configuration contain a hard-coded :term:`DATM` file list. If the file list does not match the namelist, the warmstart will fail. For example, if the user runs a coldstart forecast from day 1 to day 2, the restart file will contain information for days 1-2. If they then choose to run a warmstart forecast for days 3 to 4 with the restart file from the coldstart, it will fail even if the user puts days 3-4 into the :term:`DATM` input namelist. To resolve this issue, days 1-4 must be added to the namelist of the coldstart even though it only runs for days 1-2. ``DCOMINera5:`` Path to directory containing ERA5 input data files. See :nco:`WCOSS Implementation Standards ` for information on operational data naming conventions. ``DCOMINera5land:`` Variable used in testing. Unsupported for users at this time. ``DCOMINghcn:`` Path to directory containing GHCN input data files. See :nco:`WCOSS Implementation Standards ` for information on operational data naming conventions. ``DCOMINgswp3:`` Path to directory containing GSWP3 input data files. See :nco:`WCOSS Implementation Standards ` for information on operational data naming conventions. ``DCOMINsmap:`` Path to directory containing SMAP input data files. See :nco:`WCOSS Implementation Standards ` for information on operational data naming conventions. ``DCOMINsmops:`` Path to directory containing SMOPS input data files. See :nco:`WCOSS Implementation Standards ` for information on operational data naming conventions. ``DO_BKG_ANAL_EXT_SRC:`` Whether to use an external source file for the analysis. Only relevant when ``CUSTOM_JEDI_CONFIG_PATH: YES``. Valid values: ``"YES"`` | ``"NO"``. ``DO_FREE_FORECAST:`` Whether to run a :term:`free forecast ` (``"YES"``) or a :term:`DA forecast ` (``"NO"``). ``do_jedi_snow:`` Whether to perform JEDI snow DA. Valid values: ``"YES"`` | ``"NO"``. ``do_jedi_soil_moisture:`` Whether to perform JEDI soil moisture DA. Valid values: ``"YES"`` | ``"NO"``. ``DT_ATMOS:`` The main integration time step of the atmospheric component of the UFS Weather Model (in seconds). This is the time step for the outermost atmospheric model loop and must be a positive integer value. It corresponds to the frequency at which the physics routines and the top level dynamics routine are called. (Note that one call to the top-level dynamics routine results in multiple calls to the horizontal dynamics, tracer transport, and vertical dynamics routines; see the `FV3 dycore scientific documentation `_ for details.) ``DT_RUNSEQ:`` Time interval of run sequence (coupling interval) between the model components of the UFS Weather Model (in seconds). ``envir:`` The run environment. Set to “test” during the initial testing phase, “para” when running in parallel (on a schedule), and “prod” in production. In operations, this is the operations root directory (aka ``$OPSROOT``). For more on NCO-compliant directory structure, see the :ref:`Note on NCO Standards `. ``exp_basedir:`` The full path to the parent directory of ``land-DA_workflow`` (i.e., ``${BASEDIR}`` in the documentation). The actual value is derived in the ``setup_wflow_env.py`` file. ``EXP_CASE_NAME:`` A name for the experiment. This variable can be changed to any name the user wants (but note that whitespace and some punctuation characters are not allowed). However, the best names will indicate useful information about the experiment. Each of the sample cases provided sets the experiment name to ``app_[forcing_]starttype_##`` where ```` is the configuration (:term:`LND` or :term:`ATML`), ```` refers to the atmospheric forcing data used (if any), and ```` indicates either a warmstart or coldstart forecast. ``FCSTHR:`` Specifies the length of each forecast in hours. Valid values: Integers > 0. ``FHROT:`` Forecast hour at restart in UFS Weather Model (in hours; set in ``model_configure``). ``FRAC_GRID:`` Flag for the fractional grid option in UFS_UTILS and the UFS WM. When the fractional grid option (``frac_grid``) was introduced in 2024, some variable names such as snow depth were changed in the WM and UFS_UTILS. However, these variable names were not changed in the Noah-MP land model component. The tile2tile converter uses this flag to switch variable names between JEDI and the land model. When fractional grid is enabled (``FRAC_GRID: "YES"``), two key variable names do not match between JEDI (``sfc_data`` files) and the land model (restart files), and the tile2tile converter must translate between them: .. list-table:: Mismatched Variable Names :header-rows: 1 * - Variable name in ‘tile2tile_converter’ - Description - Noah-MP (restart) - JEDI (``sfc_data``) * - swe - Snow water equivalent - ``weasd`` - ``sheleg`` / ``weasdl`` * - snow_depth - Snow depth over land - ``snwdph`` - ``snwdph`` / ``snodl`` In ``pre_anal``, the title2tile converter creates the ``sfc_data`` files from the restart files for the ``analysis`` task. In ``post_anal``, the title2tile converter creates the restart files for the warmstart forecast from the ``sfc_data`` and restart files for the ``forecast`` task. ``IC_DATA_MODEL:`` The name of the model that the initial ``sfc_data`` files are coming from in the ``fcst_ic`` task. Valid values: ``"gfs"`` | ``"gdas"`` ``IMO:`` Number of horizontal grid points in the X direction. Usually a multiple of the resolution (``${RES}``). ``JEDI_ALGORITHM:`` Data assimilation algorithm selection. Valid values: ``"letkf-oi"`` | ``"3dvar"`` ``JEDI_IODACONV_PATH:`` Path to directory where the libraries of the JEDI IODA converter are located. ``JEDI_PATH:`` Path to the directory where JEDI is installed. The actual value is set in a machine-specific portion of ``setup_wflow_env.py``. ``JMO:`` Number of horizontal grid points in the Y direction. ``KEEPDATA:`` Flag to keep data (``"YES"``) or not (``"NO"``) that is copied to the ``$DATAROOT`` directory during the forecast experiment. ``LND_CALC_SNET:`` Flag indicating whether to calculate the shortwave radiation internally (``".true."``) or not (``".false."``). ``LND_IC_TYPE:`` Indicates the source of the initial conditions. Two options are supported: "custom" (i.e., ``C96.initial.tile[1-6].nc``) and "sfc" (i.e., ``sfc_data.tile[1-6].nc``). Valid values: ``custom`` | ``sfc``. ``LND_INITIAL_ALBEDO:`` Initial mean surface albedo. Valid values: Any number between 0-1. ``LND_LAYOUT_X:`` Number of processes in the x direction per tile for the land model component. ``LND_LAYOUT_Y:`` Number of processes in the y direction per tile for the land model component. ``LND_OUTPUT_FREQ_SEC:`` Output frequency of the land model component (in seconds). ``MACHINE:`` The machine (a.k.a. platform or system) on which the workflow will run. The actual value is provided by the user via the ``-p=MACHINE`` command line argument or derived in ``setup_wflow_env.py`` from other parameters if possible. Currently supported platforms are listed in :numref:`Section %s `. Valid values: ``"ursa"`` | ``"hercules"`` | ``"orion"`` | ``"gaeac6"`` ``MED_COUPLING_MODE:`` :term:`CMEPS` coupling mode. Valid values: ``"ufs.frac"`` | ``"ufs.nfrac.aoflux"``. ``"ufs.frac"`` is used with the active FV3 atmospheric component (e.g., in :term:`ATML` configurations), whereas ``"ufs.nfrac.aoflux"`` is used with the data atmosphere component (e.g., :term:`LND` configurations). ``model_ver:`` Version number of package in three digits (e.g., v#.#.#); second level of ``com`` directory (see :ref:`NCO Directory Structure Entities `) ``native_default:`` Defines raw batch system options/job scheduler commands that Rocoto will use when submitting jobs for a given task (using the ```` tag). If more than one option is required, they are listed consecutively as a single string. This is a machine-dependent parameter, so default values differ. ``NET:`` Model name (first level of ``com`` directory structure). ``NPROCS_ANALYSIS:`` Number of processors for the ``analysis`` task. ``NPROCS_FCST_IC:`` Number of processors for the ``fcst_ic`` task. ``NPZ:`` Number of vertical layers in the atmospheric model. ``nnodes_forecast:`` Number of nodes for the ``forecast`` task. ``nprocs_forecast:`` Total number of processes for the ``forecast`` task. In general, this is set as :math:`nprocs\_forecast\_lnd + nprocs\_forecast\_atm + (lnd\_layout\_x*lnd\_layout\_y)`. ``nprocs_forecast_atm:`` Number of processes for the atmospheric component in the ``forecast`` task. Actual default value dependent on ``APP`` (LND or ATML). ``nprocs_forecast_lnd:`` Number of processes for the land model component (Noah-MP) in the ``forecast`` task. ``nprocs_per_node:`` Number of processes per node for the ``forecast`` task. Actual default value dependent on ``nprocs_forecast`` and the maximum number of cores available per node. ``OBSDIR:`` The path to the directory where DA fix files are located. In ``scripts/exlandda_prep_data.sh``, this value is set to ``${FIXlandda}/DA_obs`` unless the user specifies a different path in ``config.yaml``. ``OBS_GHCN_SNOW:`` Flag to use GHCN snow depth observations. Valid values: ``"YES"`` | ``"NO"``. ``OBS_IMS_SNOW:`` Flag to use IMS snow depth observations. Valid values: ``"YES"`` | ``"NO"``. ``OBS_SFCSNO:`` Flag to use SFCSNO snow depth observations. Valid values: ``"YES"`` | ``"NO"``. ``OBS_SMAP:`` Flag to use SMAP soil moisture observation data. Valid values: ``"YES"`` | ``"NO"``. ``OBS_SMOPS:`` Flag to use SMOPS soil moisture observation data. Valid values: ``"YES"`` | ``"NO"``. ``OUTPUT_FH:`` Forecast history file output frequency (when second number is ``-1``, e.g., ``"1 -1"``) or hours at which to write output history files (e.g., ``"6 9 12"``). ``partition_default:`` Default partition; default set based on ``MACHINE``. ``PY_LOG_LEVEL:`` Python logging level. Valid values: ``"INFO"`` | ``"DEBUG"`` | ``"WARN"`` | ``"ERROR"`` | ``"CRITICAL"`` ``queue_default:`` Default queue; default set based on ``MACHINE``. ``RES:`` Resolution of FV3 grid. Currently, only C96 resolution is supported. ``RESTART_INTERVAL:`` Determines how often the model creates restart files, which are used to continue simulations from a specific point in time. When the second number is ``-1``, the first number refers to the frequency of restart file output (e.g., ``"1 -1"``). Otherwise, the list of numbers indicates specific hours at which to output restart files (e.g., ``"6 9 12"``). ``RUN:`` Name of model run (third level of ``com`` directory structure). In general, same as ``${NET}``. ``res_p1:`` Resolution plus 1 (``${RES} + 1``) . Must be an integer value. ``SCHED:`` The job scheduler to use (e.g., Slurm) on the specified ``MACHINE``. Valid values: ``"slurm"``. Other options may work with a container but have not been tested: ``"pbspro"`` | ``"lsf"`` | ``"lsfcray"`` | ``"none"`` ``SMAP_RAW_WINDOW_SPAN_HALF:`` The SMAP satellite is designed to create a global map every 2-3 days. Each SMAP data file covers a narrow and long area of 1000 km width, and there can be overlap. To avoid duplication and cover as wide an area as possible, the data files between ``${PDY}${cyc} +/- ${SMAP_RAW_WINDOW_SPAN_HALF}`` hours are combined after the raw data files are converted into the IODA format in the ``prep_data`` task. Its default value is ``5``. This means that 11-hour data sets are combined by default. For example, combined data for ``2025011800`` would contain the raw data files from ``2025011719`` to ``2025011805``. To use a single data set, set the configuration parameter to ``0``. ``WARMSTART_DIR:`` The path to restart files for a warmstart experiment. The actual value set is machine-dependent. ``WE2E_TEST:`` Flag to turn on the workflow end-to-end (WE2E) test. When ``WE2E_TEST="YES"``, the results files from the experiment are compared to the test baseline files, located by default in ``${BASEDIR}/land-DA_workflow/fix/test_base/we2e_com``. If the results are within the tolerance set (via ``WE2E_ATOL``) at the end of the three main tasks --- ``analysis``, ``forecast``, and ``post_anal`` --- then the experiment passes. Valid values: ``"YES"`` | ``"NO"`` ``WE2E_ATOL:`` Tolerance of the WE2E test. (Set in ``template.land_analysis.yaml``.) ``WE2E_LOG_FN:`` Name of the WE2E test log file. (Set in ``template.land_analysis.yaml``.) ``WRITE_GROUPS:`` The number of write groups (i.e., groups of :term:`MPI` tasks) to use. ``WRITE_TASKS_PER_GROUP:`` The number of MPI tasks to allocate for each of the ``${WRITE_GROUPS}``. .. _nco-dir-entities: NCO Directory Structure Entities ---------------------------------- Standard environment variables are defined in the NCEP Central Operations :nco:`WCOSS Implementation Standards ` document (pp. 4-5). These variables are used in forming the path to various directories containing input, output, and workflow files. For a visual aid, see the :ref:`Land DA Directory Structure Diagram `. ``HOMElandda:`` (Default: ``"&exp_basedir;/land-DA_workflow"`` ) The location of the :github:`land-DA_workflow <>` clone. ``PTMP:`` (Default: ``"&exp_basedir;/ptmp"`` ) Product temporary (PTMP) experiment output space. This directory is used to mimic the operational file structure and contains all of the files and subdirectories used by or generated by the experiment. By default, it is a sibling to the ``land-DA_workflow`` directory. ``COMROOT:`` (Default: ``"&PTMP;/&envir;/com"`` ) ``com`` root directory, which contains input/output data on current system. ``DATAROOT:`` (Default: ``"&PTMP;/&envir;/tmp"`` ) Directory location for the temporary working directories for running jobs. By default, this is a sibling to the ``${COMROOT}`` directory and is located at ``ptmp//tmp``. ``LOGDIR:`` (Default: ``"&COMROOT;/output/logs"`` ) Path to the directory containing log files for each workflow task. ``LOGFN_SUFFIX:`` (Default: ``"_@Y@m@d@H.log"`` ) The cycle suffix appended to each task's log file. It will be rendered in the form ``_YYYYMMDDHH.log``. For example, the ``prep_obs`` task log file for the Jan. 20, 2025 00z cycle would be named: ``prep_obs_2025012000.log``. ``PDY:`` (Default: ``"@Y@m@d"`` ) Date in YYYYMMDD format. ``cyc:`` (Default: ``"@H"`` ) Cycle time in GMT hours, formatted HH. .. _data-entities: Data Location Entities ---------------------------------- ``DATADEP_LRST1:`` (Default: ``"&DATAROOT;/DATA_SHARE/RESTART/ufs_land_restart.@Y-@m-@d_@H-00-00.tile6.nc"`` ) Land model (:term:`Noah-MP`) restart files for the next cycle. ``DATADEP_LRST2:`` (Default: ``"&WARMSTART_DIR;/ufs_land_restart.@Y-@m-@d_@H-00-00.tile6.nc"`` ) Land model (:term:`Noah-MP`) restart files used to initialize a warmstart experiment. ``DATADEP_COLDSTART:`` (Default: ``"&exp_basedir;/exp_case/&EXP_CASE_NAME;/task_skip_coldstart_@Y@m@d@H.txt"`` ) File to skip the cold-start tasks. ``DATADEP_DATM1:`` (Default: ``"&DATAROOT;/DATA_SHARE/RESTART/ufs.cpld.datm.r.@Y-@m-@d-00000.nc"`` ) :term:`DATM` restart files for the next cycle. ``DATADEP_DATM2:`` (Default: ``"&WARMSTART_DIR;/ufs.cpld.datm.r.@Y-@m-@d-00000.nc"`` ) :term:`DATM` restart files used to initialize a warmstart experiment. ``DATADEP_SFC1:`` (Default: ``"&DATAROOT;/DATA_SHARE/RESTART/@Y@m@d.@H0000.sfc_data.tile6.nc"`` ) Surface data (``sfc_data``) restart files for the next cycle. ``DATADEP_SFC2:`` (Default: ``"&WARMSTART_DIR;/@Y@m@d.@H0000.sfc_data.tile6.nc"`` ) Surface data (``sfc_data``) files used to initialize a warmstart experiment. ``DATADEP_FREEFCST:`` "&exp_basedir;/exp_case/&EXP_CASE_NAME;/task_analysis_done_@Y@m@d@H.txt" Data file(s) required to trigger the forecast task in a :term:`free-forecast ` experiment. .. _wf-log: Workflow Log ============== Information related to overall workflow progress is defined in the ``log:`` section under ``workflow:`` .. code-block:: console workflow: log: "&LOGDIR;/workflow.log" ``log:`` (Default: ``"&LOGDIR;/workflow.log"``) Path and name of Rocoto log file(s). .. _wf-tasks: Workflow Tasks ================ The workflow is divided into discrete tasks, and details of each task are defined within the ``tasks:`` section under ``workflow:``. .. code-block:: console workflow: tasks: task_jcb: task_prep_data: task_fcst_ic: task_pre_anal: task_analysis: task_post_anal: task_forecast: task_plot_stats: Each task may contain attributes (``attrs:``), just as in the overarching ``workflow:`` section. Instead of entities, each task contains an ``envars:`` section to define environment variables that must be passed to the task when it is executed. Any task dependencies are listed under the ``dependency:`` section. Additional details, such as ``jobname:``, ``walltime:``, and ``queue:`` may also be set within a specific task. The following subsections explain any variables that have not already been explained/defined above. .. _sample-task: Sample Task: Analysis Task (``task_analysis``) ------------------------------------------------ This section walks users through the structure of the analysis task (``task_analysis``) to explain how configuration information is provided to the ``land_analysis.yaml`` file for each task. Since each task has a similar structure, common information is explained in this section. Variables unique to a particular task are defined in their respective ``task_*`` sections based on the structure laid out in ``template.land_analysis.yaml``. Parameters for a particular task are set in the ``workflow.tasks.task_:`` section of the ``template.land_analysis.yaml`` file. For example, settings for the analysis task are provided in the ``task_analysis:`` section of ``template.land_analysis.yaml``. The following is an excerpt of the ``task_analysis:`` section of ``template.land_analysis.yaml``: .. code-block:: console workflow: tasks: task_analysis: attrs: {%- if COLDSTART == "YES" %} cycledefs: cycled_from_second {%- else %} cycledefs: cycled {%- endif %} maxtries: 2 envars: ACCOUNT: "&ACCOUNT;" BKG_ANAL_EXT_SRC_OPT: "&BKG_ANAL_EXT_SRC_OPT;" COMINgfs: "&COMINgfs;" COMROOT: "&COMROOT;" COUPLER_CALENDAR: "&COUPLER_CALENDAR;" CUSTOM_JEDI_CONFIG_FLAG: "&CUSTOM_JEDI_CONFIG_FLAG;" CUSTOM_JEDI_CONFIG_PATH: "&CUSTOM_JEDI_CONFIG_PATH;" CUSTOM_JEDI_CONFIG_PREFIX: "&CUSTOM_JEDI_CONFIG_PREFIX;" cyc: "&cyc;" DATAROOT: "&DATAROOT;" DATE_CYCLE_FREQ_HR: "&DATE_CYCLE_FREQ_HR;" DATE_FIRST_CYCLE: "&DATE_FIRST_CYCLE;" DCOMINera5land: "&DCOMINera5land;" DO_BKG_ANAL_EXT_SRC: "&DO_BKG_ANAL_EXT_SRC;" DO_FREE_FORECAST: "&DO_FREE_FORECAST;" do_jedi_snow: "&do_jedi_snow;" do_jedi_soil_moisture: "&do_jedi_soil_moisture;" exp_basedir: "&exp_basedir;" EXP_CASE_NAME: "&EXP_CASE_NAME;" FRAC_GRID: "&FRAC_GRID;" HOMElandda: "&HOMElandda;" JEDI_ALGORITHM: "&JEDI_ALGORITHM;" JEDI_PATH: "&JEDI_PATH;" KEEPDATA: "&KEEPDATA;" LOGDIR: "&LOGDIR;" MACHINE: "&MACHINE;" model_ver: "&model_ver;" NPROCS_ANALYSIS: "&NPROCS_ANALYSIS;" NPZ: "&NPZ;" OBS_GHCN_SNOW: "&OBS_GHCN_SNOW;" OBS_IMS_SNOW: "&OBS_IMS_SNOW;" OBS_SFCSNO: "&OBS_SFCSNO;" OBS_SMAP: "&OBS_SMAP;" OBS_SMOPS: "&OBS_SMOPS;" PDY: "&PDY;" PY_LOG_LEVEL: "&PY_LOG_LEVEL;" RES: "&RES;" res_p1: "&res_p1;" SCHED: "&SCHED;" WARMSTART_DIR: "&WARMSTART_DIR;" WE2E_TEST: "&WE2E_TEST;" WE2E_ATOL: "&WE2E_ATOL;" WE2E_LOG_FN: "&WE2E_LOG_FN;" account: "&ACCOUNT;" command: '&HOMElandda;/parm/task_load_modules_run_jjob.sh "analysis" "&HOMElandda;" "&MACHINE;"' jobname: analysis nodes: "1:ppn=&NPROCS_ANALYSIS;" {%- if native_default is not none %} native: "&native_default;" {%- endif %} walltime: 00:15:00 partition: "&partition_default;" queue: "&queue_default;" join: "&LOGDIR;/analysis&LOGFN_SUFFIX;" {%- if MACHINE == "ursa" %} memory: 32G {%- endif %} dependency: and: taskdep_prep_data: attrs: task: prep_data {%- if CUSTOM_JEDI_CONFIG_FLAG == "NO" %} taskdep_jcb: attrs: task: jcb {%- endif %} {%- if APP == "LND" %} taskdep_pre_anal: attrs: task: pre_anal {%- else %} or: datadep_sfc1: attrs: age: 5 value: "&DATADEP_SFC1;" datadep_sfc2: attrs: age: 5 value: "&DATADEP_SFC2;" {%- endif %} When running the ``config.LND.era5.3dvar.ims.DA-fcst.warmstart.yaml`` case on Hercules, the ``analysis`` task from ``land_analysis.yaml`` file would render as follows: .. code-block:: console task_analysis: attrs: cycledefs: cycled maxtries: 2 envars: ACCOUNT: "&ACCOUNT;" BKG_ANAL_EXT_SRC_OPT: "&BKG_ANAL_EXT_SRC_OPT;" COMINgfs: "&COMINgfs;" COMROOT: "&COMROOT;" COUPLER_CALENDAR: "&COUPLER_CALENDAR;" CUSTOM_JEDI_CONFIG_FLAG: "&CUSTOM_JEDI_CONFIG_FLAG;" CUSTOM_JEDI_CONFIG_PATH: "&CUSTOM_JEDI_CONFIG_PATH;" CUSTOM_JEDI_CONFIG_PREFIX: "&CUSTOM_JEDI_CONFIG_PREFIX;" cyc: "&cyc;" DATAROOT: "&DATAROOT;" DATE_CYCLE_FREQ_HR: "&DATE_CYCLE_FREQ_HR;" DATE_FIRST_CYCLE: "&DATE_FIRST_CYCLE;" DCOMINera5land: "&DCOMINera5land;" DO_BKG_ANAL_EXT_SRC: "&DO_BKG_ANAL_EXT_SRC;" DO_FREE_FORECAST: "&DO_FREE_FORECAST;" do_jedi_snow: "&do_jedi_snow;" do_jedi_soil_moisture: "&do_jedi_soil_moisture;" exp_basedir: "&exp_basedir;" EXP_CASE_NAME: "&EXP_CASE_NAME;" FRAC_GRID: "&FRAC_GRID;" HOMElandda: "&HOMElandda;" JEDI_ALGORITHM: "&JEDI_ALGORITHM;" JEDI_PATH: "&JEDI_PATH;" KEEPDATA: "&KEEPDATA;" LOGDIR: "&LOGDIR;" MACHINE: "&MACHINE;" model_ver: "&model_ver;" NPROCS_ANALYSIS: "&NPROCS_ANALYSIS;" NPZ: "&NPZ;" OBS_GHCN_SNOW: "&OBS_GHCN_SNOW;" OBS_IMS_SNOW: "&OBS_IMS_SNOW;" OBS_SFCSNO: "&OBS_SFCSNO;" OBS_SMAP: "&OBS_SMAP;" OBS_SMOPS: "&OBS_SMOPS;" PDY: "&PDY;" PY_LOG_LEVEL: "&PY_LOG_LEVEL;" RES: "&RES;" res_p1: "&res_p1;" SCHED: "&SCHED;" WARMSTART_DIR: "&WARMSTART_DIR;" WE2E_TEST: "&WE2E_TEST;" WE2E_ATOL: "&WE2E_ATOL;" WE2E_LOG_FN: "&WE2E_LOG_FN;" account: "&ACCOUNT;" command: '&HOMElandda;/parm/task_load_modules_run_jjob.sh "analysis" "&HOMElandda;" "&MACHINE;"' jobname: analysis nodes: "1:ppn=&NPROCS_ANALYSIS;" walltime: 00:15:00 partition: "&partition_default;" queue: "&queue_default;" join: "&LOGDIR;/analysis&LOGFN_SUFFIX;" dependency: and: taskdep_prep_data: attrs: task: prep_data taskdep_jcb: attrs: task: jcb taskdep_pre_anal: attrs: task: pre_anal .. _task-attributes: Task Attributes (``attrs:``) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The ``attrs:`` section for each task includes the ``cycledefs:`` attribute and the ``maxtries:`` attribute. ``cycledefs:`` (Default: cycled) A comma-separated list of ``cycledef:`` group names. A task with a ``cycledefs:`` group ID will be run only if its group ID matches one of the workflow's ``cycledef:`` group IDs. In this case, the ``cycledef:`` attribute is part of a conditional statement. If the user is running a coldstart experiment, the ``cycledef:`` group name will be ``cycled_from_second`` because the model needs time to "spin up" before cycling can begin; otherwise, the group name will be ``cycled``. ``maxtries:`` (Default: 2) The maximum number of times Rocoto can resumbit a failed task. .. _task-envars: Task Environment Variables (``envars``) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The ``envars:`` section for each task reuses many of the same variables and values defined as ``entities:`` for the overall workflow. These values are needed for each task, but setting them individually is error-prone. Instead, a specific workflow task can reference workflow entities using the ``&VAR;`` syntax. For example, to set the ``ACCOUNT:`` value in ``task_analysis:`` to the value of the workflow ``ACCOUNT:`` entity, the following statement can be added to the task's ``envars:`` section: .. code-block:: console task_analysis: envars: ACCOUNT: "&ACCOUNT;" For most workflow tasks, whatever value is set in the ``workflow.entities:`` section should be reused/referenced in other tasks. For example, the ``MACHINE`` variable must be defined for each task, and users cannot switch machines mid-workflow. Therefore, users should set the ``MACHINE`` variable in the ``workflow.entities:`` section and reference that definition in each workflow task. For example: .. code-block:: console workflow: entities: MACHINE: "hercules" tasks: task_jcb: envars: MACHINE: "&MACHINE;" task_prep_data: envars: MACHINE: "&MACHINE;" ... task_forecast: envars: MACHINE: "&MACHINE;" task_plot_stats: envars: MACHINE: "&MACHINE;" .. _misc-tasks: Miscellaneous Task Values ^^^^^^^^^^^^^^^^^^^^^^^^^^^ The authoritative :rocoto:`Rocoto documentation <>` discusses a number of miscellaneous task attributes in detail. A brief overview is provided in this section. .. code-block:: console workflow: tasks: task_analysis: account: "&ACCOUNT;" command: '&HOMElandda;/parm/task_load_modules_run_jjob.sh "analysis" "&HOMElandda;" "&MACHINE;"' jobname: analysis nodes: "1:ppn=&NPROCS_ANALYSIS;" walltime: 00:15:00 partition: "&partition_default;" queue: "&queue_default;" join: "&LOGDIR;/analysis&LOGFN_SUFFIX;" ``account:`` (Default: ``"&ACCOUNT;"`` ) An account where users can charge their compute resources on the specified ``MACHINE``. This value is typically the same for each task, so the default is to reuse the value set in the :ref:`Workflow Entities ` section. .. note:: The ``account`` variable (lowercase) is used by the job scheduler (Slurm), whereas the ``ACCOUNT`` variable (uppercase) is an ``envar`` referenced by the workflow scripts (e.g., scripts, jjobs, ush). ``command:`` (Default: ``'&HOMElandda;/parm/task_load_modules_run_jjob.sh "analysis" "&HOMElandda;" "&MACHINE;"'`` ) The command that Rocoto will submit to the batch system to carry out the task's work. ``jobname:`` Name of the task/job (default will vary based on the task). ``nodes:`` Number of nodes required for the task (default will vary based on the task). ``walltime:`` Time allotted for the task (default will vary based on the task). ``partition:`` (Default: ``"&partition_default;"`` ) The HPC system partition to run on. ``queue:`` (Default: ``"&queue_default;"`` ) The batch system queue or "quality of service" (QOS) that Rocoto will submit the task to for execution. ``join:`` (Default: "&LOGDIR;/analysis&LOGFN_SUFFIX;") The full path to the task's log file, which records output from ``stdout`` and ``stderr``. Some tasks include a ``cores:`` value instead of a ``nodes:`` value. For example: ``cores:`` (Default: 1) The number of cores required for the task. Some tasks include a ``native:`` value, usually set to ``"&native_default;"``; whether this value is listed is machine-dependent. Some tasks include a ``memory:`` tag, with a default value of ``128G``. .. _task-dependencies: Dependencies ^^^^^^^^^^^^^^ The ``dependency:`` section of a task defines what prerequisites (task or data-related) must be met for the task to run. In the case of ``task_analysis:``, it must be run after the ``jcb`` and ``prep_data`` tasks. Additionally, when running the :term:`LND` configuration, it must be run after the ``pre_anal`` task. Therefore, the dependecy section lists these task dependencies (``taskdep_*:``). When running in the :term:`ATML` configuration, the ``pre_anal`` task is not required, but one of the two surface data files (``datadep_sfc[1/2]``) is required as a restart file for the next cycle. .. code-block:: console workflow: tasks: task_analysis: dependency: and: taskdep_prep_data: attrs: task: prep_data {%- if CUSTOM_JEDI_CONFIG_FLAG == "NO" %} taskdep_jcb: attrs: task: jcb {%- endif %} {%- if APP == "LND" %} taskdep_pre_anal: attrs: task: pre_anal {%- else %} or: datadep_sfc1: attrs: age: 5 value: "&DATADEP_SFC1;" datadep_sfc2: attrs: age: 5 value: "&DATADEP_SFC2;" {%- endif %} For details on dependencies (e.g., ``attrs:``, ``age:``, ``value:`` tags), view the authoritative :rocoto:`Rocoto documentation <>`.