Configuration file options¶
This section contains the list and explanation about all the options that are available on the configuration file. Use it as a reference while preparing your configuration file. Each subsection will refer to the matching section from the config file. Those subsections explanation may be divided itself for the shake of clarity but this further divisions have nothing to do with the config file syntax itself.
DIAGNOSTICS¶
This section contains the general configuration for the diagnostics. The explanation has been divided in two subsections: the first one will cover all the mandatory options that you must specify in every configuration, while the second will cover all the optional configurations.
Mandatory configurations¶
- SCRATCH_DIR:
- Temporary folder for the calculations. Final results will never be stored here.
- DATA_DIR:
- ‘:’ separated list of folders to look for data in. It will look for file in the path $DATA_FOLDER/$EXPID and $DATA_FOLDER/$DATA_TYPE/$MODEL/$EXPID
- CON_FILES:
- Folder containing mask and mesh files for the dataset.
- FREQUENCY:
- Default data frequency to be used by the diagnostics. Some diagnostics can override this configuration or even ignore it completely.
- DIAGS:
- List of diagnostic to run. No specific order is needed: data dependencies will be enforced.
Optional configurations¶
- SCRATCH_MASKS
- Common scratch folder for the ocean masks. This is useful to avoid replicating them for each run at the fat nodes. By default is ‘/scratch/Earth/ocean_masks’
- RESTORE_MESHES
- By default, Earth Diagnostics only copies the mask files if they are not present in the scratch folder. If this option is set to true, Earth Diagnostics will copy them regardless of existence. Default is False.
- DATA_ADAPTOR
- This is used to choose the mechanism for storing and retrieving data. Options are CMOR (for our own experiments) or THREDDS (for anything else). Default value is CMOR
- DATA_TYPE
- Type of the dataset to use. It can be exp, obs or recon. Default is exp.
- DATA_CONVENTION
- Convention to use for file paths and names and variable naming among other things. Can be SPECS, PREFACE, PRIMAVERA or CMIP6. Default is SPECS.
- CDFTOOLS_PATH
- Path to the folder containing CDFTOOLS executables. By default is empty, so CDFTOOLS binaries must be added to the system path.
- MAX_CORES
- Maximum number of cores to use. By default the diagnostics will use all cores available to them. It is not necessary when launching through a scheduler, as Earthdiagnostics can detect how many cores the scheduler has allocated to it.
- AUTO_CLEAN
- If True, EarthDiagnostics removes the temporary folder just after finsihing. If RAM_DISK is set to True, this value is ignored and always Default is True
- RAM_DISK
- If set to True, the temporary files is created at the /dev/shm partition. This partition is not mounted from a disk. Instead, all files are created in the RAM memory, so hopefully this will improve performance at the cost of a much higher RAM consumption. Default is False.
- MESH_MASK
- Custom file to use instead of the corresponding mesh mask file.
- NEW_MASK_GLO
- Custom file to use instead of the corresponding new mask glo file
- MASK_REGIONS
- Custom file to use instead of the corresponding 2D regions file
- MASK_REGIONS_3D
- Custom file to use instead of the corresponding 3D regions file
EXPERIMENT¶
This sections contains options related to the experiment’s definition or configuration.
- MODEL
- Name of the model used for the experiment.
- MODEL_VERSION
- Model version. Used to get the correct mask and mesh files
- ATMOS_TIMESTEP
- Time between outputs from the atmosphere. This is not the model simulation timestep! Default is 6.
- OCEAN_TIMESTEP
- Time between outputs from the ocean. This is not the model simulation timestep! Default is 6.
- ATMOS_GRID
- Atmospheric grid definition. Will be used as a default target for interpolation diagnostics.
- INSTITUTE
- Institute that made the experiment, observation or reconstruction
- EXPID
- Unique identifier for the experiment
- NAME
- Experiment’s name. By default it is the EXPID.
- STARTDATES
- Startdates to run as a space separated list
- MEMBER
- Members to run as a space separated list. You can just provide the number or also add the prefix
- MEMBER_DIGITS
- Number of minimum digits to compose the member name. By default it is 1. For example, for member 1 member name will be fc1 if MEMBER_DIGITS is 1 or fc01 if MEMBER_DIGITS is 2
- MEMBER_PREFIX
- Prefix to use for the member names. By default is ‘fc’
- MEMBER_COUNT_START
- Number corresponding to the first member. For example, if your first member is ‘fc1’, it should be 1. If it is ‘fc0’, it should be 0. By default is 0
- CHUNK_SIZE
- Length of the chunks in months
- CHUNKS
- Number of chunks to run
- CHUNK_LIST
- List of chunks to run. If empty, all diagnostics will be applied to all chunks
- CALENDAR
- Calendar to use for date calculation. All calendars supported by Autosubmit are available. Default is ‘standard’
CMOR¶
In this section, you can control how will work the cmorization process. All options belonging to this section are optional.
Cmorization options¶
This options control when and which varibales will be cmorized.
- FORCE
- If True, launches the cmorization, regardless of existence of the extracted files or the package containing the online-cmorized ones. If False, only the non-present chunks will be cmorized. Default value is False
- FORCE_UNTAR
- Unpacks the online-cmorized files regardless of exstience of extracted files. If ‘FORCE is True, this parameter has no effect. If False, only the non-present chunks will be unpacked. Default value is False.
- FILTER_FILES
- Only cmorize original files containing any of the given strings. This is a space separated list. Default is the empty string.
- OCEAN_FILES
- Boolean flag to activate or no NEMO files cmorization. Default is True.
- ATMOSPHERE_FILES
- Boolean flag to activate or no IFS files cmorization. Default is True.
- USE_GRIB
- Boolean flag to activate or no GRIB files cmorization for the atmosphere. If activated and no GRIB files are present, it will cmorize using the MMA files instead (as if it was set to False). Default is True.
- CHUNKS
- Space separated list of chunks to be cmorized. If not provided, all chunks are cmorized
- VARIABLE_LIST
- Space separated list of variables to cmorize. Variables must be specified as domain:var_name. If no one is specified, all the variables will be cmorized
Grib variables extraction¶
These three options ares used to configure the variables to be CMORized from the grib atmospheric files. They must be specified using the IFS code in a list separated by comma.
You can also specify the levels to extract using one of the the following syntaxes:
- VARIABLE_CODE
- VARIABLE_CODE:LEVEL,
- VARIABLE_CODE:LEVEL_1-LEVEL_2-…-LEVEL_N
- VARIABLE_CODE:MIN_LEVEL:MAX_LEVEL:STEP
Some examples to clarify it further: * Variable with code 129 at level 30000: 129:30000 * Variable with code 129 at levels 30000, 40000 and 60000: 129:30000-40000-60000 * Variable with code 129 at levels between 30000 and 600000 with 10000 intervals:
129:30000:60000:10000 equivalent to 129:30000-40000-50000-60000
- ATMOS_HOURLY_VARS
- Configuration of variables to be extracted in an hourly basis
- ATMOS_DAILY_VARS
- Configuration of variables to be extracted in a daily basis
- ATMOS_MONTHLY_VARS
- Configuration of variables to be extracted in a monthly basis
Metadata options¶
All the options in this subsection will serve just to add the given values to the homonymous attributes in the cmorized files.
- ASSOCIATED_EXPERIMENT
- Default value is ‘to be filled’
- ASSOCIATED_MODEL
- Default value is ‘to be filled’
- INITIALIZATION_DESCRIPTION
- Default value is ‘to be filled’
- INITIALIZATION_METHOD
- Default value is ‘1’
- PHYSICS_DESCRIPTION
- Default value is ‘to be filled’
- PHYSICS_VERSION
- Default value is ‘1’
- SOURCE
- Default value is ‘to be filled’
- VERSION
- Dataset version to use (not present in all conventions)
- DEFAULT_OCEAN_GRID
- Name of the default ocean grid for those conventions that require it (CMIP6 and PRIMAVERA). Default is gn.
- DEFAULT_ATMOS_GRID
- Name of the default atmos grid for those conventions that require it (CMIP6 and PRIMAVERA). Default is gr.
- ACTIVITY
- Name of the activity. Default is CMIP
THREDDS¶
For now, there is only one option for the THREDDS server configuration.
- SERVER_URL
- THREDDS server URL
ALIAS¶
This config file section is different from all the others because it does not contain a set of configurations. Instead, in this section the user can define a set of aliases to be able to launch its most used configurations with ease. To do this, the user must add an option with named after the desired alias and assign to it the configuration or configurations to launch when this ALIAS is invoked. See the next example:
ALIAS_NAME = diag,opt1,opt2 diag,opt1new,opt2
In this case, the user has defined a new alias ‘ALIAS’ that can be used launch two times the diagnostic ‘diag’, the first with the options ‘opt1’ and ‘opt2’ and the second replacing ‘opt1’ with ‘opt1new’.
In this example, configuring the DIAGS as
DIAGS = ALIAS_NAME
will be identical to
DIAGS = diag,opt1,opt2 diag,opt1new,opt2