Table of Contents

PMIP4 and the CMIP6 DRS

All the attributes have to be defined properly when creating data for the CMIP6 database, but you will find below details about some attributes that are especially relevant for PMIP4

Some key concepts...

  • attribute: a global attribute (e.g. in a NetCDF file) used to describe the data
  • CV: sometimes the value of a given attribute has to be taken from a predefined set of values, known as a Controlled Vocabulary (CV)
  • DRS = Data Reference Syntax: the DRS is used to identify experiments, simulations, ensembles of experiments, atomic datasets and is used, for example, in file names, directory structures, the further_info_url, and in facets of some search tools
  • facet = a category or attribute you can put a search constraint on, when doing a faceted search

Example: the experiment_id attribute is used in the DRS, and its value has to be chosen from a CV ([piControl, past1000, lgm, …]).
On the IPSL CMIP5 search node, you can put a search constraint on the Experiment facet by clicking on Experiment and then selecting lgm and clicking on Search

CMIP6 official specifications

The following CMIP6 document is still in prep (as of July 14th 2016)

This document specifies all the global attributes that are defined for CMIP6. It also indicates how a subset of those relate to the Data Reference Syntax (DRS) and are used in file names and directory structures. Controlled vocabularies are defined for some global attributes (e.g., source_type and grid_resolution).

Project identification attributes

Project activity_id mip_era Note
CMIP6 CMIP CMIP6
PMIP4-CMIP6 CMIP

“CMIP PMIP”??
CMIP6 Should we use CMIP or “CMIP PMIP” for PMIP4 experiments that are part of CMIP6?
This is confusing
PMIP4 PMIP PMIP4??

CMIP6??
Use this for non-CMIP6 experiments, or groups that are not part of CMIP6
Should we use PMIP4 because it is the 4th phase of PMIP, or CMIP6 because we will be using CMIP6 format specifications?

Experiment names

You can find all the referenced experiment names on es-doc search site: select Project=CMIP6-DRAFT and Type=Experiment

DECK and historical experiments

FIXME How do we specify that an historical experiment is the true continuation of a past1000 experiment?

We probably need to use parent_experiment_id=past1000 in the files' metadata, as well as parent_activity_id, parent_mip_era=CMIP6, parent_source_id, branch_time_in_ parent and other related parent_* variables. We can probably also agree on a specific variant_label that will appear in the file names.

Reminder: DECK = Diagnostic, Evaluation and Characterization of Klima. More information on the CMIP6 experiments is available in Eyring et al 2017

experiment_id experiment
amip Atmospheric Model Intercomparison Project
piControl Pre-Industrial Control
abrupt-4xCO2 abrupt quadrupling of CO2
1pctCO2 1 percent per year increase in CO2
historical all-forcing simulation of the recent past

PMIP4-CMIP6 experiments

You can find specific details about the experiments by visiting the PMIP4 experimental design section, or by directly clicking on one of the experiments below

experiment_id experiment
past1000 past 1000 years
mid-Holocene mid-Holocene
lgm last glacial maximum
lig127k last interglacial
midPliocene-eoi400 mid-Pliocene

PMIP4 only experiments

Guidelines for creating new PMIP4 experiment_id values

Proposed PMIP4 experiment_id values

The following suggested PMIP4 experiment_id values should be considered as a work in progress, till they are validated!

You can find specific details about the experiments by visiting the PMIP4 experimental design section, or by directly clicking on one of the experiments below

experiment_id experiment Status
LDv1-LGMspin Last Glacial Maximum spinup Work
LDv1-transpin Transient orbit and trace gases spinup (26-21 ka) Work
LDv1 Transient deglaciation (21-0 ka) Work
FIXME Early Holocene (9.5 ka? 8.5 ka?) Work
lig116k ? Transition from the LIG to the glacial
(2 experiments)
Work
FIXME MIS11
(2 experiments)
Work
FIXME Transient Holocene (6 ka to 0) Work
FIXME Transient LIG (130 ka to 125 ka) Work
FIXME DeepMIP Work
FIXME More experiments to come…

Handling groups of simulations

CMIP5 ensemble member

aka r<N>i<M>p<L> or rip

The definitions below have been superseded by CMIP6 specifications, but it is still useful to remember them. They have been copied from:

CMIP6 variant_label

aka r<k>i<l>p<m>f<n> or ripf

PMIP4 and variant_label notes

Reminder: each option in r<k>i<l>p<m>f<n> has to be a strictly positive integer

realization_index r<k>

The long PMIP4 simulations are going to require both a lot of processing power and a lot of storage. It is quite likely that there will be only one realization, for a given set of i<l>p<m>f<n> and that the variant label will always start with r1

forcing_index f<n>

Depending on available resources, the PMIP4 groups may choose to perform several simulations for the same experiment, using different combinations of forcings. The forcings used will have to be carefully described in the documentation (and in the metadata inside each NetCDF file) and be encoded in the integer value of the forcing_index.

There are several ways to proceed. The easiest way is to let each group choose its own way of numbering the forcings combinations (and document it!), but all groups should try to use a common scheme for and associate the same combination of forcings with the same integer

Sequential numbering scheme

The contact people for each experiment determine which forcings combinations are most likely to be used and associate them with a predefined number. If necessary, a group can later ask for a new forcing combination to be registered

Forcings fforcing_index
Recommended default,
or most likely combination,
or mandatory simulation
f1
forcing1='on', forcing2='off', etc f2
Some other combination fN
Hierarchical numbering scheme

The following scheme will create bigger integers, but the values will be more meaningful

If there are 10 or less options for each type of forcing, we can assign a power of 10 to each type, multiply it with the forcing option and add everything

Tentative example for the lgm experiment:

Power Forcing Options
2 Ice sheet 1=ICE-6G-C
2=GLAC-1D
1 Aerosols 1=Hopcroft et al
2=Albani et al
0 Vegetation 1=interactive vegetation
2=interactive carbon cycle
3=prescribed

Example: GLAC-1D + Hopcroft et al + interactive vegetation = 2 * 100 + 1 * 10 + 1f211

variant_label constraints for PMIP4 experiments

historical

historical simulations that are the continuation of a past1000 simulation should use 1000 for the initialization_methodi1000

past1000

FIXME

mid-Holocene

FIXME

There are at least 2 sensitivity experiments

lgm

FIXME

lig127k

FIXME

There are at least 3 sensitivity experiments

midPliocene-eoi400

FIXME

LD-LGMspin

FIXME

LD-transpin

FIXME

LD

FIXME

Other PMIP4 specific experiments

There may be other experiments listed in Proposed PMIP4 experiment_id values and PMIP4 experiments

FIXME

PMIP4-CMIP6 directory structure and file names

The DRS defines (among other things) how the different attributes will be combined to generate unambiguous directories and file names, in the ESGF distributed database

Directory structure = <mip_era>/
                        <activity_id>/
                          <institution_id>/
                            <source_id>/
                              <experiment_id>/
                                <member_id>/         <== variant_label
                                  <table_id>/
                                    <variable_id>/
                                      <grid_label>/
                                        <version>/

file name = <variable_id>_<table_id>_<experiment_id >_<source_id>_<member_id>_<grid_label>[_<time_range>].nc

For PMIP4, we have sub_experiment_id == none (because we don't use forecast and hindcast), and therefore member_id == variant_label

Used in
dir?
Used in
file?
Attribute
name
Value for PMIP4-CMIP6
Y N mip_era CMIP6
PMIP4 ?
Y N activity_id CMIP
PMIP
Note: “CMIP PMIP” becomes CMIP (If multiple activities are listed in the global attribute, the first one is used in the directory structure)
Y N institution_id institution label (IPSL, …)
Y N version vYYYYMMDD (e.g., v20160218), indicating a representative date for the version
Note: the version is not stored in the NetCDF files and not used in the file names, because it is only specified when publishing (eg storing the data in ESGF) the NetCDF files
Y Y source_id source label (e.g. the model name/version using only authorized characters)
Y Y experiment_id See the Experiment names section
Y Y member_id PMIP4 does not use sub_experiment_id, so the value of member_id is equal to the variant_label: r<k>i<l>p<m>f<n> (see the CMIP6 variant label section)
Y Y table_id CMOR table label (Amon, …)
Y Y variable_id variable identifier (tas, pr, …)
Y Y grid_label gn: output is reported on the native grid
gr: output is regridded by the modeling group to a “primary grid” of its choosing
gr1, gr2, …: output is regridded on another grid than the primary grid (that was already different from the native grid)
N Y time_range the last segment of the file name indicates the time-range spanned by the data in the file, and is omitted when inappropriate. The format for this segment is the same as in CMIP5

Examples:

CMIP6 data license and acknowledgement

PMIP data users have to add the following PMIP-specific sentence to their acknowledgement:
PMIP is endorsed by both WCRP/WGCM and Future Earth/PAGES

Data users

The data end users have to follow the Terms of Use and licensing information detailed on the CMIP6: Proper citation and acknowledgement page.

Data providers

The Terms of Use are also detailed in the license global attribute available in each data file created by the providers

Note: you can get the latest version of the license on github.

The “license” attribute should record the following statement (with segments in square brackets optional, and with required, appropriate text entered in place of <*> ): 

“CMIP6 model data produced by <Your Centre Name> is licensed under a Creative Commons Attribution-[NonCommercial-]ShareAlike 4.0 International License (https://creativecommons.org/licenses). Use of the data must be acknowledged following guidelines found at https://pcmdi.llnl.gov/home/CMIP6/CitationRequirements6-0.html. Further information about this data, including some limitations, can be found via the further_info_url (recorded as a global attribute in this file)[ and at <some URL maintained by modeling group>]. The data producers and data providers make no warranty, either express or implied, including, but not limited to, warranties of merchantability and fitness for a particular purpose. All liabilities arising from the supply of the information (including any liability arising in negligence) are excluded to the fullest extent permitted by law.”

The [*] indicates that institutions may choose to use the Non-commercial version of this license by inserting the words “NonCommercial” at this point, but this will significantly limit the use of the data in downstream climate mitigation and adaptation applications.  Please do not simply copy the statement above when writing data; Some text must be entered, some text is optional and the symbols “[*]” should not appear in the licensing text.

Details about the Creative Commons licenses

You can read the The Licenses section of About The Licenses if you want to understand the Creative Commons copyright licenses. You can also read the Six licenses for sharing your work summary pdf.

More specifically, CMIP6 data will be distributed under the following 2 licenses (each institute has to choose one license)

Logo Abbreviation Full name Details
CC BY-SA 4.0 Attribution-ShareAlike 4.0 International https://creativecommons.org/licenses/by-sa/4.0/
CC BY-NC-SA 4.0 Attribution-NonCommercial-ShareAlike 4.0 International https://creativecommons.org/licenses/by-nc-sa/4.0/

Note: CC buttons and logos are available from the CC site Downloads page.