Attribute Convention for ORCESTRA#
Goals#
The goal of this convention is to define a minimal set of global attributes for datasets to provide the following functionalities:
Automatically retrieve contact information for any dataset
Create a summary table with all datasets and some key metadata
For additional metadata not covered by this convention we recommend following the CF and ACDD conventions.
Global attributes#
The following section lists a selection of required and recommended global attributes for datasets. The required attributes are essential for the ORCESTRA Data Browser to render proper landing pages.
If an attribute is undefined, it is recommended not to set it instead of using empty strings.
Required#
Key |
Value |
|---|---|
|
Short phrase or sentence describing the dataset |
|
Paragraph describing the dataset |
|
Comma-separated list of names |
|
Comma-separated list of emails |
|
Recommended#
Key |
Value |
|---|---|
|
Type of sampling geometry according to CF conventions |
|
Comma-separated list of projects (see below) |
|
Name of the platform that supported the sensor (see below) |
|
Method of production of the original data (e.g. “radar”, “radiometer”, “CTD”) |
|
Audit trail for modifications to the original data |
|
Comma-separated list of URL/DOI to extended information |
|
Comma-separated list of keywords |
|
A textual description of the processing (or quality control) level of the data. |
|
Institution responsible for the dataset |
|
Instrument used to measure the data (may be set in addition to |
|
Comma-separated list of identifiers (e.g. ORCID) |
|
A comma-separated list of the conventions that are followed by the dataset, e.g., ‘ACDD-1.3, CF-1.12’. |
Annotating datasets without native metadata#
Our attribute convention is based on the use of global attributes. However, not all data formats support metadata natively (e.g., CSV, PDF). While we strongly recommend converting data to formats that support attributes, a fallback mechanism is provided for externally annotating data of any format.
Each directory containing a file named dataset_meta.yaml is considered a dataset.
This YAML file must include an attributes block that defines the dataset’s global attributes.
All attributes must comply with the ORCESTRA Attribute Convention.
The file may also include an optional extent block, used to provide information that would otherwise be derived from dataset coordinates.
Currently, the extent block supports the following keys:
Key |
Type |
Description |
|---|---|---|
|
[string] |
Temporal extent in ISO 8601 format |
|
[float] |
Geospatial extent defined by a GeoJSON Bounding Box |
Example:
attributes:
title: BEACH dropsonde dataset (Level 3)
summary: This dataset is the Level 3 BEACH dataset. It contains quality controlled
dropsonde data from the ORCESTRA field campaign with a common altitude dimension.
creator_name: Helene Gloeckner, Theresa Mieslinger, Nina Robbins
creator_email: helene.gloeckner@mpimet.mpg.de, theresa.mieslinger@mpimet.mpg.de, nina.robbins@mpimet.mpg.de
license: CC-BY-4.0
featureType: trajectoryProfile
history: Level 1 ASPEN processing with Aspen V4.0.4
keywords: ORCESTRA, BEACH, Sounding, Dropsondes, Atmospheric Profiles
platform: HALO
project: ORCESTRA, PERCUSION, MAESTRO
references: https://github.com/atmdrops/pydropsonde
source: dropsondes
extent:
temporal: ["2024-08-09T14:26:37", "2024-09-28T19:30:47"]
spatial: [-59.45647812, 1.29273319, -19.62099838, 22.03603554]
Appendix: Controlled Vocabularies#
Project identifiers:
ORCESTRA, BOW-TIE, CELLO, CLARINET, MAESTRO, PERCUSION, PICCOLO, SCORE, STRINQS
Platform identifiers:
EarthCARE, HALO, ATR-42, INCAS KingAir, BCO, CVAO, RV METEOR, INMG, MSG