Introduction:
The Research Data Archive (RDA) Dataset Submission system is a web-based interface that enables the following:

  • Allows a potential non NCAR/UCP dataset submitters to submit dataset information online, so the NCAR Computational and Information Systems Laboratory (CISL) Data Engineering and Curation Section (DECS) team members can evaluate and determine if the data is appropriate to be archived in the RDA.
  • Informs the potential dataset submitters about the terms and conditions under which the dataset will be evaluated, ingested, and managed.

Motivation:
Through the management provided by the DECS, the RDA has established itself as an archive that specializes in atmospheric and climate datasets.  RDA's mission statement focuses the archive to collect, manage, and preserve scientific datasets for the long term in order to support research in the atmospheric, weather, climate, and related sciences.  Consequently, RDA's datasets can be discovered, accessed, and used/reused by scientific, education, and private sector communities.

Scope of RDA Dataset Collection:
The RDA welcomes submission requests for datasets that primarily support weather and climate research.  The datasets will be considered on a case-by-case basis, constrained by available resources (personnel time and infrastructure capacity), relevance to existing data collections in the RDA, and importance for the core weather and climate research sponsored by the NSF.

Process:

  • When using the RDA Dataset Submission system:
    • The potential dataset submitter will be asked to create an account if the dataset submitter does not have an existing RDA account.
    • Once an account has been created, a new Dataset Submission Form can be filled in by following the instructions provided.
    • After the Dataset Submission Form's mandatory fields have been completed, the Form can be submitted for evaluation.
      • Completion of the optional fields is recommended.
  • Please note that the submission of the Dataset Submission Form does not guarantee the acceptance of the dataset.  
    • Acceptance depends on many things, including, but not limited to, appropriateness of the data for weather and climate research, estimated term of relevance and popularity for the community, metadata and data standards used to create the dataset, and importantly, the infrastructure and human resource available to support the dataset in the RDA.
    • For additional details regarding the Appraisal and Selection decision making process, please refer to the RDA Appraisal and Selection Workflow.

Rights / Terms, Conditions for Use, Collaboration, and Ownership:

  • By choosing to provide dataset information for appraisal and selection considerations, the dataset submitter acknowledges and accepts the UCAR/NCAR Privacy Policy.
  • In the case where the dataset is selected to be ingested into the RDA:
    • The dataset submitter acknowledges and accepts the Terms of Use for UCAR Data Repositories and Copyright Issues outlined by UCAR/NCAR.
    • The dataset submitter agrees to work collaboratively with the RDA DECS team members to ensure the metadata and data are accurately described and transferred to the NCAR infrastructure.
    • The dataset submitter agrees to work collaboratively with the RDA DECS team members to determine and agree upon the appropriate level of curation as described in Description of RDA Dataset Collection Curation Levels.
    • The dataset submitter/provider can be cited by name in the data citation recommended to the RDA users.
    • The RDA agrees to host the dataset for a minimum of 5 years.

Data Preservation Policy:

The DECS is committed to providing long term preservation and access to the digital assets it hosts in the RDA. Being guided by community supported best practices such as the OAIS reference model, DECS staff use digital preservation strategies that adapt to an evolving social and technical environment. Technical details of the digital preservation strategies are provided in RDA Data Security/Resilience Overview. Additionally, all RDA datasets receive the following digital preservation support:

  • A Digital Object Identifier (DOI) that will always point to the dataset homepage.
  • Version control to support provenance tracking of any changes to dataset metadata and/or dataset data files.
  • Secure storage and disaster recovery backups for both the dataset metadata and data files.
  • Migration to new storage media on a regular cycle.
  • Routine MD5 checksum fixity checks, to validate the integrity of the data files.

To avoid preservation and future usability issues related to nuanced or proprietary data formats, the DECS asks all RDA data submitters to provide data in community accepted/supported formats, such as those listed in NCAR's Climate Data Guide.

In the unexpected event that the RDA would need to cease its activities, DECS staff will follow the plan to preserve data outlined in the RDA strategy to wind down services due to a change in NSF or NCAR’s strategic direction.

Finally, all datasets ingested into the RDA, independent of format, will be maintained according to the basic curation level detailed in Description of RDA Dataset Collection Curation Levels. If enhanced or data-level curation are needed, the dataset submitter will work with DECS staff to determine and define those requirements prior to dataset creation and ingest.

Dataset Withdrawal Policy:

  • Non observing system based datasets (e.g. Model output datasets typically larger than 10 TBs) may be purged from the RDA after 5 years if:
    • Community usage metrics don't support continued maintenance of the dataset.
    • The dataset has been superseded by an updated version and it is determined that the copy of the old version is no longer of significant research value.
  • The RDA will attempt to contact the original submitter to provide advance notice in the event that a decision has been made to purge a dataset. This will allow the author to find an alternative repository or pay for continued archival in the RDA if needed.
  • After a dataset has been purged, a deprecated version of the dataset landing page will remain visible, presenting the dataset title, dataset status information, and contact information for assistance. Additionally, the dataset DOI information will be updated through DataCite so that the DOI will resolve to the "deprecated" dataset landing page if needed.
  • Observing system based datasets (e.g. irreplaceable measurements) will be preserved indefinitely if the RDA is the repository of record.

Cost Recovery:

  • There are no costs associated with using the RDA Dataset Submission system.  Potential dataset submitters are encouraged to submit requests if the RDA is viewed as ideal host for their data.
  • The RDA has resource limits and priority obligations for data support at NCAR.  When possible and appropriate, the RDA can extend more broadly its data archiving and access services.  The potential and the need for data management cost recovery from the dataset submitters/providers is determined early in the Dataset Appraisal and Selection workflow.  No cost and minimal cost are the two most likely scenarios for qualified community reference datasets.  The actual cost structure for each dataset will be determined on a case-by-case basis.
  • Note: NCAR and UCP researchers who need data management repository services to support open access requirements for datasets, will be asked to pay for cost recovery according to the current NCAR Data Management Services Rates. Labs may use their no-cost storage space allocations (e.g. /glade/p/…), in place of paying for data management storage services, at the lab’s discretion.

Frequently Asked Questions:

  • Why do I need to create an account?
    • The creation of an account allows the dataset submitter to have the benefit of accessing RDA's Dataset Submission system.  The Dataset Submission system provides customized RDA data services and support, including My Submissions dashboard.  The dashboard serves as a centralized location for organizing the dataset submission process, so that the dataset submitter should find the process more streamlined.  
  • Why am I asked to provide dataset information for the appraisal and selection process?
    • The fields requested by the Dataset Submission Form are essential information for the RDA DECS team members to understand the nature of the dataset.  By providing the relevant dataset information through the Dataset Submission Form, the dataset submitter helps in enabling the following:
      • DECS team member's evaluation of the dataset's compatibility with RDA's dataset collection and the dataset's potential contribution to RDA's overall mission statement.
      • Standardized information input format.
      • Maintain a documented, consistent, and fair system for assessing new datasets.
      • Encourage sharing of project-specific knowledge from the dataset submitters to improve the metadata and provenance records for a dataset.