Information Management: Data Storage Policies

IM Guide » Data Storage Protocols

Data Storage Policies 

CWT IM data storage policies are designed to balance analytical, accessibility, security, and archival considerations for each resource.  Storage specifications for metadata, digital data, and printed data archives are as follows:

  • Metadata

    The metadata describe the physical and logical structure of a data set, as well as the hypotheses, methodology and researchers responsible for its creation.  The primary repository for CWT metadata is the Metabase, a relational database developed using Microsoft SQL Server® 7.0. Our implementation of the Metabase is based on the system originally developed by the GCE LTER and has been built for Coweeta through a collaborative effort with GCE's Information Manager, Wade Sheldon. The Metabase is secured using both network and database security layers, and is accessed primarily through web applications available on the Coweeta LTER Public Web Site and through applications that connect to the database via ODBC.  Write access is limited to IM staff who have access to ODBC accounts are logging in from machines that are within a single subnet of the University of Georgia network.

    Database files are synchronized between the GCE data management workstation and server, and are regularly backed up to our offsite backup node.

  • Digital Data

    The primary repository for digital data (e.g. submissions, processed data, and archived data) is the CWT IM File Server located at 151 Baldwin Hall, Department of Anthropology, University of Georgia. This server is maintained in a locked laboratory and consists of a RAID 5 array with a hot-spare for system failures. Data files are protected by several layers of computer security (using Windows 2000 NTFS access control and TCP/IP firewall software), and are backed up with Bacula to another RAID 5 storage system located in the University of Georgia, Laboratory of Archaeology's Curation Vault.  Incremental backups occur on a daily basis and a full backup of the system is conducted monthly. We maintain two months worth of monthly and daily backups and data loss extending beyond a single day would only result if we experienced the simultaneous disk failure of three drives on both our file and backup servers. On site backups for LTER personnel working at the Coweeta Hydrologic Laboratory is currently the responsibility of individual personnel and is currently managed through the use of external drives and simple backup software, such as Foldermatch. A centralized, RAID 1 backup system is planned for the second quarter of 2011.

    As data files become candidates for online access, copies will be transferred to the CWT project server, which is co-located with the file server in the UGA Anthropology Department. Access to these files will be controlled as appropriate by network file security protocols and web-based data access programs available through the Coweeta LTER Public Web Site. CWT IM supports a wide variety of data formats

  • Printed Data

    At the present time, the CWT IM does not support the curation of printed materials.

  • A NOTE ON DATA GOVERNED BY HUMAN SUBJECTS RULES

    The University of Georgia's Institutional Review Board (IRB) requires that researchers working with human subjects maintain the confidentiality of their informants through throughout the research process. Coweeta's Human Subjects Policy recognizes that the standard for confidentiality includes specific archiving and identification procedures. These procedurs include the assignment of random identification codes for all research subjects, the use of such identification codes in lieu of personal identifiers on all research materials, and the archiving of documents linking random identifiers to personal identifiers in locations separate from research materials. While it is the responsibility of the researcher to ensure their compliance with human subjects research guidelines, Coweeta IM facilitates compliance in the following ways:

    1. Providing separate archive spaces for tables linking personal and randomly assigned identifiers and assigning file permissions for these spaces only to the investigators involved in the research.
    2. Maintaining confidential data on servers that are separate from the public Coweeta LTER data archive.
    3. Classifying all non-confidential data from Human Subjects-related research as Type IV in perpetuity, in order provide the researcher with the means to ensure that data are used in a manner that is consistent with the consent for data use given by the researcher's informants.