NIH Data Management
Ensuring Compliance with New Federal Standards
The USC Digital Repository offers a full suite of data management services to help researchers meet new requirements for data preservation, access, and security by the National Institutes of Health (NIH), the National Science Foundation (NSF), and other federal agencies. The USCDR is collaborating with the USC Office of Culture, Ethics, and Compliance, the USC Office of the Chief Information Security Officer (OCISO), and USC Research and Innovation to help researchers fulfill these requirements, securely share data, and protect human subjects data and other sensitive information.
As outlined by National Institute of Health, the critical considerations for choosing an appropriate repository and data management strategy are:
Unique Persistent Identifiers
Unique persistent identifiers (PIDs) allow datasets to be found and cited by other researchers to ensure their long-term value and relevance after studies are completed. The USCDR partners with Data Cite to provide a unique digital object identifier (DOI) to research datasets published for discovery and access in the USC Digital Library.
Long-Term Sustainability
Research datasets created with NIH, NSF, and other federal funding must be preserved for as long as they have value to the research community and the public. The USCDR ensures the long-term survival and integrity of research data in its dedicated digital preservation infrastructure on the USC campus with multiple mirrored locations for file redundancy to provide disaster recovery. USCDR offers 1-year, 6-year, or 20-year retention periods in consultation with researchers.
Once data files are ingested into the USCDR’s preservation systems, they are assigned SHA-1 values that are verified through automated fixity checks at 6-month intervals. If the USCDR’s quality-assurance systems detect bit-variance in digital files, systems automatically restore original files from back-up copies to ensure no information is lost. Data files are automatically migrated to new storage media every three years or whenever errors are detected on individual pieces of storage media. USCDR’s long-term plans include migrating digital files to new preservation technologies as they become the industry standard.
Metadata
Metadata optimizes discovery, reuse, and citation of datasets—particularly within the research communities where they are most relevant. USCDR metadata librarians provide expertise with the selection of appropriate metadata schema, standards, and controlled vocabularies such as Medical Subject Headings (MeSH) for biomedical research publications.
Data Curation and Quality Assurance
Data curation ensures the accuracy and quality of datasets and makes them discoverable by other researchers. USCDR metadata librarians protect the integrity of the metadata for all research datasets made accessible via the USC Digital Library. They also provide expert advice to help researchers optimize data curation strategies and metadata for the research communities and public audiences that benefit from their studies. USCDR preservation and access systems log all changes to preserve a complete record of activity and ensure the long-term integrity of digital files and metadata for research datasets.
Free and Easy Access
NIH and other federal agency data management requirements ensure that all federally supported research datasets are made broadly accessible to the research community as well as the public. USCDR provides broad, equitable, and open online access to datasets and their metadata through the USC Digital Library, which contributes digital collections to Calisphere, the Digital Public Library of America, and other online resources that draw millions of page views every year. Users can search for and discover published datasets using the Primo search interface on the USC Libraries’ homepage, the USC Digital Library’s search interface, as well as Google and other search engines.
USCDR will consult with researchers to precisely configure access to fulfill requirements related to confidentiality, human subjects research, and intellectual property rights. All metadata records are publicly accessible online and digital files from research datasets can be made publicly viewable or downloadable via the USC Digital Library. Files can also be made available only to users with log-in credentials or after approval by research teams’ principal investigators or other designated administrators.
Broad and Measured Reuse
Along with broad access, NIH and other federal agency data management requirements ensure that all federally supported research datasets can be readily analyzed, cited, and reused by the research community. The USCDR access infrastructure enables the broad reuse of research datasets and our metadata librarians provide expertise to help researchers document and share rights, software, research methods, and other critical information to maximize the value of datasets for other researchers.
Clear Use Guidance
In support of compliance with NIH and other federal agency data management standards, USCDR metadata librarians advise researchers on providing transparent and accessible documentation for rights, use guidelines, and procedures for working with datasets. USCDR can assist with the organization of data collections for access in the USC Digital Library, including selecting appropriate rights statements, documentation of use guidelines in metadata records and landing pages, and other resources to aid with the analysis, citation, and reuse of data from NIH- and other federally funded research studies.
Security and Integrity
In keeping with NIH and other federal agency requirements, the USCDR maintains secure systems and stringent policies to protect human subjects data and other sensitive information. All USCDR systems used to preserve, manage, and provide access to research datasets benefit from a suite of security measures, including multifactor authentication and regular vulnerability testing by USC’s Information Technology Services (ITS) unit and outside security consultants. The USCDR works closely with ITS to leverage our university’s robust and diversified security resources for the protection of research data entrusted to our preservation and access systems. All software and technology vendors that contract with the USCDR is required to meet exacting industry standards for digital security and support HIPAA, PCI-DSS, and other regulatory compliance.
Access controls and user rights and permissions can be precisely configured in the USCDR’s preservation and access systems. For datasets that include sensitive information, the metadata can be made publicly accessible online for easy discovery in keeping with NIH requirements, while access to the digital files can be limited to authorized users. The USCDR consults with researchers to identify appropriate security policies and configure the systems to match the sensitivity of information included in research datasets. USCDR preserves the integrity of original digital data files to prevent unauthorized modifications and track all user activity.
Confidentiality
In keeping with NIH requirements, regulatory standards, and best practices, the USCDR’s security measures protect the confidentiality of sensitive data. The USCDR has documented capabilities and a full suite of administrative, technical, and physical safeguards to ensure compliance with applicable confidentiality, risk management, and continuous monitoring requirements for the protection of sensitive data. All access to data in USCDR systems is logged by user ID and IP address, and all changes are logged by the system. The preservation systems protect the integrity of original data files. The access and preservation systems can be configured to support HIPAA and other regulatory compliance.
Common Format
NIH, NSF, and other federal agency guidelines require researchers to employ standard file formats and metadata schema that are supported and in common use within their research communities. Expert personnel at the USCDR assist researchers with identifying the correct digital file types, transcoding files to correct types and specifications, and applying appropriate, consistent, and commonly used schema for metadata records. To ensure the broad relevance of their research datasets, there are guidelines for researchers to avoid proprietary formats that create unnecessary barriers for access and reuse.
Provenance
Recording full information about the provenance and history of research datasets significantly enhances their value to the research community in keeping with the goals of NIH, NSF, and other federal agencies. Providing complete, transparent information about dataset provenance allows data to be evaluated by other researchers, cited, and reused. The USCDR has systems and expert personnel in place to help researchers document the origin, chain of custody, and any modifications to research datasets and metadata. USCDR systems log all changes made to metadata records.
Data Retention Policy
Data management requirements at NIH and other federal agencies are intended to preserve research datasets for as long as they have value to the research community and the public. Expert personnel at the USCDR can advise on appropriate retention periods to fulfill NIH, NSF, and other federal agency requirements.
Client Support
The USCDR team offers a wealth of expertise in data management, preservation, and access services to help researchers fulfill the requirements of NIH, NSF, and other federal agencies. The USCDR team is highly responsive and maintains dedicated resources for client support.
Get Started
To consult with our USCDR team about your data management needs, please contact uscdr@usc.edu.