Submitting a Protocol for Existing Data

Submitting a Protocol for Secondary Data


A perscon writing on a pad

Exempt 4 refers to the "secondary research uses of identifiable private information or identifiable biospecimens.” 45 CFR 46.104(d)(4)

Exempt Category 4, also known as “Secondary Research,” traditionally covered research involving the collection or study of secondary data, documents, records, pathological specimens, or diagnostic specimens. This category was generally referred to as research using “previously collected data” or “secondary data.” However, with the implementation of the revised Common Rule on January 21, 2019, significant changes were made to Exempt Category 4 to better align with modern research practices:

  1. Expansion to Include Identifiable Data: Under the pre-2018 rule, Exempt Category 4 applied only to research involving secondary data that was either publicly available or recorded in a manner that subjects could not be identified, directly or through identifiers linked to the subjects. The revised Common Rule has expanded this category to include the secondary research use of identifiable private information or identifiable biospecimens, provided that specific conditions are met. These conditions may include regulations under HIPAA, the use of publicly available information, or ensuring that the investigator does not contact subjects or re-identify the information.
  2. Inclusion of Prospective Data Collection: The revised rule now allows for the secondary use of data or biospecimens that will be collected in the future, not just those that are already in existence at the time of the research proposal. This change means that a research study proposing to analyze samples or information that will be collected for clinical purposes in the future could qualify for this exemption if it meets at least one of the applicability provisions outlined in the revised rule.
  3. Prohibition on Re-Identification: If an investigator records information about individuals in a non-identifiable manner, the investigator must not attempt to re-identify or contact the research subjects. This ensures that the privacy of the subjects is maintained even if the data is initially recorded without identifiers.
  4. New Provisions for Broader Exemptions: The revised Common Rule introduces two additional provisions under Exempt Category 4:
    • HIPAA-Regulated Research: When the investigator's secondary use of the identifiable private information is regulated under HIPAA as "healthcare operations," "research," or "public health." It’s important to note that this provision applies only to the secondary use of identifiable private health information, not to biospecimens.
    • Federal Research: When the secondary research is conducted by or on behalf of a federal department or agency, using data collected or generated by the government for non-research purposes, and the information is subject to federal privacy standards and other requirements specified in the exemption.

The following guide will explain how to submit a protocol for secondary data to Teachers College (TC) Institutional Review Board (IRB) review. 

Submission Requirements

All submissions to TC IRB should include a Principal Investigator (PI) survey (submitted through TC Mentor IRB) and a completed IRB application (IRB Application Template). For detailed information on how to submit to TC IRB, please visit our How to Submit page. 

On the IRB Application, several questions ask about “new data collection” or “recruitment efforts.” For these questions, you may respond with “Not Applicable” when submitting an Exempt Category 4 protocol, because no new data will be collected as part of the study parameters.

Researchers should clearly indicate the identifiers that will not be included in the final dataset. 

Data Security

Data retrieved from an external institution or individual should be transferred via secure methods and stored in a safe location. When receiving datasets from external institutions, researchers are required to abide by guidelines and policies set in place by the data owners. Failure to do so can result in consequences from both the external institutions and TC IRB. Data protection measures should also be carefully considered. Researchers are advised to consider the following documents per their specific research needs: 

In most cases, the following documents are not required for an Exempt Category 4 TC IRB protocol submission:

  • Consent, Assent, or Parent Permission 
  • Recruitment Materials
  • Study Instruments

What about Prospective Data?

Prospective vs. Retrospective: Prospective studies involve individuals over time, where data is collected about them as their characteristics or circumstances change. The revised rule now allows for the secondary use of data or biospecimens that will be collected in the future, not just those that are already in existence at the time of the research proposal. This change reflects the evolving nature of research where new data collection may be intertwined with secondary data analysis. 

What about Publicly Available Data?

Publicly available data is defined as data that is accessible to the public without any restrictions or the need for permission or authorization. This can include information from public sources such as local telephone directories, publicly available websites, open-use datasets, and other similar resources. It is important to note that student records, which are protected under the Family Educational Rights and Privacy Act (FERPA), do not fall under this category and are not considered publicly available data.

Secondary data can help reduce new data collection burdens, and can also show how data evolves over time. Researchers can consider the following ways to explore secondary data:

  • Techniques to Acquire Secondary Data: Publicly available digital spaces provide ideal resources for accessing secondary data. These sources may include websites, newspapers, blogs, or social media. Secondary data may also include personal archives, notes, journal entries, or text that were collected in the past. 
  • Unexplored Data: Colleagues may possess unanalyzed data sources or data sets. This data may be considered a viable source for analysis. Researchers may consider ways to share data.
  • Possible Triangulation: Secondary data may offer pathways to compare and contrast various data sources to find patterns of difference and discuss their meaning. The researcher may synthesize information from different research sources or compare secondary over time. 

What Research is NOT Exempt Category 4 - Secondary Data?

  • New Data Collection: A researcher wants to understand how critical thinking techniques impact math learning. He will survey and interview participants to answer his research questions. This study involves new data collection and does not qualify for Exempt Category 4. 
  • Linked Identifiers & Potential Follow-Up: A researcher will obtain access to the research database for a secondary study and will record whether or not participants in the study received information about smoking cessation. The entries have linked identifiers (e.g., names, phone numbers, etc.), and will be coded so that she can go back to the research data at a later date to assess health outcomes. Aside from the code, she will only record age, smoking status, whether they received the cessation information and blood pressure. The linked identifiers (and potential participant follow-up) may not qualify for Exempt Category 4.

Tips for completing an IRB Protocol involving Secondary Data

You can outline these details in the IRB application, through an uploaded Data Security Plan and/or Data Sharing (or Use) Agreement (templates are on our website and Mentor IRB/Documentation).

  1. What is the data? The IRB will need to know if you are using data sets, video recordings, audio recordings, journal entries, photos, transcripts, survey responses, etc. If you are using data sets, the IRB will need to know what data fields you will use and how. Explain the data set. What is it? What content is included in the data set? Does it include education or health-related content? How was it originally collected (e.g., under the oversight of another IRB, not originally for research purposes)?
  2. How will you obtain access to the data? The IRB will need to know if the data are publicly available or if there are restrictions for accessing the data. Is there an affiliated data repository? Is the data privately held? Explain how you plan to access secondary data. What is your data transfer method? Are you engaged in a Data Sharing (or Use) Agreement? Please elaborate on the IRB application or upload supplemental documents with your IRB submission to explain how you will access the data (e.g., Data Sharing (or Use) Agreement).
  3. How many records will you access? Will the data be combined with other data sources? How easy is it to deduce the identities of the participants? The IRB needs to understand the complete picture of the data and the potential to deduce identity which could compromise confidentiality. What are the inclusion/exclusion criteria for the secondary data you are proposing to receive? Are you using every single data point from the previous study/data set? Are you only using certain data points? Does it include personally identifiable information? Is it de-identified? Please elaborate on the IRB application.
  4. Can the participants be linked to their data? The IRB will need to know in what form you will receive the data. Can the data be de-identified? Are the data linked and stripped of identifiers? Who prepared the data for you? Will you merge multiple data sets? What is your data security plan for obtaining, receiving, accessing, storing, sharing, and protecting that data? Will the data be made public (stored in a repository)? Is the data only shared in aggregate? How long will you keep the data?
Back to skip to quick links