- Joined
- Sep 5, 2007
- Messages
- 523
- Reaction score
- 56
I am working on a large set of electronic medical record EMR data that I am in the process of de-identifying by following Safe Harbor guidelines. My colleagues and I are trying to figure out how to handle dates/time points. It is important for our study to evaluate change/progress over time, particularly in reference to patients' own baseline and when programmatic changes were made to their treatment (as in this is literally the main objective of this study).
In my prior work with non-EMR data, I always computed "time since [date]". For instance, if a participant provided data on 01/01/2022, 01/21/2022, and then again on 03/29/2022, I'd code the time points as "0" days (i.e., baseline), "21" days (time lapsed between 01/01/2022 and 01/21/2022), and "88" days (time lapsed between 01/01/2022 and 03/29/2022). However, it appears that Safe Harbor guidelines *may* not permit this:
Clearly, just keeping the dates in the file is not an option. What I'm not clear on is whether a method such as the one I outlined above would make it a "derivative."
So, my question is: Does anybody have experience with de-identifying protected health information and maintaining relevant information as to when patient provided those information, let it be in a date format or "time since [date of choice]" format?
I have contacted the Office of Civil Rights, but either got someone's voicemail, pre-recorded messages, or talked to someone who had no clue what I was even asking about.
In my prior work with non-EMR data, I always computed "time since [date]". For instance, if a participant provided data on 01/01/2022, 01/21/2022, and then again on 03/29/2022, I'd code the time points as "0" days (i.e., baseline), "21" days (time lapsed between 01/01/2022 and 01/21/2022), and "88" days (time lapsed between 01/01/2022 and 03/29/2022). However, it appears that Safe Harbor guidelines *may* not permit this:
and(C) All elements of dates (except year) for dates that are directly related to an individual, including birth date, admission date, discharge date, death date, and all ages over 89 and all elements of dates (including year) indicative of such age, except that such ages and elements may be aggregated into a single category of age 90 or older
andWhat are examples of dates that are not permitted according to the Safe Harbor Method?
Elements of dates that are not permitted for disclosure include the day, month, and any other information that is more specific than the year of an event. For instance, the date “January 1, 2009” could not be reported at this level of detail. However, it could be reported in a de-identified data set as “2009”.
Many records contain dates of service or other events that imply age. Ages that are explicitly stated, or implied, as over 89 years old must be recoded as 90 or above. For example, if the patient’s year of birth is 1910 and the year of healthcare service is reported as 2010, then in the de-identified data set the year of birth should be reported as “on or before 1920.” Otherwise, a recipient of the data set would learn that the age of the patient is approximately 100.
andCan dates associated with test measures for a patient be reported in accordance with Safe Harbor?
No. Dates associated with test measures, such as those derived from a laboratory report, are directly related to a specific individual and relate to the provision of health care. Such dates are protected health information. As a result, no element of a date (except as described in 3.3. above) may be reported to adhere to Safe Harbor.
May parts or derivatives of any of the listed identifiers be disclosed consistent with the Safe Harbor Method?
No. For example, a data set that contained patient initials, or the last four digits of a Social Security number, would not meet the requirement of the Safe Harbor method for de-identification.
Clearly, just keeping the dates in the file is not an option. What I'm not clear on is whether a method such as the one I outlined above would make it a "derivative."
So, my question is: Does anybody have experience with de-identifying protected health information and maintaining relevant information as to when patient provided those information, let it be in a date format or "time since [date of choice]" format?
I have contacted the Office of Civil Rights, but either got someone's voicemail, pre-recorded messages, or talked to someone who had no clue what I was even asking about.