Back to SCARD Posts

What is the patient_uid?

Constructing a statistical linkage key for validating patients across the globe.

As with other practice management platforms, the patients Name, DOB and Sex are entered into SCARD however this is only for your use in identifying the patient. Under no circumstance would the identifiable data be shared with any party and as the data is encrypted per-doctor, so no person has the ability to access the identifiable data.

When results are contributed to the de-identified pool; the patient’s name and doctor’s reference are replaced with the patient_uid.

The patient_uid works well for validating the vast majority of individuals across an anonymous data collection though the realistic scenarios. Without allowing for the human breeding season, there is less than a 0.3% chance of sharing a birthday with any given individual while sharing names is obviously far more variable and culturally dependant.

Constructing the statistical linkage key

There are any number of ways this can be done but the method that we have used employs a concatemer of four demographic variables:
– The 2nd, 3rd and 5th letters of the individual’s surname;
– The 2nd and 3rd letters of their first name;
– Their entire birth date without punctuation; and
– The person’s gender encoded as either a 1 (male) or 2 (female).

Additional considerations

Obviously given the great variety of names available, particularly from an international perspective, you’ll need some fairly robust rules to govern how people record the information. Here are some of the rules we use:
– Do not include apostrophes, hyphens, inflections, dashes or spaces.

– If the first name or surname of the person is not long enough to supply the requested letters i.e. a surname of less than five letters) then the number ‘2’ should be substituted to reflect the missing letters. The placement of a number ‘2’ should always correspond to the same space that the missing letter would have within the field.

– If the first name or surname of the person is completely absent, it should be replaced by a string of digits of value ‘9’ to indicate ‘not stated’. The use of ‘not stated’ for this data item should be strongly discouraged but such cases can be easily identified and excluded from any analysis as the alphabetical characters have been replaced with numeric ones.

– Often people use a variety of names, including legal names, married/maiden names, nicknames, assumed names, traditional names etc. Even small differences in recording, such as the difference between MacDonald and McDonald, can make record linkage impossible. To minimise discrepancies in the recording and reporting of name information, recorders should ask for a person’s full ‘surname’. I imagine that this shouldn’t be a major issue with Australia given the near ubiquitous use of Medicare data.

– In some cultures it is traditional to state the surname first. To overcome discrepancies in recording/reporting that may arise as a result of this practice, recorders should always ask the person to specify their given name and their surname separately.


Share this post

Back to SCARD Posts