Skip to contents

The find_presence function is used to find the presence or absence of a specific feature around a defined index date of a patient's record. It can be used to find presence/absence, number of occurrences, date closest to the index date, or date closest to an index date in days.

Usage

find_presence(
  database,
  tables,
  code_list,
  index_date,
  func,
  index_days_start,
  index_days_end,
  allow_subset = FALSE
)

Arguments

database

database name to use

tables

names of tables to use in a dataset that is a list c("diagnosis"), c("diagnosis","procedure")

  • supported tables: encounter, diagnosis, procedure, medication_ingredient, medication_drug, lab_result, vitals_signs

code_list

user defined table with 3 mandatory columns:

  • mandatory columns

    • feature - the feature the exact code will roll up to and name of the matrix column in the output (letters, numbers, underscores only, no spaces, not case sensitive)

    • code - exact code

    • code_system - RxNorm, LOINC, etc

      • if a user is using the encounter table, code_system must equal "Encounter Type" and code maps to the "type" field in the encounter table

      • if a user enters the value month_year_death in code column and patient in code_system column, the function will use the value in the patient.month_year_death column for all calculations

  • optional columns

    • Supported columns are:

      • qualifier_num - looks across lab_result and vitals_signs num value fields at the same time

      • qualifier_text - looks across lab_result and vitals_signs text value fields at the same time

    • if a user only passes in lab_result or vitals_signs to find_presence, the function checks only that table

    • users can create as many additional columns as they want, but column names must be unique and match the supported column names

    • syntax for qualifying lab numeric values:

      • '<=X': less than or equal to X

      • '<X': less than X

      • '>=X': greater than or equal to X

      • '>X': greater than X

      • '~=X': not equal to X

      • 'X:Y': between X and Y

    • syntax for qualifying lab categorical values

      • the user can enter any string they want - exact match

        • if a user wants to use multiple values for a categorical lab, repeat the row with the same code but different qualifier value

    • if cell is left blank, system skips and assumes no qualification for that code

    • one qualification of a code does not apply to the entire feature; in the case there is more than one code mapped to the feature - every code must be qualified

index_date

user defined table with two columns

  • patient_id

  • index_date

func
  • date - outputs the closest date (as a date type) to the index date of the present code in the feature (NULL if no presence)

    • in the case of a tie, uses before the index date

  • boolean - outputs 1 for presence and 0 for absence

  • relative - outputs an integer value of how many days before or after the index date the closest presence code appeared, negative if before index, positive if after, NULL if absent

    • in case of a tie, uses the before the index date

  • count - number of unique dates a feature occurred in time window defined by the user in relation to the index event

    • start_date is used for encounter

index_days_start

start period relative to index date in days

  • None means anytime

  • negative means before the index date

  • positive means after

index_days_end

end period relative to index date in days

  • None means anytime

  • negative means before the index date

  • positive means after

allow_subset

(optional argument) - checks whether the number of patients in the index event input table is the same as the dataset cohort number - would default to false, but if set to true would warn and continue

  • if TRUE and there are fewer patients in the index_date table, only returns the patient_ids present in index_date_df

Value

  • table with patient_id as the first column and each unique feature value in code list input as subsequent columns

  • each feature finds the function value the user chooses and populates the cells of the table

Details

code list without qualifiers:

|     feature     |   code   | code_system |
|-----------------|----------|-------------|
| lung_transplant | T86.81   | ICD-10-CM   |
| lung_transplant | Z48.24   | ICD-10-CM   |
| lung_transplant | Z94.2    | ICD-10-CM   |
| lung_transplant | T86.81   | ICD-10-CM   |
| lung_transplant | 1006036  | CPT         |
| lung_transplant | 88039007 | CPT         |

code list with qualifiers:

|      feature       |  code   | code_system | qualifier_num |
|--------------------|---------|-------------|---------------|
| blood_pressure_sys | 8460-8  | LOINC       | 100:160       |
| blood_pressure_sys | 8461-6  | LOINC       | 100:160       |
| blood_pressure_sys | 8459-0  | LOINC       | 100:160       |
| blood_pressure_sys | 8450-9  | LOINC       | 100:160       |
| blood_pressure_sys | 87741-5 | LOINC       | 100:160       |
| blood_pressure_sys | 8480-6  | LOINC       | 100:160       |


present_absent_table = find_presence(database='covid_db', tables=c('procedure','diagnosis'), code_list=code_list, index_date=sg_index_date, func='boolean', index_days_start=-365, index_days_end=-1)
head(present_absent_table)

| patient_id | lung_transplant |
|------------|-----------------|
|          1 |               0 |
|          2 |               0 |
|          3 |               0 |
|          4 |               1 |
|          5 |               1 |