external-link R functions

trinetx.find_date API documentation

Function find_date

def find_date(database: str, tables: list, code_list: sql.dataframe.DataFrame, function: str, begin: str = None, end: str = None, patient_table: sql.dataframe.DataFrame = None) ‑> sql.dataframe.DataFrame

Description


The find_date function is used to find a patient’s date occurrence of a specific feature to help create an index date. It can be used to find the first, last, or random date of occurrence of a specific feature in a patient’s record.

Inputs


  • database - database name to use

  • tables - names of tables to use in a dataset that is a list [“diagnosis”], [“diagnosis”, “procedure”]

    • supported tables: encounter, diagnosis, procedure, medication_ingredient, medication_drug, lab_result, vitals_signs
  • code_list - user defined table with 3 mandatory columns:

    • mandatory columns

      • feature - the feature the exact code will roll up to and name of the matrix column in the output (letters, numbers, underscores only, no spaces, not case sensitive)

      • code - exact code

      • code_system - RxNorm, LOINC, etc

        • if a user is using the encounter table, code_system must equal "Encounter Type" and code maps to the "type" field in the encounter table
    • optional columns

      • Supported columns are:

        • qualifier_num - looks across lab_result and vitals_signs num value fields at the same time

        • qualifier_text - looks across lab_result and vitals_signs text value fields at the same time

      • if a user only passes in lab_result or vitals_signs to find_date, the function checks only that table

      • users can create as many additional columns as they want, but column names must be unique and match the supported column names

      • syntax for qualifying lab numeric values:

        • '<=X': less than or equal to X

        • '<X': less than X

        • '>=X': greater than or equal to X

        • '>X': greater than X

        • '~=X': not equal to X

        • 'X:Y': between X and Y

      • syntax for qualifying lab categorical values

        • the user can enter any string they want - exact match

          • if a user wants to use multiple values for a categorical lab, repeat the row with the same code but different qualifier value
      • if cell is left blank, system skips and assumes no qualification for that code

      • one qualification of a code does not apply to the entire feature; in the case there is more than one code mapped to the feature - every code must be qualified

  • function - first, last, random

    • first - each feature finds the first occurrence of any of the codes in the feature, as a date

      • works for labs, returns the date of the lab
    • last - each feature finds the last occurrence of any of the codes in the feature, as a date

      • works for labs, returns the date of the lab
    • random - Fortran seed approach to pull a random date based on all the occurrences of the feature within the relative time from index

    • Null if there are no codes present within a feature for a patient

  • begin (optional argument) - earliest date to look in a patient's record (YYYY-MM-DD)

  • end (optional argument) - latest date to look in a patient's record (YYYY-MM-DD)

  • patient_table (optional argument)

    • a dataframe with a single column called patient_id

    • if this argument is present, use this, and not the patient dataset table

    • this will supersede using the the patient table dataset

Returns


A dataframe with a patient_id column and a column for each unique feature value in the code_list input.

Examples


first_code_table = find_date(database='covid_db', tables=['procedure','diagnosis'], code_list=code_list, function='first')

first_code_table.head()

| patient_id | lung_transplant |
|------------|-----------------|
|          1 | 11/20/20        |
|          2 | 11/21/20        |
|          3 | 11/22/20        |
|          4 | 11/23/20        |
|          5 | 11/24/20        |