Function
time_to_event
def time_to_event(database: str, tables: list, code_list: sql.dataframe.DataFrame, index_date: sql.dataframe.DataFrame, index_days_start: int, index_days_end: int, exclude_prior_outcome: bool = False, allow_subset: bool = False) ‑> sql.dataframe.DataFrame
-
Description
The time_to_event function quickly generates serial times and serial dates for all patients in a cohort. These calculations can then be used for survival analyses and Kaplan-Meier graphs.
Inputs
-
database - database name to use
-
tables - names of tables to use in a dataset that is a list ['diagnosis'], ['diagnosis','procedure']
- supported tables: encounter, diagnosis, procedure, medication_ingredient, medication_drug, lab_result, vitals_signs
-
code_list - user defined table with 3 mandatory columns:
-
mandatory columns
-
feature - the feature the exact code will roll up to and name of the matrix column in the output (letters, numbers, underscores only, no spaces, not case sensitive)
-
code - exact code
-
code_system - RxNorm, LOINC, etc
-
if a user is using the encounter table, code_system must equal "Encounter Type" and code maps to the "type" field in the encounter table
-
if a user enters the value month_year_death in code column and patient in code_system column, the function will use the value in the patient.month_year_death column for all calculations
-
-
-
-
index_date - user defined table with two columns
-
patient_id
-
index_date
-
-
index_days_start - start period relative to index date in days
-
must be great than or equal to 0
-
positive means after
-
-
index_days_end - end period relative to index date in days
- must be after index_days_start
-
optional arguments:
-
exclude_prior_outcome - defaults to False
-
if set to True, excludes any patients with any codes within the feature of the code_list input that are present before the the start of the defined time window (index_days_start)
-
looks at one feature at a time
-
if a user has a code_list with multiple features, the function only looks for the prior outcome of the codes within a single feature to determine exclusion
-
a patient_id can appear within one feature and not another in the output if set to True
-
-
allow subset - checks whether the number of patients in the index event input table is the same of the dataset cohort number - would default to false, but it set to true would warn and continue
-
Returns
-
returns a 6 column dataframe with the following columns:
-
patient_id
-
feature
-
serial_time
-
if any codes within the feature of the code_list input are present in the time window, the row is populated with the difference in days between the index date and the first occurrence of a code within the feature
-
if no codes within the feature of the code_list input are present and the patient’s last fact falls within the time window, the row is populated with the difference in days between in the index date and the last code
-
last fact is the most recent fact of a patient across all tables in the dataset database
- populate the date that is one day after the last fact
-
if last fact is a death date in patient.month_year_death , use the last day of the month
-
if there are facts for the patient after their death date, use the death date (earliest of death date, end of study, last fact)
-
if no codes within the feature of the code_list input are present and the patient’s last fact falls after the time window, the row is populated with the difference in days between the index date and the last date of the time window
-
-
status
-
1 if any codes within the feature of the code_list input are present in the time window
-
0 if
-
no codes within the feature of the code_list input are present and the patient’s last fact falls within the time window
-
no codes within the feature of the code_list input are present and the patient’s last fact falls after the time window
-
-
-
serial_date
- the date (YYYY-MM-DD) used for the time_to_event field calculation that is not the index date
-
index_date
-
Example
outcomes = time_to_event(database=db, tables=['diagnosis','procedure'], code_list=code_list, index_date=index_date, index_days_start=1, index_days_end=1825, ) display(outcomes) | patient_id | feature | serial_time | status | serial_date | index_date | |------------|-----------------|-------------|--------|-------------|------------| | 1 | type_1_diabetes | 123 | 0 | 2/29/16 | 10/29/15 | | 1 | type_2_diabetes | 123 | 0 | 2/29/16 | 10/29/15 | | 2 | type_1_diabetes | 1825 | 0 | 7/15/14 | 7/16/09 | | 2 | type_2_diabetes | 1825 | 0 | 7/15/14 | 7/16/09 | | 3 | type_1_diabetes | 1825 | 0 | 12/13/14 | 12/14/09 | | 3 | type_2_diabetes | 1825 | 0 | 12/13/14 | 12/14/09 | | 4 | type_1_diabetes | 1825 | 0 | 7/15/14 | 7/16/09 | | 4 | type_2_diabetes | 1825 | 0 | 7/15/14 | 7/16/09 | | 5 | type_1_diabetes | 386 | 0 | 9/6/14 | 8/16/13 | | 5 | type_2_diabetes | 386 | 0 | 9/6/14 | 8/16/13 | | 6 | type_1_diabetes | 386 | 0 | 9/6/14 | 8/16/13 | | 6 | type_2_diabetes | 386 | 0 | 9/6/14 | 8/16/13 |
-