Function dstats
dstats.Rd
The dstats function is used to find various patient lab value aggregations and calculations around an index date.
Usage
dstats(
database,
tables,
code_list,
index_date,
index_days_start,
index_days_end,
allow_subset = FALSE
)
Arguments
- database
database name to use
- tables
names of tables to use in a dataset that is a list
c("lab_result")
orc("lab_result", "vitals_signs")
supported tables: lab_result and vitals_signs
- code_list
user defined table with 3 mandatory columns:
mandatory columns
feature
- the feature the exact code will roll up to and name of the column in the output (letters, numbers, underscores only, no spaces, not case sensitive)code
- exact codecode_system
- RxNorm, LOINC, etc
optional columns
Supported columns are:
qualifier_num
- looks across lab_result and vitals_signs num value fields at the same time
if a user only passes in lab_result or vitals_signs to dstats, the function checks only that table
users can create as many additional columns as they want, but column names must be unique and match the supported column names
syntax for qualifying lab numeric values:
'<=X': less than or equal to X
'<X': less than X
'>=X': greater than or equal to X
'>X': greater than X
'~=X': not equal to X
'X:Y': between X and Y
syntax for qualifying lab categorical values
the user can enter any string they want - exact match
if cell is left blank, system skips and assumes no qualification for that code
one qualification of a code does not apply to the entire feature; in the case there is more than one code mapped to the feature - every code must be qualified
- index_date
user defined table with two columns
patient_id
index_date
- index_days_start
start period relative to index date in days
None means anytime
negative means before the index date
positive means after
- index_days_end
end period relative to index date in days
None means anytime
negative means before the index date
positive means after
- allow_subset
(optional argument) - checks whether the number of patients in the index event input table is the same as the dataset cohort number - would default to false, but if set to true would warn and continue
if TRUE and there are fewer patients in the index_date_df table, only returns the patient_ids present in index_date_df
Value
Returns a table with the following columns (unique key of the table is patient_id + feature)
patient_id
feature
- each unique feature value per patient IDmin
- the minimum lab value that appeared in time window defined by the user in relation to the index eventmax
- the maximum lab value that appeared in time window defined by the user in relation to the index eventmedian
- the median of all lab values that appeared in time window defined by the user in relation to the index event.If it is even values, take average of the 2.
mean
- the mean of all lab values that appeared in time window defined by the user in relation to the index eventfirst
- the first lab value that appeared in time window defined by the user in relation to the index eventIf multiple values same day, system displays median
last
- the last lab value that appeared in time window defined by the user in relation to the index eventIf multiple values same day, system displays median
Details
= dstats(database='covid_db', tables=c('procedure','diagnosis'), code_list=code_list, index_date=sg_index_date, index_days_start=-365, index_days_end=-1)
dstats_table head(dstats_table)
| patient id | feature | min | max | median | mean | first | last |
|------------|----------------|-----|-----|--------|------|-------|-------|
| 1 | alkaline_phos | 50 | 30 | 33 | 40 | 45 | 50 |
| 1 | aspartate_amin | 50 | 30 | 33 | 40 | 45 | 50 |
| 2 | alkaline_phos | 50 | 30 | 33 | 40 | 45 | 50 |
| 2 | aspartate_amin | 50 | 30 | 33 | 40 | 45 | 50 |
| 3 | alkaline_phos | 50 | 30 | 33 | 40 | 45 | 50 |
| 3 | aspartate_amin | 50 | 30 | 33 | 40 | 45 | 50 |