Notes for Wednesday Biostatistics Clinic

The Biostatistics Clinic on Wednesdays is dedicated to biostatistics applications in surgery, anesthesiology, and emergency and critical care medicine.

Click here for older notes

2020 November 04

Emily Deaton (Jessica Turnbull), Pediatric Critical Care

  • Previous clinic session 2020 May 13
  • I am expanding my previous project to the NICU (see previous clinic) and to other centers. NICU project is in data collectiom. Multicenter project is in design. Mentor confirmed. VICTR voucher request.

2020 September 30

Jeremy Joseph, Plastic Surgery

  • Discussed informally previously and a separate clinic has been set up with Dr. Higdon for the FitBit abdominoplasty study. This one is in patients undergoing breast reconstruction (DIEP) flap surgery and comparing patients getting reminders from the FitBit device vs those who do not to compare how long it takes them to return to their baseline # of steps. Abstract: The importance of physical activity and ambulation in prevention of post-operative complications is well-documented. However, strategies to encourage early ambulation is a field that continues to develop until these complications can be eliminated and improved patient outcomes are achieved. The use of actigraphy devices in the community has been used to track and promote individuals to be physically active. These devices are beginning to influence the medical field including investigations into outcomes in patients amongst various surgical specialties. In patients undergoing plastic surgery, though, the literature on actigraphy and its use in this population is sparse. A particularly susceptible population to post-operatively complications in plastic surgery, namely venous thromboembolism (VTE), are those undergoing breast reconstruction for breast cancer.12 A prior study performed by our group that utilized actigraphy to monitor patients undergoing deep inferior epigastric perforator (DIEP) flaps for reconstruction after breast cancer resection showed that this patient population does not return to their baseline ambulation status as far out as 8 weeks post-operatively. This proposed study is designed to expand on the previous work mentioned above in the DIEP flap patient population by examining the use of actigraphy devices in its role in promoting earlier return to baseline ambulation. A single-institution randomized control trial study design is proposed with patients undergoing DIEP flap reconstruction randomized into two groups of approximately 10-15 patients in each group (total of 20-30 patients). The control group will consist of DIEP flap patients who will wear a FitBit HR device for 2 weeks pre-operatively and 8 weeks post-operatively; this group’s devices will not provide any reminders or alerts. Rather, the use of the actigraphy device in this group would be solely for the purposes of monitoring activity levels. The experimental arm will consist of DIEP flap patients who will wear their FitBit HR device for the same duration and timepoints, but this group’s devices will provide alerts and reminders to stay active throughout their post-operative course. The data extracted from the monitoring of the devices from both groups of patients will be used to determine if the use of the alerts in this population shows promise in promoting return to baseline physical activity. If an improvement is identified in the experimental group as we hypothesize it will, it would suggest that the use of actigraphy devices may be beneficial in the post-operative course for this patient population. The impact of this could be multifold: reduction of post-operative complications, namely VTE, which in turn leads to improved individual patient outcomes, readmission rates, and a fiscal reprieve to the healthcare system due to the prevention of these complications. Mentor confirmed.
  • Dandan Liu and Wu Gong attend the clinic. Due to feasibility issues of recruitment, we suggest that the investigator use data from a prior study conducted a year ago the control arm and assign patients in the current study to intervention arm. Firstly, we should compare baseline characteristics between groups to check balance of the study and identify potential confounding variables. Secondly, we conduct univariate analysis to assess intervention effect on the outcome of interests (daily mean heart rate and daily steps count collected using Fitbit). Linear mixed effect model might be used with interaction between follow up days and intervention the term of interest. The scope of the study is appropriate for VICTR voucher.

2020 September 23

Jed Maslow, Orthopaedic Surgery

  • We are interested in evaluating the outcomes of revision radial head arthroplasty. The primary outcome of interest is overall survival (repeat surgery) of the implant. Secondary outcomes would include post-operative complications and functional outcomes (patient-reported). The comparative groups include primary radial head arthroplasty and those who undergo resection instead of replacement as a revision procedure. Mentor confirmed.

2020 September 16

Shiayin Yang, Otolaryngology

  • Background: The Nasal Obstruction Symptom Evaluation (NOSE) is a validated quality of life instrument used to assess the severity of nasal airway obstruction and how it affects a patient’s life. It asks patients to rate their feeling of nasal congestion and breathing over the past month. The score ranges from 0 to 100 with 0 as asymptomatic and 100 most symptomatic. It has been validated for use in measuring outcomes from septoplasty and rhinoplasty and is used by surgeons as guide to determine treatment options and improvement from surgery.1-2 Although this survey is routinely used in clinical practice, there is limited data regarding normative values, which limits interpretation of clinical procedures and surgeries. Study Objectives: The objective of this study is to determine normative values of NOSE score amongst the general population and asymptomatic individuals. VICTR voucher request.
  • Two objectives: (1) estimate the distribution of NOSE score; (2) evaluate the effect of age, sex, ethnicity, and location on NOSE score
  • NOSE score ranges from 0-100, with lower score indicating better outcome, therefore, expect to see right skewed distribution with more participants at lower end. There are several ways displaying the results: transformation to achieve normal distribution; describe proportion of patients falling within certain range; simply describe the whole distribution of the score
  • Sample size: International standard used N=120 to estimate reference range for clinical labs. From modelling aspect, rule of thumb is 15 per d.f. If want to see effects of certain clinical factors on NOSE score, for example, Y = age + sex + ethnicity (Caucasian, Black, Hispanic, Asian, other) + location (total 7 regions), need at least 12*15 participants. This study will have plenty cases to estimate NOSE score distribution as well as perform multiple linear regression.
  • Analysis: histogram, descriptive statistics, multivariable linear regression
  • The proposed work fits under the scope of VICTR biostatistics voucher.

2020 September 02

Alexander Langerman, Otolaryngology - Head and Neck Surgery

  • I’m preparing a grant to examine the association of surgical performance assessment (OSATS, SIMPL) with postoperative errors and complications for neck dissection. I wish to discuss a statistics plan for this. Ideally identify a biostatistics collaborator.
  • Dr Alexander Langerman is preparing a grant submission for evaluating the performance (competence) for a procedure of neck dissection. He intends to build a libary of annotated surgery videos for potential deep learning evaluation in future. The performance could be self-rated, cross-rated (by other attendings), and by third-party. The surgery has been conducted in Vanderbilt roughly 100 times a year. Chris asked the research significance and the practical significance, and suggested to have at least two third-party raters (observers), and inter-rater relibilities could be measured while the relationship between the rating and the clinical outcome could be explored. In addition, Chris suggested that the investigor to think about to establish a collaboration plan with the Department and the investigator happily agreed. Given the current COVID situation, the Biostatistics Department has been overwhelmed by a huge amount resource-demanding clinical researches. Therefore, VICTR would not have the capability to provide direct grant suppport for this effort before the grant submission deadline at the end of August. However, as a courtersy solution, Chris will donate several hours to support the statistical writing of the grant submission.

2020 August 26

Barron Frazier, Pediatric Emergency Medicine

  • The goal of our retrospective study utilizing de-identified data is to evaluate the post-intubation performance of a quaternary referral center that specializes in pediatric care. The study has these specific aims:
    • 1. Analyze the relationship between patient characteristics and post-intubation sedation
    • 2. Analyze pre-intubation interventions impact on post-intubation sedation
    • 3. Evaluate if difficult intubations impact post-intubation sedation
    • 4. Assess the impact of post-intubation sedation practices on duration of mechanical ventilation, duration of ICU level of care and overall hospitalization, and survivability.

Just started data collection and REDCap database has been built.

Outcome: appropriate post-intubation sedation, which is defined as whether the sedation is given to patients within suggested time frame (calculated using half life data of intubation meds). Timing depends on physician and physician make decision based on patient's condition and meds used. There are a lot of confounding issues. Predictors include long-acting paralytic used for RSI, presence of hypotension before intubation, time to disposition, presence of ED pharmacist, delayed PICU admission, age (better on continuous scale, instead of categorical), difficult intubation, intubations requring 2 more attempts, time of arrival, race, type of presentation, history of chronic condition, and neurology/neurosurgery consult.

Suggested group these predictors into categories. Check their correlations. Perform descriptive statistics. Then decide which predictors should be included in the model.

Patient outcomes include: mechanical ventilation days, ICU LOS, hospital LOS, and in-hospital mortality. Consider survival analysis or ordinal outcome analysis treating death as worst outcome (longest). If consider ordinal outcome using proportional odds model, the interpretation as odds ratio might be a little challeging. For mortality, if rate is low, the study may not have power to fit the model. Also need to consider what adjusting covariates should be included.

The scope of the analysis fits under VICTR voucher.

2020 August 19

Edward Qian, Pulmonary and Critical Care

  • Retrospective analysis of the SMART dataset looking at patients receiving empiric antibiotics. Specifically looking at the comparative effectiveness and renal/neurologic toxicity of cefepime vs piperacillin/tazobactam. Will come prepared with outcome of interest, exposure, and known confounders of interest. VICTR voucher request. Mentor confirmed.
  • The study aims to use retrospective data from the SMART trial to compare cefepime vs piperacillin/tazobactam in relation to MAKE30 and other secondary clinical outcomes. We need to firstly understand data availability/patterns of each antibiotic prescription which might be given multiple times during ED/ICU stays. Then we will assess indication bias of whether some patients might more likely be given one antibiotic vs the other. Propensity score or propensity score matching might be used if there is indication bias. The proposed work fits under the scope of VICTR voucher.

2020 August 12

Don Arnold, Pediatrics

  • IV Magnesium is recommended as a second-tier medication for acute asthma exacerbations in children. However, the evidence base for this comprises 5 RCT and a small aggregate sample size. More recent registry-level data suggests that IV Mg does not decrease hospitalizations. Our objective is to determine whether this treatment is associated with decreased hospitalizations and improved lung function in our prospective parent study of 933 children with acute exacerbations. We anticipate using propensity scores to test our hypothesis that Mg does not decrease hospitalizations or improve lung function in comparison with standard treatment without Mg. VICTR voucher request.

  • The study is looking at whether IV Magnesium is associated with the outcomes including; AAIRS score at 2hrs, hospitalization, time to Q 4-hr albuterol, and hospital LOS. We propose to perform propensity score matching in the analysis with matching covariates pre-defined. The proposed work fits under the scope of VICTR voucher.

2020 July 29

Emily Kight, Bioengineering

Previous clinic session 2020 July 08

  • 1. Feedback on validate panel test plan
  • 2. Test against 20-40 benign vs 20-40 disease age matched women
  • 3. Test against ELISA (more sensitive method but takes 8 hrs)
  • 4. Determine limit of detection
  • 5. Determine discriminatory power (how to do this?)

Leah Brown, Pulmonary/Critical Care

  • The aims of this study are twofold: 1)to assess the effect of gastric acid suppression (both before and during ICU admission) on lung colonization, injury, inflammation, and infection, and 2) to correlate respiratory culture data with incidence of these clinical outcomes. Understanding the dysbiotic consequences of gastric acid suppression on the lung microenvironment will help elucidate markers to predict negative outcomes and facilitate development of preventive interventions.

    • Hypothesis: In critically ill patients, gastric acid reducing agents facilitate the colonization of gut bacteria in the lungs, which may promote (or reflect)inflammation, injury (ARDS), and/or infection (pneumonia, sepsis).Enrichment of the lung microbiome with various gut bacteria may correlate with these clinical outcomes.
    • Likely multiple confounding variables– how to best control for these? General Assistance with statistical analysis for project.
    • VICTR Voucher, mentor confirmed

2020 July 08

Matthew Felbinger, Pharmacy/Emergency Medicine

  • I am working on evaluating the effects of a post-intubation sedation protocol for patients that have an endotracheal tube placed in the emergency department and require analgesics/sedatives after tube placement, to maintain comfort with the breathing tube. In short, the protocol directs physicians to select two medications (propofol and fentanyl) and avoid the use of other sedatives (i.e. benzodiazepines). The questions that I’d like to address are:

    • 1. What is the compliance with the protocol? The presumption is that the compliance is high, but deviations are likely. This is mostly QI focused.
    • 2. Does the use of the protocol reduce the amount of benzodiazepines used in the ED for these patients? We hypothesize that the protocol will decrease use of benzodiazepines.
    • 3. If there is a reduction in benzodiazepine use in the ED, does this have an impact on Delirium and coma-free days in the ICU? This is based on the presumption that patients receive the same standard of care once in the ICU, so the only “intervention” would be medications administered in the ED. I’d need help with matching patients and adjusting for covariates to best represent the data to answer this question.
    • Interrupted time series analysis : adjusted proportion of Benzo use vs. time, expect to see a overall decrease with a big jump around when intervention happens
    • Interrupted time series analysis : proportion of protocol follows vs. time
    • Association between protocol implement with clinical outcomes: does medication regimen associate with outcomes? See what have been done in ED changes what have been done in ICU. Need to consider outcome and other factors that may modify/moderate/confound the outcome as well as baseline clinical factors.
    • VICTR voucher.

Emily Kight, Bioengineering

  • I am working on an early detection panel of biomarker test for ovarian cancer in urine using affordable and low-cost lateral flow assay (LFA) technology. In this order, I plan to: Test the ability to discriminate between benign vs disease state for age matched women, Compare panel test to ELISA (validate same urine samples to ensure ELISA results are the same as LFA results), Test predictive power of panel test in clinical cohort of women with a benign cyst. Determine PPV.
    • For #1, how many of each type do I need to say my test can discriminate?
    • For #2, is it enough to be just as good as ELISA which is longer and more expensive? How do I determine the uncertainty in my measurement?
    • #3 How many times does my test have to be right to be considered predictive? How many women do I need to do predictive testing? Can my panel test be combined with algorithm of patient info to increase PPV? I’d like to compare standard of care vs (standard of care+panel test) vs. (standard of care+panel test+algorithm)
  • Literature research on all biomarkers that were used for decision making and cost of the biomarkers. Pick some most commonly used (i.e. 5) and see whether they can discriminate the disease.
  • Can LFA measure the biomarkers? Need to make sure the biomarkers chosen are clinically meaningful and make biological sense. Use Bland-Altman plot to assess agreement between biomarker measured by LFA and measured by ELISA.

2020 June 24

Claci Walls, Pediatric Emergency Medicine

  • We have implemented a curriculum to help improve pediatric airway management and would like to review some of our pre and post survey data. We would like to discuss with the statisticians the best way to analyze this type of data and what would be the appropriate analysis to answer our question if this curriculum improve resident involvement in airway distress situations and if this increased the number of airway procedures.

Whitney Gannon, Pulmonary and critical care medicine

  • We performed a randomized controlled trial of different educational methods in teaching ECMO to clinicians. We would like direction on the best way to compare groups and how to address missingness.

2020 June 10

Kimberley Harper, Pediatric Critical Care

  • I have implemented a risk stratified sedation weaning protocol and wish to look at outcomes and differences pre and post protocol. VICTR biostatistics voucher. Mentor confirmed.
  • Hypothesis: implementation of sedation weaning protocol based on risk strarification will reduce (1) weaning duration (in days, ranges from 1 to 50 days, majority is between 10 to 20 days), (2) ICU length of stay.
  • Pre period is from 2019 June to December, post period from December to March 2020.
  • Suggest interrupted time series analysis to see whether the difference is just due to time effect or due to treatment.
  • Outcome ~ time (since day zero) + period (pre/post) + time * period + covariates (duration of sedation,...)
  • The scope of the project falls under VICTR biostat voucher.

Caroline Erickson, School of Medicine

  • The project is a cost analysis of the MIND-USA clinical trial and the research question is: Does treating ICU delirium with haloperidol or ziprasidone affect total ICU costs relative to placebo in adult medical or surgical ICU patients? Questions to address are related to the statistical plan and obtaining a VICTR grant.VICTR biostatistics voucher. Mentor confirmed.
  • Compare ICU costs beween two groups. Cost is usually skewed and traditional parametric model is not desired. Suggest use Cox proportional hazards model (replacing time with cost and all has event).
  • It is better to analyze both ICU cost and total hospital utilization in case of cost shiftting. There is no itemized individual cost, but cost buckets. Can compare overall cost and cost within bucket. Show cost curve (survival curve) instead of reporting median. If cost shiftting only occurs in sever pts, median cost will not show any difference.
  • Competing risks of death, which in this study is the worst outcome.
  • Daily cost ~ daily delirium + daily sepsis. Reference paper on analyzing daily cost:
  • The scope of the project falls under VICTR biostat voucher.

2020 May 13

Emily Deaton, Pediatric Critical Care

  • Estimate on biostats services, advice for appropriate statistical analysis for my fellowship scholarly project. Evaluating moral distress in those working within the health care team in the pediatric ICU and pediatric cardiac ICU. Comparing the level of moral distress to the perceived ethical decision making climate found within the unit. Within these goals, we are also evaluating whether knowledge or use of the ethics consult service is a protective factor. VICTR biostatistics voucher. Mentor confirmed.
  • Survey on Moral distress in PICU. Total 53 items, calculated score for each item will be added to get total score. Need to consider if some items are missing.
  • Collection of survey started from end of February and will be completed in June. Covariate of COVID?
  • Challenge: religious beliefs
  • Question: does having experience wih consult service decrease moral distress or increase perception of the ethical decision making climate questionnaire. Can fit a model with moral distress as outcome and decision making climate, ethics consult service as predictors.
  • Total enrolled N=75 (38 nursers, 9 faculty, 4 fellows, 11 peds residents, 2 EM residents, 5 therapists, 4 NP/PA). Descriptive statistics of all item scores overall and by groups.
  • The scope of the work fits under VICTR biostat voucher of $5000.

Cindy Hernandez, Urology

  • Reduction of Allogenic Blood Transfusion in Locally Advanced Kidney Cancer. PI: Moses. Additional Study Authors: Hernandez, Dawes/Siegrest, Benson, Idrees, Garrard, Balsara. Background and Rationale: Allogenic blood transfusion is associated with adverse events such as pulmonary and vascular complications, immunosuppression, and cancer recurrence/specific survival. Radical nephrectomy for large tumors (>7cm) and/or tumor thrombi are complex and frequently require allogenic blood transfusions due to acute blood loss from parasitic vessels and thrombectomy with or without IVC reconstruction. There are specific techniques that can be utilized to reduce the need for allogenic blood transfusion, including acute normovolemic hemodilution (ANH), cell saver, and veno-venous (VV) bypass. The utilization of these techniques has not been well described in kidney cancer, highlighting an unmet need in patient care. Hypothesis: Utilization of blood sparing techniques in locally advanced kidney cancer can reduce allogenic blood transfusion. Protocol: Randomization of patients with >=cT2 renal masses to blood sparing vs non-blood sparing intra-operative techniques. Determination of method of blood sparing to be determined based on patient and tumor characteristics. Primary Endpoint: Reduction in allogenic blood transfusion. Secondary Endpoints: Complications (measured by Charlson or Elixhauser), Cancer Recurrence, 3-year Overall Survival, Cost. Exploratory Endpoints: Assessment of circulating tumor cells (CTCs) in preoperative blood specimen, pre- and post-wash cell saver container, and postoperative blood specimen; Measurement of pre- and post-operative inflammatory markers (CRP). Sample size: To be determined by calculating the % of patients receiving allogenic blood transfusion in a contemporary VUMC cohort, as well as from known literature. We will aim for a 20%/30%/50% reduction in number of units of allogenic blood, which will allow for calculation of a two-sided alpha of 0.05, indicating a significant reduction. Patients will be randomized 1:1, patient (and surgeon?) will be blinded to the study arm. Postoperative blood transfusion will be included in the final tally for each patient. Inclusion: Patients 18 and older with reasonable hepatic, pulmonary, renal and cardiac function, who have >=cT2 renal masses, Hemoglobin >=9 or 10, able to provide consent, can have N1 or M1 disease if they are deemed surgical candidates (including cytoreductive nephrectomy). Exclusion: severe anemia. Questions to address: power analysis for determination of sample size. Is randomization method appropriate? VICTR biostatistics voucher. Mentor confirmed.
  • VUMC 12/17/2015 to 12/17/2019, total 446 Nephrectoies (partial and radical), among those, 158 Nephrectomies >= cT2 Renal Cell Carcinoma ( 20.8% (33) transfused, 14.5% (23) IVC thrombectomy). The eligible cohort is from 158, which is around 140. Randomize the 140 patients into two groups: receive blood sparing techniques vs. standard care.
  • Primary outcome: transfusion units received (those not transfused will have number as zero)
  • Proportional odds model can be used to evaluate the difference of units transfused between groups
  • Suggest consult with cancer center for sample size and randomization

2020 May 06

Seth Davis, Otolaryngology

  • Large defects in the oral cavity often require vascularized free tissue transfer for reconstruction. These “free flaps” can atrophy over time, leading to functional deficits. We attempt to quantify this degree of atrophy over time via volumetric measurements of CT scans. We are hoping to perform a power analysis based on prelim data to determine the number of patients needed to generate significance between findings in radiated and non-radiated patients. Mentor confirmed.
  • Two volumetric measurements per patient, approximately at 3 months and 1 year, but not fixed for all patients. Baseline volume is not available and is currently estimated using the fitted slope and BMI. Fairly rapid change during first few months is expected, and the curve should be flattened approaching 1 year. Therefore, two time points is not enough to provide accurate estimate of the overall trend. We suggest obtain more data points as well as more patients.
  • Can fit a mixed-effects model of all follow up volume as the outcome, the fixed effects include baseline volume, time by radiation interaction.

2020 April 22

Jo Ellen Wilson, Psychiatry/Center for Critical Illness Brain Dysfunction and Survivorship

  • I would like guidance regarding a stats plan for a CAM-ICU validation project (validated in a new population) - neurocritically ill. Multiple assessments (1 each day) per patient. Reference rater collected DSM-5 level data in a standardized form. Would like assistance with analysis plan. VICTR Biostatistics voucher. Mentor confirmed.
  • Validated tool for delirium diagnosis called VADA (as measured by DSM5), aim to validate VADA in neuro critically ill patients.
  • VADA was collected once a day (if there is), CAM was collected twice a day. VADA will be paired to the closest CAM on that day.
  • There will be multiple pairs of delirium (Y/N) per patient. The agreement between the two will be measured and presented as PPV/NPV (or sensitivity/specificity).
  • The scope of the analysis fits under a standard VICTR biostat voucher.

Whitney Gannon, Pulmonary/Critical Care

  • Retrospective cohort analysis of anticoagulation practices in extracorporeal membrane oxygenation. We will examine bleeding complications among usual anticoagulation practices in patients receiving venovenous ECMO. Independent investigator.
  • Patients on ECMO often need blood thinner to prevent clotting, however, it may cause bleeding.
  • Inclusion: patients on ECMO and with no bleeding disorder, no clotting disorder, no recent surgery (so they are on blood thinner only due to use of ECMO)
  • Goal: Whether composite bleeding/clotting is associated with in-hospital death.
  • Enrolled patients from 2016 until now, total N=54 patients. 14 out of 54 died in hospital. 22 had bleeding events. 7 clotting events.
  • Ideal if enough sample: time-varying covariate Cox model of time to death with bleeding and clotting events as two time-varying covariates. However, with this limited events (N=14), we suggested perform simple descriptive statistics.
  • Mediation analysis: indicator of bleeding, clotting, their interaction, and disease severity. Need about 60-90 events.

2020 April 8

Kristina Betters, Pediatrics/Critical Care

  • We are evaluating a novel scoring system to observe improvements in patients strength during a PICU stay. VICTR voucher request, independent investigator.
  • The goal of the project is to evaluate whether PAM score is sensitive to clinical interventions. The analysis will include: (1) Table 1 descriptive statistics of patients characteristics; (2) evaluate association between PAM score and physical functional scale; (3) compare PAM score by discharge disposition; (4) fit a linear model (or proportional odds model) of PAM score at discharge in association with other clinical measures adjusting for initial PAM score.
  • The scope of the analysis fits VICTR biostatistics voucher

2020 March 11

Jaycelyn Holland, Pediatric Emergency Medicine

  • There are multiple scoring systems for assessment of leadership or ‘non-technical’ skills in crisis scenarios, though none have been studied in a medical student population. We hope to perform an initial evaluation of a BARS (behaviorally anchored rating scale) to determine if it shows good inter-rater reliability in a medical student population. I am hoping to ensure we have an appropriate plan for video review to obtain scores for analysis. I would also appreciate advice on how data can be analyzed (once obtained) to produce educationally relevant data. VICTR voucher request, mentor confirmed.

Monica Bhanot, Dept of Diabetes, Endocrinology, & Metabolism

  • We are analyzing human adipose tissue depots (SQ vs omental) for iron containing macrophages to correlate their differential number with insulin resistance. VICTR voucher request, mentor confirmed.

2020 February 19

Jonathan Siktberg, Unknown

  • Please provide a short description of your project and the questions you’d like to address: Our project is a case series of 653 patients who have been treated for age-related macular degeneration under the step therapy protocol at VEI. We have completed data collection and would like guidance on our data analysis, specifically on the utility of multiple linear regression in our project. Mentor confirmed.

2020 February 12

Melanie Whitmore, Pharmacy

  • My project will analyze the utilization of chemical restraints in agitated patients. We are looking at the association of chemical restraint administration and length of hospital stay, associations with gender, time of year, etc. Would like to review use of statistical tests for appropriateness and analysis.

Belinda Li, Pediatric urology

  • A retrospective review of a combination of clinical data (10 parameters) to determine predictors or patterns in the type of surgery they receive. Mentor confirmed.

2020 February 5

Belinda Li, Pediatric Urology

  • A retrospective review of a combination of clinical data (10 parameters) to determine predictors or patterns in the type of surgery they receive.

Christine Lopez, Internal Medicine/Hepatology

  • Retrospective study which will involve evaluation of changes in liver tests in women with NAFLD over the course of pregnancy and post-partum. We hypothesize there may be a worsening in liver tests during pregnancy. We have applied for a VICTR voucher and need to lay out a more detailed plan for statistical analysis before it can be reviewed.

2020 January 22

Melanie Whitmore, Pharmacy

  • My project will analyze the utilization of chemical restraints in agitated patients. We are looking at the association of chemical restraint administration and length of hospital stay, associations with gender, time of year, etc. Clarification and interpretation of statistical tests run on SPSS.

Samantha Brokenshire, Pharmacy

  • Retrospective, pre-post study evaluating change in diuretic needs, serum potassium and urine output after initiation of spironolactone. Would like assistance in JMP Matched Pair analyses and review of statistical methods. Mentor confirmed.

2020 January 8

Laura Shashy, Pediatrics- Neonatology

  • Fellowship project- pre and post intervention data. Review statistical analysis. Has attended Thursday clinic.
  • VICTR voucher (no biostat)/Mentor confirmed

Melanie Whitmore, Pharmacy

  • I would like to request an appointment with Dr. Chris Lindsell, as he is familiar with my project and has advised me in a previous clinic appointment. My project will analyze the utilization of chemical restraints in agitated patients. We are looking at the association of chemical restraint administration and length of hospital stay, associations with gender, time of year, etc. I have sent a more thorough description via email to Dr. Lindsell. Thanks!

2019 December 18

Rachel Boardman, Pharmacy

  • Retrospective review to compare the use of dual hyperosmotic therapy (hypertonic saline + mannitol) against monotherapy in early management of TBI patients.
  • Retrospective data pulled from EPIC within two years (December 2017-December 2019), N=195 total patients admitted into Vanderbilt emergency department with TBI. Each patient received either hypertonic saline, mannitol, or both. Patients' demographics and trauma related clinical variables at ED presentation were collected.
  • Plan to perform propensity matching on the fluid received.
  • The primary outcome will be GCS at hospital discharge. About 15% patients died prior hospital discharge and may assign worst score to include those patients in the analysis.
  • Li Wang was present at the clinics. A VICTR voucher for biostatistical assistance is appropriate based on the scope of the study.

2019 December 11

Emily Long, Plastic Surgery

  • We performed a preliminary study last year comparing cytokines in peritumoral skin of immune-suppressed vs immune-competent patients with squamous cell carcinoma. Our study was underpowered, and we identified trends, but did not reach statistical significance. We would like to determine how many samples we would need to adequately power a larger study going forward, and take steps to apply for a VICTR grant.
  • VICTR voucher/Mentor confirmed

2019 December 4

Laura Shashy, Pediatrics- Neonatology

  • Fellowship project looking at comparison of Hospital Anxiety and Depression scores in a group of NICU parents who journal vs those who do not journal. Previously attended clinic in September (Thursday).
  • Would like to continue discussion about data analysis.
  • VICTR voucher/Mentor confirmed
  • Total 100 parents of NICU babies/children. Baseline anxiety and depression score were available and post scores were also measured. The goal is to compare post scores between intervention and control groups. For some kids, both parents data were included.
  • Suggest look at outcome distribution first to decide whether use linear model or proportional odds model. Since some are from same family, consider mixed effects model.
  • Expect about 30% missing outcom, table 1 should be baseline characteristics overall and among those with and without outcome missing. Can perform imputation as sensitivity analysis.

Sneha Patel, Pharmacy

  • Mechanical ventilation is a common strategy to help maintain oxygen saturation and excrete carbon dioxide throughout the lungs of critically ill patients. Spontaneous awakening trials (SATs) and spontaneous breathing trials (SBTs) are utilized daily to facilitate weaning from mechanical ventilation when appropriate. Many patients who have passed the SAT/SBT trial are not extubated. I will be evaluating which steps limit patients from passing a combined SAT and SBT and achieving extubation. We are interested to see if patients who pass an SAT safety screen, SAT, SBT safety screen, and SBT but are not extubated will (1) be more likely to have received heavy sedation on the day prior to the SAT/SBT and (2) be more likely to have a neurological indication for mechanical ventilation. I am requesting support for data analysis discussion and assistance.
  • There will be three main analysis.
  • Analysis of daily failure of SAT in relation to sedations
  • Analysis of daily failure of SBT in relation to sedation among those who pass SAT or do not take SAT
  • Analysis of extubation among those who pass both SAT and SBT
  • All analysis will be using mixed model with auto correlation structure
  • VICTR voucher/Mentor confirmed.

2019 November 13

Mark Xu, School of Medicine

  • Examining the clinical data of pediatric rhabdomyosarcoma patients, specifically what contributes to changes in vital status.
  • Mentor confirmed

2019 October 23

Christin Giordano, Medicine/Nephrology

  • We are hoping to look at pregnancy and fetal related outcomes in women with chronic kidney disease.
  • VICTR voucher/Mentor confirmed

2019 October 9

John Power, Internal Medicine

  • Redcap clinical database, analyzing with STATA but questions on how to perform multivariate analysis with high percentage of missing data

2019 October 2

Alan Makhoul, Plastic Surgery

  • We are working on a case series of 20 patients who will have activty measured for 3-4 weeks before breast reduction surgery and for 4-6 weeks following surgery. We want to know how long it takes for post-operative physical activity to return to the pre-operative baseline.

    Questions for Biostatistician:

    1. Comments: Enrollement has begun (~3 subjects). 20 SUBJECTS IS LIKELY for feasibality. Could drop early observations for run in. Reccomend spagetti plots to show data. Analysis will bemostly descriptive. Could include table with row for each subject with age, diagnosis, time to baseline active, etc. Discussed missing

Megan Wright, Emergency Medicine/SOM

  • Research question: Among adult patients transferred to the VUMC ED from outside hospital EDs in 2018, what patient and hospital factors are associated with potentially avoidable transfers?
  • VICTR voucher, mentor confirmed
  • Advised looking at CDP for chart abstraction. Advised of timeline for application and analysis, deadline is mid December for SAEM.

2019 September 25

Michael Smith, Trauma and Surgical Critical Care

  • We are planning to study our cohort of patients with traumatic brain injury around their patient-centered outcomes. This would probably be a multi-phase study for which we would ultimately seek external funding, so I would like to speak with the statistician team prior to starting anything.

2019 September 4

Jason Cook, Cardiology

  • BNP is elevated in atrial fibrillation, however precise mechanism has not been determined. Our study aims to evaluate structural and functional assessments of the left atrial appendage to determine which factors have the largest impact on changes in BNP.
    • Mentor confirmed/VICTR voucher.
    • VICTR statistician has met with the investigators and discussed the study. A $5000 VICTR voucher is appropriate for the scope of the analysis.

2019 August 28

Wade Brown, APCCM

  • Retrospective analysis of complication by level of exereince in MICU intubation
    • Mentor confirmed/VICTR voucher.

Melanie Whitmore, Pharmacy

  • Retrospective assessment of use of chemical restraints for agitation and the impact on length of stay. Questions on data to collect and which statistical tests would match data collected
    • Mentor confirmed.

2019 August 14

Marie Kuzemchak, Cardiac Surgery

  • Our project is looking at warm ischemic time in heart transplant patients and early postoperative graft dysfunction.
    • Mentor confirmed.

2019 August 7

Austin Adair, Pediatric Cardiac Critical Care

  • Review of previous project with biostatisticians to change direction of project given sample size.Attended clinic in June.
  • VICTR voucher.
  • Mentor confirmed.

Taylor Coston, Internal Medicine

  • I aim to answer the question, “Does the choice of balanced crystalloids versus saline for fluid resuscitation affect outcomes among critically ill adults with cirrhosis?” There is currently little evidence to guide the choice of resuscitation crystalloid in these patients. I plan to perform a subgroup analysis of SMART (a pragmatic, cluster-randomized, multiple-crossover trial comparing balanced crystalloids with saline for fluid resuscitation among adults admitted to 5 ICUs at Vanderbilt). I have identified a cohort of 530 patients with cirrhosis using a previously validated ICD-9 algorithm converted to ICD-10. The primary outcome was the proportion of patients that experienced a major adverse kidney event within 30 days - a composite of death, new receipt of renal-replacement therapy, or persistent renal dysfunction. In addition to the primary analysis I plan to perform a sensitivity analysis comparing saline vs LR only, given the decreased ability of cirrhotic patients to metabolize lactate. If data from the BASE trial is available, it would be interesting to compare LR vs plasmalyte in this population as well.
  • VICTR Voucher/ Mentor confirmed.

2019 July 31

Alexandria David, Pharmacy

  • To evaluate if adequate sedation can be achieved using a sedation protocol that focuses on utilizing opioid and dexmedetomidine infusions and minimizing benzodiazepine infusions in mechanically ventilated pediatric patients in a critical care setting.


    • Difference in total cumulative dose and duration of benzodiazepine infusions between control and study group
    • Comparison of the incidence and duration of delirium between control and study group
    • Comparison of total cumulative dose and duration of opioid infusion between control and study group
    • Comparison of utilization of adjunct sedation (ketamine, chloral, etc.) between control and study group
    • Comparison of length of mechanical ventilation, hospital length of stay, and ICU length of stay between control and study group.
We anticipate sample size to be small and would also like to know how to handle missing data.
  • VICTR voucher.
  • Mentor confirmed.

2019 July 24

Joydeep Baidya, Department of Anesthesiology

  • Intrathecal baclofen pumps/catheters are placed in patients with cerebral palsy to treat spasticity. Following treatment, certain patients experience cerebrospinal fluid (CSF) leaks. While most recover as a result of conservative treatment, others require an epidural blood patch (EBP). We are trying to determine factors that predispose patients to have a CSF leak, and subsequently require an EBP.
    Question: There are a few factors that we have determined to be statistically significantly different between the two groups. We would like to conduct linear regression on the data and are looking for advice/suggestions on how to do so.
  • Mentor confirmed.

2019 June 26

Austin Adair, Pediatric Cardiac Critical Care

  • All pediatric patients undergoing OHT during the designated study time who had either a cardiac catheterization, CT-angiogram of the chest, or cardiac MRI as a part of their pre-transplant evaluation will have their Nakata index and McGoon ’s ratio as measures of pulmonary artery size compared to right ventricular dysfunction, ventilator hours, graft failure, hospital mortality, ICU length of stay, hospital length of stay, maximum vasoactive infusion score in the first 24 hours after transplant. We need help with the data analysis plan.
  • VICTR voucher.
  • Mentor confirmed.

Lana Boursoulian, Pulmonary

  • Data managemnt needs
  • Mentor not confirmed.

2019 June 12

Ricardo Lugo, Cardiology

  • Determining predictors of a “steam pop” event during radiofrequency ablation in the human ventricle. We have collected data of repeated observations within multiple patients. I would like to request assistance with using the R package for GEE analysis.
  • This is a retrospective study of RFAs conducted in a single center. n~39 patients, ~1,000 ablations. ~17 patients had at least one event (~32 events in all).
  • Could use sandwich estimator for simple linear model.
  • Suggest using graphics highlighting individual data points where possible.
  • Suggest VICTR voucher. This meeting can serve as initial clinic visit for voucher.
  • Mentor confirmed.

2019 May 29

Daniel Sack, Epidemiology

  • Association between completed primary care visits at Shade Tree clinic and visits to the VUMC emergency room, adjusted for zip, CCI, mental health diagnosis, and recent ED visits
  • Mentor confirmed by phone.

2019 May 15

Caroline Eskind, Medicine/Infectious Disease

  • Assessing lung transplant microbiome with 16s rrna sequencing. first need assistance with analyzing demographic data
  • Needs statistical help with comparing demographics and clinical factors between groups., as well as analyzing sequencing data.
  • Suggest apply VICTR Biostatistics voucher in amount of $5000.

Ben Fernandes, Pediatric Critical Care Medicine

  • Please provide a short description of your project and the questions you’d like to address: Conducting a prospective pilot study of single ventricle patients in the CVICU. Checking 3 biomarkers prior to discharge after first surgery and readmission dates
  • VICTR Biostatistics voucher, mentor confirmed.

2019 May 8

Janesh Lakhoo, Radiology

  • Comparing calcifications on MR to CT. Need voucher for statistics tests. Planning on paired t-tests at the moment for comparison.
  • For each patient, there will be one CT measure of calcification, and additional three MR measures (by three methods). We want to compare each of the MR measure to the CT measure. The primary outcome is the measure of calcification, which is ordinal. Consider use Bland-Altman method to measure agreement, which plots the difference vs. the average. We might also need to take level dependence agreement into account. Denim regression can be fit to measure the relationship between CT and MR.
  • Suggest apply $5000 VICTR Biostatistics voucher for the help of statistical analysis and manuscript preparation..

2019 April 24

James Law, General Ophthalmology

  • Univariate analysis of oculoplastics data
  • Protocol with no expected funding support, mentor confirmed.
  • Outcome include number of treatment (continuous) and type (categorical). Want to compare the outcomes by various factors like location (with more than 3 categories). Can use proportional odds test for number of treatment, and Chi-squared test for categorical outcomes. Suggest apply $5000 VICTR voucher for biostat support.

2019 April 10

Chelsea Isom, General Surgery

  • I have a paper that was originally designed to detect a complication rate difference in a cohort. One of my reviewers is requiring that I now perform a multivariable model to look for factors that maybe associated with my outcome. I wanted help to perform a power calculation with my current dataset to see if I even have power to detect a meaningful difference in ORs with the data I have.
  • Protocol with no expected funding support, mentor confirmed.

2019 March 06

Kristina Betters, Pediatric Critical Care Medicine

  • We are doing a retrospective large database study looking at utilization of physical and occupational therapy in pediatric critical care patients. We have done some analysis on our own but want to verify with statistician. Specifically, we are looking at differences in patients and outcomes between centers that use more PT/OT services, and have run multivariate logisitic regressions for factors related to development of bed ulcers and discharge to inpatient rehab facility.
  • Protocol with no expected funding support
  • We have discussed the statistical approach for the analysis of the project. We suggest apply for $5000 VICTR voucher which would be sufficient for VICTR biostatistician to complete the analysis and help with the manuscript.

2019 February 27

Christine Helou, OBGYN

  • The objective of this study is to investigate whether implementation of an enhanced recovery after surgery pathway facilitates reduced length of admission for patients undergoing minimally invasive gynecologic surgery. The study will use data retrospectively collected at our institution beginning with the implementation of ERAS in February of 2018. These patients will then be compared to controls matched by surgical procedure and surgeon who underwent surgery in the year prior to protocol implementation at our institution.

    Questions we would like to address relate to study design and controlling for confounding with variations in provider practice, compliance with the protocol, and multiple interventions implemented as part of this protocol

  • Protocol with no expected funding support, mentor confirmed.

Jackson Cabo, Urology

  • We are planning a large, prospective study assessing disparities in post-operative opiate practices. Outcomes will include keeping of leftover opiate medications beyond the post-operative period, as well as safe disposal practices. We seek to determine how these practices vary according to health literacy (as measured by BHLS scores) as well as according to the patients place of residence. In particular we are interested in assessing how these outcomes vary according to rural/urban status, as well as distance to tertiary care center. We will be assessing these outcomes prospectively in a large cohort of general surgery and urology patients. Particular data on opiate use and disposal will be assessed via telephone interviews, with prescription data verified using the Tennessee CSMD.

    (1) How should we structure our variables and analyses to best assess how disparities in literacy and geographic area may impact opiate use and disposal practices?

    (2) In order to specifically test for how these “disparities based factors” may impact opiate practices, what demographic variables should we be sure to control for. What sort of analysis would be optimal for asking this question?

  • VICTR Voucher, mentor confirmed.

2019 February 20

Benedicto Fernandes, Pediatric Critical Care Medicine

  • Retrospective chart review looking at the predictive value of BNP and hCRP in the prediction of readmission, heart failure, and failure to thrive in patients with congenital heart disease.
  • VICTR Biostatistics voucher, mentor confirmed.

Marcus Tan, Surgical Oncology

  • I have three tumor subtypes and I’m screening these tumor subtypes for 8-10 different markers to evaluate whether a combination of these markers can separate the tumor subtypes.

    For at least one of the markers, the expression was lost in the tumor subtypes in 60%, 10% and 0%.

    I need help with the power calculation (would 20 tumors of each subtype be sufficient?) and statistical analysis

  • VICTR Biostatistics voucher

2019 February 13

Mack Goldberg, Ob/Gyn

  • I am examining retrospectively estimated blood loss of certain gynecologic procedures that do and do not use of a drug called tranexamic acid. Looking for assistance on analyzing data for case control study
  • Protocol with no expected funding support
  • Outcomes: blood loss, significant blood loss, transfusion
  • Subjects undergoing D&E.
  • Txa, yes/no (standard dose)
  • 17 cases/control.
  • Wilcoxen test for primary outcome of blood loss
  • Could use propensity score adjustment and use a linear model to predict blood loss using txa and propensity score
  • Suggest dot plot plus boxplot to show all data points by group (txa vs. no txa)

2019 February 6

Lindsey Safley, Pharmacy

  • Venous thromboembolism (VTE) is a disorder that includes deep vein thrombosis (DVT) and pulmonary embolism (PE). Historically, patients were admitted to the hospital and treated with intravenous unfractionated heparin (UFH) or subcutaneous low molecular weight heparins (LMWH) as a bridge to therapeutic anticoagulation with an oral vitamin k antagonist (VKA) determined by the international normalized ratio (INR). An outpatient management protocol was implemented at Vanderbilt University Medical Center for low-risk patients who present to the emergency department with VTE to be discharged on apixaban.The purpose of this study is to retrospectively evaluate the safety and efficacy of using apixaban in low- risk VTE patients in an outpatient setting

  • Applying for VICTR grant and hoping to obtain feedback/results for middle of March. Mentor confirmed.

  • Note: Tight timeline, mid-March for abstract, conference deadline, voucher will take at least two weeks.
  • Process, need analysis plan
  • Outcome: Return in 7/30 days
  • Time to analysis? Do have date of return. This could be a more powerful approach.
  • Data: May want to do an audit of data to confirm data. Can do a second reviewer, and measure concordance berween reviewers. ~125 chart reviews. Could review 100% of primary predictors, discrepencies adjudicated. For other predictors, could do a random audit. Randomly select charts for independent reviewers.
  • Next steps, send protocol to Kim Hart and Chris Lindsell, Chris Lindsell will be responsible for analysis. Can list biostat personnel as KSP to use identified data.

Parisa Samimi, OB/GYN

  • I would like assistance with a retrospective review examining the effect of urethral length on post-operative slings.
  • Applying for VICTR grant. Mentor confirmed.

  • Preoperative imaging measure uretha length, shorter length may lead to improved recovery/outcome.
  • Outcome: Failure within one year-binary outcome (yes/no). Time to outcome is problematic, given episotic visits.
  • Also: relationship of length to urinary retention and UTI? How to proceed? Be explicit upfront--pre-specify analysis. Flesh out other hypotheses--are the mechanisms the same or different? If not, can we answer the question? Pre-specification is key.
  • Restricted cubic slines, to see shape of curve and allow shape of relationship to vary.
  • Pr(y): logistic regression. Covariates, weight, height, (not BMI, it can be a poor summary), age?, race?, other?.
  • Next steps: send protocol to Kim Hart and Chris Lindsell, draft manuscript!

2019 January 30

Don Arnold, Pediatrics/Emergency Medicine

  • NIH guidelines recommend %-predicted peak expiratory flow (%-PEF) or forced expiratory volume in 1-second (%-FEV1) measurement in children with acute asthma exacerbations to categorize severity (≥40%, mild-moderate; <40%, severe) and response to treatment. To our knowledge, the validity %-PEF to predict %-FEV1 as a criterion measure of lung function and response to treatment during childhood asthma exacerbations has not been examined. We sought to examine whether %-PEF predicts %-FEV1 in children during asthma exacerbations.
    Methods: We prospectively studied children aged 5–17 years with acute asthma exacerbations in a pediatric ED. Participants performed PEF and spirometry in accordance with American Thoracic Society (ATS) standards. I anticipate including data from those with spirometry meeting ATS quality criteria in multivariable regression models to examine associations of pretreatment %-PEF with %-FEV1 and proportionate change of %-PEF with proportionate change of %-FEV1 after 2 hours of treatment. Model covariates include age, gender, race and pretreatment severity measured using the validated, 0-16 point (16 most severe) Acute Asthma Intensity Research Score (AAIRS).

    1. Appropriate inferential test(s) to examine these associations, including whether methods to examine for nonlinear associations should be used.
These analyses will be used for:
    1. Manuscript
    2. Preliminary data for R03 or R21 application
  • VICTR Biostatistics voucher, mentor confirmed
  • Have worked with Chris Slaughter in the past, on other projects
  • Have submitted to SAEM (abstract)
  • Data from K23, data previously published
  • Would like voucher
  • Modeling ability of PF to predict FEV

Jeffrey Birnbaum, Pediatrics/Emergency Medicine

  • I am having providers perform Acute Asthma Intensity Research Scores (AAIRS) on patients independently at 2 time points. I would like to look at inter-rater reliability of scores overall and by each subcomponent of score. I do not know if I will need VICTR funding for formal biostats assistance.
  • May want voucher, mentor confirmed.
  • Apply for voucher
  • Kappa, ICC may be appropiate
  • ~20 enrolled, would like sample size calculation, and assitance with analysis plan.
  • IRB exempt, limited data collection

2019 January 23

Austin Adair, Pediatric Critical Care Medicine

  • Discuss potential analysis strategies of data set of transport.
  • VICTR Biostatistics voucher, mentor confirmed

Yuxi Zheng, Ophthalmology

  • Outcomes for long-term followup of surgical correction of head positioning (ordinal variable) and strabismus (continuous variable) associated with infantile nystagmus syndrome.
  • Question: I would like help interpreting the results of multiple regression.
  • My mentor was present during first joint meeting with James Law, Cathy Jenkins, and Li Wang on 11/7.

2019 January 16

Jillian Hayes, Pharmacy

  • I am currently working on a project involving the impact of feedback on the prescribing habits of physicians assistants and nurse practitioners in the outpatient Vanderbilt Health at Walgreens clinics. We are attempting to determine the best way to analyze the change in prescribing rates over time. We have a few ideas, but wanted confirmation on the best way to do this.
  • Abstract, mentor confirmed

Joshua Bland, VUSM

  • Chart review of mental/behavioral health patients who present to VCH ED over a 1 year period. Data are collected. Outcomes include diagnosis subgroup, disposition, and length of stay. We have performed linear and logistic regressions and would like feedback about how best to refine these analyses to our data.
  • Abstract, mentor confirmed

2019 January 9

Austin Adair, Pediatric Critical Care Medicine

  • Discuss with Li Wang analysis of data pertaining to obesity in single ventricle patients undergoing bidirectional Glenn.
  • VICTR Biostatistics voucher, mentor confirmed

Alexander Hawkins, General Surgery

  • The number one driver for readmission and overall health care utilization in patients with a new ileostomy is dehydration. The early stages of dehydration are difficult to assess. We want to see if the use of at home urine osmolarity testing via dipsticks would decrease health care utilization (the exact outcome measure is up for discussion, I was thinking either a composite count of ED visit, readmission, clinic visit versus a count of days of ED visits, readmission length of stay and clinic visit). I am attaching a brief research overview of the project.

    Biggest question revolves around the randomization. It will be far easier for staff to randomize by month (or even week) rather than by patient. I want to understand the ramifications of this.

    Would also like to discuss power calculations. Current rate of health care utilization is around 25%. We would look to halve that.

  • VICTR Biostatistics voucher, independent investigator

2018 December 19

Ryan Hsi, Urology

  • Sample size calculation
  • Planing a prospective RCT on patients undergoing ureteroscopy, randomizing to stent or no stent. The main outcome is 30 day complications. The literature indicates the outcome has a ~15% rate. There have been two *somewhat similar studies from 2001 and 2007 to help estimate SD - and need help determining best way to calculate sample size.

2018 December 12

Justin Banerdt, Department of Medicine, Division of Pulmonary and Critical Care Medicine, CIBS Center

  • Our project is a prospective cohort study of delirium incidence and outcomes at a resource-limited referral hospital in Lusaka, Zambia with a high burden of critical illness and HIV. 820 new medical and surgical patients were evaluated for delirium using the brief confusion assessment method (B-CAM). 28-day and 6-month outcomes include survival and functional status. The primary questions we would like to address include: What is the prevalence of delirium and risk factors for delirium in this cohort? Is delirium an independent predictor of mortality and long-term functional impairment in this cohort?
  • Outcome: VICTR Voucher

Sydney Payne, Plastic Surgery

  • Clinical trial investigating traditional surgical follow up vs. no follow-up. Would like to discuss general approach to analysis based on data that we are collecting.
  • Outcome: Protocol with no expected funding support

2018 December 5

Kelli Rumbaugh, Pharmacy/ Surgical ICU

  • We are examining the incidence of acute kidney injury among SICU patients who received either vancomycin (+) zosyn, vancomycin (+) cefepime, vancomycin (+) levofloxacin, and vancomycin (+) meropenem. I would like help on the best statistical tests to conduct to detect differences among the four groups.
    • Outcome: VICTR Voucher

Ryan Brown, Allergy, Pulmonary and Critical Care Medicine

  • We are comparing the use of central catheters and vasopressors 6 mo before and 6 mo after the institution of a protocol within the MICU to allow for peripheral administration of vasopressors. Primary outcome is time from ICU admission to vasopressor initiation. Secondary outcomes are safety outcomes. Cohort - patients receiving vasopressors within 24 hours of ICU admission.
    • Outcome: VICTR Voucher
    • Mentor confirmed

2018 November 28

Yuxi Zheng, Ophthalmology

  • Outcomes for long-term followup of surgical correction of head positioning (ordinal variable) associated with infantile nystagmus syndrome

    Question: Best statistical test for evaluation of improvement of head positioning (ordinal variable) post-operatively and over time. Is ordinal logistic regression appropriate? I appreciate your help and guidance!

    My mentor was present during first joint meeting with James Law, Cathy Jenkins, and Li Wang on 11/7.

2018 November 14

Sabina Dang, Otolaryngology

  • Social determinants of health in a population of airway stenosis patients. I would like to address multivariate analysis.
  • Outcome: Other
  • Mentor confirmed

James Law, Vanderbilt School of Medicine; Vanderbilt Eye Institute

  • Atttended clinic 11/7, want to attend on 11/14 as follow up. Mentor will not attend (did attend 11/7)
  • Outcome: Protocol with no expected funding support

2018 November 7

James Law, Vanderbilt School of Medicine; Vanderbilt Eye Institute

  • Have collected data for medical student research project - hoping to present this as a retrospective interventional comparative case series, looking for general advice for analyzing the data.
  • Outcome: Protocol with no expected funding support
  • Mentor confirmed

2018 October 31

Rachel Forbes, Surgery-renal transplant

  • Learning curve for vascular anastomosis based on times for various residents
  • Outcome: Abstract

2018 October 10

Christine Helou, OBGYN

  • Study looking at effect of BMI on endometrial hyperplasia/malignancy in premenopausal women. Can we identify a specific BMI cut off that would warrant biopsy. Study: retrospective cohort comparing obese premenopausal women who had endometrial sampling to non-obese premenopausal women. Would like to ID cases using SD. Question for clinic: is this feasible, help calculating sample size of cases needed.
  • VICTR voucher

Judd Heideman, Internal Medicine

  • Assistance with development of logistic regression to assess whether periprocedural hypoxemia causes in-hospital mortality, from database of 1000 hospital intubation procedures
  • VICTR voucher

2018 October 3

Svetlana Avulova, Urology

  • Obesity paradox and its impact on cancer related mortality has been previously observed
    • Obesity is associated with aggressive prostate cancer (increases risk of high-grade CaP) and worse long term outcomes
    • In localized prostate cancer, obesity is associated with greater risk of aggressive pathology and worse overall survival
    • In metastatic castrate resistant prostate cancer, obesity is associated with improved overall survival and prostate cancer specific survival (Halabi et al. 2007)
    • In nonmetastatic CRPC, it is associated with reduced all cause mortality (ie improved overall survival) but not prostate cancer specific mortality (Vidal et al. 2018)
    • In metastatic HSPC, BMI is associated with improved OS (Montgomery et al. 2007)
    We aimed to identify if sarcopenia (muscle mass < 5.5 cm2/m2) rather than obesity (BMI ≥ 30kg/m2) is associated with overall survival in men with mPCa/CRPC. Our goal is to elucidate whether muscle mass independently predicts overall survival by utilizing a validated software package provided by the Diet, Body Composition and Human Metabolism Core led by Dr. Silver to measure the muscle mass index. In addition, we will incorporate the Charlson Comorbidity Index in our multivariate analysis as patient comorbidity may predict overall survival regardless of muscle mass or BMI.
  • VICTR voucher

2018 September 26

Rosemarie Dudenhofer, Medicine / Allergy, Pulmonary, and Critial Care

  • We want to assess if there are benefits of positive airway pressure (pap) in the lung transplant population:

    • Primary Endpoint: whether or not pap decreases the incidence of transplant rejection and whether pap improves graft survival
    • Secondary Endpoint: Prevalence of OSA in the lung transplant population
      Questions: how many patients should we plan to enroll in the 3 arms:
      a. no OSA with no pap tx (control)
      b. no OSA with pap tx
      c. OSA with pap tx
      Question: ideal length of follow up to determine endpoints.
  • VICTR voucher

Ehtesham Khalid, Neuromuscular Medicine

  • Retrospective data analysis for IVIG and PLEX treatmentPrimary Endpoint: whether or not pap decreases the incidence of transplant rejection and whether pap improves graft survival

  • VICTR voucher

2018 September 19

Lauren Schmidt, Pharmaceutical Services

  • The objective of this retrospective review will examine if intravenous lipid emulsion(s) (ILE) from either propofol or a parenteral nutrition source is associated with adverse effects on outcomes in critically ill patients. We are interested in identifying the incidence of infection, hospital and ICU lengths of stay, ventilator free days and rates of mortality between patients who received ILE versus patients who did not receive ILE.
  • VICTR voucher

Chi Le, OB/Gyn

  • The project aims to identify drivers of 30 days unplanned re-admissions after laparoscopic hysterectomies. I have done some preliminary analysis in R, and we would like to receive some input from Biostatistics. I will send the R code and de-identified data to the clinic email a few days in advance of the meeting.
  • VICTR voucher

2018 September 12

Timothy Hopper, Critical Care Medicine

  • Proof-of-concept for new medical device, seeking to compare device measurements to clinical gold standard for intravascular volume status: PCWP during RH catheterization.
  • Outcome: abstract

Rob Freundlich, Anesthesiology

  • K23 grant application
  • Outcome: grant

2018 August 22

Lindsey Safley, Emergency Medicine/ Pharmacy

  • What are the safety and efficacy outcomes of patients discharged from the ED on apixaban for VTE in a real-world setting?
  • Retrospective study of patients discharged from the ED from June 2016 –June 2018
  • VICTR voucher

Ryan Stark, Pediatric Critical Care

  • We have submitted a small descriptive paper on 15 subjects that suggests a novel finding. The reviewer requested a “power calculation” even though this is not a prospective study. I would like some input on where they are just asking for us to provide an effect size of our finding and if there would be a way to do a reverse power calculator to estimate effect.

2018 August 15

Shayan Rakhit, Trauma

  • We are evaluating educational outcomes of a School of Medicine course [ISC: Injury, Repair, Rehabilitation] that primarily deals with trauma. We have Likert scale before/after data for the course as well as numeric and qualitative assessments of the students. We would like to meet in order to develop an analysis plan.
  • VICTR voucher

2018 August 8

Nicholas Kavoussi , Urology

  • Retrospective study of SD data to identify patients who have had kidney stone surgery and determine whether routine imaging is helpful in preventing additional stone events as well as optimal modality.
    PI for project Ryan Hsi met with Dr. Shyr to discuss the above.
  • Have discussed the project. The aim is to look at association between post-op imaging the 5-year recurrence after kidney stone removal surgery.
  • Recurrence is defined as ER visits with pre-defined CPT codes, or ER visits with documentation of "stone", or surgery including pre-defined CPT codes.
  • Want to compare imaging pattern (frequency and timeing) between patients with and without recurrence.
  • $5000 biostatistics voucher is suggested.

2018 August 8

Maya Yiadom, Emergency Medicine

  • I’m looking for the most appropriate way to analyze concordance (or disagreement) in a sample of 69 patients. We collected EKG results (a test) for each of these patients and documented the physician’s clinical interpretation (ResGMO _Read_Final) using a coded rubric (ResGMO _Read_Coded). We then had experts review the same EKGs (VUMC_Attd_Final) and provide an interpretation as part of a quality assurance pilot using the same rubric (Att_Final_Coded). The coding rubric includes 47 unique diagnoses coded (1-47). From manual assessment in excel, we found the interpretations disagreed 49.3% of the time, but are looking for a test statistic to replicate this that will give us an estimate of variation (SD) or confidence (CI).

2018 August 1

David Isaacs, Neurology

  • Deep brain stimulation (DBS) is an FDA-approved treatment to address uncontrolled motor symptoms in Parkinson’s disease (PD). Vanderbilt University Medical Center is one of the highest-volume DBS centers in the country. From 2007 through October 2017, 265 Parkinson’s disease patients underwent DBS implantation at Vanderbilt, with electrodes placed in one of two anatomical targets: 168 in STN and 97 in GPi. Pre-operatively, each patient is extensively evaluated with a battery of validated motor, cognitive, and mood instruments. The majority of these patients continue following with their Vanderbilt neurologist. In an attempt to capture longitudinal outcomes in this population of interest, we will recruit all PD patients two years or more status post DBS who are receiving regular care at Vanderbilt University Medical Center. Study participants will undergo a condensed evaluation of motor function (Unified Parkinson’s Disease Rating Scale Part III), cognitive performance (Montreal Cognitive Assessment), mood (Beck Depression Inventory), and quality of life (Parkinson’s Disease Questionnaire-39). These results will be compared to baseline measures performed pre-operatively, allowing for assessment of interval change. STN and GPi DBS patients will be analyzed separately. Goals of attending the Biostatistics Clinic are as follows:
    – appropriateness of selected statistical methods, as enrolled patients will have variable follow-up time
    – guidance for developing a regression model to address confounding variables (disease duration, age of surgery, etc)
    – methods for addressing baseline differences between comparison groups (STN and GPi groups)
    – feasibility / reasonableness of VICTR voucher for ongoing statistical assistance
  • Design complete but no enrollment/data collection
  • Has a total ~250 patients who underwent DBS surgery. Estimated ~150 patients to be enrolled in the study.
  • The goal of the study is to compare long term longitudinal outcomes between two groups.
  • $5000 VICTR biostat support is suggested.

Ashley Nassiri, Otolaryngology

  • We suspect that there are specific proliferative factors that can be identified in vestibular schwannoma histology that can be indicative to future growth rates thus impacting the clinical decision to radiate postoperatively. This project involves a retrospective review of pathology slides (with new staining for proliferative factor) correlated with growth noted on imaging postoperatively.
  • Questions for session: 1) Power analysis for estimate of needed participants & planning for statistical analysis. 2) Stats budgeting for funding request.
  • Tumor cells were not removed completely to avoid nerve damage. Some patients will undergo radition after sugery. However, radition can cause complications.
  • The goal of this study is to look fro clinical factors that can predict tumor growth, which can help decide whether the patient should be given radition or not.
  • MRI was performed prior surgery, immediately after surgery, 1 year after and 2 year after surgery. The left behind tumor growth can be assessed from MRI scans.
  • The interested marker is Ki-67 index. Can fit separate linear models for 1 year and 2 year post surgery.
  • $5000 VICTR biostat support is suggested.

Robert Yawn, Otolaryngology

  • Same patient cohort as above, but look at facial function at 1 year post surgery. Facial function is on a 1-6 scale, with 1 means normal and 6 means complete paralysis.
  • The interested clinical factor is time the patient spent while glucose falls in a certain range. Other confounders are available including tumor size, age, commorbidity, etc.
  • Suggested ordinal logisitc regression model.
  • $5000 VICTR biostat support is suggested.

2018 July 25

Diane Haddad, Surgery

  • Health inequity in the US is seen and documented across all healthcare fields with advances in medicine not experienced by all racial and ethnic groups. Outcomes in critical illness are no exception with documented disparities existing in severity, mortality and hospitalization rates attributed to race, access and poverty. We designed a study, using the BRAIN-ICU cohort to evaluate effects of socioeconomic and insurance status on in-hospital delirium and long-term cognitive impairment. Each participant in the BRAIN-ICU cohort had a socioeconomic score geocoded using census data and zip code. We similarly have data on insurance status and post-hospital disposition. We plan to look at effects of socioeconomic status, insurance and post-critical illness disposition on duration of in-hospital delirium and long-term cognitive impairment. Other covariates we plan to include in our model include age, race, sex, education level, co-morbidities (Charlson, Framingham), disease severity (APACHE, SOFA, sepsis, hypoxemia, coma), benzos/opioid equivalents, EtOH abuse. Our hypothesis is that socioeconomic and insurance status will have no significant effect on in-hospital delirium but a significant effect on LTCI and post-hospital disposition.

2018 July 11

Kala Dixon, Surgery

  • The study purpose will consists of participants undergoing RYGB Surgery at the surgical weight loss clinic at Vanderbilt. To cover surgical costs, Vanderbilt accepts Aetna, Cigna, and Blue Cross Blue Shield. Aetna requires pre-op patients to go through a program that is interprofessionally driven in which they meet with the surgeon/nurse practitioner/ and dietitian. They meet with this team once a month every 90 days before surgery begins. BlueCross Blue Shield only require 6 months of uniprofessional intervention, prior to surgery, consisting of only a primary care office visit. Therefore, once IRB approval has been obtained, I will receive that list of patients, contact them, and ask, via telephone, if they would be interested in partaking in a telephone survey (short form 36 and a diet habit survey). This initial contact will be the baseline survey results. Following the telephone survey they will undergo their intervention method, and will be re-contacted right before surgery begins to re-take the survey. The goal is to compare the quality of life survey results between the 2 groups (control group:uniprofessional patients who are only required a pre-op primary office care visits and compare them to the experimental group of those who undergo the interprofessional intervention with a surgeon/nurse practitioner/ dietician). With the aim of the project being to compare the two groups to see if there was a change in results in one’s quality of life between those who undergo Vanderbilt’s interprofessional intervention pre-operatively to those who only undergo uniprofessional intervention pre-operatively.
  • Questions:
  • 1) How to conduct the power of analysis to determine the number of participants needed to conduct the project.
  • 2) The current statistical methods that will be proposed to analyze the results are: t-test, multiple linear and logistic regression. Are these the best methods to analyze comparative results and outcome improvements?
  • 3) What statistical data do we need to collect to determine study significance?
  • Design complete but no enrollment/data collection

2018 June 27

James Patrinely, Plastic Surgery

  • Our prospective, randomized controlled trial compared pain outcomes using the visual analog scale (1-10) for two types of injections for trigger finger (steroid + saline vs steroid + lidocaine). We have completed patient enrollment and need recommendations for final data analysis.
  • Study design is non-inferiority.
  • Data have been unblinded and reviewed, final analysis not complete. Data are not normally distributed. Suggest use of median and IQR in table 1.
  • Discussed if there is a need to revise power calculation, given that data collected have a different SD than was estimated. There is no need to revise, but could discuss implications of a larger sample in the manuscript.
  • Suggest Wilcoxen test for primary outcome and Fishers (or Chi-Square) for secondary outcome of treatment efficacy.
  • Suggest visual display of data in a strip chart.

2018 June 13

Justin Shinn, Otolaryngology

  • "Endoscopic evaluation of critically ill patients who are extubated in the ICU to determine if they have acute laryngeal trauma. Binary outcome, also with ordinal values within acute laryngeal injury. Phone follow up data at 3 months using a voice and breathing survey to begin assessment of longterm outcomes. Data collection is complete"
  • Justin's primary question of interest is describing the incidence of laryngeal trauma in patients who were intubated. We discussed providing an overall estimate as well as among those who were intubated multiple times.
  • We also discussed logistic regression (gives odds ratios) versus a modified Poisson model (gives relative risks) for investigating the risk factors of trauma. We discussed creating a list of clinically important predictors and ranking them in order of importance. The complexity of the model that can be fit is driven by the number of non-trauma patients in this case (~40 without trauma vs ~60 with trauma). The rule of thumb is 1 parameter can be estimated for every 10-20 events. For continuous variables without splines and dichotomous variables, 1 parameter is estimated. If splines are fit with continuous variables, then # knots - 1 are the number of parameters estimated for that variable. For categorical variables with more than 2 levels, # levels - 1 parameters are estimated.
  • For univariate tests of association, the Wilcoxon Rank Sum test is preferred over a t-test for continuous variables. For categorical variables, the Pearson chi-square test is appropriate.
  • For particular types of injury, rather than fitting some kind of model, it was suggested to graphically display or illustrate in tables the distributions.

2018 June 6

Jeff Heimiller, Emergency Medicine

  • "We built a cricothyrotomy model that we used to test our EM residents on. We wanted to see if it could differentiate between the novice and more experienced residents."

2018 June 6

Austin Adair, Pediatric Critical Care

  • "We need help in initiating a study involving pre- and post-utilization of electrolyte monitoring using a specific machine for retrospective study. We need help determining effectiveness based off of multiple patient variables."

2018 May 23

Lara Harvey, OB/GYN

  • "VICTR voucher."

2018 May 16

Nicola White, OB/GYN

  • "Comparison of outcomes for women who have primary c sections vs higher order lacerations after delivery. We would like assistance with creating a case matched comparative group."
  • Women with 1st pregnancy are eligible. There are ~10,000 women with C-section and ~500 women with higher order lacerations. Primary outcome is depression scale and want to compare depression between two groups.
  • Chart review is effort consuming and are planning propensity matching on the 500 cases. Will check to see what available information will have for the matching.

2018 April 25

Joy Carroll, Ophthalmology

  • "I have a dataset with pre-operative factors and post-operative outcomes for cataract surgery. I would like to compare several measures and outcomes between a group taking a type of medication and everyone not taking the medication to determine whether the outcomes vary. I used a student T test for a smaller version of this study that looked at one factor but not the overall outcomes. I would like to discuss which statistical tests to use and what programs may be acceptable (i.e. may I continue to use excel, or would I need to learn R?)"

2018 April 18

Luis Huerta, Pulmonary/Critical Care

  • "I am attempting to validate an electronic severity of illness calculator (SOFA score) versus the gold standard of manual chart review. There are 6 components to the SOFA score, and each is on an ordinal scale of 0-4 (for a total of 24 possible points). My primary question for the clinics relates a secondary analysis: I am attempting to determine the best statistical method to use to compare the agreement between manual and electronic collection of the six individual components, as 1-2 components are likely to be more difficult to collect electronically. Data colelction is complete."

James Zhang, Surgery

  • "Our project investigates the incidence of new onset orthostatic intolerance after bariatric surgery. We are also interested in whether onset of these symptoms is related to weight loss."

2018 April 4

Muhammad Aanish Raees, Pediatric Cardiac Surgery

  • "We are applying for a VICTR grant for the project titled: Evaluation of different patching materials for vascular reconstruction in surgical repair of congenital heart disease and clinical outcomes. I would like to discuss the statistical plan for this study, and how many statistical hours this study will require."

2018 April 4

Andrew Medvecz, General surgery

  • Previous clinic notes
  • "The project is a retrospective cohort study evaluating long-term outcomes (readmission in particular) of small bowel obstruction, comparing operative versus non-operative management. We plan to apply to VICTR for biostatistical voucher support as well as data support. We would like biostatistical support on the project and are hoping to meet with biostatisticians that would be associated with the analysis if VICTR funding is granted. In particular, we would like assistance with developing an analysis plan."
  • One suggestion: multi-state models/competing risks, because patients can move between surgical and non-surgical management if they have recurring SBO (which many patients do, in highly varying numbers); could also do survival analysis with recurring events
  • Hospitalizations and surgeries can be recurring events (at different hospitals, though having THA data will be very helpful here)
  • Max of ten years followup on ~15,000 SBOs (2007-2009); estimate 3:1 nonoperative vs operative (no data yet on individual patients, but is coming)
  • This is a challenging statistical problem; Li Wang and Chris Lindsell (VICTR statisticians) will work on analysis plan for VICTR application
  • Second aim: for an individual hospitalization, how likely are patients to be readmitted? other outcomes? This gets extra complicated because of not only bounce-backs, but "bounce-aways" who go to a different hospital

2018 March 21

Katherine Riera, General surgery

  • "Peds trauma QI project. Will be large database before/after QI interventions. Will be patient demographics, outcome measures, and trauma activation information. Also will be Redcap staff survey data. Planning to apply for VICTR funding and needed to come to a clinic prior."
  • Retrospective data: possible to collect data on level I/II trauma cases, 2015-2018 (~1500 total, with ~360 level I); main outcomes are times to events (time in trauma bay, time to intubation, etc), which will need to be extracted by hand. Recommend getting data from roughly the same seasonal time period as prospective data (eg, April-October both years), since trauma rates vary a lot seasonally.
  • Mortality rate is unknown but low
  • Interventions (all done together, "on" clinical staff, not patients): heads up, prebrief, wall poster with relevant components; will begin after IRB approval, then post-intervention data collection will begin a month later (likely ~280 patients in those six months)
  • Times are available both via trauma flowsheet and video recordings
  • 15-question pre-intervention survey to ~150 staff will be done once IRB approves study (before intervention); post survey done later; goal is to describe staff satisfaction with "how things are going" for various types of patients (some questions are VAS; demographics are categorical). Shoot for 90% response rate; might want to do a simple random sample of staff population to be able to better target followup efforts.
  • Send current protocol to; Li (VICTR statistician) can help draft a statistical analysis plan for application
  • Plan to apply VICTR voucher for biostatistical support. $5000 is suggested.

Josh Latner, Emergency Medicine/Medical Student

  • "Survey of medical students on their opinions/attitudes towards point-of-care ultrasound. I would like to discuss the best ways to create a summary score for a series of Likert-type questions and ways to validate it." (mentor will be attending; data collection completed in REDCap)
  • Creating a summary statistic won't be very helpful here - what does a "score" of 10 mean? Validating an instrument will be a long process; not within the scope of the descriptive study.
  • Response rate about 56%; good to provide information about/compare nonresponders to survey responders (whatever is available) to help determine whether there is likely to be bias in the observed results vs actual views of population.
  • Stacked barcharts can be hard to compare across groups (if comparison is what the goal of the visualization is); consider side-by-side barcharts instead ( position_dodge in ggplot2)

2018 March 7

Jackson Cabo, medical student/Urology

  • "I am working on a quality improvement study looking at post-operative disposal of opiates as well as prescribing practices. I have already performed preliminary analysis, consisting primarily of Fisher’s exact tests and Kruskall Wallis tests. I only require verification that my code in R looks correct as we plan to submit an abstract soon."
  • Rather than binary classification, preserve as much information as possible. eg, rather than "keeper" vs "non-keeper," use "used all of prescription," "disposed of leftovers properly," "kept leftovers."
  • Descriptive statistics for, say, morphine equivalents left over after postop, will give you a lot of bang for your buck.
  • Consider interrupted time series for continuing PDSA interventions.
  • No p-hacking smile

Christodoulos Kaoutzanis, Plastic Surgery

  • "The purpose of this study is to compare the different autologous fat grafting techniques Telfa Rolling and REVOLVE System in patients undergoing postmastectomy breast reconstruction. Data has been collected and analyzed but we would like to do a multivariate analysis to identify risk factors for 2 of our outcomes and we need assistance with that analysis."
  • Primary outcomes = differences in number of times surveillance imaging performed (secondary/exploratory), amount of fat necrosis
  • Techniques are not randomized; Telfa was used most commonly earlier, REVOLVE more common over the last few years, but Telfa still used sometimes (mostly chosen by surgeon preference)
  • All cases are from a single surgeon, which could confound results (surgeons hopefully perform better over time; does REVOLVE look better just because of surgeon experience?). This is a straightforward procedure and the surgeon had prior experience, so hopefully not a huge factor, but definitely a limitation.
  • Data is currently recorded as number of breasts (~188 pre, 131 post) over four years; important to recognize that there are some within-patient factors that will lead to correlation between the same patient's breasts
  • Plan to apply for VICTR voucher; recommend summarizing project (mini-protocol; manuscript shell even better) and contacting Chris Lindsell/Li Wang with VICTR biostats to work on analysis plan prior to voucher application
  • Chris suggested interrupted time series

2018 February 28

Aaron Bolduc, surgery/GI-Lap -- Cancelled

  • "Followup for my statistics results based on the 2/21/18 clinic plan."

2018 February 21

Aaron Bolduc, surgery/GI-Lap

  • "Bowel length study on outcomes for bariatric surgical patients. Does bowel length correlate with height, weight, age, sex, or race? Are the groups of data recorded by two different observers homogeneous?"
  • Four surgeons measured bowel length during separate procedures; total N is currently about 300. Two surgeons account for vast majority of cases.
  • Will not be able to discuss in terms of interrater reliability/agreement, since only one surgeon measured each bowel. Goal is to see if distributions are roughly equivalent between the surgeons.
  • All gastric bypass patients are currently treated the same way - bypassing 150cm (for patients with BMI<45) or 200cm (others) of small bowel. Eventual goal (much later) is to determine whether outcomes would improve if amount of small bowel bypass is specified by patient characteristics. Intermediate goal (later) is to determine whether bowel length is associated with surgery outcomes under current practice. Current goal: is bowel length associated with baseline/demographic characteristics?
  • Suggest multivariable regression model with bowel length (which is normally distributed, wonder of wonders) vs patient characteristics of interest listed above.
  • Also interest in describing characteristics of two surgeons' patient populations; hope is to be able to state that patient characteristics don't differ strongly by surgeon.
  • Plan is to apply for VICTR voucher; this will fit within the 90-hour time frame, and plan is for a manuscript. Strongly suggest creating a data dictionary to go along with deidentified data, including variable definitions, coding (eg 0 = male, 1 = female), other variable info. (No time for VICTR prior to next Thursday's abstract deadline, but feasible for June conference/manuscript.)
  • If there is strong interest in interrater reliability, could do a small subsample (10-20%?) of patients who get measured by both surgeons; this adds time and logistical complexity. Important to make sure each surgeon is blinded to the other's measurements.

2018 February 14

Chelsea Isom, resident, General Surgery

  • "Retrospective chart review trying to answer the question: Do ipsilateral central venous ports in breast cancer increase the risk of complications? Question: I wanted to make sure that I did my power calculations in the PS software correctly."

2018 February 7

Christin Giordano, resident, Department of Medicine

  • "We will be trialing a new rounding method on 2 of the 5 general internal medicine resident teams and measuring burnout as well as percent early discharges and conference attendance (comparing the intervention teams vs. those with usual rounding plans). We can bring our survey questions but one is the Masloch burnout scale which has its own scoring. We would like help developing a statistical plan. Ultimately, we would like to qualify for a voucher for VICTR funding for biostatistics and Dr. Lindsell said this was the first step."
  • Total of ~60 people, 24 in intervention teams and 36 in control teams.
  • Recommend trying to do burnout scale at beginning and end of rotation (two-week period); depending on barriers, may or may not work out well, but if cost is acceptable, doesn't hurt to try and could definitely help when analyzing and discussing results/limitations.
  • Primary analysis should not categorize subscores; this will result in less power and ability to see a difference between groups.
  • Potential confounders: prior rotation (but in such a small group, this could potentially identify people - get as much detail as possible while preserving anonymity)
  • Plan to collect data in REDCap; suggest REDCap clinic to help with survey-specific design questions
  • "Early discharge" will be measured only for patients who are discharged alive from the hospital (patients who are transferred, etc will not count toward this)
  • For burnout scale, it'll depend on the distribution of the data, but nonparametric Wilcoxon test is likely a good idea (or proportional odds logistic regression for multivariable analysis, adjusting for baseline, if possible)
  • This should fit into a standard 90-hour VICTR project

2018 January 24

Jackson Cabo, medical student

  • "From my prior statics clinic visit: "“I am doing a project with Dr. Bailey (Surgical Oncology) regarding healthcare disparities in colorectal cancer and how they may vary depending on setting of care (i.e. type of treating hospital). A Cox Proportional Hazard Regression will be used to estimate the effects of race, income status, and insurance status in the context of hospital facility (type) on overall survival.”
  • This will be my second visit to statistics clinic for assistance with my project. I would like some assistance with preliminary data analysis in order to aid in my preparation for presenting my work at the U54 Cancer disparities symposium at the beginning of next month. I would like assistance with selection of the appropriate statistical test (ideally in R) for a couple of comparisons I am making between median survival depending on race, insurance status, and facility type. I also had some questions as to which tests I should perform to accompany Kaplan-Meier curves. I will attach the document with relevant comparisons to an email and send it to the statistics clinic staff."

2018 January 24

Benjamin Weisenthal, orthopedic resident

  • "I have NSQUIP data from 2005-2013. For each year, I would like to pull a certain group of patients with certain ICD-9 codes and then group these patients by CPT codes. I have the NSQUIP data in SPSS but just need help extracting the information."

Joshua Arenth, Pediatric Critical Care

  • "Addressing optimal data formatting for analysis re: previously discussed project."

2018 January 17

Andrew Medvecz, general surgery/trauma resident

  • "The project is a retrospective cohort study comparing operative vs nonoperative management of small bowel obstruction. The primary outcomes are readmission with small bowel obstruction and cost associated with the all the subsequent admissions. We are applying for VICTR funding and would like to discuss our statistical plan with the biostatistician. Oscar Guillamondegui is the mentor for this project and will be in attendance."
  • Estimated recurrence rate for nonsurgical patients is about 30-35% within two years; data more long-term is hard to ascertain due to usual factors + patient movement between physicians/locations
  • This study will use data from the Tennessee Hospital Association, so is better able to track more patients even between hospitals (exceptions: Indian hospitals, VAs); this data often used for business purposes, but can be helpful for longitudinal studies
  • Uses ICD9/10 to determine reason for admission (we are interested in three, any of which identify this cohort)
  • Plan is to take cohort with first admission for small bowel obstruction in 2007, 2008, 2009; send those patient IDs to THA; THA will provide deidentified dataset of future admissions for any reason, CPT codes and charges for admission-related procedures, as well as death dates and some other info (eg, hospital type)
  • Nonsurgical management should be pretty consistent between patients
  • Analysis approaches for primary outcome could include a cumulative incidence model (Bryan Shepherd has experience with this), Cox models with recurring events [readmissions]; have data on confounders (age, comorbidities)
  • Given the nature of the database, we're hopeful that data will be in good shape from the beginning (CFOs/CEOs use this), sent in a CSV
  • Also interested in cumulative costs over study time, acknowledging issues with charges vs costs, different charges for different patients...
  • For primary outcome plus straightforward cost descriptives, this should fit into a standard 90-hour VICTR voucher. (If cost data gets much more complicated or interesting, might recommend another voucher or talking with health policy folks.)

Mitchell Hayes, medical student

  • "Our question is whether or not cancer detection rates change over time adjusting for the individual radiologists who read the MRIs and individual urologists who performed the biopsies. We would like to create a logistic regression model in R that analyzes cancer detection rate as a function of at least two interaction variables: urologisttime(1) + radiologisttime(2). Time(1) and time(2) are two different continuous variables (time in this case is more like experience with MRI-US fusion biopsy). Urologist and radiologist are two categorical variables. I need assistance determining whether we have sufficient power and analyzing the output in R."
  • Dataset of ~350 patients, January 2015-July 2016, some done with new method and some with previous standard of care; 200 are active surveillance, meaning they've already had one positive biopsy, so final sample size will be ~150 patients with no known signs of cancer
  • There is a recognized learning curve for radiologists with this technology such that cancer detection rate goes up over time; suspect that urologists also have a learning curve
  • Don't currently know for sure whether a patient has cancer or not; best proxy is the fusion biopsy result, but best we can do is "were there more/less positive biopsies," not "did things get more/less accurate", because we don't know which ones are false negatives/false positives
  • Degrees of freedom for logistic regression model: Total df you can reliably fit = 1 for every 10-20 [limiting sample size], where limiting sample size is the minimum of [events, non-events]. Eg, with 100 patients and a 30% positive biopsy rate, limiting sample size is 100 * 0.3 = 30, so number of df you can reliably fit is ~2-3.
  • Recommendation for this study: keep things straightforward; there will always be more questions you want to answer, but trying to do too much will limit the reliability of your results (don't trust a model that's very overfit) as well as the generalizability (this model perfectly predicts results at VUMC for these 18 months, but once a new radiologist or two comes on board, results no longer apply).

2017 December 20

Rachel Labianca, Pharmacy resident

  • "I previously attended biostatistics clinic in August in preparation for VICTR grant application for my residency research project on time to antibiotics and open fracture trauma patients. I am in the process of collecting data during December, and would like to attend clinic again to ensure that I am formatting the data most appropriately to be ready for statistical analysis. I also would like to determine more specific statistical endpoints to guide my data collection now that I will a definitive number of patients and event rates."
  • Primary exposure is amount of time between admission and administration of antibiotics; recommend analyzing this as a continuous exposure vs dichotomizing (<60 vs >=60 minutes), because you lose information and power when you dichotomize (but guidelines state 60 minutes)
  • Estimated total sample size 230-240 with ~12% infection rate; some patients are missing exposure, and these are more likely to come from outside hospitals
  • Some patients die prior to full followup/opportunity to have infection; could consider doing a Cox model with competing risks (outcome = time to infection, censored = survived and never had infection, competing risk = death); collect dates of death and infection to allow for this
  • Resources on using spreadsheets for sharing data: Broman & Woo ( ); "spreadsheets from heaven/hell" (on Dan Byrne's web page: )

Gary Owen, Pharmacy resident

  • "Follow up from 11/29 re: assessing pain, agitation, and delirium practices in an international cohort. Would like to solidify a plan for VICTR application."
  • Important limitation/confounder to note with delirium outcome: If delirium screening went up between 2010 and 2016, it's possible that delirium rates could increase due to more screening, rather than a clinically meaningful increase. Look at rates of missingness for this outcome as well as rates of delirium itself.
  • Can look at delirium as a secondary outcome (duration, -free days), but make sure to look at coma and mortality as well to get the whole picture
  • If sticking to descriptive statistics and multinomial logistic regression, this project should fit within 90 hours for a VICTR voucher (adding multistate model would add time to this, if VICTR statistician has not done this before)

Jackson Cabo, medical student

  • "I am doing a project with Dr. Bailey (Surgical Oncology) regarding healthcare disparities in colorectal cancer and how they may vary depending on setting of care (i.e. type of treating hospital). A Cox Proportional Hazard Regression will be used to estimate the effects of race, income status, and insurance status in the context of hospital facility (type) on overall survival. We are looking for assistance in generation of this regression model."
  • Five types of treatment facility; data from 2004-2014, with one record per patient
  • Primary question: are social determinants still independent predictors of survival after taking type of treatment facility (academic, community, etc) into account?
  • Could take fixed or random effect approach (do not do separate models for each facility type); which to use depends on whether you want to make inferences about facility type or merely control for variability/allow different baseline hazards, as well as sample size (need more events [deaths] to reliably estimate fixed parameters for facility types; not yet sure what mortality rate is) (statisticians lean toward fixed effect in this case)
  • Data is already restricted to patients with specific type of colorectal cancer and have adequate followup information; next step is to come up with a list of potential confounders/risk factors, in order of importance
  • Funding is available via medical school research program; suggest talking to Dr Shyr about how to arrange short-term collaboration (maybe with Sam Nwosu?), if possible; if that doesn't work out, could apply for a VICTR voucher, and clinic statisticians believe this would fall within the 90-hour category

2017 December 13

Timothy Olszewski, Research dietitian

  • "Dr. Heidi J Silver is the PI on an IRB-approved retrospective study on sex differences in sarcopenia surgical outcomes. We would like to discuss data that we have analyzed."
  • The cohort of interest includes subjects with surgery for benign tumors or malignant tumors. The analysis will separate out these two groups. If someone had a benign surgery and also a malignant surgery, a covariate indicating prior surgery may be in order. Or only the first surgery should be included. The number of subjects who fall into this category need to be determined. For subjects who had >= 2 surgeries in one cohort or the other (ie., benign or malignant), only the first surgery should be considered.
  • Several outcomes are of interest -- dichotomous ones such as ED revisit, rehospitalization, death will be modeled using logistic regression; time to event outcomes will be modeled using Cox proportional hazard models, provided the proportional hazard model assumptions are met.
  • For all models, clinical knowledge should guide the choice of potential confounders to include in the model.
  • This project should fit in a standard 90-hour VICTR voucher. approved.

Carsen Petersen, Registered dietitian, MS student

  • "I have completed data collection and now need to start data analysis. Dr. Heidi Silver is my mentor on the IRB-approved retrospective study. We are examining the agreement between common prediction equations for estimation of resting energy expenditure (REE) and measured REE using the KORR ReeVue indirect calorimeter in free-living obese adults. Statistical analyses will be completed to understand if age, sex, race, or obesity grade influences the agreement."
  • The first question of interest is the agreement between the measured REE and several already established predictions of calories expended in a day. We discussed the use of Bland-Altman plots to illustrate agreement as well as using different colored points to illustrated the relationship for different sub-groups in their cohort (e.g., men/women, race).
  • In order to assess the relationship between the different factors of interest (e.g., BMI) and measured REE, a linear model with mREE as the outcome and BMI fit with restricted cubic splines could be fit (adjusting for any potential confounders). Plots of mREE by BMI would best illustrate this relationship.
  • This project should fit in a standard 90-hour VICTR voucher. approved.

2017 December 6

Kaitlyn Works, Emergency Medicine

  • "QI for the waiting room: gauging patient expectations before and after implementation of an informative video."
  • Make answer options as clear as possible (eg, if patient would say they expect to wait 2 hours, make options clear about where 2 hours would go)
  • Timing of survey: watch video and take survey after triage; need to think about how to get survey to patients who might go immediately from triage to treatment room, or come in via ambulance
  • Discussed specific changes to survey made during clinic
  • Possibility: X months of baseline data collection; X months of "passive" intervention (play video at intervals in waiting room, patient treatment rooms); X months of "active" intervention (sending patients to a specific room to watch video after triage, eg)

2017 November 29

Gary Owen, Pharmacy/Critical Care Resident

  • "We are using a large, international database of critically ill patients (collected from the international study of mechanical ventilation) to study how clinical practice compares to pain, agitation, delirium guideline recommendations. Our objectives, planned analyses, and questions are below:
  • (1) Objective: Describe current practice and changes between 2010 and 2016 (before and after 2013 guidelines) Key variables (measured daily): RASS scores, amount of sedation/analgesia, daily spontaneous awakenings, occurrence of delirium Analysis: Descriptive stats for years 2010 and 2016, pertaining to amount of analgesia/sedation, depth of sedation, spontaneous awakening, along with outcome of delirium. May compare groups before/after guideline implementation in 2013, as well." Question: options for describing adherence (e.g. how to quantify daily spontaneous awakenings for the group - average, frequency, per pt, etc.). dealing with missing data considering any delirium during admission or day-to-day delirium
  • (2) Objective: Determining which aspects of clinical practice are associated with delirium Key Variables: daily measures as above; facility, region/nation; pt specific factors Analysis: Multivariate logistic regression of delirium vs available covariates. Question: how to consider presence of delirium - any delirium during admission vs daily risk
  • (3) Objective: Identifying factors associated with deviations from guidelines. Key variables: year, location, patient specific Analysis: Multivariate analysis. Question: How to quantify adherence (all or none during admission, day to day, each component of guidelines)."
  • Prospective data collection for a month at each site, first intubation only for all intubated patients
  • First goal: describe changes in clinical practice. Can do this well for sedation/analgesia practices; RASS was only collected in 2016 so can't really look at change for this. SATs may be mandated as part of the study protocol, so much harder to determine whether usual care changed between the two time points (since these practices are prescribed by the study). For medications, most reasonable to describe daily doses (not totals over intubation).
  • Delirium is collected as daily yes/no (no ICDSC/CAM-ICU available). 2016 data has about 25,000 patients (can have >1 day each); if logistic regression is used, limiting sample size = minimum of [delirium, no delirium]; divide this by 10-20 to get number of covariates you can include reliably.
  • Fixed vs random effects: Use a fixed effect if you want to make inferences about a covariate (eg, relationship between age and delirium); use random effects if you want to account for variability (eg, between study sites) without making inferences.
  • Multinomial regression may be a better choice than logistic regression - this looks like separating normal vs delirium vs coma, instead of delirium vs [anything else]
  • Data are clustered by both patient and site; it'll be important to adjust for this (could use sandwich estimation, bootstrapping, and/or random effects)
  • Key questions for next clinic: 1) What descriptives are both most interesting and most available/reliable in dataset? 2) Do you really want to look at adherence risk factors? - this will take lots of time due to need to define "adherence," could be confounded, etc; 3) Logistic (delirium vs anything else) or multinomial (n/d/c, but more complicated) for delirium outcome

2017 November 15

Bradley Kook, Obstetric Anesthesia Fellow

  • "We aim to perform a retrospective chart review assessing variability in timing of nurse administered PRN opioids in post-cesarean patients."
  • Lots of value in a descriptive analysis here, once you have ability to classify patients together
  • Could look at intraclass correlation between nurses and patients...?
  • Can easily get lots of data from EMR
  • Would be interesting to look at years of nursing experience, possibly
  • Defining outcomes is hard
  • Planning to get a VICTR voucher; may come back to help refine outcomes and analysis plan once more data is collected

2017 November 8

Andrew Perez, Medical Student

  • "We are looking at rates of complication-related port removal when patients are neutropenic vs normopenic at the time of port placement."
  • Data already collected on cases with clinically defined neutropenia, including some demographics, whether port was removed due to infection, whether 60-day followup obtained, and reason for incomplete followup (including death)
  • Main question today: how many controls to collect data on (SD is not tenable for this situation - no ready-made variable in SD that says "port removed due to infection")
  • No data available currently on controls other than year of port placement (ie, can't stratify on much); make sure that cases and controls represent roughly the same time frames
  • Other, preferred option if feasible: Collect data on all controls, so you don't have to choose a random sample, and use neutrophil as a continuous exposure as opposed to a case/control dichotomous variable
  • To help with feasibility, could restrict data (exposure, outcome, confounders) to, say, last five years; risk losing power because you lose events, but you're still gaining some by including neutrophils as continuous; probably still net gain
  • Clinical messaging: think about a figure showing neutrophil value on the X axis and probability of port removal on the Y axis - sends a concise message, whatever the relationship ends up being (recommend modeling with restricted cubic splines)

2017 November 1

Rachel Coleman, Endocrinology Fellow

  • "Review of patients behavior regarding their insulin pumps, specifically looking at how their bolus behaviors (for example: how manual bolus events and bolus wizard events affect their a1c). Have data for 224 pump downloads (two weeks of data for each of 224 patients) in a de-identified excel file as well as R file. I would like to discuss which statistical analysis is best and how to run the statistical analysis."
  • Key question is how the manual vs wizard users are different - manual users are probably very different, have been using the pump and doing manual calculations for a long time
  • If more a1c measurements were available, probably want to do some kind of longitudinal analysis - a1c (measured every 3 or 6 months) over time vs manual vs wizard, or a continually varying measure of override vs wizard, since patients can do both
  • If stick to single download per patient that corresponds to one measured a1c, maybe look at % of overrides over previous two weeks vs a1c at that visit; if you stick with the two groups, try various levels of cutoffs vs sticking with only 90%
  • Important to take potential confounders into account: there may be factors that affect both whether the patient decides to override the wizard and the a1c (eg, patients with disease for longer may be better at managing their a1c and more likely to manually override at times). Examples - time since diagnosis, comorbidities, age, etc

Shayan Rakhit, Medical student

  • "The Sequential Organ Failure Assessment (SOFA) score, is commonly used to dynamically evaluate a patient’s severity of illness over the course of their ICU stay and contains six components measuring six organ systems. The neurologic SOFA component is the Glasgow Coma Scale (GCS), but the GCS has high inter-rater variability and is not routinely collected in ICUs. For this reason, Vasilevskis et al published (at Vanderbilt in 2016) a validation of a SOFA score utilizing the Richmond Agitation Sedation Scale (RASS), a much more reliable and reported measure of consciousness. Specifically, construct validity was determined with correlation with regular SOFA score and predictive validity was determined with regards to mortality (compared to regular SOFA score).
  • We have access to Respira/4th ISMV, a multi-center cohort across 42 countries. Our aim is to 1) validate (as done previously) the modified SOFA score utilizing RASS against the original SOFA utilizing GCS in this larger, more diverse population; and 2) evaluate if specific patient characteristics, such as direct neurologic injury, and practices, such as light sedation, affect the validity of a modified RASS utilizing RASS."
  • High rates of missingness, which are probably informative - patients without RASS/GCS are likely to be different
  • Recommend adjusting for site (or country, if site is untenable), in main or sensitivity analysis
  • Data is pretty clean
  • This project should fit in a standard 90-hour VICTR voucher. approved.

2017 October 18

Paula Smith, Surgery

*" I am using a large multi-institutional retrospective database to look at the relationship between insurance status and oncologic outcomes in Gastrointestinal Neuroendocrine Tumors. I have dome some preliminary work with this data but would like assistance from a biostatistician gaining more sophisticated understanding of my data."
  • Outcomes of interest include time to mortality and time to disease relapse.
  • Given that multiple sites are involved, fitting Cox models stratified by site would be most appropriate.
  • For the disease relapse outcome, we discussed treating death as a competing risk and calculating cumulative incidence.
  • She would like to apply for a VICTR voucher; the scope of work, along with any manuscript revisions, will fit easily into the 90 hour time frame.

Tanya Marvi, Medical student

*"We wrote a paper looking at factors associated with increased blood loss in pediatric scoliosis surgery. We are looking to address some fo the reviews from the journal regarding our statistical analysis as we prepare to resubmit the manuscript."
  • The primary concerns were lack of details regarding how the model was determined as well as whether any transformation of the data were necessary to meet linear model assumptions. Step-wise methods were originally used; we discussed how this is not the best approach and how to go back and redo the analysis using clinical knowledge/literature to determine which covariates to include. We also discussed how to check if model assumptions are met and if not how to address.

2017 October 11

Konrad Sarosiek, Plastic Surgery -- Cancelled

  • "We have a large data set of ~60,000 patients who underwent different surgical procedure combinations under the guise of ‘mommy makeover’ and we are looking to see if there is added risk when combining procedures. We are looking to find relative risk & to isolate frequent complications & identify risk factors."

Andrew Perez, Medical Student

  • "We are comparing port removal rates in patients who were neutropenic vs normopenic at the time of port placement. We are looking for a statistician to help us analyze data and I was told this clinic was the place to start getting help with that."
  • 3500 patients with ports placed; patients excluded if there was no neutrophil measurement within two weeks of port placement
  • 184 neutropenic patients out of ~3500; hypothesis is that these patients have higher rates of port removal than normopenic patients
  • Ports could get removed for different reasons
  • Recommend talking to synthetic derivative team to see if they can quickly extract data on all these patients, to save Andy from having to build dataset himself; drawback = deidentified, so no going back and getting extra info later
  • First step: come up with a detailed list of fields you'd want from the EMR/SD - potential confounders, reason for port removal, list of infections, anything you can think of
  • Take that to SD team and see how feasible it would be to get that data
  • When it's time for data collection + analysis, recommend applying for VICTR biostatistics voucher; this should fit within the 90-hour project time frame
  • Recommend data collection in REDCap, pending discussions with IRB

2017 October 4

Breanna Thomas, Meharry

  • Looking at relationship between subjects being of multiple minorities (LGB + racial minority) and anxiety, depression and substance abuse
  • Parent study is longitudinal, plan is to look at a cross section; need to make sure you know how the data (from UM) was subsetted and sent
  • Data is from parent study of 18-59-year-olds; detailed codebook and data info are available online
  • Anxiety, depression, and substance abuse are currently coded in data as yes/no variables with 45-55% prevalence (based on quick glance); make sure you know exactly how these are determined (self-report, questionnaire...?)
  • Potential confounders: age (leave as continuous if possible! - looks like it is categorized in data; ask if raw data is available); employment status; possibly others
  • Come up with complete list of confounders and prioritize them in order of importance

Justin Shinn, Otolaryngology

  • "Data is now collected regarding neck cancer metastasis in those with smaller tongue cancers. Want to compare retrospective group in those who recurred in the neck to those who did not based on pathology results."
  • 15 years of data with about 75 patients with T1/T2 tongue cancers with some followup of neck observation
  • Have some followup out to 60 months; main time period of interest is two years
  • Main outcome of interest is recurrence (particularly in neck); 10 patients died within 5 years, but most recurred prior to death (two died without recurrence)
  • Idea is that doing more neck dissections in certain patients could help prevent recurrence; no one in database had a neck dissection
  • Recommend time to event analyses with outcome = time to recurrence; patients who never have recurrence will be censored at end of followup period (two years?) or last contact
  • With above approach, patients who died without recurrence would be considered under competing risks, but with small sample size and very few of those patients, probably not worth worrying about
  • Multivariable approach = Cox proportional hazards model
  • 36 recurred out of 74 patients
  • Main exposure = tumor depth, measured in mm
  • Possible confounders: margin (positive - did they get it all? - and/or close); T1 vs T2 (use tumor size instead of categorization? - but only 7 patients in this cohort had T2)
  • Write analysis plan a priori and stick to that (can include secondary analyses, but don't push your data too hard - with 36 events, it will be hard to tell much in detail)
  • More data would be available looking at patients with neck dissections, but would be hard/long to get; getting that would allow you to say "what are the chances of recurrence if I do vs don't have the dissection, assuming all other factors [tumor size, etc] are the same?"
  • Software: SPSS is most user-friendly - if you use it, make sure you turn on a log so you can reproduce results; if you go with Stata, UCLA has good examples/docs ( )

2017 September 27

Jeffrey Weiner, Pediatric Cardiology

  • "I am evaluating a database with clinical risk factors for post-operative thrombosis in congenital heart surgery. I am evaluating known risk factors (age, weight, severity of disease/surgery, cardiopulmonary bypass time) with genetic data (genotypes for known prothrombotic SNP’s) to see if I can create a novel risk prediction model. I am having trouble (mostly software related as I am new at this), and would love a biostatistician’s insight."
  • Data on ~1000 patients, with 11% prevalence of thrombosis
  • Among other predictors, genotype is of particular interest - 7 SNPs, each with three subtypes; different SNPs have variable prevalence rates
  • Current approach: lump all SNPs into a single yes/no variable; covariates are age, weight, surgery, genotype
  • Planning to get VICTR voucher; data is in REDCap
  • Suggest looking at relationships between the genotypes: if someone with gene X always has gene Y, including both in model can be problematic; also, if very few patients have
  • Complexity of the model will be limited by minimum of (events, non-events) - in this case, roughly ~110 patients have thrombosis, so can fit 10-11 degrees of freedom max ("df" is kind of like a variable, but not exactly)
  • Clotting rarely happens in the first few days after surgery; for this reason, could consider time to event model (Cox) where event is time to clot, but would ideally account for a) time-varying covariates (severity of illness, etc) and b) competing risks (if patient dies before having a clot)
  • Abstract deadline in two weeks; for that, recommend current logistic regression model among all patients and only among hospital survivors. Hopefully those results are similar, but if not, emphasize results among survivors, because mortality could be a big source of bias and confounding in this study.
  • We believe this project would fit into the 90-hour VICTR voucher category.

2017 September 20

Pooja Santapuram, Hearing & Speech Sciences

  • "The purpose of this study is to examine the relation between language development and eye gaze patterns to audiovisual speech specifically in infants at risk for autism spectrum disorders (ASD). ASD is a developmental disorder characterized by social and communication deficits in addition to repetitive and restricted behaviors. It is known that infants at 1 year of age who later go on to be diagnosed with autism look at individual’s faces less frequently (Osterling et al., 2002) and that toddlers (18-24 months) later diagnosed with ASD use fewer vocalizations with speech sounds and greater “atypical vocalizations” when compared to typically developing (TD) toddlers (Plumb & Wetherby, 2013). Yet, ASD is typically not reliably diagnosed until 2-3 years of age. Therefore, characterization of eye gaze patterns to audiovisual speech and vocalizations in high-risk infants may facilitate earlier identification of ASD and may also allow for future studies on potential treatments in this clinical population. Questions I’d like to address or basically how best to approach an analytic plan for this study."
  • For first project, recommend calculating sample size for correlation statistic using precision (confidence interval width). Use Spearman correlation (nonparametric; does not assume that variables are normally distributed).
  • Linear regression with skewed variables: 1) Can run model and check assumptions - not interested so much in individual variables as in whether overall model fits well and meets assumptions. Try RP plots, QQ plots. 2) If assumptions are not met, can try transforming individual variables to improve overall model fit; also recommend using spline terms (or polynomials, if restricted to SPSS). This allows associations to not be straight lines, which is usually more accurate.

Joshua Arenth, Pediatric Critical Care

  • "Follow up session regarding best approach to log data into redcap for analysis as discussed a previous clinic."
  • Previous clinic notes
  • Planning to reapply for expired VICTR voucher to analyze pilot data of provider communication intervention. Discussed a longitudinal REDCap database with one demographic form (filled out at session 1 only) and a questionnaire form, filled out at sessions 1, 2, and possibly 3 (for control group only).
  • Each questionnaire's score will be an integer, 0-11. Recommend a Wilcoxon test (nonparametric version of paired t-test).
  • Planning to pitch multicenter trial for this intervention; sample size will be determined once this data is analyzed.

2017 September 13

Brian Adkins, Pathology

  • "Allo-antibodies against red cell antigens in pregnant women lead to poor fetal outcomes. As such OB/GYNs follow serial antibody titers.Traditional tube titration in slow and subjective. Automated gel titration is available but testing requires further understanding before clinical implantation. We are trying to figure out sample size and number of tests we should be running to determine clinical cut offs for antibody levels."
  • Suggest weighted kappa to address agreement between level of titration that each sample method detected antibodies at.
  • Need to get data into a software-readable format; look at the "spreadsheet from heaven" example here.
  • Variables are all categorical (1/2/4/16/etc for levels, 0/1 for differences) so nonparametric tests will likely not help.
  • If VICTR voucher is requested, this will fit under the 90-hour limit.

Paula Smith, Surgery

  • "I have a data set I am trying to run some stats on using the Stata program and I have questions about the best tests to run and how to make my data set compatible with Stata."
  • General recommendations: Use a do file in Stata to save analysis approach; write analysis plan a priori to define cohort; keep continuous information as much as possible rather than categorizing (eg, if raw BMI data is available, use that rather than categorizing)
  • Planning to submit abstract in October for conference in April; may submit abstract based on analysis already done, and work with VICTR on multivariable regression/competing risks
  • Main research question: Do adrenaocortical carcinoma patients with more resection have better/worse survival and risk of recurrence after surgery?
  • Currently analysis does not adjust for confounders or account for competing risk of death in recurrence outcome
  • Currently: used descriptive stats, KM curves; can get logrank p-values for KM curves, but need to look at proportional hazards assumption (do the curves cross? - but look at this in context of how many patients are "left" when they do cross)
  • Possible future recommendations: multivariable Cox proportional hazards model adjusting for potential confounders (age, BMI, surgery type, etc); for recurrence model, may need to use competing risks
  • This would fit into a 90-hour VICTR voucher.

2017 September 6

Shriji Patel, Ophthalmology

  • "I am conducting an analysis of Medicare Part B Claims Data and would like assistance regarding which statistical methods would be helpful in identifying trends in claims data."
  • Medicare Part B only has five years of data, only summary data available. Might be able to look into time series, but not certain that those methods will be helpful. Recommend good descriptives/visuals.

Nick Dantzker, Orthopedics

  • "Study to establish which radiographic parameters correlate with functional outcome and patient satisfaction in operative distal radius fractures. Need assistance with model for intra/interobserver reliability of radiographic measurements and overall statistical model for project"
  • ~55 wrists (final total could be up to 165, but more likely to be ~60) with injury x-rays pre-op, post-op and long-term; have 7 radiographic parameters on each at three time points (VAS pain scores, radial inclination, etc), as well as three injury ratings (one per patient); some patients have both wrists included
  • Main questions: 1) how good is interrater agreement on measures, often in degrees, and 2) are these measures (at one or both time points) actually predictive of outcomes?
  • Because a few (~4) patients have both wrists included, recommend randomly selecting one wrist from each - otherwise, confounders and outcomes from those patients will be more correlated than outcomes from different patients
  • Interrater agreement: suggest Bland-Altman plots (kappas are for categorical measures) - you don't want to see a pattern (eg, differences in agreement based on true value) ( additional link)
  • Intrarater agreement: suggest repeating measurements on ~15 wrists
  • Could do multivariable regression: outcome = [confounders - age, injury type, etc] + any x-ray info that will always be present + [one measurement, eg radial inclination]; run separate model for each measurement of interest
  • Regression is limited due to sample size - if you fit too many things in your model, it will not be generalizable to any other study (won't replicate), and we anticipate about 60 patients
  • Type of regression model will depend on exact outcome you're looking at
  • Three time points - could include an interaction term, but would likely be underpowered. Could also look at each time point with a separate model.
  • How many models are we talking? 10 measurements; __ outcomes; three time points - lots of models
  • Cohort is limited to a select subset of patients who have all three followup time points, isolated injury, respond to followup question - eg, need to be careful about how you generalize this to a general ortho population (patients with complete followup will be different from patients who are lost to followup)
  • Look at demographics of patients who responded to survey vs those who didn't - these will likely be different, which could bias results
  • Suggest applying for VICTR voucher; this will be <90 hours (typical manuscript project)
  • Feel free to come back for input on REDCap database, further discussion

2017 August 30

Jennifer Watchmaker, Radiology

  • "I would like to perform I believe an ordinal logistic regression. I have outcome data and also a continuous variable. I would like to know if the continuous variable (obtained pre-procedure) predicts outcome. I would like to also gain a sense of what additional analysis I can do with my dataset. I have 300 procedures worth of data on redcap."
  • 300 procedures (multiple procedures per patient - about 175 unique patients) are being reviewed by radiologists, currently only one read each; recommend having at minimum a random sample of these reviewed by multiple readers to gauge interrater reliability
  • Main outcome is ordinal, ranging from no response (0) to full response (3)
  • Main exposure is NLR (neutrophil:lymphocyte ratio) - hypothesis is that patients with a higher NLR are less likely to have a treatment response
  • Collect info at time of procedure and two months post-procedure; so far have excluded patients who are lost to followup for any reason
  • Recommend a proportional odds logistic regression model due to ordinality of outcome; will need to adjust for the fact that there are multiple procedures per patient
  • Do *not* do univariate testing to determine what covariates to include in the model; rather, decide based on clinical knowledge/literature review what are important potential confounders. Rough estimate is that you could include 10-15 parameters in this model.
  • Examples on how to do ordinal logistic regression in R ( and Stata (; look into clustered sandwich estimation of variance to account for within-patient correlation
  • If not enough time to figure out clustered sandwich estimation before abstract submission, recommend using just first procedure per patient and applying for VICTR voucher
  • Additional reason to get VICTR voucher: likely that NLR has a nonlinear relationship with the outcome, which makes an accurate model more complicated to fit and interpret
  • If VICTR voucher is required, this project will require 90 hours or less

2017 August 23

Rachel LaBianca, Critical Care Pharmacy

  • "Studying open fracture orthopedic trauma patients to determine whether there is a difference in infection rates for patients receiving antibiotics within 60 minutes of presentation versus those receiving antibiotics >60 minutes from presentation. Would also like to conduct analysis to identify other factors impacting infection rate. Would like assistance with statistical design and help in determining whether VICTR application will be needed for this project."
  • They anticipate approximately 100-200 subjects in their analysis with about 20% infection rate. Their primary outcome is infection rate; therefore, they are somewhat limited on the complexity of the model they can fit without using data reduction techniques.
  • They have identified potential confounders and will put them in order of importance.
  • We discussed the use of splines for the time to antibiotic if that is their primary exposure of interest. We also discussed whether they could fit a model with type of antibiotic, categorized into 3 main categories as their primary exposure of interest.
  • They will return to clinic to discuss setting up the data to ensure a smooth transition to the analysis phase.
  • The analysis is fairly straight-forward although could be a bit more involved if some kind of data reduction technique is used, such as propensity scores. The analysis should easily be completed within 90 hours giving enough additional time for manuscript preparation/revision.

2017 July 26

Joshua Chew, Pediatric Cardiology

*"We are completing a retrospective study evaluating a new echocardiographic measure (pulmonary pulse transit time; pPTT) in pediatric pulmonary arterial hypertension (PAH). Our cohort includes roughly 20 PAH patients with 2:1 age/sex matched controls. Our initial analysis demonstrated a difference in pPTT between PAH patients and controls. We also saw an association between pPTT and a crude measure of right ventricular function. We are now performing follow-up measurements to obtain an objective measure of RV function (myocardial performance index; MPI). We would also like to explore the relationship of pPTT with hemodynamic data and clinical outcomes in PAH patients over time. The questions we would like assistance with are as follows: 1. What is the most appropriate approach to evaluate the relationship between pPTT and MPI, taking into account that we suspect it may not be linear? 2. Would appreciate recommendations on analysis plan for the longitudinal analysis of pPTT in PAH patients. What sorts of hemodynamic data/outcome measures are most appropriate? How do we deal different follow-up times and different times between echocardiograms? How do we account for patients being on different therapies during the follow-up time?"
  • We discussed the use of splines to relax the linearity assumption in any models that may be fit. In order to avoid over fitting with linear regression, we follow the rule of thumb of estimating 1 parameter for every 10-20 subjects.
  • We discussed the use of spaghetti plots to illustrate the trajectories of the pPTT over time in the 20 PAH patients, including using colors to indicate those with different therapies or who may have died.
  • As a potential secondary analysis in the 20 PAH patients, a mixed effects model with a random intercept could be used with pPTT as the outcome and time and type of therapy as the covariate, including an interaction. The number of classes of therapies will need to be discussed given the small sample size and the potential for over-fitting.

Rachel Sosland, Urology

  • "Urinary tract infections may affect as many as one third of patients undergoing intradetrusor onabotulinumtoxinA (BTX-A) injection for medication-refractory overactive bladder (OAB). We have retrospectively collected data on 70 patients undergoing intravesical botox injection in 2016 and seek to identify potentially modifiable risk factors for post-operative UTI in patients with non-neurogenic OAB. Several of these patients have undergone multiple injections. We would like to assess their risk for UTI over time and over multiple different injections. We are seeking statistical support to assist in determining the best way to analyze this data in the same patient over time with multiple injections."
  • We discussed several options to address their question of interest. One potential option is to use Poisson or negative binomial regression with the number of UTIs as the outcome for a given person adjusting for covariates of interest such as where the injection was received (OR or clinic), class of antibiotic received (number of classes will need to be discussed to avoid over fitting), and including the varying follow-up times as an offset.
  • We also discussed how to address the question of risk factors for multiple UTIs. Subsetting on those who had at least 1 UTI, a logistic regression with the outcome indicating whether the subject had > 1 UTI and adjusting for pre-determined covariates would potentially address this.

2017 July 19

Cyrus Adams, Surgery/Urology

  • "We are currently investigating patient factors (demographic and clinical) in a group of adult patients with congenital genitourinary disorders. We currently have a redcap database of ~150 patients who recently presented to the adult clinic meeting this criteria. We are interested analyzing patient demographic and clinical factors that may be associated with renal dysfunction at the time of presentation to the adult clinic (measured by decreased GFR and/or the presence of hydronephrosis or renal scarring)."
  • Second question: Renal dysfunction measured by GFR - typical cutoff is <60, but we recommend also analyzing with GFR as a continuous variable (allows you to keep all information and not make false dichotomies). Only 9 patients had renal dysfunction when categorized, so definitely recommend keeping that outcome continuous. This outcome is fairly normally distributed, which is helpful.
  • First question: predictors of being followed as pediatric patients. Outcome is determined by MDs via chart review. Have 58 Nos ("non-events") and 93 Yesses ("events"); can reliably fit about six degrees of freedom (roughly equal to six variables) in a logistic regression model.
  • Recommend doing graphs of descriptive statistics, overall and by pediatric followup status.
  • Prioritize potential covariates in order of importance/relevance; consider missingness when doing this (if a variable is clinically important but only measured in the hospital, eg education and health literacy, it is less helpful here).
  • Plan is to apply for VICTR voucher.

2017 May 17

Sara Nelson, Anesthesiology

  • "I visited about a month ago to talk about our drug cost project. I've fit a model, but have some concerns that the assumptions haven't been met. I would like to meet and talk about the current model and possible alternative options such as random/mixed regression. If possible, it would be great to pull up the analysis in R (I can bring my laptop)."
  • Collinearity: try Hmisc::redun() for a redundancy analysis, or varclus() from same package.
  • Might try a negative binomial model due to distribution of cost outcome. To incorporate random effects, might try lme4::glmer.nb()
  • Available variables: procedure code; attending anesthesiologist (248 unique); in-room provider (CNA/resident; 572 unique); surgeon; surgery start time; duration; ASA class; age; gender; case type (surgical specialty; 9 unique); institution (VUMC/MGH); cost; base relative value units (?); provider team (3500 unique)
  • No data available on specific medications used
  • Is duration a strong surrogate for complexity? Strong enough to leave out case type?
  • Possibility: cost ~ ASA class + age + gender + duration + institution + age*institution + ASA*institution, random effect = attending; adjusting for other variables (case type, etc) is likely an artifact of things like duration and will result in collinearity
  • Adjusting for attending would get at what is likely driving at least part of the cost, which is sedation choice, but no way to tease that out

2017 May 17

Tanya Marvi, SOM, Medical student

  • "My project is looking at platelet count in patients with musculoskeletal infection. Patients were categorized into local, disseminated, and complicated infection, and I used an ordinal logistic regression to see if we could predict how they would be categorized based on their day 1 platelet count. Additionally, I used rloess to look at the trend of the platelet counts among the different groups. I want to make sure I am interpreting the results correctly and see what other analysis I should consider."
  • The outcome of interest is a 3-level outcome describing severity of infection in patients who present to the ED and are subsequently admitted. One question of interest is whether there is an association of platelet count and type of infection diagnosed. A proportional odds model was fit with platelet count as the single covariate. We recommended fitting splines to the platelet count (3 knots should be fine, given the sample size), and clinically determining what potential confounders are of importance to include in the model. Proportional odds assumptions should also be checked.
  • Because subjects had differing lengths of stay in the hospital, a Cox model could also be fit with time to discharge as the outcome and adjusting for pre-selected confounders.
  • A second question of interest is whether time-varying platelet count is associated with type of infection. A proportional odds model can be fit with robust standard errors. There should be an option in Stata to request the robust estimates.

2017 May 10

Sara Nelson, Anesthesiology

  •  =We performed a study assessing the variation in anesthetic drug costs. We did so by creating multivariate linear regression models in R. We've received reviewer comments with various suggestions we would like to implement, such as combining two of the models. I believe this will involve creating a nested variable; however, I am struggling with getting this to work in R.=
  • Current models have a fixed effect for in-room provider (561 df). Suggest replacing this with a random effect, or possibly a nested random effect (attending -> in-room provider). R package nlme or lme4 (newer) might be good for this.
  • Might also look at interaction between case type and duration.
  • Make sure that costs are the same per patient regardless of insurance.
  • Make sure to check model assumptions, even after outcome is transformed.
  • For all model covariates, show a point estimate + CI for the effect on cost. For age especially (modeled with restricted cubic splines), would be good to also show a visual of age vs cost.
  • Potentially reframe paper as "potential predictors of cost" vs "how much variance in cost can we explain."

Bryan Hill, OB/GYN surgical fellow - CANCELLED

  •  =Reporting complications after surgery are important for quality improvement. Two methods of finding complications are: 1) administrative data from diagnosis codes and 2) key-word search from a manual chart review. We suspect the administrative reporting method, under-reports complications. Primary aim: determine sensitivity, specificity of the administrative method compared to the manual reporting method Secondary aim: Determine which risk factors are associated with having a complication. #1: Question for statisticians: would the best way to look at our secondary aim be to create a regression model with the outcome "complication" and variables age, body-mass index, setting (outpatient or inpatient), sling type, attending, anesthesia time, operation time, smoking history, diabetes? #2 Sling type is heavily dependent on attending (they like to chose a particular brand or type). How do we adjust our model for that?= 

2017 May 3

David Leverenz, Internal Medicine

  • "We have developed an educational podcast for our internal medicine residency program. We are studying the effects of this project through pre and post-intervention surveys. I would like assistance in the statistical comparison of pre and post-intervention survey results."
  • Emphasize descriptive statistics (pre vs post) over p-values. If p-values are needed, chi square tests for categorical variables and Wilcoxon/Mann-Whitney tests for percentages are useful. Also think about boxplots to show variability in data instead of a single summary statistic.
  • Include measures of variability (interquartile ranges) as well as summary statistics (median).
  • Make sure and address differences in response rate in pre vs post, and describe any differences in patient populations.

2017 April 26

Susan Smith, Critical Care Pharmacy

  • "Purpose of project is to examine the efficacy of a short versus long duration of antibiotics for the treatment of intraabdominal infections. We would like help with our binary logistic regression model."
  • This is a really complex analysis due to immortal time bias, confounding, etc. Option A would be a Cox model for time to treatment failure with primary exposure = daily antibiotic use. This is a complex model to fit for a non-statistician; suggest contacting VICTR to see about stats support for this.
  • Option B... maybe a Kaplan-Meier curve removing patients from N at risk as they go off antibiotics?

Jamie Robinson, Surgery

  • "I would like assistance with a regression analysis looking at factors that may affect survival after portoenterostomy for biliary atresia."
  • Data set has 48 patients with this rare condition; about 20 had the outcome of interest (transplant or death). This limits what can reasonably and reliably be put in a survival model.
  • Consider lag time between being placed on the transplant list and actual transplantation - would be good to describe this.
  • When fitting model, use a Cox proportional hazards model (coxph in R's survival package; outcome will be created with the Surv() function; look at vignettes and/or look for UCLA tutorials). Descriptives and qualitative info will be helpful with a small population.
  • Choose a common followup time - maybe five years, two years? Look at minimum/maximum followup time to determine.

2017 April 12

Nishant Ganesh Kumar, Plastic Surgery/Medical School

  • "Would like to conduct a multi-regression analysis of opiate use and hospital length of stay against other variables being studied in an Enhanced Recovery after Surgery protocol for microsurgical reconstruction."
  • Primary outcomes are hospital length of stay and total opioid use. Hospital LOS has a very skewed distribution; original analysis used linear regression. We recommend checking RP plots and, if assumptions are not met, using either ordinal logistic regression or a Cox proportional hazards model with time to hospital discharge as the outcome.

2017 April 5

Ashley McCallister, Pharmacy

  • "My research project is in the NICU on Vitamin A use. I need help identifying what types of statistical tests should be run on the data."
  • Primary outcome is BPD (yes/no); secondary outcomes are discharge on oxygenation and days on the ventilator
  • Currently patients who died in the NICU are excluded; this will present severe limitations due to confounding, but without statistical support it is complicated to account for death when including all patients
  • For dichotomous variables, can use chi square test (vitamin A exposure vs BPD, eg). For days on the ventilator, use a Wilcoxon rank sum test (like a t-test, but does not assume normality)

Ida Aka, Clinical Pharmacology

  • "I need help with my sample size calculation for my PPI and SSRI projects. Both projects are looking at CYP2C19 *2 and *17 variants."
  • Extended discussion on how sample size/power can vary depending on genotype proportions in the sample; will need to investigate distributions of both genotypes and outcomes to decide how many patients are feasible to genotype and what kind of tests to use in eventual analysis

2017 March 23

Kristy Broman, Surgery - No Show

  • "The question I am trying to answer is whether there is a way to compare two incidence ratio. I am using the SEER database and SEER Stat which has built in modules for calculating age standardized incidence ratio for specific events. The output I get is the total N, the total event number, and the standardized incidence ratio. This is essentially the ratio of observed to expected, but I cannot know how the expected is determined (this is a "black box" within the module. So I want to know if there is a way to essentially compare the already calculated standardized incidence ratios."

2017 March 15

Leslie Fowler, Anesthesiology

  • "Prior to developing a Residents as Teachers curriculum within our department, Dr. Robertson and I sought to gain insight into the teaching perspectives of our residents by administering the Teaching Perspective Inventory (TPI). We administered a follow-up survey to gather information regarding dominant and recessive teaching perspectives."
  • "Our manuscript was accepted for publication with revisions. One reviewer notes indicated we should consult a statistician to see if raw data can be used for other statistical analysis as well as a T test. Another reviewer comments stated to consider researching a theoretical framework to base the research design. Should we conduct a T Test with data we collected? Is that the most appropriate? Can the raw data be used for other analysis?"
  • An already validated survey was used to evaluate teaching mode preference in 2nd, 3rd, and 4th year residents. This validated survey converts the raw scores to weighted scores in each of five different teaching modality preferences. Frequencies of primary teaching modalities were computed according to whether the resident planned an academic or private practice career. In order to assess whether there was a difference in the distributions of the frequencies of primary teaching modalities across the academic/private practice groups, we recommended a chi-square test. In addition, we recommended doing a Wilcoxon Rank Sum test using the raw scores across the two groups. Finally, in order to better visualize the distribution of the raw scores, we recommended creating box plots of the raw scores for each group.

2017 March 8

Amol Utrankar, Anesthesiology

  • "I am a medical student working with several members of Department of Anesthesiology on a project examining factors associated with in-encounter mortality among patients who are escalated to the intensive care unit following multiple rapid response team activation events, using a sample of 80 patients from 2016 VU rapid response data. I have several continuous and categorical variables of interest (Sepsis Related Organ Failure Assessment, organ dysfunction by system, age, gender, referring rapid response team, and hours elapsed between rapid response events. I would like to double-check my statistical methods with someone who has more experience in statistical analysis and Stata; I've been using chi-squared tests, Fisher's exact tests, and logistic regressions to assess associations, but want to make sure that I'm applying these tests correctly and setting up my variables properly."
  • There are about 87 subjects in their data with about 20 deaths. They are interested in exploring the association of different risk factors with mortality. We discussed the rule of thumb governing how complex of a logistic regression model could be fit (include roughly one covariate for every 10-20 deaths).
  • They have more covariates of interest to include in the model than degrees of freedom allowed without over fitting. Therefore, we discussed avoiding using univariate analyses to drive model selection. Rather, clinical knowledge and literature reviews should help govern what selecting the models of interest.
  • There are several complications that are of potential interest as covariates. One way of including all of them in the model is simply to create an indicator for whether any complication occurred or not or to sum them up and include the total number of complications.
  • We also discussed ways of displaying the data to help tell the story. One suggestion was to create boxplots and strip charts of the number of hours between the first call to the ICU team to when the patient was elevated to the ICU floor, stratifying by survival status. Points on the graph could be coded by shape and/or color to indicate sex or age or any other categorical variable of interest.
  • A potentially more complicated analysis that would account for variability in the different ICU teams' threshold for elevating a patient to ICU would be to fit the logistic regression model with a random effect. This may not converge due to the small number of ICU teams (~4-5).

2017 March 1

Susan Smith, Critical Care Pharmacy Resident

  • "This is a retrospective study examining the effects of neuomuscular blockers on time to abdominal closure in trauma patients undergoing damage control laparotomy managed with an open abdomen. I would like help determining what type of regression is most appropriate to answer two different questions regarding my data set: 1. Does neuromuscular blockade affect the time to abdominal closure following damage control laparotomy? 2. Does neuromuscular blockade affect the time to goal RASS? For the first question, at least one of the covariates is time-dependent. I also have a few specific questions regarding how to interpret the results form these analyses."
  • Recommend a Cox proportional hazards model for all outcomes (time to...). No need for time dependent covariates (all covariates are baseline).
  • Also recommend including patients who died before primary outcome (abdomen closure) - this will make results more generalizable to all patients who receive this procedure, vs. those who survive (which we can't know when a patient is admitted).
  • For help with SPSS, look for UCLA tutorials on Cox regression. Interpreting output

Brian Adkins, Pathology

  • "I am comparing rates of atopy in patients with allergic transfusion reactions. I need help calculating significance."
  • Strongly recommend collecting data for a control group (patients who received a transfusion and did not have an allergic reaction). Might be possible to do this using BioVU. With current data, you can describe the prevalence of allergies among patients who had a transfusion reaction, but can't draw any statistical conclusions about a difference in allergy rates between them and other transfusion patients.

2017 February 22

Sara Nelson, Anesthesiology

  • "We are looking to determine the effect of the pain consult service on mortality and morbidity in rib fracture patients. The protocol for the consult service was implemented in 2013. Our data is from 2010-2015, so I think there needs to be a before and after analysis utilizing matching. Mortality is the primary analysis, there are numerous secondary analyses--pneumonia, respiratory failure, 30-day vent free days, 30-day ICU free days, length of stay and tracheotomy."
  • 1152 patients seen by consult service after implementation; total data set has ~5000 patients, but not all data is available before protocol implementation
  • Raw mortality rates are 7% prior to implementation and 3% after; about 400 deaths in the data set
  • Recommend excluding patients seen after chest service began, but before official protocol implemented - too many unknowns and variables in this group to allow for clean conclusions
  • Main research question: are outcomes different among patients who met screening criteria, or would have met screening criteria, before and after implementation of the screening protocol? -> Need to exclude patients who never would have met screening criteria
  • Recommend Cox model for mortality, proportional odds logistic regression for other continuous outcomes, logistic for pneumonia, etc; limiting sample size is number of events (Cox model) or minimum of events/non-events (logistic)
  • Matching is probably the cleanest way to do this - match on age, number of rib fractures, ISS? Or match on propensity score: create model for propensity of being screened (among patients in post-implementation period) using data available, then use that model to calculate propensity score for all patients in pre- and post-implementation periods and match on that

2017 February 15

Elena Nedelcu, Pathology

  • I need assistance with choosing the right test to interpret correlation between variables and outcome and perform them
  • Data was collected on 324 liver transplant patients over three phases: baseline, practice changes, and post-implementation; main exposure is blood utilization, outcomes include LOS, mortality, discharge disposition
  • Potential for mixed effects model - want to account for surgeon
  • With multiple outcomes and end goal of manuscript, statisticians recommend 90-hour VICTR voucher ( )

2017 January 25

Laurie Tucker, Department of Pediatrics - Postponed

  • Follow-up to previous clinic visit.
  • Data looks great. There are a few additional pieces on the list that Laurie is trying to obtain from StarPanel. Notes from previous sessions are below.
  • Clinic statisticians estimate 90 hours for VICTR application.

Johnny Wei, Medical Student/Anesthesiology

  • "I am a 3rd-year medical student who is working with the Department of Anesthesiology on a project investigating demographic and clinical factors associated with post-operative opioid use. In short, we are looking at what factors (i.e. age, sex, type of surgery, etc) are associated with having an opioid or benzodiazepine prescription at various time points in the 12 months after a procedure. I have a rough idea of what types of figures I would like to create, and have already created the initial iterations of them. However, because I am a relative novice regarding biostatistics and using my analysis software (Stata), I’d like to discuss my methodology and my analysis process and see if I’m doing anything inappropriate with my data management or analysis. Most of the tests I have been running are chi-square/Fisher’s and logistic regressions, and I would appreciate advice on the appropriateness or the optimization of these tests. In short, I’m more interested in someone looking over the work and code I have done so far, and seeing if there are any major red flags in my methodology rather than coming up with the analysis plan itself (although advice on the latter would be very much appreciated)."
  • Recommend removing p-values from Table 1, un-collapsing outcome to regain information lost in categorizing, and using Kruskal-Wallis test instead of one-way ANOVA.
  • One resource for sample size determination:

2017 January 18

Niels Johnsen, Urologic Surgery

  • "We are working on a project that attempts to determine predictors of bladder rupture in patients following blunt-trauma pelvic fractures. A prior study was performed at an outside institution with similar (or intended to be similar) methods using a smaller cohort of patients. We chose all bladder rupture patients plus control pelvic fracture patients without rupture (4:1) and have the data on these patients. The hope is to identify clinically significant predictors of bladder rupture based on fracture configurations and then to devise a clinical prediction model to risk-stratify patients who present with pelvic fracture for having bladder ruptures. I have attached the previously published similar study that I'm referring to as a reference and will bring the deidentified database with me on Wednesday."
  • Motivating paper uses univariate variable selection and stepwise backwards selection to create the final model. We do not recommend either of these.
  • We do recommend choosing a pool of potential predictors based on clinical knowledge and available data, prioritizing based on potential clinical importance.
  • 140 bladder rupture cases (minimum event size)
  • Reference - Frank Harrell's Regression Modeling Strategies (chapters on predictive modeling and data reduction)
  • Planning to submit VICTR voucher when mechanisms are available again (check with Lesa Black); we estimate 90 hours to develop and validate the prediction model and prepare manuscript

Joel Musee, Department of Anesthesiology

  • "We have put together a study to examine whether a commonly used perioperative device (Lifebox), can be used to alert clinicians of hypoperfusion. The lifebox is pulse oximeter and measures oxygen saturation on extremities. The monitor has a graphical read out made up of 15 bars with more bars associated with a better signal, a proven surrogate for perfusion. We hypothesized that mean arterial pressures of 55 or less would not lead to a perfusion signal of 5/15 bars. The biggest questions is how to best analyze the data to test our hypothesis and also what kind of power we would need for a study like this."
  • Recommend not dichotomizing unless absolutely necessary - above scenario, for example, treats Lifebox measurement of 6 and 15 as exactly the same, which is likely not true
  • Data will be manually collected by staff looking at blood pressure and Lifebox at the same time
  • Likely repeated measurements on each patient (during preop, while administering anesthesia...)
  • Recommend collecting information on other variables - age, amount of sedation at a given measurement, etc; make sure some kind of patient identifier is present
  • Use longitudinal database in REDCap for data collection
  • Recommend fitting a spline (search "restricted cubic splines") - device might be more closely associated with MAP at certain points than others
  • Could be completely separate patient populations - one could be women undergoing C-sections; might be advisable in this case to do subgroup analyses?
  • Will probably apply for VICTR voucher - estimate 90 hours
  • Think about how many patients (in each group?) are feasible to enroll and how many time points could be measured
  • Next step would be to repeat the study in Kenya - possible that we'd see different results if MAP tends to be different, for example

2017 January 11

Chelsea Isom, General Surgery

  • Updates to project.
  • Chelsea has cleaned the data and done some preliminary analysis - project will now probably take ~60 hours to complete. See notes below for details.

Maie El-Sourady, Internal Medicine

  • "I am an attending physician on the Palliative Care consult service and I have been collecting data from the learners that rotate on our service for the past 4 years. They compete a pre-test and a post-test evaluating their comfort level with basic palliative care topics. I would like help doing the statistical analysis with this data. I have several cohorts (medical students, internal medicine residents, visiting fellows, etc) but the test is the same for all of them."
  • Recommended starting with boxplots of raw data to look at differences in overall distribution for pre and post scores on each individual question. Excel doesn't have a template for this - look into SPSS.
  • Possible further investigation could involve collapsing into categories using Cronbach's alpha or similar and looking at differences in pre/post by type of learner (resident/student/other).
  • Recommend asking if there is access to a biostats collaboration plan; if not, and if further analysis is needed, can apply for a VICTR voucher (probably 40 hours).

2017 January 4

Laurie Tucker, Department of Pediatrics - Postponed

  • Follow-up to previous clinic visit.

2016 December 28 - canceled due to holiday

2016 December 21

Chelsea Isom, General Surgery -- Postponed

  • Updates to project.

2016 December 14

Laurie Tucker, Department of Pediatrics -- Canceled

  • Follow-up to previous clinic visit.

Andrew Smith, Pediatrics/Anesthesiology - No show

  • I am currently embarking on a multicenter look at variations in value delivery to critically ill children across congenital cardiac surgical centers across the US, using data merged from two data streams, a clinical registry (Pediatric Cardiac Critical Care Consortium or PC4) and an administrative data set (Pediatric Health Information Service) to try and pull together the numerator and denominator of the value equation… I was wondering who would be able to help me think about how best to look at cost (and value) comparisons from a statistical standpoint, with respect to outcomes including mortality. Specifically, given that some children die relatively soon after surgery, they may not incur substantial cost though one would also argue that they didn’t get the “ alue" they wanted from their episode of care… I’m thinking about this from a censoring and survival curve/Kaplan-Meier standpoint, but I’m sure it is more complex than that… which is where I think some healthcare economic statistical prowess would come in handy.

Kazeem Oshikoya, Clinical Pharmacology

  • Requesting help with interpretation of a data analysis.
  • Looking at risk factors for composite adverse event (change in BMI, increase in blood sugar, others) among pediatric patients prescribed risperidone for at least four weeks. Observation period is 16 weeks (>=4 weeks of risperidone + additional weeks up to 16). Eventually will look at genetic variants but focusing on this for now.
  • Currently has data on 210 patients; among these, has 45 events. Number of parameters that can be included in a logistic regression model is the minimum of (events, non-events) / 10-20. So, with 45 events, can include 4 (maybe 5) parameters; any more, and the model will be overfit, meaning it will be perfectly fit to this data set but will have radically different results if applied to a different cohort.
  • Recommend not doing testing on univariate descriptives: for example, might be OK to describe age among patients prescribed risperidone on vs off-label, but don't test this difference. Can be misleading due to presence of confounders. Use clinical judgment/literature to prioritize which covariates should be included in the model.

2016 November 30

Laurie Tucker, Department of Pediatrics

  • Follow-up to previous clinic visit.
  • Went over spreadsheets of data collected since last visit and made suggestions: data dictionary to indicate what each variable level means; get ICD9 codes in addition to CPT codes, and think about how to cluster ICD9 codes; think about clinical outcome variables to represent general questions of interest (eg, we can model language category vs level of ED triage).
  • Plans to straighten out data issues and come back to clinic 12/14.

Alexander Hawkins, Department of Surgery - had to cancel

  • "Working with patient satisfaction scores and looking at association between disease processes. Would like help with how to interpret scores and adjust for providers, pain scores, etc."

2016 November 23 - canceled due to holiday

2016 November 9

Chelsea Isom

  • " Approximately 20% of patients with colorectal cancer (CRC) present with metastatic disease-most commonly to the liver or lungs. Successful resection of these metastatic foci leads to significant long-term survival. Less commonly, patients present with isolated metastasis to non-regional lymph nodes (NRLN) and little is known regarding the role of resection in these patients. The primary aim of this study is to evaluate the outcomes of patients with CRC who undergo resection of NRLN metastasis. A retrospective cohort study of patients diagnosed with CRC and NRLN metastasis was performed using the Surveillance, Epidemiology, and End Results database (2004-2012). Demographic and clinical factors will be compared for patients who underwent resection of NRLN metastasis and those who had not. Kaplan-Meier and log-rank analysis will be used for survival analysis. Logistic regression analysis will be used to assess factors associated with resection of NRLN metastasis."
  • Data set is one record per patient, 829 patients of interest. Limited data available on potential predictors (registry data). Suggested things to look into: propensity for getting surgery vs not, competing risks, multiple mortality models looking at patients who have had opportunity for at least 1 year of followup and the subset who has had at least 5 years of followup.
  • Estimate about 90 hours for data management, analysis, manuscript writing and revisions.

Mark Clay & Ashley Newell, Pediatrics (Cardiology/Critical Care)

  • "Restrospective project looking at increased BMI as a risk factor for increased resource utilization in patient after Bidirectional Glenn procedure. The data was previously analyzed use a Loess Regression using R software. The data has been edited and we are seeking help with repeat analysis and graph generation."
  • Need to look at model assumptions for prior analyses - for example, residual vs predicted plot. Based on distribution of LOS and ventilator hours, we have concern that model assumptions are violated and therefore the model results would not be reliable.
  • If that's the case, look into perhaps a negative binomial model in R (function glm.nb() in the MASS library.
  • Continue keeping Z score for weight as a continuous variable in the model. Consider adding patient location (followup at VCH vs clinics in other areas) to model, but may need to prioritize covariates: With 109 patients in a linear regression model, can only have ten degrees of freedom (roughly corresponds to covariates) and still trust model results.

Leah Hauser, Otolaryngology

  • "Studying olfactory (smell) dysfunction in CRS. There is some prior evidence that tissue eosinophilia contributes, but this role is controversial. Our 3 major questions are: 1. Does objective olfactory function measured by age/sex adjusted UPSIT score correlate with tissue eosinophil counts?; 2. Is olfactory function in eosinophilic CRS due to tissue eosinophilia or disease severity?; 3.Is the effect of eosinophilia on olfactory function associated with type of CRS (CRS vs CRSwNP)? We think that preliminary data analysis shows that eosinophils counts (column J) correlate moderately with UPSIT score in CRSwNP but not at all in CRS(without NP), but we suspect this may be due to worse disease rather than the eosinophils themselves. We are not sure how to best analyze our data to determine the etiology of olfactory dysfunction."
  • Looked at distribution of outcome (UPSIT scores, raw and adjusted); distribution is bimodal, which makes linear regression problematic. Consider other regression options like proportional odds (aka ordinal) logistic regression for multivariable associations.
  • For univariate associations, Pearson correlations are probably invalid for the same reason; use Spearman (rank) correlations instead.
  • Clinically investigate reason for bimodal distribution.

2016 October 26

Katie DesPrez, Critical Care?

  • "Retrospective clinical project on ARDS. Briefly, I am interested in understanding whether my correlation between the variable I've called OSI and mortality is valid even in patients who have no blood gas (i.e., in this data set, patients who do not have the variable OI). Preliminarily it does not seem to be, but I'm wondering whether this is because the data is underpowered for that particular analysis."
  • Using Stata to compare non-nested ROC curves:
  • Could also do a model with both oxygenation variables and see whether blood gas version adds additional predictive value after adjusting for pulseox version.
  • Also suggested looking at time to death (vs died/survived), and looking at SSDI for death dates especially for patients discharged to hospice.

Laurie Tucker, Department of Pediatrics

  • "A project looking at the acute health care utilization patterns of non-English speaking patients in comparison to English speaking patients. The study is set up as a retrospective cohort study. We have gather data from Star Panel, and I would like a bit of help determining the next steps in analyzing the data."
  • First step is to determine who exactly data has already been collected on: patients who were already established as of July 2013, or does it include patients who were born or were established after that date? If the former, everyone should have the same followup time; if the latter, need to deal with different followup times in analysis.
  • Also think of potential confounders for relationship between language group and rate of acute care visits - does one group have higher severity of illness, for example.
  • Recommend coming back to clinic after discussion with data colleagues. Planning to apply for VICTR voucher.

2016 October 19

Billy Cameron, Surgical ICU/Trauma

  • "I am currently working on a project to justify a Nurse Practitioner team in the Trauma division at VUMC. We performed a 12-week pilot, for which we have good data showing decreased length of stay. One of the data points was to compare an acuity scale: Injury Severity Score (ISS) to show that acuity remained pretty even from the previous year before the pilot compared to the pilot period. I am trying to figure out what the statistical significance of the difference is (we are wanting to show that the level of patient acuity according to the ISS was relatively stable). The comparison period prior to the pilot, the ISS was 12.62 (scale of 0-75) for n= 281. For the pilot period, the ISS was 11.99 for n= 332."
  • Preparing for presentation to leadership and want to show that difference in LOS is not due to clinical factors like difference in mortality or ISS between retrospective and pilot period. For these purposes, recommend t-test or Wilcoxon test (depending on distribution of data) comparing ISS scores between the two periods, and chi-square test for proportion of deaths during each time period.
  • For eventual manuscript, will need more advanced analyses; recommended going through VICTR for statistical analysis support.

Joseph Kuebker, Endourology

  • "We are attempting to design a study to see if the effective dose (radiation exposure) for a particular type of xray we do is comparable to the generally accepted historical average. Specific questions are how many patients we should enroll to detect differences of >10% (if possible) and what tools to use given we are comparing against a generally accepted number and not against an actual groups of patients/exams."
  • Main "punch" will be plotting the data to describe it. Try boxplots with raw data, perhaps seeing Example 2 here for guidance in Stata:
  • If a test is absolutely necessary, a one-sample t-test (or nonparametric version, depending on distribution of the data) would be most appropriate. But main value of project will be describing how VUMC patients are dosed compared to current guidelines (single number). Recommend minimum of 20 patients per clinical category (eg, female overweight, male normal weight...).

2016 October 12

Jessica Grahl, Pharmacy

  • Basic question: whether antimicrobials are associated with an increased risk of delirium among critically ill patients, using an established cohort (BRAIN-ICU) plus additional data collected from StarPanel.
  • Mental status changes daily and can be normal, delirious or comatose. One idea: multinomial regression looking at status "tomorrow" vs antimicrobial use + covariates "today." How to account for repeated measurements within patients? (Jennifer and Rameela have done cluster bootstrapping; this is complicated and takes awhile)
  • Simpler idea: take out all comatose days from outcome, only look at delirium vs normal status. This limits what we're able to say from the study, but does simplify analyses.
  • Antimicrobials = antibiotics, antivirals and/or antifungals. Patients are often on >1 of these classes and in the case of antibiotics, could easily be on >1 drug of the same class. Also could be lots of interactions between subclasses and other confounders/modifiers.
  • Depending on the type of analysis chosen, this could be a very time-consuming project that might bump it up to the highest level of VICTR projects. If the logistic regression approach (or something similar) is chosen, estimate 90-100 hours for a typical project with data management, manuscript revisions, etc.

Joseph Kuebker, Endourology

  • "We are attempting to design a study to see if the effective dose (radiation exposure) for a particular type of xray we do is comparable to the generally accepted historical average. Specific questions are how many patients we should enroll to detect differences of >10% (if possible) and what tools to use given we are comparing against a generally accepted number and not against an actual groups of patients/exams."
  • Missed clinic

2016 October 5

Debra Braun-Courville, Pediatrics

  • Project is looking at weight gain among adolescents given a specific type of birth control; VICTR application was sent back due to lack of control group. Have access to medical records; suggest matching cases (girls who received a specific type of birth control) to controls (girls who did not receive birth control) as well as possible, using age, race, BMI, and any other available factors.

2016 September 28

Luis Huerta, Pulmonary/Critical Care

  • "I am working on a pilot single center cluster-randomized multiple-crossover trial of contact precautions in the Vandy Medical ICU which has not begun yet, and would like assistance with designing the statistical analysis plan, particularly taking into account the cluster-randomization of the planned trial."
  • Brought up several concerns that have been thought through: seasonality, proximity between patients/presence of infection on floor, how to recruit sites for larger study, different patient populations in different ICUs, admission diagnoses
  • Suggested following up on Frank's email and meeting with him, Dan B, Robert, who have experience with these types of trials

2016 September 21

Ashley Kroeger, Pediatric Critical Care

  • >1000 patients discharged from pediatric cardiac ICU; looking at risk factors for readmission
  • Two different versions of Pediatric Early Warning Signs score: validated, and VCH-specific (extra components)
  • Question 1: is PEWS score, or the difference in PEWS between ICU discharge and floor arrival, predictive of time to ICU readmission?
  • Question 2: do the extra components added at VCH add helpful information when it comes to predicting readmission?
  • Possible analysis: Cox proportional hazards model with outcome = time to readmission (patients who were never readmitted are censored at hospital discharge), covariates = standard PEWS score + score on VCH-specific component + confounders
  • could do two versions of above, one using score at ICU transfer and one using score at floor arrival
  • be wary of multiple hospitalizations per patient - may need to deal with this in analysis or just take first hospitalization per patient
  • estimate ~90 hours for VICTR voucher

Alex Hawkins, Surgery

  • Trying to sort out statistical analysis for three separate cohorts- a group that got radiation pre-op, a group that got it post op and a group that never got radiation from data using the NCDB
  • Question 1 - does neoadjuvant radiation improve rate of R0 resection and/or overall survival?
  • Question 2 - does preop radiation improve rate of survival?
  • Suggest time-varying Cox model to incorporate post-op radiation properly, since patients will start post-op radiation at varying times after surgery (time 0) (including it as a single value would introduce immortal time bias - patients who die faster don't have opportunity to have radiation postop)

2016 September 14

Maya Yiadom, Emergency Medicine

  • Questions about how to phrase results of study with a small sample size and trending-but-not-significant results
  • Key suggestions: make sure to compare patients with and without missing data; try to compare 54 sites with data vs national characteristics (region, hospital type, anything else you can get); discuss limitations of sample size/power, potential bias and overfitting - don't overstate results

Alice Hensley, Pediatric Critical Care

  • "I am working on a project looking at multitasking abilities and comparing scores on an online multitasking test to residency milestone assessments, but would like guidance on how best to analyze the data that I will be collecting. "
  • Out of 100 residents, currently only have data on 22, so priority #1 is getting people to take the multitasking test as soon as possible (needs to be far enough before six-month evaluation to truly be baseline measurements).
  • Get as much data as possible on demographics and resident characteristics, to be able to compare residents who did and didn't take the test and possibly impute for missingness at the time of analysis
  • Probably use proportional odds logistic regression (outcome is a score 1-5)

2016 September 7

Shaun Mansour, med student/global health

  • won't be here until 12:45; see preliminary info in email

Joshua Arenth, Pediatric Critical Care

  • "I am planning a study evaluating the effectiveness of a curriculum on communication skills in the ICU. I would love to talk with someone about project design and potential statistical analysis requirements. "
  • Descriptive statistics will tell most of the story: percentages in each of the three communication styles before and after intervention by intervention group
  • To get p-value, suggest logistic regression: [optimal vs suboptimal style, after intervention] = [style before intervention] + [intervention, yes/no]; odds ratio for intervention is what will tell you whether the intervention group is different from the non-intervention group
  • Spaghetti plot showing change from before to after intervention, with two groups in different colors
  • Estimate about 40 hours for analysis and manuscript through VICTR

2016 August 31

Vance Albaugh & Georgina Sellyn, Surgery

  • Powering a longitudinal study examining cognitive function after surgical (two types) or medical weight loss surgery
  • Could do one/both of the following for a very simplistic approach: paired t-test or two-sample t-test (combined surgery groups vs medical); this should give the "worst case" scenario
  • Feasibility: could easily enroll ~100 patients in each of three groups, cost not a major issue
  • Main interest is whether differences in cognition are seen very early (1 or 3 months) as opposed to the known differences at 12 months after surgery; could do a mixed effects model (longitudinal) for final analysis

Shaun Mansour, medical student/Global Health

  • moved to next week at 12:45

2016 August 17

Chelsea Isom, General Surgery

Uche Anani, Division of Neonatology

  • "I am currently refining my IRB protocol for my mixed method study on clinical decision-making during the perinatal period. I am using a validated survey for my patient population but need some help to determine how big my sample size needs to be to have a power of 80% or p < 0.05."
  • Prospective; planning to administer a decision-making survey to both patients and clinicians (OB/GYN, genetic counselors, and in some cases, neonatalogists) to measure amount of struggle patient is having with making pregnancy-related decision. Main question - how well do clinicians predict how much the patient is struggling. Survey outcome is a continuous score ranging 0-100.
  • For descriptive analyses, not much need for power calculations; get as much data as is feasible, then do (for example) scatterplots and correlation statistics comparing genetic counselors to patients and (separately) OB/GYN to patients.
  • Predictors of a closer relationship are also of interest (education, religiosity, health literacy, years of clinician experience, etc). Could do linear regression for this.
  • Suggest contacting Frank Harrell to determine if there is an available collaboration plan with pediatrics; if not, apply for VICTR assistance. End goal is a manuscript; clinic staff believes this would fit in the <90-hour VICTR category.

2016 August 10

Erin Powell, Pediatric Critical Care

  • Followup visit from 7/13
  • Suggest performing Cronbach's alpha to make sure all questions are informative. Also check to see if original instrument has a validated scoring system. If the alpha works out and there is no other scoring system, suggest creating a single score that is the sum of all 15 questions.
  • Suggested model: post score = pre score + experimental/control (or score = pre/post*experimental/control, using an interaction term to determine whether there is a difference between groups, but first option uses fewer degrees of freedom, and N = 17)
  • Goal is to publish both curriculum and a manuscript about the curriculum's efficacy, probably submitted to an educational journal

Debra Braun-Courville, Pediatrics

  • Followup visit from 2/25
  • Goals: manuscript and conference poster/presentation in March/April 2017 (deadline in November)
  • 282 adolescents (age 12-23) on progesterone-only implanted birth control known to have side effect of weight gain
  • Ideally, see about pulling data from entire population "eligible" to receive this type of contraception (maybe talk to synthetic derivative folks?); that would allow comparison of weight gain/BMI/etc between similar patients who did and did not receive this type
  • If that isn't possible, some ideas...
    • discriminant analysis/risk factor model for weight gain among patients who did get this type
    • separate subgroup analyses among (eg) 12-17yo and 18+yo, because these populations are so different in terms of the outcome
  • For VICTR purposes, actual plan will depend on what data is available, but likely to need 90-100 hours either way for data management, analysis, manuscript writing and revisions

2016 July 27

Susan Dickey, Pharmacy

  • She has questions related to a logistic regression for a retrospective critical care research project regarding the duration of antimicrobial therapy for intraabdominal infections in critically ill surgical patients.
  • 240 patients total who met inclusion criteria, approximately 70 events (event is a composite outcome defining "treatment failure"); excluded transplant patients and those in SI <24 hours
  • Lots of confounding due to ICU/hospital LOS. Suggestions:
  • Look into doing a Cox model for time to event data instead of logistic regression; patients who did not experience event will be censored at hospital discharge
  • Sensitivity analysis only including patients who stayed >=8 days (long enough to potentially be included in the "long" treatment group); this will reduce sample size, but still leaves enough for a reasonable analysis, and would reduce confounding from patients who only stayed a few days
  • Look into whether patients in the long antibiotic group but never got vasopressors were withdrawal of care patients

2016 July 20

Melissa Warren, Critical Care Medicine

  • "We have created a new chest x ray scoring system and are seeing how this score can/may be used in patients with critical illness to assess prognostication and outcomes. We have currently scored all of the chest x rays and are analyzing patients in the FACTT database (a former critical care study looking at conservative vs liberal fluid strategies in critical care). I was hoping I could sit down with a biostatistician to discuss which tests would be best to use/how to perform them in SPSS in order to look at the correlation between score/outcomes."
  • We discussed how to show agreement with continuous measures (Bland-Altman plots) and recommended she create plots for the overall score and by component to see if any portion of the score is driving any disagreement. If needed, we also recommended looking by quadrant because of potential issues in the lower left quadrant.
  • Due to the scope and technical complexity of her questions, we recommended that she check with Tatsuki Koyama ( to see if this work would be covered under a collaboration plan. Otherwise, we encouraged her to apply for a VICTR voucher.
  • Some of the outcomes of interest are time to successful extubation, ventilator-free days, time to death, LOS, etc. We think the scope of the work should take at least 90 hours.

2016 July 13

Erin Powell, Pediatric Critical Care

  • They would like to discuss the data for their project evaluating the effectiveness of a curriculum to teach communication skills to pediatric critical care fellows.
  • We discussed the different measures they are using to evaluate their chosen outcomes and appropriateness of methods suggested to use for analysis (t-tests).
  • We suggested factor analysis might be most appropriate to answer their questions of interest and suggested that if this approach is beyond the scope of their abilities to apply for a VICTR voucher.

M. Frances Wright, Medical student

  • They have questions related to their project on blood product utilization during liver transplant surgery.
  • They had several questions related to outliers and what to do with them in the analysis. We highly encouraged them not to remove them from any analysis unless they can prove that these measurements were made in error. We also tried to help them understand analyses done for them previously. We also encouraged them to consider applying for a VICTR voucher if the scope of the future analyses seems beyond their abilities. They were going to check whether a collaboration existed between our two departments.

2016 June 29

Justin Gregg, Urology

  • He has questions regarding sample size calculations.

Jamie Felton, Pediatric Endocrinology

  • "My questions are regarding the best way to statistically analyze a data set from an ELIspot assay."

2016 June 8

Ravi Bamba, Plastic Surgery

  • "I have a dataset that I finished collecting from a previous project. I needed help running my stats but I do not have funding."
  • Investigating risk factors for recurrence of pressure sores in subjects who had a surgical intervention to fix the problem.
  • He has data from 1997 - 2015. We advised that he restrict follow-up to ensure that all have had equal opportunity to have a recurrence observed.
  • We recommended that he use non-parametric tests (Wilcoxon Rank Sum test, e.g.) for the univariate analyses.
  • His primary question of interest is whether there is a difference in time to recurrence between different pre-specified risk factors. We showed him the UCLA website as a resource as to how to fit Cox models in SPSS. We also advised him to organize the covariates of interest from most to least important based on clinical knowledge and literature.

2016 June 1

Ravi Bamba, Plastic Surgery

  • "I have a dataset that I finished collecting from a previous project. I needed help running my stats but I do not have funding."
  • No show for clinic

2016 May 25

Deborah Jacobson, General Surgery - canceled due to OR schedule

  • "We have data including complications/pt/year for 10 years of data and want to see if there is a significant decline in complication rates over time."

Viraj Mehta, Ophthalmology - moved to Thursday clinic

  • "I'm evaluating eye motility outcomes after surgery for orbital floor fractures in children. I have collected all the data, and needed help figuring out the best way to analyze it."

2016 May 18

Vance Albaugh, Department of Surgery

  • "I have a question about powering a clinical research study, as well as some specifics about the data analysis."
  • Looking at gastric bypass patients' glucose tolerance tests at multiple time points after surgery. Plan: give them a regular glucose tolerance test, measure response (every 15-30 minutes, so we get a curve for each test), then a few days later give the same test supplemented with salt. Hypothesis is that salt will make the glucose response worse (higher) initially, but by a year after surgery, the salt/no salt responses will be roughly equivalent. No-salt response will also change over time as patients become less insulin resistant.
  • Of possible interest: DeLong et al, 1988:
  • Jeffrey Blume in biostats might be a good resource for AUC curves - this project is especially complex due to longitudinal measurements + AUC measurements
  • Number of patients could be pretty high - this is a relatively easy study to do compared to other gastric bypass studies
  • Describing and plotting this data will likely be as or even more informative than statistical testing. Something like one row per patient, one panel per time point, with salt vs no salt at each time point in each panel. 5-10 patients' worth of pilot data would be highly informative for future sample size/analysis discussions.
  • Maybe a mixed effects model along the lines of: glucose response = salt * time + visit * time + covariates (# covariates restricted by sample size)
  • Jackie is looking into latent growth curves

Maya Yiadom, Emergency Medicine

  • I am submitting a K23 proposal and could use help identifying:1) Whether I’ve selected the right study design for may aims; 2) The right analysis method should be for Aims 2 and 3; 3) How do I get an appropriate ED (Aim 2) and patient (Aim 3) sample size for Aim 2 and 3?
  • Could fit one model using patient-level data to answer both aims 2 & 3, including both ED- and patient-level characteristics. This would allow you to get estimates for, say, academic vs non-academic institutions, or patient age, after adjusting for all other factors.
  • Recommend plotting time to diagnosis & time to treatment for any available pilot data to help inform model choice. If outcome is normally distributed and patient-level data is used, a linear mixed effects model could be good (site is random effect).
  • Time to treatment gets very tricky because some patients get treated via medication and some via procedure - procedure inherently has longer time to treatment. Look at distribution of times separately and together - will likely need two separate models to answer treatment question.

2016 May 11

Jordan Rupp, Emergency Medicine

  • "I have a couple quick statistics questions for a small QA study in which I am participating. We will be assessing the lung ultrasound abilities of the emergency medicine residents at the Nepali hospital after a brief 2 week teaching session given by Bales and I in March. I need some help making sure our sample size calculations, etc. are correct."
  • Studying pneumonia; typical gold standard is CT, but not feasible in Nepal
  • Original plan: do chest x-ray (standard, but can take up to six hours) and ultrasound on all suspected pneumonia patients, compare to discharge diagnosis
  • No data from before the class is available, so can't compare pre- and post-training. If that is the true main question of interest, the study needs to involve equivalent providers or sites who didn't get the training course.
  • Possible main question: does doing the ultrasound at presentation add value to the standard chest x-ray, in terms of accurately predicting whether the patient is diagnosed with pneumonia?
  • Need to refine research question before continuing with sample size calculations.
  • If money is available to do CTs on everyone, sample size will depend expected sensitivity/spec/PPV/NPV and on how wide a margin of error would be clinically acceptable (eg, if we expect something like a point estimate of 80%, would a 95% CI of [70%, 90%] acceptable?)

Amelia Maiga, General surgery resident

  • "I have two specific R coding questions for a survival analysis I'm doing on a multi-institutional retrospective cohort of surgically-resected distal cholangiocarcinomas. I've tried stack overflow and perusing the Hmisc source code without success.
  • 1. I am using aregImpute to impute missing covariates, but run into errors when I attempt to include any factor variables with 6 or fewer observations per factor level. When I attempt to specify group=d$site (where site is a factor covariate with 10 levels, one of which only has 6 observations), I get a different error message about not all values of d$site represented in observations with non-missing values of another covariate.
  • 2. I would like to use fit.mult.impute to fit a Cox proportional hazards model utilizing the data imputed by aregImpute, but despite specifying the xtrans object appropriately, I keep getting an error message "imputed=TRUE was not specified to transcan", suggesting that R thinks I intend to use transcan rather than aregImpute to impute the data to fit the model."
  • Suggest an interaction term between log10(followuptime) * death in the aregImpute() (might need to create log variable beforehand)
  • Suggest pooling sites by region in a new variable to use in imputation, and/or include site in analyses, possibly by stratifying (strat = 'site' in cph()), which would allow the baseline hazard to differ between sites (still get one HR for each covariate)

2016 May 4

Drew McKown, Pulmonary/Critical Care

  • Hoping to discuss test selection/power calculations prior to IRB submission
  • "The idea is to perform a physiologic assessment of the patient to determine an ideal ventilator setting and then assess if that setting is different from one prescribed by an algorithm."
  • Basic question: There is an ARDSNet algorithm for setting tidal volume based on PEEP. Want to compare the tidal volumes recommended by the algorithm by tidal volumes determined by stress index (measure of how much stress is on the lungs).
  • Recommendation - calculate power/sample size using a paired t-test and SD of the difference for the means between ARDSNet/stress index results. However, for actual analysis, recommend nonparametric (Wilcoxon) test, since especially with a small sample size, assumptions for t-test are likely to not be met.

2016 April 27

Andy Brooks, Center for Human Genetics Research

  • Wants to discuss power calculations for proposed research project
  • No current power methods exist for these genetics methods - gave some suggestions about simulations, etc. Suggest coming to future Tuesday omics clinics for more long-term discussions (except first Tuesday of the month).

2016 April 20

Vance Albaugh, Surgery

  • "I am planning a clinical study and would like my sample size to be reviewed by a biostatistician before I submit for VICTR funding. The study is a randomized, double-blind experiment in human volunteers examining the effects of a drug commonly given to liver failure patients on oral glucose tolerance."
  • First study: ileostomy patients, want to see if they respond differently to placebo vs. treatment given directly to the small intestine. Patients with and without diabetes will likely respond very differently - need to know how to power this (for VICTR application).
  • Most conservative approach - power separaately for diabetic and non-diabetic patients. Complication is that there is no pilot data on diabetics; can guess that variance is twice as much for these patients as non-diabetics. For analysis, suggest doing one model with interaction term to get most efficient/accurate treatment effect (AUC = tx + diabetes + tx*diabetes).
  • Second study: bowel length vs. weight loss (and potentially other outcomes) in gastric bypass patients. Suggest longitudinal approach: model with patient ID as a random effect, with weight at each time point as the outcome and baseline weight and bowel length and time as independent variables. Data will be structured with multiple records per patient. Can show results graphically by showing a line over time for patients at (for example) the 25th, 50th, and 75th percentile of bowel length. Allow at least bowel length to have a nonlinear association with the outcome (restricted cubic splines is a popular approach). Could apply for VICTR voucher if this analysis is too complex to do himself.

2016 April 13

Flavio Silva, Orthopedics

  • Project: Scapular and cervical neuromuscular deficits in musicians with and without playing related musculoskeletal disorders (case-control study)
  • Asked for help with regression and descriptive statistics
  • Case-control study - original plan is to match; we suggest using entire cohort and adjusting for confounders (effective sample size of ~70)
  • Outcome: chronic pain; covariates: three test scores (two are closely related, one is less closely related)
  • One test: six unique values (20, 22, 24, 26, 28, 30; he has dichotomized this based on previous literature); neck flexion: number of seconds (less than a minute; normative means are 24 or 38 depending on gender - might dichotomize this); scalpular dyskinesis is dichotomous yes/no
  • Suggest looking at Spearman correlation between two neck flexion tests to see how closely related they are - if very closely, might not make sense to include both (adjusting for one would make the other meaningless)
  • Additional analyses use test scores (above) as dependent variables. For linear regression with test with 20, 22... as outcome, need to carefully look at diagnostics to make sure results are reliable. Some guidance for SPSS might be here: If assumptions aren't met, could consider ordinal logistic regression for this outcome.

Tony Qiu, Anesthesiology

  • "I'm doing a research with anesthesiology department and currently in process getting IRB approval, I have a few questions regarding data analysis part. My question is what model can I use to assess mortality data across different institutions?"
  • 30-day mortality is often used as a quality marker for non-emergent surgeries. Question is whether institutions are "gaming the system" - keeping patients alive long enough to make that marker, then transferring to palliative care, or just not taking more severe cases due to the risk of not making that marker.
  • For IRB application, could do Kaplan-Meier plot for general time to death across all institutions. This does not get at the question of whether different institutions are "gaming the system" (keeping patients alive to the 30-day marker and then being less careful).
  • But one K-M plot is not going to fully answer the question (might get IRB approval, but won't actually answer the question).
  • When it comes time for the analysis, might suggest a Cox proportional hazards model with time to death as the outcome, could adjust for potential relevant confounders (severity of case, etc).

2016 March 30

Flavio Silva, Orthopedics - Canceled

  • Project: Scapular and cervical neuromuscular deficits in musicians with and without playing related musculoskeletal disorders (case-control study)
  • Asked for help with regression and descriptive statistics

2016 March 23

Ravi Bamba, Plastic Surgery

  • Working with burn patients, looking for association between age and a) number of cytokines and b) % change in 2nd vs. 3rd degree burn between initial and final assessments.
  • Initial plan was to collect data only on patients <30 and >65 years old, then to look for a difference in the two groups. We recommended collecting data on a spectrum of patients and then looking for an association between patient age and the two outcomes. This will allow the results to be more generalizable and have more power (less loss of information). Hoping to collect data on ~60 patients if time/logistics allow.
  • Also consider what confounders to collect data on and adjust for: possibly total burn surface area, comorbidities, burn mechanism, other clinical factors.
  • Plan to apply for VICTR voucher for both lab funding and statistical support, with end goals of pilot data for a grant and a manuscript. Suggest around 60-75 hours of statistical support for data management, modeling and diagnostics, manuscript writing/editing and revisions.

2016 March 16

Lyly Nguyen, Critical care

  • Comparing burn ICU outcomes from time period before a specific drug was administered for inhalation injury (2002-2008) and after that drug became part of standard care (2008-2014).
  • For univariate comparisons, recommend describing variables using median and interquartile ranges (rather than or in addition to mean/SD) and using Wilcoxon nonparametric tests rather than t-tests.
  • Outcomes include ICU LOS, probably hospital LOS, vent-free days, and pneumonia (ever/never during ICU stay).
  • For continuous outcomes, all of these are highly skewed, so need to transform them before running a linear regression model (see this link for help in SPSS:
  • Number of variables you can put in the model: For continuous outcomes, it's the number of complete cases (no missing data) / 10-20. For pneumonia, it's the minimum of (pneumonia, no pneumonia) / 10-20.
  • Calculating vent-free days: Pick a common denominator among all patients (say, 28 days). If a patient dies, they automatically get 0. If they survive, they get (28 - number of days on vent in first 28 days of ICU stay; assume not on vent after ICU discharge).
  • Missing data methods in SPSS may not be robust.
  • Limitations/things to be aware of:
  • - Missingness can strongly bias results and affect number of covariates that can be included in the model.
  • - Mortality rate is about 20% and can also bias results - as one example, this may mean that patients with a shorter ICU LOS are actually doing worse (dying earlier) than patients with a longer ICU stay.
  • - Temporal confounding can limit interpretation - can't say that lower pneumonia rates cause fewer vent days, for example, since we don't have timing of either event; treatment effect is also confounded by time. Also clinical care may have changed fairly drastically over the 12-year study period.

2016 March 9

Oliver Gunter, General Surgery

  • "I have a question regarding a large database study I’m conducting. This is an IRB approved study that I’m trying to finalize for submission for publication. I have some questions regarding possible propensity score matching to eliminate problems I have with differences in patient characteristics."
  • Given that this is survey weighted data, and there is plenty of sample size (N = 186,000 with event rates of 10-15% for the two outcomes), there doesn't seem to be a need for propensity score adjustment, and it could add complications due to survey weighting. Just adjust for individual covariates in the main model.

2016 March 2

Scott Boyd, Surgery

  • Main research question: whether short (24h) vs. long-term (7-day) antibiotic use is associated with a difference in infection rates after a specific type of oral surgery at two sites (retrospective cohort). Most infections concentrated within 30 days of surgery date; infections observed after this tend to be different and in different types of patients.
  • Major limitation: antibiotic use is constant across each site, so antibiotic duration is completely correlated with study center. Suggestion: describe rates of other types of infection at those sites to (hopefully) show that those are similar, so that any association found in this analysis is more likely to be due to antiobiotic duration than just study center effect.
  • Suggested logistic regression (outcome = infection, yes/no) and also Cox proportional hazards model, where outcome = time to infection. No patient has >1 infection. In either of these models, effective sample size is ~53 (number of infections), so could adjust for up to five parameters to account for potential confounding.
  • Make sure to discuss in limitations section the idea that despite doing everything we can, it is not possible to completely tease out the association of antibiotics vs. the association of study center and unmeasured confounders that go along with that.
  • Planning future prospective study which will hopefully better address these issues.
  • For VICTR planning purposes, this should fit in a regular 90-hour VICTR project.

2016 February 24

Debra Braun-Courville, Pediatrics

  • She is working on a clinical research project looking at contraceptive usage among adolescents from chart review data and needs guidance regarding the analysis.
  • Recommend doing KM curve for up to 12 months (or whenever a large proportion of patients have data up to this point - majority of patients had device inserted >x months ago).
  • Could do a Cox proportional hazards model with time to removal as outcome (patients with device still in are censored at 12 months, or whatever time point is used), and baseline variables as covariates: age, previous pregnancies, etc.
  • For variables such as bleeding, weight gain, etc which are collected during followup, recommend doing descriptive statistics for reason the device was removed - analysis with these variables is going to be biased due to lots of missingness in the clinical record.
  • Try looking at UCLA's stats web site for examples in SPSS, or apply for VICTR voucher

Dan Wang, Hematology-Oncology fellow

  • "I’m a first year Hem-Onc fellow and am doing an epidemiology project and had a quick question on how to calculate a p-value for comparing two APC (annual percentage change) using data from a SEER-like database (Texas Cancer Registry)."
  • Because the data is very aggregated (rates per year by demographic), not much we can do statistically. Descriptives and figures are probably the best bet.

2016 February 17

Meredith Stocks, Medical student

  • "I am a medical student assisting Dr Sarah Krantz with a project looking at short interpregnancy interval and counseling at antepartum and postpartum appointments. We already have a population of short IPIs but need help setting up our control group. Dr Krantz has done a bit of work regarding the design of the project and I will email specifics closer to the date of the clinic as we are meeting this week to have everything ready."
  • Have 300+ women with short IPIs within five years; need to know how many cases to pull. Ideally, pull all data within same five-year period, but if this isn't feasible due to logistics, try PS software to see how many patients give adequate power.
  • Perform one logistic regression model: short IPI = attendance at antenatal visit + confounders (demographics, provider type, etc). Interpretation: odds ratio for antenatal visit is the odds of short IPI for those who attended antenatal visit vs. those who didn't, adjusted for all other confounders.
  • Matching is also a possibility; can be more clinically straightforward, but is more work on the front end.

Clint Leonard, Vanderbilt Burn Center

  • "We are a team from the Burn Center currently working on a manuscript entitled "Assessment of Outreach by a Regional Burn Center: Utilization of resources should be part of education for referring providers." We had some questions about analysis and interpretation of our results that we were hoping to discuss with you this Wednesday. "
  • Essentially, we studied all interfacility transports to the Burn Center from Jun 2012 - Jul 2014. For the 623 patients that met our inclusion criteria, we recorded:
             Method of transport (helicopter, airplane, or ambulance)
             Burn Size
             Outside hospital estimate of burn size
             Actual burn size (as determined by our burn attending)
             Burn Mechanism
             Fluid resuscitation data including fluid type, rate, and bolus administration. 
             Intubation status
             Length of stay (both ICU and total hospital) 
            The difference between our estimate of burn size and outside hospitals' estimates
            Trends in fluid resuscitation rates
            Trends in air versus ground transport
  • As this is a retrospective study we are only identifying trends rather than making sophisticated inferences, so the majority of our findings are simple declarative statements such as "Of 143 patients who arrived by air, 18 (13%) and 49 (31%) were discharged from the hospital within 24 and 48 hours, respectively." However, there are a few areas that I would appreciate your input on:
  • Is there a good way to represent the relationship between overestimation of TBSA and overresuscitation? What is the best way to see which demographic factors (age, TBSA, mechanism) affect the likelihood of air vs. ground transport? Similarly, what is the best way to see which demographic factors (age, TBSA, mechanism) affect the likelihood intubation? - suggested boxplots of delta(TBSA) for each overresuscitation group
  • Related to the above, what information will we glean from a chi square test that we will not get from a logistic regression, and vice versa? Would it be worth it to perform both? - logistic regression allows adjustment for confounders, and gives direction and magnitude of association. Chi square only gives association and does not account for confounding at all.
  • I want to doublecheck the validity (and utility) of making certain statements without controlling for other variables, e.g. "18% of patients who were burned while smoking on O2 died, while all other mechanisms combined had a mortality rate of 4%" - definitely need to adjust for confounders in this case, since patients with this burn mechanism will have inherent differences from overall population. Use logistic regression if you have certain death data on everyone, or if you only have (for example) in-hospital death, could use Cox proportional hazards regression and censor at hospital discharge.

2016 February 10

Jin Han, Emergency Medicine

  • "I have a cohort of delirious and non-delirious patients (230 patients). I want to preliminarily develop novel subtypes of delirium based upon clinical and biomarker data."
  • Have data on several delirium characteristics (severity, arousal level, etiology) on 228 patients. Have functional outcomes at 6m on ~160, cognitive outcomes on ~110, and mortality rate of ~30%. Interested in risk prediction score for outcomes using delirium characteristics (whether or not patient meets criteria for full delirium) as well as patient characteristics.
  • Planning to submit R01 to develop the full risk score. Suggested VICTR design studio with clinical + statistical experts to figure out how best to use this pilot data in grant submission.

Justin Godown, Pediatric Cardiology

  • The project is development of risk prediction models for placement of a ventricular assist device vs medical management with outcomes of survival to transplant and 1 year post transplant survival in pediatric patients. Considering using propensity matching due to variability within groups.
  • Main goal would be to develop a risk prediction score for mortality, with VAD vs. medical management as a key component
  • Data comes from two databases with a wide variety of cardiac patients; suggest limiting patients included to those who are sick enough where this decision would have to be made.
  • Could do a Cox regression model with time to death = baseline factors + VAD vs. medical management; not sure about getting a risk probability from this, though.
  • Could also do a logistic regression model with, say, one-year mortality as outcome; more straightforward to get a probability, but lose the time information.
  • Propensity scores could be useful here to either match VAD patients with medically managed patients with similar propensity of VAD, or as a data reduction technique if number of events is low. It's possible that neither of these are necessary.

2016 February 3

Jin Han, Emergency Medicine

  • "I have a cohort of delirious and non-delirious patients (230 patients). I want to preliminarily develop novel subtypes of delirium based upon clinical and biomarker data."

Kiersten Brown Espaillat, Emergency Medicine

  • "I am looking for guidance on how to proceed with performing a validation of the bedside swallow screening used for acute stroke patients in the ED, neuro ICU, and neuro care unit."
  • Wants to assess validity of a VUMC-developed bedside swallow assessment compared to video fluoroscopy among stroke patients who can safely have swallow assessment. Have retrospective data on patients who failed swallow screen (thus required fluoroscopy), but no fluoroscopies currently on patients who passed swallow screen.
  • Sample size needed will be determined by confidence interval width that's clinically meaningful - for example, if point estimate is 90% sensitivity but CI goes down to 75%, is that clinically OK, or is that too low? Kiersten will look up validation study for only validated tool and use that as a starting point.
  • Swallow tool failure rate is ~15%.
  • VICTR could be a good resource.

2016 January 13

Mike LeCompte, Surgery and Critical Care

  • Mike wants further input on his application to VICTR regarding a surgical resident education project.

2015 December 16

Jason Singer, Medical student

  • "I have a few quick questions for biostats clinic related to how to treat data from Likert scales. For example, if the average choice is somewhere between agree and strongly agree or if the average choice is between
usually and always, how do I report a mean and standard deviation?"
  • Due to limitations with REDCap on tablets, their survey had to be restructured from using VAS's to categorized answers. Because of this, the type of analysis required to answer their questions of interest has changed.
  • Doing a chi-square test will determine if there is a difference in distribution of responses between inpatient and outpatient groups. If there is a significant difference, then differences in proportions could be done in which a specific cut-off is determined.
  • Another alternative is to fit a proportional odds model with the responses as the outcome and group (inpatient versus outpatient) as the main covariate. The advantage to this approach is that confounders can be included in the model. The proportional odds assumption needs to be tested once the model is fit.

2015 December 9

Mike LeCompte, Surgery and Critical Care

  • "I am trying to design a study on surgical resident education techniques and wanted to get some input on my study design and setting up my statistical methods."
  • We would recommend doing the pre-video at the first junior surgical experience to then compare to the post-video after all surgeries are complete. We would also recommend having the same number of assistant surgeon experiences between the two groups if the pre-video is made at the time we recommend.
  • We recommended that he determine what would be clinically meaningful differences in the different outcome measures to help determine what power he would have given his fixed sample size.
  • We think it will require about 40 hours of statistical support to do the analysis and help in manuscript preparation.

Tommy An, Medical Student

  • Question about Stata output for logistic regression

2015 November 25

Renee Hill, Physical Medicine & Rehab

  • Postdoc in Center for Integrative Medicine; interested in seeing whether an 8-week intervention results in reduced emotional distress, and whether this reduction is different depending on level of self-compassion. No control group; all patients received the intervention, and have pre- and post-intervention distress scores.
  • For main question (reduction in scores), could do a paired t-test or (more likely) a Wilcoxon signed rank test, if data is not normally distributed, comparing pre- and post-intervention scores.
  • To see whether the association of pre- and post- scores differs based on level of self-compassion (continuous value), could do a multivariable regression model: post-intervention score = pre-intervention score + self-compassion score + (pre-score * self-compassion). Can also add potential confounders (age, gender, etc) to this model. Linear regression is most common, but need to check distribution of the outcome first - if the outcome is not normally distributed, this model may not be appropriate or reliable. Also check model assumptions, such as residual vs. fitted plots.
  • Secondary question: do post-intervention scores stay stable in the months after the intervention, or rebound? Data available on cohorts from 2012 through 2015, all with different followup times. Could do a regression model with followup (current) score as the outcome, months since completion of intervention as the main exposure, and adjusting for other confounders (anger score, age, etc).
  • Suggested looking into a VICTR voucher (check Starbrite for more info).

2015 November 18

Michael Ghiam, medical student - canceled

  • "Hello, I am a third year medical student working on a epidemiology study using Stata to analyze my data. I am currently working on writing code for my study and I am running into some roadblocks. I was wondering if I could come in on Wednesday or Thursday and get some pointers about which codes would be best to use and if I’m analyzing my data the right way. Please let me know if this is at all possible and what I should provide you with in advance."

Tommy An, medical student

  • "I have attempted a logistic regression in STATA to predict whether a patient has MRSA or MSSA musculoskeletal infection based on presentation data from the emergency department. I need some advice to see if I'm on the right track with my statistical analysis."
  • Logistic regression seems fine. Make sure to note that results will only be generalizable to patients who actually did develop MRSA or MSSA, not patients who developed neither.

Jamie Robinson, Surgery

  • " I am trying to do a Kmeans analysis and having some difficulty with figuring out if what I am getting is what it should look like. "
  • Suggest looking at varclus() function in Hmisc package to cluster variables instead of patients
  • Also could create principal components (using princomp()) before clustering to reduce data to two dimensions before using kmeans

2015 November 11

Nick Kramer

  • " is our project and question: literature review of weight bearing after posterior acetabular fracture. The data available is relatively limited looking at the question we are asking so we are attempting to combine several studies to see if there is a trend for benefits of early vs late weight bearing. We have several questions regarding the best way to do this, if it is possible at all."
  • Clinic statisticians will email the department to ask if anyone has further expertise in this area. This study presents a challenge because there is no common exposure in each study included in the review; rather, each study is a cohort of either early or late walking patients. Raw data/SDs are also available on very few of the studies.

Michael Benvenuti, Orthopedic surgery

  • "I am working on a pilot study to determine the effect of antibiotics on length of stay and culture sensitivity in pediatric musculoskeletal infection. i have done some preliminary analysis using Stata and have a few questions about how to continue."
  • We discussed logistic regression for the main outcome: positive culture = covariates. The number of covariates that can be reliably put in the model is roughly equal to [minimum of positive/negative cultures] / 10-15. So, if the split is 50/50 and there are 120 patients, 60 "events" / 10 = 6 possible covariates in the model.
  • For length of stay, we suggested using a Cox model looking at time to hospital discharge, rather than using LOS as a continuous outcome. (LOS generally has a distribution that is difficult to model.) All covariates for this model are measured at ED presentation.
  • About 30 patients are included in the cohort but have no culture measurement. It is important to look at whether these 30 patients are different from the other 120 - was no culture done because they were sicker/less sick, younger, etc.

2015 November 4

Courtney Baker

  • "In terms of the project, it is a collection of data on intra-operative coagulation factors and transfusion data for pediatric scoliosis patients over the last 3 years. I have asked a number of questions around "what determines/predicts intra-operative blood loss in these cases?" I have done rudimentary (and most likely not entirely correct) statistical analysis between the coagulation factors and the transfusion results. What is needed is a rigorous discussion about employing multivariant statistical analysis on the associations I see. The goal of this data set is to publish some new observations and associations in order to develop a quality improvement protocol AND a more rigorous research project into one or more of these specific associations."
  • Recommend doing multivariable regression models instead of univariate analyses: eg, "end of surgery platelets = baseline platelets + other baseline variables". Specific regression type depends on distribution of outcome; linear regression is appropriate for truly continuous outcomes with wide enough range (eg, 0-100) as long as assumptions are met (eg, residuals are normally distributed). Logistic regression is appropriate for dichotomous outcomes, and something like proportional odds logistic regression would be appropriate for integer values with small ranges (eg, 0-3).
  • Suggested spaghetti plots to describe fibrinogen loss during surgery, one line per patient, with reference line at time of fibrinogen intervention.
  • Models could be fit using restricted cubic splines for continuous covariates, such as baseline fibrinogen.
  • Discussed multiple comparisons, which GraphPad uses by default.
  • Use nonparametric tests of association (such as Wilcoxon rank sum test/Mann-Whitney) unless it is known that variable is normally distributed (and even then, nonparametric is a safe choice).
  • Possibly talk with Shirley Liu about doing regression analyses due to ortho collaboration; otherwise apply for 90-hour VICTR voucher.

2015 October 28

Emily Buttigieg, Medical student

  • "I am a medical student (VMS III) working on a research project in the Pediatric Surgery department. My project involves measuring body composition using tissue resistance and reactance measurements and comparing it to standard measurements, BMI, weight and height. I have collected my data and am inquiring as to the best software and approach to analyzing my data. I was hoping to attend next Wednesday’s clinic. Thanks in advance for your help. "
  • About 30 patients total with repeated measurements. Suggested using Spearman correlations on Z-scores from device and BMI calculations, using a) only one measurement per patient and b) all measurements per patient. Bland-Altman plots might also be helpful, and scatterplots of raw data will be very useful.

2015 October 21

Mary Bayham, Global Health

  • We seek to describe the burden of fever, diarrhea, and respiratory illness among children aged 6-59 months in Zambézia Province, Mozambique as well as predictors (individual and system level) of health care utilization for these children. The goal of this thesis is to identify significant predictors (individual and system level) of healthcare utilization for children under five with fever, diarrhea, and cough. These findings could inform future planning, policy and interventions in reducing under five morbidity and mortality in Mozambique.

  • Dataset includes 3,800 families; 2,700 children < 5-years-old; and 14 districts (3 of which were oversampled).
  • Suggest reporting descriptive statistics for 3 individual districts and comparing to all other districts combined.
  • Suggest utilizing a multivariable logistic regression model for healthcare utilization - initially without any weighting and once more weighting for district - and comparing results.

Erin Hamilton, Global Health

  • Goal to assess impact of a nutrition education intervention in children.
  • Outcome is Z-score BMI adjusted for age measured at 4 time points. Dataset includes 151 children with Z-score BMI measured at least twice.
  • Planning to use multi-level mixed effects linear regression model adjusted for gender.
  • Suggest utilizing paired Wilcoxon test for change in Z-score BMI between time points (pre vs. post) and box-and-whisker plot of median and IQR at each time point.

2015 October 14

Rachel Hayes, Surgical Sciences

  • "I am struggling to interpret an interrupted time series using logistic regression and proc glm in R. I’ve attached a de-identified data set (date is shifted) and some R code."
  • Difference in interpretation between lrm() and glm() models is due to differences in default anova() tests - anova.rms() uses added last tests, while anova.default() uses sequential tests (eg, time after adjusting for only variables ahead of it in model formula).

2015 October 7

Thomas An, Medical student

  • "I am in Dr. Schoenecker’s lab studying musculoskeletal infection. I have outcomes data and numerous variables for patients with musculoskeletal infection and am hoping to set up a multivariable analysis to predict which variables are most predictive of outcomes."
  • We discussed linear regression and diagnostics to use to evaluate whether assumptions for linear regression have been met (residual/predicted plots). We also discussed potential alternatives if the assumptions for linear regression are met including transformations, ordinal regression, or negative binomial regression.

Teerayut Tangpaitoon, Department of Urology

  • "Our study is about evaluate outcome of Holmium laser enucleation prostate surgery (HoLEP) compare to HoLEP with concurrent Cystolitholapaxy in same setting(retrospective)."
  • The two groups are completely confounded by presence of bladder stones. Those who received the additional therapy were patients with bladder stones.
  • We discussed t-tests versus Wilcoxon Rank Sum tests. We also discussed using linear regression to adjust for potential confounders.
  • We recommended the UCLA website for help in how to perform the analyses. ( )

2015 September 30

Kelly Maguigan, Critical Care Pharmacy

  • Project involving enteral intolerance in spinal injury patients who are on concurrent vasopressors.
  • Planning to request VICTR voucher for analysis.
  • Retrospective chart review of 80-100 patients from TRACS registry who were on pressors + enteral feeding for at least one hour.
  • Primary analysis: risk factors either count of enteral intolerance during hospital stay (Poisson or negative binomial model?), and/or daily yes vs. no enteral intolerance outcome, using lagged covariates.
  • Strongly suggest using REDCap for data collection - easier to build databases, especially with assistance from REDCap clinics, and built to be easy for statisticians to export data on the back end.
  • Secondary outcomes (ICU/hospital LOS, etc) are mostly descriptive.
  • For VICTR voucher, we believe this project will fit within the voucher time frame (90-100 hours).

Mike Benvenuti, medical student

  • I have been working with some retrospective data and am not sure how best to represent the effects of antibiotics on culture yield (odds ratio, negative/positive predictive value…) and would like some quick input. I also have another data set and I would like to show that patients have a d-dimer above the clinical threshold following joint replacement and am again not sure how to best show that.
  • ~200 patients. Primary relationship of interest: time of first antibiotic use vs. hospital LOS and/or possible secondary outcome: extension of antibiotic prescription post-hospital stay
  • Issues: immortal time bias - patients who get antibiotics later are, by necessity, in the hospital longer. Could use time-varying Cox model to address this? Looking at secondary, post-hospital outcome would help with this issue but could be a more blunt measure, and may not be that helpful depending on how many patients had antibiotic prescriptions extended.
  • Analysis will be complicated either way - suggest talking to MarioDavidson and/or research immersion course instructors about how to get stats support.
  • Definitely need to adjust for confounders (severity of infection, etc).

2015 September 23

Don Arnold, Pediatric Emergency Medicine

  • Dr. Arnold is applying for VICTR biostatistics support for writing the analysis plan and sample size justification for his grant proposal. I estimate that this will take approximately 20 hours of support.
  • This is a cluster-randomized trial and will have correlation due to providers on multiple patients, so we recommend an analysis that will account for that, such as GEE, mixed effects models.

  • "We propose a 2-arm cluster (by clinician) RCT of the Asthma Prediction Rule (APR) in 3 children's hospitals to determine if implementation of the APR electronically will result in a decrease of the "unnecessary hospitalization rate" for children with acute asthma exacerbations from 23% to about 19%. My specific question is around the power calculations I'm doing. For an effect size this small I'm calculating that I need a couple thousand clinicians in each arm, whereas we will likely have at most 60 in each arm. I'm looking to alternative designs or methods to analyze the data."

Richard Lesperance, Surgery

  • Want to use an ICC to measure interobserver reliability. We recommend giving a confidence interval along with the point estimate.
  • A useful measure for one of the main analyses could be the mean difference (with confidence interval) in length between the anterior and lateral chest wall.
  • "1) For measurements of chest wall thickness among trauma patients, should the mean measurements have standard deviation or 95% CI calculated for them? 2) We have 4 observers performing measurements on a total of 450 CT scans. How do we calculate / report inter-observer reliability? (Cohens’s kappa?) Should all 4 observers rate the same 5 or 10 scans to obtain the measure?"
  • "Detailed background: We are reviewing about 450 CT scans of patients who had a pre-hospital intervention: needle decompression of a tension pneumothorax. This is a traumatic condition where pressure builds up inside the chest (but outside of the lungs), and can kill a patient unless the pressure is drained. Commonly, paramedics are taught to insert a needle through the chest wall to drain the pressure – but there are concerns that the needles may not be long enough. Previously published literature suggests that this procedure may be ineffective in many people due to chest wall thickness – there are several published studies using cadavers, and CT scans of healthy volunteers, that show quite often the chest wall is thicker than the needle size commonly used – if the needle can’t reach the pleural space, the excess pressure cannot be drained.

We are looking at those 450 CT scans and measuring chest wall thickness in 2 locations – the front (where the procedure is traditionally taught, just below the clavicle) and also the side (which is an alternate location that some EMS providers are taught). We are measuring both sides (4 measurements total per CT scan).

Additionally, we are trying to measure the distance from the chest wall to the nearest cardiac structure (if this is too close, maybe the needle can injure something if we just use a longer needle!)

Besides the measurements (which other people have done, just not on actual trauma patients), we also theorize that many of these procedures are being performed unnecessarily. This condition is difficult to study because it is difficult to diagnose pre-hospital, and can be lethal if not effectively treated. The only marker that a patient both needed the needle decompression, and that is was successful, is the subjective EMS report that there was a “whoosh” of air when inserting the needle and the patient “got better”.

Further, we have noticed many patients who received a needle pre-hospital, but when they get to the CT scanner, they have no pneumothorax (air in the pleural space at all) – which means by definition, not only did they not need the procedure at all, but the procedure was completely ineffective. So we will be left with a % of patients who we can prove never had a pneumothorax and even though they had needle decompression attempted, it definitely failed.

We realize that this number is the minimum; in effect these are the patients we can PROVE had both inappropriate and ineffective treatment. The true number is much higher but unknowable.

Statistical questions: 1) Some previous literature reports the mean thickness and standard deviation – others report the mean and a 95% Confidence Interval. Is there a benefit to one way or another? 2) We have 4 people reviewing CT scans. What test should be used to measure inter-observer reliability? Should all 4 observers rate the same scans to obtain the measure? How many samples – 5, 10, more?"

2015 September 16

Jamie Robinson, Biomedical Informatics NLM Fellow

  • "My topic is Surgical Resection for CPAMs. In particular, I believe I need help with regression analysis."
  • We discussed specifics of R code, how to fit splines to a continuous variable and how to report the resulting estimates.

Thomas An, Medical student

  • "I am trying to compare hospital outcomes for MRSA vs. MSSA pediatric infection. I have done some analysis in GraphPad that I am not sure is correct."
  • We discussed non-parametric tests to use (Wilcoxon Rank Sum test, Kruskal-Wallis) as opposed to parametric tests. We also discussed whether a regression that adjusts for potential confounders is appropriate.

2015 September 9

Isa Wismann-Horther and Katie Ryan, OB/GYN

  • The project is a retrospective case control study looking at healthcare system based factors (postpartum counseling on birth control etc.) and their affect on short interval pregnancies. We wanted to meet with you all before we finished our IRB to make sure our methods were using the best possible statistical model of analysis. We were thinking about creating a scale to make a composite score which would put a numerical value to the descriptive data we're looking at (if birth control counseling was documented in the chart before discharge, did they pick a preferred method etc.) We were curious how many charts we would need to look through to power the study.
  • We recommended that the group write out their specific aims in order to keep them as focused as possible.
  • We also recommended that they check out whether a VICTR design studio would be appropriate for more intense statistical support. We also recommended that they check if there is a collaboration between our department and OB-GYN.
  • We discussed what would be an appropriate outcome. We advised against using a scoring system. Rather, we suggested they consider either looking at the elements of counseling and treat it as an ordinal outcome (number of elements covered in counseling) or as a binary outcome (any counseling versus none).

2015 August 26

Carissa Cascio, Psychiatry

  • "If possible, we'd like to review a recent VICTR submission and the pre-review feedback we received regarding our power analyses, assessment of normality, and imputation."
  • Addressed three critiques from VICTR application:
  • Power calculations: initially used difference seen in pilot data; suggest adding power analyses using the smallest effect size that would be clinically meaningful.
  • Assessments of normality (for t-tests): Suggest using nonparametric versions (Wilcoxon) instead of assessing normality. Loss of power is minimal if assumptions are met, and benefit is large if assumptions are not met.
  • Multiple imputation: Attrition is expected to be a major issue, but not sure if data will be missing at random (an assumption of multiple imputation). Suggested doing complete case analyses as primary, and doing multiply imputed analyses as a secondary analysis. Specify beforehand what variables will be used for imputation, and describe the differences between patients with and without missing data.
  • Possibly helpful paper for types of missing data for missing imputation:

2015 August 19

Maya Yiadom, Emergency Medicine

  • Looking at differences in screening for EKG vs. stemi incidence for multiple emergency departments for pilot data for a grant submission. Each center has its own screening criteria, and rates of stemis vary across centers.
  • Suggested simple descriptive analysis: table with one row per center describing incidence, sensitivity, specificity, NPV and PPV with confidence intervals. Think about describing demographics as well.

Jamie Robinson, General Surgery

  • Wants to look at the difference in costs using PHIS on performing appendectomy + 30 days post op before and after the intervention of a clinical practice guideline.
  • Didn't come to clinic.

Brian Long, Surgery

  • Looking at risk of cancer recurrence in pediatric neuroblastoma patients and whether surgery is helpful in high-risk patients. At the very beginning of the process (database design, etc for a retrospective multicenter study), but looking ahead to requesting a VICTR voucher for statistical analysis. Analysis will likely involve multiple survival models and competing risks, so we recommend the 90-hour VICTR voucher level to accommodate all analyses and data management. Also suggested attending the REDCap clinic for help with database design, and gave some data collection suggestions.

2015 August 12

Brian Long, Surgery

  • Planning a pilot retrospective chart review
  • Brought a proposed variable list for data collection and proposed outcome measures for a retrospective study looking at outcomes after an operation.
  • Research questions of interest are: Which patients with high-risk disease will benefit from surgical resection of the primary tumor? Does complete resection of the primary tumor benefit patients with high-risk disease? Do patients in whom a complete resection is unlikely to be accomplished benefit from partial resection of their tumor?
  • Discussed data needed for survival analysis: cause of death, last date of follow up, etc.
  • Discussed types of survival events
  • We discussed applying for VICTR biostatistics support, but we did not estimate the time required yet.

2015 Aug 5

Nick Carter, Surgery resident

  • The sample size estimation is completed using the Kappa statistic. With a total measurements of 48, it provides at least 80% power to detect a Kappa = 0.8 with a two-sided type I error = 5%.

2015 July 29

Nick Carter, Surgery resident

  • I am working on a study trying to show non-inferiority of postoperative care provided by community health workers compared to standard postop care with the operating surgeon. We are just getting started in planning the study, and I hoped to discuss study design and power with a statistician."
  • Study will take place in Haiti. Prevalence of wound infection is very low (~5%), so getting enough patients will be difficult. Discussed a validity study, showing sensitivity/specificity/PPV/NPV with confidence intervals, using surgeon in-person assessment as gold standard and both a) surgeon via cell phone picture and b) community health worker assessment as new diagnostic tools. Due to low prevalence, will need a lot of patients to get a "reasonable" (according to clinical judgment) confidence interval for each quantity.
  • Discussed adding "dirty" infection sites as well to raise prevalence to an estimated 10-20% - something to think about.
  • Link to PS sample size software (Windows)
  • Link to VICTR studio page

2015 July 22

Tracy Marien, Endourology and laparoscopic surgery

  • "I am from the Urologic Surgery department. Basically, my primary question is as follows: I was performing a multi-regression analysis in Stata to assess which factors are associated with passage of ureteral stones. However, it appears that while two factors are significant they also cancel each other out because they are so strongly associated. Is there a way to control for this?"
  • She has ~ 100 patients with kidney stones, some of whom pass the stone without intervention and some that need surgical intervention. Standard of care is to measure the size of the stone in the axial direction with CT scan. Coronal measurements are also made but are not referenced typically. Her hypothesis was to investigate whether both measurements would help predict who requires surgical intervention.
  • We discussed looking at the scatter plot of the axial and coronal measurements and found there to be a strong linear relationship. We advised against fitting both variables in the model of interest. We also advised to pre-specify the model based on literature and clinical knowledge rather than univariate tests. We also discussed the 10:1 or 20:1 ratio for determining the complexity of the model based on the minimum of events/non-events.
  • Because of the strong correlation observed in the scatter plot, we discussed including an aspect ratio variable in the model with either axial or coronal measurement to assess whether this would be a question of interest.

2015 July 15

Jonathan Siktberg and Mayur Patel, TBI

  • Project on diffuse axonal injury; sent PPT slides
  • Team has a good list of aims, models and covariates: POLR model for GOSE score, linear regression for quality of life score - these seem appropriate provided assumptions are met with data. Possibly a Cox proportional hazards model for mortality, if this outcome is of clinical interest.
  • Discussed multiple imputation, since many patients could not be reached for followup (approximately 35%). Recommended doing both multiply imputed and complete case analyses to compare results; multiply imputed analyses may be less biased. Clinical data other than model covariates can be incorporated into imputation.
  • Currently patients are classified as DAI negative or positive, with positive having three possible grades. We recommended doing models with a four-level variable for DAI (0/1/2/3); to address the more typical clinical question of any vs. no shear (DAI positive vs. negative), could redo the model dichotomizing into positive vs. negative.
  • The group plans to apply for a VICTR voucher. Given the possible complexities of multiple models and multiple imputation, we estimate 90 hours.

Jamie Kuck, Division of Allergy, Pulmonary, and Critical Care Medicine

  • "Sepsis patients have high levels of cell-free hemoglobin in their plasma, and these levels are associated with increased risk of mortality. While exploring the possible mechanism of cell-free hemoglobin, I measured levels of oxidized LDL in sepsis patient plasma and found that those patients with high cell-free hemoglobin have low levels of oxLDL, which was a surprise. An endocrinologist suggested that we then look at LDL levels since sepsis patients usually have low amounts, which could explain the low amounts of oxLDL."
  • We suggested a linear regression model like this: oxidized LDL = hemoglobin + total LDL. Prism doesn't seem to be capable of this, so get SPSS and look at examples on UCLA stats web site for instructions.

2015 July 8

Nick Kramer, M3 Meharry Medical College

  • He would like continued input on his project.
  • We discussed how clinical judgment should guide the selection of manuscripts to include in their systematic review. We also discussed how to organize their analysis from the big picture of outcomes by type of fracture (simple versus compound) or type of repair (screws versus plates).

Christopher Brown, Dept Internal Medicine

  • The project is fairly simple, we measure labs once a day or twice a day to follow potassium. I would like the primary end point to be regarding this lab, meaning --if you measure the potassium twice a day, does the potassium stay in the normal range for more time during the patients hospitalization than if you measure it only once a day--. However because I am measuring the value more often in one group than the other I am wondering what the method for accounting for this statistically would be, as there appears to be a sampling bias between the groups. It was suggested to me that this could be accounted for with a "generalized least squares approach" however I do not completely understand how that regression adjustment would help me. In any case, can you tell me if it is possible to compare the time (persondays or personhours) a variable spends between two values (IE potassium between 3.5 and 5.0) when the two groups involved are sampling the variable with different frequencies? (IE I can detect twice as much low or high values in theory twice a day than once a day, so how do I compare these groups)
  • We discussed how to approach analysis of the prospective data -- create an indicator for whether the subsequent days lab was within normal range or not treating the 1x/day or 2x/day as a treatment group variable (standard of care versus intervention) and how this analysis would require a repeated measures analysis such as GEE. As secondary descriptives, we discussed calculating the proportion of measurements outside of normal range in each of the groups.
  • We estimate that this analysis will require 40 hours of a VICTR statisitican's time.
  • We also discussed the retrospective data abstracted from the medical record and methods of analyzing it. This also will require some type of repeated measures analysis as well as decisions on how to handle repeat hospitalizations per subject. We estimate that the retrospective analysis would require 60 hours of a VICTR statistician's time.

2015 July 1

Kendra Parekh, Assistant Professor Department of Emergency Medicine

  • "I would like to attend a biostat clinic on July 1 to discuss data analysis for a survey that evaluated emergency medical technicians’, nurses’, and physicians’ attitudes toward a new Emergency Medical Services system in Georgetown, Guyana."
  • She has responses from 17 EMTs for one survey and about half MDs and half RNs who filled out the second survey. There were about 10 questions between the surveys that were the same. We recommended using the chi-square test with continuity correction to assess whether there was an association between provider type and responses to the question.
  • We recommended she use graphs to display the data as well as tables with proportions rather than simply relying on p-values from tests of association.

Alexander Gelbard, Otolaryngology

  • I am looking at disease of unexplained scarring in the trachea. We looked at the biology of the fibrotic tissue response, and then investigated the association with defined respiratory bacteria. Finally we investigated activation of immunologic pathways in the tracheal tissue samples.

    Expt 1. qPCR results (10 experimental, vs 3 controls)- *the controls are 23 pooled donors preformed in triplicate.

    Expt 2. PCR results (binary yes/no presence of detectable bacterial) in 10 experimental vs 10 controls.

    Expt 3. In isitu hybridiation (binary assessments of staining positive/negative) with 10 experimental vs 10 controls, and 10 normals

    Expt 4. Comparision of % positive cells in Transmission Electron microscopy immunogold staining.

    Expt 5. Elispot comparison. IFNgamma release in response to antigen specific stimulation. 10 experimental, 10 controls.

    Expt 6. Comparison of immunohistochemistry quantification. 10 experimental vs 10 controls and 10 normals.

    Expt 7. qPCR results (10 experimental, vs 3 controls) - *the controls are 23 pooled donors preformed in triplicate.

  • We recommend that he contact the Friday clinic for a question involving the replication of pooled data for the healthy normal group. Otherwise, we recommended non-parametric tests and chi-square with continuity correction, as appropriate.

Daniel Heath Hagaman, Anesthesiology

  • We advised that he get a copy of SPSS rather than using Excel.
  • We suggest chi-square for difference in proportions of forms filled out in VPEC before and after intervention. Include a few months of wash out period after to get a more stable estimate. To look at factors to predict forms being filled out among non-VPEC providers, use logistic regression and include covariates based on clinical knowledge. Ideally, a random effect for provider should be included.

2015 June 24

Laura Wilson, Hearing & Speech Pathology

  • "I am seeking expertise regarding the data analysis plan for a study on school outcomes after sports-related concussion."
  • Survey study of ~120 patients (age 13-17) who were seen for concussions during the last school year, following up on academic performance, special accommodations (extra test time, sitting out gym), and satisfaction with return to school. Hypothesis is that school absences and accommodations are a) correlated and b) both affect academic performance and satisfaction, which are measured on both parents and children.
  • For future studies, mediation or PATH analyses might be appropriate, but these models will need to be simple due to data and sample sizes. Suggest proportional odds or logistic regression depending on distribution of final outcomes (have multiple levels, but if some levels are not well represented, may make sense to combine). Base effective degrees of freedom on final outcome levels Ns - for logistic, the minimum of the two outcome categories. Do not use univariate analyses to prioritize covariates; rather, develop a priority ranking based on clinical knowledge and importance in hypotheses.

2015 June 17

Trisha Pasricha, 4th year medical student

  • "I am a 4th year medical student who has completed a study at the Vanderbilt Center for Surgical Weight Loss that looks at the correlation between depressive symptoms and BMI/medical co-morbidities in patients who have sought surgical and medical weight loss at the center. I would like some help determining if we have used the correct analysis of our data."
  • Info about the study: Complete data on 38 patients, but only 8 medically treated patients - lots of loss to followup in this group. Therefore, any inferences about treatment will need to be approached from the standpoint of very preliminary research. Due to limited sample size and complex research questions, we can't really put all exposures of interest in the same model.
  • Possible models:
  • post-treatment BDI = (% change BMI) * comorbidities + pre-treatment BDI; this answers the question of whether, among patients equally depressed at baseline, there is any association between % change in BMI and/or comorbidities and post-treatment depression, and whether the association for comorbidities changes based on % change BMI, and/or vice versa.
  • post-treatment BDI = (% change BMI) * medical vs. surgical treatment + pre-treatment BDI; this answers the question of whether, among patients equally depressed at baseline, there is any association between % change in BMI and/or treatment type and post-treatment depression, and whether the association for treatment changes based on % change BMI, and/or vice versa.
  • post-treatment BDI = treatment * comorbidities + pre-treatment BDI; this answers the question of whether, among patients equally depressed at baseline, there is any association between treatment type and/or comorbidities and post-treatment depression, and whether the association for comorbidities changes based on treatment type and/or vice versa.
  • BDI (Beck Depression Inventory) is typically very skewed. Suggested checking histograms but probably using an ordinal logistic regression model (also called proportional odds logistic regression). Trisha is currently using Prism; if Prism can't handle POLR, check into SPSS (helpful link for SPSS:

2015 June 10

Kristy Kummerow, General surgery resident

  • "I am a surgical resident and would like to attend a Biostats clinic to request help doing multiple imputation in Stata. I would prefer tomorrow (Wednesday) if there is still space, or Thursday. Please let me know whether either of these are options."
  • We discussed what should go in the imputation model (outcome to be included) and found links to help with the syntax for fitting the MI model and the final model with the MI results.

Michael Kenes, PGY-2 Critical Care Pharmacy Resident

  • "My study is looking at clinical outcomes of Clostridium difficile infections in neutropenic patients. I have collected all of the data and performed univariate analysis. I am in need of assistance in discussing how to handle patients who died within the study timeframe as well as a regression analysis."
  • Since time to diarrhea resolution was captured in the data, we recommended they use a competing risk analysis to handle the deaths.
  • Due to the low number of events, we discussed pre-specifying the model as opposed to using univariate analyses to drive model selection as well as propensity scores for data reduction.

2015 June 3

Jennifer Hale, Pediatric Pharmacy

  • "Evaluation of a computerized prescriber order entry protocol for pain management and sedation in a pediatric cardiac intensive care unit"
  • We discussed alternate outcomes besides total med dosage, ventilator-free days (necessary to have a common denominator to alleviate bias in differing stays in the ICU).
  • We recommended the Wilcoxon Rank Sum test rather than the t-test for the unadjusted tests with continuous outcomes. We also discussed fitting models to adjust for potential confounding. If normality assumptions are met, then the linear model would be most appropriate; if they are not, then other types of models need to be considered such as the logistic regression or proportional odds model.
  • We also suggested investigating average dose/day as an outcome.

Justin Bachmann, Cardiovascular Medicine

  • I’d like to attend the health services research biostatistics clinic today in D2221 MCN. I’m conducting an analysis of the association between self-efficacy and physical activity in a cohort of 2000 patients in the Vanderbilt Coronary Heart Disease Study. Physical activity is characterized as both a continuous (MET-minutes/week) and an ordinal (low, moderate, high) variable. The independent variables include continuous, categorical and ordinal variables. I’m using negative binomial regression as well as ordinal logistic regression and would like to get the statisticians’ thoughts on these models.
  • We suggested investigating whether SAS can run a zero-inflated negative binomial model or not and whether this is necessary given his data. We also discussed the robustness of the proportional odds model when the assumptions are borderline, especially when using the Score test p-value to make the determination.

2015 May 20

Tim Shaver, Biochemistry

  • Per the email: "I am a Biochemistry graduate student working to develop a retrospective study of the correlation of novel gene fusion events with biochemical recurrence following radical prostatectomy. I previously attended the Thursday clinic on May 7 and received some guidance regarding sample size justification for an upcoming VICTR proposal. However, I have encountered some difficulties implementing your advice due to incomplete reporting in our test data set. I would like to bring some of the specific numbers and receive feedback on a new plan for our sample size and power calculations."
  • This study is a case-control study. The main issue is that the pilot data may be inaccurately or incompletely coded, so pilot estimates might be incorrect. We recommended 1) removing patients with no information from the test data set (a sizable number) in order to avoid artificially inflating the denominator; 2) using PS's case-control functionality to calculate the difference in proportions they can detect with the 300 patients they plan to sequence at various levels of recurrence rate among the general population.

2015 May 13

Dikshya Bastakoty, Department of Pathology, Microbiology, and Immunology

  • Per the email: "I am applying for a VICTR grant (part of which involves analysis of human samples for gene expression), and was looking for help with sample size determination based on recommendation by the biostatician who reviewed my grant."
  • We recommended that she check what detectable alternatives she could detect for each of her 3 primary proteins of interest given the maximum number of subjects she could afford.
  • We also recommended that she review the literature to see if other studies existed that supported the difference she was using for her current power calculation.
  • If the current number she reports is all that can be feasibly recruited and afforded in the time and budget constraints, we recommended she highlight that in the application.

2015 April 29

Jim Jackson and Jo Ellen Wilson, VA Quality Scholar Program

  • Per her email: "I am needing a quote for the amount of time it would require to develop an analysis plan and perform analysis for a proposed project of mine. I will be using this information to submit a request for funding from VICTR."
  • Recommended keeping it to aims 1-4, since we're not sure we have adequate data to answer aim 5. Estimate 90 hours for two manuscripts (data set is very clean, and VICTR statistician is familiar with it). We edited Jo Ellen's aims to reflect discussion of modeling specifics in clinic - Jo Ellen has this document.

2015 April 15

Eric Wise, Department of Surgery

  • Spearman correlation is univariate; multivariable regression adjusts for multiple potential confounders
  • Do NOT use univariate selection to decide which variables to put in models; instead use clinical judgment, literature search, hypothesis to decide which variables to put in - less subject to noise and confounding
  • Model selection (forward, backward, stepwise) is fraught with problems; using above approach -> no model selection algorithm, less likelihood of "data fishing"
  • Kaplan-Meier is univariate; Cox proportional hazards model good for time to event analyses with adjustment

2015 March 18

Alexandra Fish, Center for Human Genetics

  • I have a question regarding approaches to handling sampling zeros. I had previously conducted an analysis in which I used a likelihood ratio test to determine if including interaction terms substantially improved model fit. I am now trying to reproduce that analysis in a new data set, which contains sampling zeros. When I run the analysis, I am getting a p-value for the LRT, but the program is unable to estimate the betas for individual terms. So, I guess my question is - is the LRT appropriate in this situation? Can I trust the p-value? I am uncertain which of the clinic themes is most appropriate for this question.
  • She is investigating whether the interaction of two SNPs are associated with gene expression. She has fit a model with additive and dominant terms for both SNPs and their interaction; however, out of the 5000 subjects, no one is recessive for both SNPs. We were unsure of how best to approach this; however, we recommended against some suggestions she found while searching for an answer such as simply adding a constant count to all frequencies to avoid cell counts of
    1. We recommended she contact Yaomin Xu.

2015 March 11

Candace McNaughton

  • "I’d love suggestions about data manipulation, in preparation for planned analyses. I have received data pulled from the EDW that includes multiple BP and other measures per subject and over time; this data needs to be combined with prescription data (also over
time), as well as with a 3^rd dataset that includes measures of adherence to the blood pressure medications."

Henry Ooi, Cardiovascular Medicine Heart Failure & Transplant

  • "We have a large dataset in Stata format which we are planning to analyze. There are duplicate entrys and we would like advice on how to handle this this in Stata without losing data."
For both of today's clients, a couple of links on aggregating/collapsing data in Stata that may be helpful:

UCLA Stata web page with examples
Short example from Indiana University

2015 March 4

Dupree Hatch

  • "I am a Neonatal-Perinatal Fellow with an interest in Patient Safety in the NICU. I am planning to look at unplanned extubations in the context of the NICU with a future study. I have a data set which contains ~80 unplanned extubations in ~60 patients as part of a larger cohort of all infants that have received mechanical ventilation in our unit for the past year. I am hoping to describe the risk factors for infants to have these events. I have a data set which contains time-to event data for all of the infants as well as various demographic and clinical data. My needs for Biostats clinic are: Help with designing the survival analysis with repeated measures for the patients who have had multiple events. Modeling risk of unplanned extubation with postnatal age as the independent variable Quote for how much time and resources would be needed to have formal biostatistics analysis and help with preparation of the manuscript."
  • Ideal scenario would be survival model with competing risks and repeated measures; we are not sure that this exists. Possible solutions: look at time to first unplanned extubation, with death as a competing risk, censoring babies who did not have an unplanned extubation; also look at calculation of ventilator-free days per ARDSNet definition (babies who die get zero VFDs; otherwise, calculated as [time of interest, eg 28 days] - [time on vent or time after unsuccessful extubation, usually defined as extubation followed by death or reintubation within 48 hours]). Possibly look into data reducation techniques like propensity scores, since low number of events (~60) means only 4-6 degrees of freedom included in model, and association between many covariates and time on vent is very likely nonlinear.
  • Current VICTR policies:

Lisa Rae

  • "I am in the process of writing a VICTR grant application looking at changes in Plasminogen and the coagulation cascade in burn patients, with a primary outcomes of: development of heterotopic ossification (incidence 1-3% of burns) and serum levels of coagulation factors after injury and during subsequent wound healing. My data will include serum lab values, xrays and photos of the healing wounds (time to wound closure). "
  • Recommend primarily descriptive study (looking at relationship between total burn percentage as continuous variable vs inflammatory markers, eg d-dimer); event rate for HO is so low (1-3% in total population) that it's unlikely we could practically enroll enough patients to get a proportion/CI within a reasonable margin of error.
  • VICTR policies:

2015 February 4

Lara Harvey

  • "...a study of BMI and its influence on FSH levels. I have three large deindentified datasets I have obtained from synthetic derivative and would like some help cleaning and compiling the data and entering it into Stata to use."
  • To read in file from SD: "insheet using "filepath"", or Import, ASCII Data Created by Spreadsheet, select file and choose Tab-delimited, OK (using Stata 10)
  • If more complicated data management is needed to find closest BMI/FSH combination, ask synthetic derivative folks where to start (bioinformatics core?) - this would fit in a 35-hour VICTR voucher, but not sure biostats vouchers can be used for data management and graphics.

George DeKornfeld, VCH Pediatric Heart Institute

  • "We are looking at low birth weight infants under 2000kg with complex congenital heart disease. We are attempting to statistically compare the surgical outcomes, in terms of complications experienced, of a group which was treated at initial presentation and a group which was first allowed to mature and grow. Our hypothesis is that it is better to treat earlier. The groups have been divided and the complications have been noted for each patient."
  • Two main outcomes: worst complication (scored per patient) and days on ventilation
  • Days on ventilation is complex due to mortality (average ~32% in population), so will be artificially truncated for patients who die. Need to account for this - look into ARDSNet definition of vent-free days for one approach.
  • Think of potential confounders; birth weight is a major one (include as continuous variable in model). Can adjust for 1-2 confounders in addition to treatment.
  • In SPSS, use ordinal logistic regression (also called proportional odds logistic regression). We caution against Excel for statistics.
  • A VICTR voucher might be helpful if more complicated techniques (propensity score) or a manuscript are desired; see policies here.

2015 January 28

Mitch Odom, 4th year medical student

  • He has a database of a few thousand; baseline testing only (no repeated measures), looking into neurocognitive and symptom scores for young athletes assessed by a computerized testing battery.
  • There are 740 subjects in his data. Each have taken a cognitive assessment (continuous measure ranging from 0 - 100).
  • The primary question of interest is to examine the association between cognitive score, cognitive status (ADHD; LD; ADHD/LD) and hours of sleep the night prior to the exam (categorized as < 7; 7-9; >9.
  • We discussed any confounders that should be included in the linear regression and whether there should be an interaction between cognitive status and hours of sleep.
  • He might find helpful code hints for SPSS at UCLA's web site:

2015 January 21

Sarah Greenberg, Research Coordinator and Health Policy Fellow

  • Orthopaedic trauma - looking for help with a linear regression for determining complications in long bone fractures

2015 January 14

Donald H Arnold, Pediatrics and Emergency Medicine

  • The analysis is as follows:
    • Pulse oximeter plethysmograph estimate of pulsus paradoxus (PEP) is an electronic measure we have developed to measure the severity of acute asthma attacks.
    • Predictor variable: PEP
    • Primary outcome variable: FEV1, continuous variable
    • Secondary outcome variables: i. Acute Asthma Intensity Research Score (AAIRS), ordinal scored 0 to 16; ii. Airway resistance, continuous
  • Will fit 3 baseline models and 3 change models. Need to finish analysis (and possible manuscript) before April 2015. Apply $2500 VICTR voucher (~40 hours).

Cesar Molina, Orthopedic Trauma

  • Plans to attend for help with a sample size calculation where the prevalence is low.
  • Submitted manuscript and was criticized about small sample size.
  • Identify risk factors for deep infection and non-union in pts with open distal radius fractures . N=62 (only 1 infection); N=54 were followed to be able to get outcome of non-union (4 non-union)
  • Download PS (Power and Sample Size) software for sample size calculation. Choose dichotomous outcome, prospective, two proportions. For example, two groups will be diabetic and non-diabetic, compare prevalence of deep infection between the two groups. Need 140 diabetic and 140 non-diabetic pts to detect a difference between 15% and 5% with 80% power.

Carolina Pinzon, Surgery

  • Feasibility study on prevalence of BMP7 protein in two groups of women. Have done experiments in animals, but no data in human
  • If the proteins can be measured in human, want to compare the protein levels between groups. Need standard deviation and a clinical meaningful difference to calculate sample size

2014 December 31

Sarah Greenberg, Research Coordinator and Health Policy Fellow

  • Orthopaedic trauma

2014 November 26

Dr. Hernandez, Jennifer Morse

  • Comparing the difference in time between two different endotracheal devices.
  • Patients in operation room. 12/15 succeeded in A, 6/13 in B. Failure was claimed when more than 10 minutes was taken.
  • Could do two-step analysis. 1. examine binary success using chi-square test or logistic regression. 2. compare time difference among success patients using linear regression.
  • Could use all the data (including censored data) and do survival analysis (log-rank test, or cox proportional hazard model).
  • data from multiple sites. Note the differences between sites.
  • The total time can be broken time into three phases. Could analyze the three periods separately.

2014 November 19

Michael Kenes and Joanna Stollings, MICU Pharmacy

  • This study seeks to analyze the natural history of delirium. In order to achieve this, the study will divide patients into two cohorts: those with a continued stop of sedation after spontaneous awakening trial (SAT), and those with sedation restarted after SAT. Based on these two cohorts, we will characterize the time to resolution of delirium after SAT and the time needed to remain delirium-free for 48 hours. Additionally, we will assess the time until reappearance of delirium once sedation is restarted in those patients who become delirium-free following SAT.
  • Given the outcomes and potentially complex data management, we estimate 120 hours for a VICTR voucher for this project.

Silky Chotai, Spine Center

  • I am working on a spine research project and planning to apply for the VICTR grant.
  • Implemented an IT protocol in July 2014 where each part of a standard-of-care protocol is checked off in the EHR system. Want to see if compliance rates and outcomes improved after this IT system was initiated.
  • Part I: Compare compliance rate for standard of care before and after IT system. This lets us determine whether there was any actual improvement in adherence to standard of care before and after IT system was implemented.
  • Part II: Compare outcomes before and after IT system, while adjusting for potential confounders (severity of injury, age, etc - determine based on clinical knowledge). Hypothesis is that protocolizing standard of care will improve patient outcomes.
  • Suggest classifying each piece of the standard of care protocol into three groups: fully compliant with best practice; received best possible care for that patient, even if not ideal standard of care; received care that did not match ideal care for no recorded reason.
  • We estimate 95 hours for this project.

2014 November 12

Wes Self, Tyler Barrett; Emergency Medicine

Examining associations of 2 hour rate control in the Emergency Department with various genetic variants.

2014 November 5

Mary Van Meter, Medical student

I am a 3rd year medical student in the planning phase of a research project that will look at the cost of sterilizing surgical trays within different specialties of gynecology. This project is not really going to be focusing on patient outcomes, and I wasn't sure if I should plan to attend a clinic on Monday because it focuses on cost outcomes of various surgeries, or if I should attend on Wednesday as it involves surgery, or if it even matters at all. I would like to attend the week of November 3-7 if possible.

Kyla Terhune, Department of Surgery

Have not been to a biostats clinic before so new to this, but would like to bring two very simple projects to the clinic:

Both on surgical education:

1) A completed project in which we assessed 50 of our interns (assessment was done by both residents and attendings). Wanted to review the assessments and query the best way to compare the assessment done by residents compared to that by attendings. (I can send data in the morning for review)

- comparing knot tying and suturing skills between two groups ("novices" and R2's)

- interested in evaluating inter-rater reliability (use a Kappa statistic, need to decide if a weighted Kappa is necessary)

- encouraging using two raters rather than three, better interpretability

- if any variables are continuous consider intraclass correlations (should focus on Kappa, Likert variables)

- Need to consider:

- how to define a clinically significant difference

- cautioned on the possibility of high variability in the small data set for the second group (n=8)

- consider randomization within the population of the "novices"

- estimate 40 hours of statistical support

2)Review the study design for an upcoming project that we intend to complete on video assessment of interns and basic technical skills. We have submitted the IRB for approval, and have the basic study plan in mind but wanted to review the study design prior to beginning, and would potentially apply for a VICTR grant with this one.

- also consider using spaghetti plots to compare scores between raters

- estimate 60 hours of statistical support

2014 October 29

Jonathan Schildcrout and Yaping Shi

2014 October 22

Jennifer Morse, Perioperative Clinical Research Institute

  • We created an educational module for 7 fellows who answered daily questions to prepare them for their board exam. We would like to compare their daily quiz participation and scores with their final exam scores. Data to be collected includes:
    Quiz Data
    Number of questions attempted
    Number of questions answered correctly
    Above data broken down by question category

    MCCKAP Data: scores broken down by question category

    Fellowship Data
    Number of hours worked during the specified time period
    Number of procedures performed
    Number of attendances to lectures
    Number of attendances to evaluations

  • Basic plan: write up educational intervention for critical care/anesthesia fellows (daily question emailed to all fellows). Compare pre- and post-intervention board exam scores, overall and by subscores. Suggested basic descriptive statistics for pre- and post-intervention groups (N=7 fellows in current data, with another 14 fellows undergoing intervention now), stripcharts and Wilcoxon test to compare pre- and post-intervention scores.

2014 October 1

Craig Sheedy, Emergency Medicine

  • This is a follow-up visit to the one made September 10. Craig is working with Don Arnold in Pediatric Emergency Medicine as his mentor on a study looking at whether passive oxygenation positively influences O2 sat levels during intubation. They have retrospective data on 44 subjects who were intubated without passive oxygenation. Because this practice is now standard of care at Vanderbilt, a randomized trial cannot be used to study their question of interest.
  • Retrospective data is limited to 44 patients, with main outcome of interest = lowest O2 sat during time it takes to intubate. Outcome is not normally distributed, so for actual analysis, something like Wilcoxon test is more appropriate, but using PS's t-test calculations and retrospective data, we roughly estimate 80% power to detect a difference of 15% saturation with 1:1 ratio (new vs retrospective data), and about 13% with 2:1 ratio.
  • Using lowest saturation during intubation attempt as the outcome, adjusting for age and possibly race and gender, we estimate about 40 hours for analysis.

2014 September 24

Justin Godown, Pediatric Cardiology

  • Email: I have an analysis to be performed involving a multivariable logistic regression looking at risk factors for antibody development after pediatric heart transplantation. I was planning to apply for a VICTR voucher for this project. Do I need to attend a clinic to discuss this prior to applying? How many hours do you usually estimate for a project like this? It should be fairly straightforward.
  • Suggest limiting followup to first five years after transplant, limiting patient population to those transplanted >=5 years ago; this deals with issue of very different followup times (patients in database transplanted from 1987 - 2014). Possible secondary analysis looking at development of antibodies by one year after followup.
  • Covariates include ischemic time, pre-transplant antibody presence, etc. Anticipate ~50 events, so can include only 5 parameters in model.
  • Goal is abstract with a possible manuscript; Justin says data is very clean (stored in REDCap). Estimate 40 hours for manuscript.

2014 September 17

Travis Ladner and Eric Wise, Surgery

  • Genetic research in cardiovascular disease using BioVU
  • 100 patients, 20 vasospasms (on initial screen), 40 possible SNPs; thinking of ~40 logistic regression models (vasospasm = SNP + covariates)
  • recommend either Tuesday biostat clinic or VANGUARD clinic in PRB

Justin Godown, Pediatric Cardiology

  • The project is looking at strain (a measurement by echocardiogram) in patients after heart transplant. We want to compare the measurement to a group of normal controls at different time points after transplant. We also want to see how this measurement changes in the setting of rejection or coronary disease.
  • Transplant patients get echos twice a week right after transplant, spaced out to every three months or so eventually
  • Hypothesis is that rejection and coronary disease might be able to be detected earlier via a change in strain measured on echocardiogram
  • "Normals" will be patients who are referred to clinic for murmurs, etc, but have normal echocardiogram; will be matched by age and gender at each time point
  • Question 1: compare strain values in transplant patients to "normals" at transplant, 1 month, 1/3/5 years after transplant; probably estimate 40 hours for this portion
  • Question 2: predict rejection by previous echocardiogram values, time between previous echo and rejection, and interaction term between them for effect modification (logistic regression with repeated measures - use Huber-White sandwich estimation using patient as cluster); estimate about 100 hours for this portion
  • Check with Frank Harrell, Yanna Song and/or Chris Slaughter for possible collaboration plan; otherwise, plans to apply for VICTR
  • Strongly suggest using REDCap for data collection, since it will make data management/cleaning/analysis much easier

2014 September 10

Craig Sheedy, Emergency Medicine

  • "We are trying to start a project in the pediatrics ED and would like to calculate sample sizes needed for the study."
  • Craig is working with Don Arnold in Pediatric Emergency Medicine as his mentor on a study looking at whether passive oxygenation positively influences O2 sat levels during intubation. They have retrospective data on 45 subjects who were intubated without passive oxygenation. Because this practice is now standard of care at Vanderbilt, a randomized trial cannot be used to study their question of interest.
  • We used PS to look at various sample sizes and different ways of approaching the study to determine whether the study will have sufficient power to detect a difference based on the number of subjects that will be feasible to recruit within a year.
  • Craig will discuss with his mentor and both will return to clinic for further evaluation after discussing the results we saw today.

2014 September 3

Silky Chotai, Vanderbilt Spine Center

  • I am a research fellow at the Vanderbilt Spine center, we are working on a grant proposal. I have some questions regarding the biostatistics.
  • Primary aim is predictors of patient-centered outcomes, specifically pain (continuous score from validated scale). However, different surgery types get different pain scales, so cannot include all surgery types in the same model for pain (six types of pain scores). (EQ5D, a QOL score, is used across surgery types.)
  • Secondary aim is to use surgical patients in the spine registry to look for predictors of high direct/indirect/total costs for patients with various surgery types. Start by looking at distribution of cost outcomes/residuals - linear regression might be appropriate, but need to look at distribution to see whether transformation or other model type is necessary.
  • Tertiary outcome: identifying cost "outliers" - factors which determine patients who have unusually high costs.
  • Some patients are included multiple times due to revision surgeries, and all are followed up for pain scores at multiple time points, so repeated measures are important.
  • Strongly advise against using univariate analyses to select potential predictors; this will lead to potentially misleading and/or nonreproducible results. Instead, use clinical knowledge to select potential risk factors and interactions/effect modifications to include in models.
  • Plan is to apply for VICTR voucher.
  • For future projects with smaller sample sizes, discussed data reduction techniques like propensity scores, prioritizing degrees of freedom, etc.
  • Mentor is unable to come on Wednesdays due to OR schedule, so recommend returning on a different day and having him available for discussion.
  • Possibly split into two separate projects, since all the above would easily run >200 hours. Planning to use REDCap for data collection.

2014 August 27

L. Tyson Heller, Jennifer Green, Dr. Rice, not present, Med/Peds

  • Under the umbrella of improving IV access on the general medicine floors in general, we have a proposal for a simultaneous study on the placement of ultrasound-guided peripheral IVs placed by Medicine Housestaff.
  • Design for testing intraosseous catheters against central lines. the catheters can be put in much more quickly than the central lines
  • Discussed whether/how to randomize in previous clinic
  • For patients who have codes, there is about 5-10%
  • Increase in return of spontaneous circulation (y/n)... event note is sent to redcap
  • cerebral
  • Questions about designing a study.
    • What to collect
  • How is time off of iv access measured?
  • other study: would training in ultrasound _ decrease the time off
  • Clinically, being without iv access for more than 4 hours is unacceptable.
  • They are trying to decrease the time between being without access until IV therapy places the IV.
  • Timepoints: time of loosing access, IV consult request, time IV placed by IV therapy.
  • Could there be additional variablility due to the prioritizing for urgent patients.

Douglas Conway, Vanderbilt Institute for Clinical and Translational Research

  • I would like to attend the Wednesday, 8/27/14 biostats clinic regarding an upcoming RCT. We need some power calculations done to determine the study size needed. We have collected pilot data from around ~70 individuals through a survey. One of the outcomes of the study will hopefully be a significant change in quality of life (QoL), measured by a 29 item instrument/questionnaire taken at the beginning of the trial and several more times throughout. We want to know how many people we will need to enroll to hopefully show significant, powered stats of change. The 29 item instrument is within a larger set of questions that makes up our pilot data. That raw data has been attached as well as a scoring guide generated by the institution that created the instrument. I look forward to meeting with you all tomorrow.

2014 August 20

Ahilan Sivaganesan, MD - Neurosurgery

  • Needs help with database design, data cleaning, and possible extraction of data from StarPanel /Wiz.

Ashly Westrick, MPH - Neurosurgery

  • She is working on a VICTR application for funds for biostat support for a retrospective abusive head trauma study, and I'll need to include in my application the proposed length of time for analysis (cost, etc).
  • Wants to describe the population. Model with disposition as the outcome, i.e. home with mother/father, home with other family, rehab. Model with death as the outcome (has about 25 deaths).
  • Approximate 60 hours for analysis and manuscript.

2014 July 30

Luke Krispinsky, Pediatric Critical Care Fellow

  • Needs an estimate of time needed for VICTR application.
  • Main outcome is endothelial function post-bypass vs exposures including age, weight, baseline endothelial function, potentially sedation (though sedation is complex because different drugs are used)
  • Secondary analyses: correlate endothelial function with additional outcomes (peak lactate, fluid, vasoactive ionotrope score) and correlations between endothelial function and biochemical qualities
  • Eventually look at mortality, time on MV, time to ICU and hospital discharge, but not in the scope of the first manuscript
  • Check into potential collaboration with cardiothoracic surgery (Frank or Hui?); if VICTR voucher is submitted, estimate max of 100 hours for above

Malena Outhay, SOM

  • Gave intervention in trauma education to both medical personnel and laypeople in Mozambique, along with pre- and post-intervention tests; currently have 88 subjects, roughly 50/50 medical vs laypeople
  • Main question: are pre- and post-intervention test scores different, and does that difference depend on medical vs layperson (may be other potential confounders as well, but incomplete data on these)
  • Test scores could be 0-100; actual data ranges from about 13-90% and is pretty normally distributed
  • Potential collaboration with IGH - check with Meridith to see if this project is covered
  • Recommend paired t-test for first question, then linear regression: post-test = pre-test + group, or possibly post-test = pre-test * group (interaction + main effects) if hypothesis is that regression slopes will be different for medical personnel vs laypeople
  • Adjusting for additional confounders difficult due to incomplete data

2014 July 23

Calvin Gruss, Department of Anesthesiology

  • Calvin is asking for help in analyzing data from an anesthesiology project comparing the accuracy of a neck circumference estimation with a true neck circumference.
  • Two main questions: is "gold standard" neck circumference measurement repeatable, and is a new method (via digital photo) as reliable as gold standard?
  • Recommend Bland-Altman test for both, but not in Excel add-on package; look for resources like SPSS (possibly contact Jonathan Schildcrout or Matt Shotwell for direction, since falls under anesthesia collaboration)
  • Possible reference

Austin Adams, ENT Surgery

  • Follow-up questions from last week's clinic. Helped with PS calculations.

2014 July 16

Kelvin Moses, Urologic Surgery

  • wants preliminary results for a grant submission.
  • requesting data from Southern Community Cohort Study (SCCS). Needs power analysis and statistical plan for the data request.
  • applying for VICTR biostats support for funding for this prelim project. Needs estimate.

Alex Seelochan, Anesthesia

  • Original email: My name is Alex Seelochan, and I am currently affiliated with the summer anesthesia program at Vanderbilt. I wanted to request whether I may come to the Wednesday class of the Bio statistics Lab to revise Fisher's Exact Test. Specifically, I do have data to consider. I have attached the following table for your reference. My mentor (Dr. Thomas Austin) and I are trying to do the appropriate analysis and extrapolate patient sample needed for significance. Moreover, I have been using PS.
  • There are two groups, intubated at 20 and 30 centimeters of water; main question is whether post-intubation pressures are different between the two groups. Original plan was to collapse into "in acceptable range" vs. not; we instead recommend keeping these values continuous, graphing data (boxplot + stripchart), and doing a t-test or, more likely, a Wilcoxon rank sum test to compare the two groups.
  • Recommend using SPSS for graphs and analysis rather than Excel.
  • PS can calculate number of patients needed in each treatment group to see a difference of __ assuming one exists.

Austin Adams, ENT Surgery

  • Planning to apply for VICTR voucher - will eventually need estimate of hours
  • Plans prospective RCT comparing two types of intubation, with several outcomes of interest; will collect data via REDCap
  • Will use PS to calculate sample size on primary outcomes: aspiration (yes/no) and patient satisfaction postop; need pilot data/estimates for both quantities (eg, what % do we expect to aspirate in each group, or how satisfied are patients under usual care and how much of a difference would be meaningful - need measures of variability, like standard deviation, in addition to estimates)
  • Also need to figure out how to measure patient satisfaction - visual analog scale, Likert scale, simple satisfied vs. not satisfied? This will affect power calculations
  • Use REDCap to full advantage - take advantage of numeric fields/ranges, dropdowns, etc (this will maximize stats support by minimizing data cleaning time)
  • Talk to Matt Shotwell and Jonathan Schildcrout (PhD biostatisticians) about possible collaboration plan with anesthesiology

2014 July 9

Catherine Bulka, Anesthesiology

  • General question: propofol dosing required for loss of consciousness has been shown to differ by race, but providers are often not considering race in choosing doses; wondering whether this will be improved by educational intervention for providers
  • Concern is that studies which suggest difference between races may not be strong and/or generalizable
  • Also, given VUMC patient population, unlikely that we'd be able to see any differences between races other than white vs. African-American
  • Alternate research question: assuming everyone starts at same loading dose (by weight), do different races require different maintenance doses? Recommend mixed effects approach to account for differences among providers.

Anji Wall, General Surgery/Bioethics

  • Original email: "I am a general surgery resident, with a PhD in bioethics, and am planning to start a project assessing the common ethical issues discussed in MMI conferences. I have a paper survey tool and a coding guide, which I am planning to use for data collection... I would like assistance with determining sample size, format for data collection and the type of analysis to conduct. I do not have research funding or a formal research mentor but will attempt to get funding through VICTR if this is something that you all think would be warranted."
  • Recommend collecting data in REDCap - no need for numeric vs. character coding, etc
  • Main goal is to determine how best to educate surgeons on clinical ethics topics; main question: are ethical issues discussed equally often in morbidity vs mortality cases?
  • For descriptive purposes, plan to start with 100 cases, then use that as pilot data or proceed with morbidity vs mortality comparisons from there

2014 June 18

Jennifer Morse and Emmanuel Okenye, Perioperative Clinical Research Institute

* "I am assisting one of our summer students with a research study. We are planning on attending the Wednesday clinic together to get some advice on performing an analysis.

The investigator is trying to determine if there is a statistical difference between the time to successful intubation between 2 devices when used on mannequins. There are two sites (Us and San Antonio). At each site, there were 5 anesthetists who each performed 6 trials with each device.

In my initial analysis, it was determined that the total time to success was not normally distributed. The two devices appear to result in different lengths of time but I am unsure what test to use. Mann-U? Can that account for the multiple trials per individual? In addition, there was a large difference between the two sites due to a different mannequin and different experience levels of the anesthetists (This demographic data was not captured).

* Data is skewed, but not terribly, so suggest a linear model with sandwich estimation to adjust for within-subject correlation (checked model diagnostics in clinic). Example code:

## Model without interaction, since interaction may be underpowered with 10 subjects
mod1.ols <- ols(Total.time ~ Site + device, data = mydata, x = TRUE, y = TRUE)
## fit original model, without accounting for within-subject correlation
mod1.robcov <- robcov(mod1.ols, cluster = mydata$Subject)
## use Huber-White sandwich estimation to account for within-subject correlation
mod1.robcov ## get coefficients, p-values

dd <- datadist(mydata); options(datadist = 'dd') ## needed to get predicted values
## Plot predicted times for each site, device held at mode of other variable (eg, predicted times for Vanderbilt and SA held at most frequently tested device)

## Calculate predicted values for all combinations of site, device, save as data set to use in additional plots <- Predict(mod1.robcov, Site = c('Vanderbilt', 'San Antonio'), device = c('LMA', 'i-gel'))

## Repeat above for interaction model as sensitivity analysis, replacing "+" with "*" in ols() call
## Repeat also for time for first step in process (similar outcome distribution)

2014 June 11

Stuart Ross, Anesthesia

  • "I'm in the beginning stages of a project with the anesthesia department and I wanted to get some ideas about how to best collect information. Later I'll be getting data from charts here at Vanderbilt, but the goal is to compare patients here with those elsewhere. The gist of it is comparing how patients from various contries/ regions differ from those here. What I want to do is take information from however many sources I find and organize it in a way that makes it easy to use/ search/ etc etc. Maybe a simple excel spreadsheet will do, but I wanted to make sure there wasn't an easy way of doing this that I might not be aware of."

2014 June 4

Tom O'Lynnger, Neurosurgery

  • "I’m interested in attending the Wednesday biostats clinic to discuss a project I am conducting about outcomes in pediatric traumatic brain injury after ICU protocol implementation. The main analysis is an ordered logistic regression involving Glasgow Outcome Scale and a second ordered logistic regression involving discharge disposition. I’d also like to predict favorable discharge disposition using logistic regression. I have 129 total patients and have already done an analysis myself (I’m an MPH student in addition to being a resident in neurosurgery) that I believe is accurate but would like to confer with an expert. If possible it’d be great to go over the analysis during the session, though if not, I’d plan on getting VICTR support."
  • We went over Tom's analysis and made some suggestions, including: describing continuous variables with medians and IQRs instead of means and SDs; doing Wilcoxon tests rather than t-tests for descriptive statistics/table 1; removing sex and race from the model to avoid overfitting; combining the single patient discharged to acute care with the patients discharged to rehab; making sure that Stata is coding the outcome variable as expected; producing a boxplot of raw data for before/after and favorable/unfavorable discharge disposition vs GCS.

2014 May 21

Heidi Smith and Natalie Jacobowski, Psychiatry and Anesthesiology

  • Study to describe pediatric delirium in ICU.
  • Want to study relationship between diagnoses of delirium and physicians' use of certain descriptors.
  • Intensivists' description of patients who were diagnosed as having delirium.
  • They have developed a list of areas: agitation,
  • One factor is whether delirium is mentioned in the problem list and in the plan. There is a daily physician note.
  • Some medications could contribute to delirium. This could be reflected in nurse notes or medication record.
  • They have three observation times for each patient. The day prior, the day of, and the day after the day delirium was diagnosed.
  • Discussed need for comparison with patients who were not diagnosed with delirium for the inferences they are interested in.
  • Could select controls based on matching on important patient factors.
  • It is important to consider which day's observation to use for patients who didn't have a diagnosis of delirium. One option is to consider the
  • You could also potentially consider including all observations for the patients.
  • They are considering using VICTR for biostatistical support.
  • We think Jennifer Thompson would be well suited for this project and should give the time estimate.

Ben Mackowiak, Neonatology

  • Acidosis and pulmonary hypertension in neonates.
  • Has experiments on piglets whose pulmonary vessels were exposed to acid in three doses until a certain pH is reached.
  • They have a machine that reads the percent dilation
  • Discussed problems with analysis on percent change. An alternate way to control for the initial size is to use a regression model controlling for the initial size.
  • You can estimate the (absolute) mean difference between the initial and final data. You could do a paired t-test between the baseline and result after the first application of acid.
  • A good approach would be a mixed effects regression model with a fixed effect for dose and a random intercept for subject (pig vessels).
  • Should plot the trajectories and see how linear they are.

2014 May 7

Shreyas Joshi, Urology

  • Appling for a VICTR grant and would appreciate assistance powering our study and determining the most appropriate data analysis plan for the study.
  • Overall, 49 patients died.
  • Looking to correlate preoperative sarcopenia with postoperative outcomes in patients undergoing surgery for Renal Cell Carcinoma (RCC).
  • We are using a program that determines the skeletal muscle index on preoperative CT scans to obtain our "preoperative sarcopenia index" variable. Sarcopenia is lack of muscle mass. It is a newer measure of nutrition.
  • It may be nice to have some or all of the scans re-scored to see how reliable it is. You can look at the agreement in the scores. Or, if there is already a study published,
  • Have 250 pre-op ct scans. All patients who get the surgery should have the scan. Whether they have the scans would maybe depend on the referral patterns.
  • So far, we have overall and disease-specific survival data, and we are working on gathering 30/90 day complication rates and hospital-free-days.
  • We hope to be able to power the study for survival (overall or disease-specific) in order to move forward with data analysis.
  • They are applying for biostatistics support, and we estimate that the project will require 50 hours of statistician work for this project and manuscript.

2014 April 30

Calvin Gruss, Anesthesiology

  • studying the effects of acute hypoxia longitudinally in ~100 healthy subjects. Study has three phases, subjects had hypoxia first, then had carboxyhemoglobin/methemoglobin, at last they had hypoxia+elevated carboxy/methemoglobin. Each subject had two runs. The # of measurements for each subjects is from 20-26. Interested in assessing the relationship between % carboxyhemoglobin and hemoglobin concentration in the blood.
  • suggest using mixed effect model. Additional stat help can be get from Dr.Schildcrout or Dr.Shotwell or VICTR biostat support.

2014 April 2

Christy Goben, PICU

  • Needs quote for VICTR biostat support; study in pediatric critical care with a focus on sedation trends in the PICU over the last decade with delirium impact
  • No delirium screening prior to 2008-2009 (PCAM introduced); now done on children 5yo+
  • Delirium re-education done in 2011, so plan to compare three time periods (no screening, post-screening education, post-re-education)
  • Expect overall use of sedation to decrease over time with possible exception of Precedex/dexmedetomidine
  • Pulling data from StarPanel, ICU only (not after transfer to floor); suggest collecting in longitudinal format rather than summary if feasible
  • Likely to have multiple ICU stays for some patients; will have identifier, so can control for within-patient correlation
  • Plan to describe delirium prevalence, use/dose of several sedatives and antipsychotics (primary goal), and secondarily, correlate/model these vs outcomes (ICU LOS, hospital LOS, time on vent, mortality)
  • For outcomes, need to collect potential confounders as feasible (eg, use of pressors, sepsis, diagnosis/procedure codes, SOI, etc - whatever seems reasonable)
  • Suggest very preliminary estimate of 120 hours, pending discussion on informatics/pulling data

2014 March 26

Luke Krispinsky, PICU

  • Pediatric critical care fellow looking at endothelial dysfunction before and after cardiopulmonary bypass in infants (0-12m) undergoing repairs of congenital heart defects using iontopheresis and monitor that can quantify distal perfusion made by Perimed
  • Since this is a new machine, suggest taking multiple measurements on same patient/same time to gauge reproducibility - may be restricted by cost ($13/probe)
  • Mainly interested in a) describing change in endothelial function b) seeing how change is associated with outcomes (eg, ICU LOS)
  • Exposures: difference between baseline and lowest (post-surgery) measurement, difference between post-surgery and 24h measurement, and AUC
  • Outcomes: ICU (average 3-14 days, depending on type of surgery and other variables) and hospital LOS (varies widely), ianotrope score (need for BP meds), fluid requirements, vent LOS, mortality (<10%) - within defined study period, like 30 days?
  • Next steps: define exposure(s) and outcomes(s) of primary interest, get estimates of variability on those outcomes from literature, think about potential confounders like severity of illness

2014 March 12

Christy Goben, PICU

  • Needs quote for VICTR biostat support.
  • Study in pediatric critical care with a focus on sedation trends in the PICU over the last decade with delirium impact

Sarah Scott, MD candidate, Director of Pharmacy Shade Tree Clinic

  • Needs a quote for VICTR biostat support.
  • Small study using a cohort of pediatric critical care patients. Her topic is the association of acute kidney injury and mortality in children on ECMO.

Angela Maxwell-Horn, Developmental Medicine

  • Briefly, I am doing a training on developmental screening for
pediatric residents when they rotate through my department (developmental medicine). I am going to examine the well-child checks that they do in their continuity clinic both before and after the training to see if their practice changes in how they screen and refer. Another aspect is that I am also contacting their preceptors in the clinic to see if they think the resident does a better job of screening after the training. Currently, there are 26 pediatric interns. I want to know how many patient charts I need to look at before and after the training to make any results significant. Additionally, we are thinking of breaking up the next intern class into two group would get the in person training by me, and the other group would watch a video recording online. I am concerned, however, that this will considerably lower the power of my study.
  • Recommended: sampling all relevant charts for each intern from month prior to and month after training; maybe stratified chi square test for # charts with appropriate screening. First step: how many interns are currently doing appropriate screening?
  • Angela will talk to preceptor and finalize outcome variable, get initial idea of how many people are doing screening correctly pre-training.

2014 March 5

Kendell Sowards, Instructor in Surgery

  • Has requested feedback on a power calculation.

Eileen Duggan, Pediatric Surgery

  • Questions regarding how to treat missing race and insurance data in large dataset; how to build best model for overall adverse event binary outcome (specifically how to adjust for hospital clustering, how to treat race and insurance variables (i.variable or different variables for each race/insurance status), and choosing the best model); and working with an interaction term in this model.
  • We recommended adjusting for hospital using a random effect in her logistic regression model and that the best way to build her model was through pre-specifying predictors to include based on literature and clinical knowledge.
  • Her data has missing data at random so recommended that she impute data to avoid biased estimates.
  • We also discussed how best to deal with and present terms with interactions -- always report together, never just report a main effect.
  • Finally, we discussed different ways of coding categorical variables such as race. She had seen in the literature that sometimes it is recorded as a multi-level single variable and other times it is recorded as separate indicator variables. Much depends on her question and discussed the different interpretations of the different ways of coding the variables.

2014 February 19

Aaron Benson, Urology, mentor/PI Nicole Miller

  • multivariable logistic regression predicting sepsis
  • an important independent variable (preoperative nephrostomy tube) is omitted(?) because there are no observations of the dependent variable (sepsis) in patients with the preoperative nephrostomy tube (n = 67) and 9 observations of sepsis in patients without a preoperative nephrostomy tube (n = 152). For this test, the results state that the preoperative nephrostomy tube "predicts failure perfectly". My questions are: how might I explain this to reviewers of our manuscript and is there another test that I should use?
  • Having a nephrostomy tube shouldn't have impacted the length of times.
  • Addressed whether there is a time after which everyone gets nephrostomy. The use has increased. This would be something to discuss in the discussion section.
Here are some follow-up questions I sent Dr. Benson, along with his answers:
What is the purpose of the model? Is it to make predictions for patients based on their characteristics? Or to identify the important predictors of sepsis?
The study is a retrospective analysis of our percutaneous nephrolithotomy (PCNL) experience. Most of these patients have access to the kidney obtained as part of the PCNL (i.e., no pre-existing nephrostomy tube). Other patients have a pre-existing nephrostomy tube placed ahead of time -- either because of their history of recurrent UTI, pyelonephritis, high infection risk features, etc. or because they presented acutely and had the nephrostomy placed at that time. We are basically trying to determine whether patients with a nephrostomy tube prior to PCNL are less likely to develop post-PCNL sepsis. We are not necessarily trying to identify predictors of post-PCNL sepsis (already lots of data), but rather whether pre-PCNL nephrostomy tube (with renal urine culture and specific antibiotics) may be protective against post-PCNL sepsis. After two manuscript reviews, our journal reviewers are recommending multivariate analysis to make sure that the differences in sepsis rates is not due to other factors.
When you say the variable (nephrostomy tube) is omitted, do you mean that your group decided to exclude the variable (nephrostomy tube) from the model? If so, is that because of the result you are getting?
No, I mean that STATA itself is showing the word "omitted" in the row for PCN (nephrostomy tube) where the data should be. We are not omitting the data on PCN because that's the focus of the study.
I'm unsure of what you mean by "For this test, the results state that the preoperative nephrostomy tube "predicts failure perfectly"." Which test? Is it part of the automatic regression output? And where is the "predicts failure perfectly" coming from? The output in stata?
By "this test", I mean logistic regression. The phrase PCN "predicts failure perfectly" is from the STATA output just above/below the results table it produces.
When you say "no observations" of sepsis in patients with the tube, you mean all of the patients with the tube were known to not have sepsis, right? You don't mean whether they had sepsis is unknown/missing?
Correct, in the group of patients who had a nephrostomy tube prior to PCNL (n = 67), there were no sepsis cases. In the group the did not have a nephrostomy tube prior to PCNL (n=152), there were 9 sepsis cases. There is not any unknown/missing data for whether patients developed post-PCNL sepsis.
How many other variables did you have in the model, and how many were considered?
Unfortunately, I left my jumpdrive at home today or I would have already sent you the STATA file. But, off the top of my head, there are probably 10-12 other variables.
  • We recommended using exact logistic regression. In stata you would use exlogistic.
  • Here is a web page explaining this issue and why you need exact logistic regression: You should be able to try this (you don't need the [fw=] option) in stata and use this to explain your analysis in the methods.
  • We also discussed that the model would be really overfit using 10 variables with only 9 events. We think one variable would be appropriate, but it would also be okay to use 2 variables, maybe nephr. tube and operating time.
Here is some R code:

counts <- matrix(c(143, 67, 9, 0), 
   nrow = 2,
   byrow = TRUE,
   dimnames = list(c("No sepsis", "Sepsis"), 
      c("No tube", "Tube")))
binconf(x = 0, n = 67, method = "all")

2014 February 12

Imani Brown, IGH, MPH candidate

  • Looking at preliminary data from an intervention in HIV+ people in Mozambique. She is interested in assessing what factors are associated with receipt of 9 different messages included in the intervention.
  • Each message is defined as having been 'received' or 'not received' so we suggested logistic regression. We also recommended ranking the predictors of interest so that depending on how many events she has, she can fit a model with the proper number of parameters based on the 10:1 or 20:1 ratio.
  • For those messages with very few events, we recommended descriptive tables and graphs as opposed to tests of association or models.

2014 January 22

Stephen Humble, Mayur Patel and Patrick Norris, Trauma

  • Planning to perform non-inferiority power calculations applied to paired observations of heart rate variability.
  • Want to describe heart rate variables to hopefully determine a norm for ICU population
  • Suggest spaghetti plots to describe data for now

Heather Kistka, Neurological Surgery

  • Planning to submit VICTR voucher
  • Compared VUMC residency applications in 2007 (N = 148) vs 2012 (N = 191) to determine whether "misrepresentation" has increased among applicants based on Pubmed searches for publications
  • Problems: 1) application changed in that window; 2) various kinds of misrepresentation (existence, author order, peer reviewed vs not); 3) if misrepresentation had multiple types (eg, not peer reviewed and changed author order), only "worst" was recorded
  • Analyses planned: lots of descriptives by year (number/types of misrepresentation, demographics, etc), plus logistic regression model among 2012 applicants looking at risk factors misrepresentation vs no misrepresentation (ie, "red flags" indicating that application should be closely investigated)
  • Logistic model: 8-9 df (AOA membership, grad degree, board scores [nonlinear?], gender, top 20/non-top 20/foreign med school, # works on CV), works with 84 events in 2012
  • Suggest secondary analysis using a "scale of misrepresentation" as outcome (flagrant vs something getting put in the wrong section) in proportional odds model - lots of ways to look at this (done only on applicants with misrepresentation, looking at "badness" of misrepresentation, or look at worst misrepresentation per applicant...)
  • For above, possibly create weighted score, along lines of DNE*3 + FAO*2 + (MAO + NPR + OPO)*1, for outcome per applicant; if distribution is wacky (probably will be), perhaps take out applicants who didn't misrepresent anything and reduce number of covariates to account for lower N
  • Clinic estimate is 40 hours without secondary analysis, 60 hours with secondary analysis

2014 January 8

Catherine Bulka, Anesthesiology

  • Working on a project analyzing geographic variation in hospital billing practices. I have a dataset of hospitals and what they charge for certain orthopedic surgeries. I have aggregated the hospitals to the core-based statistical area level (these are geographic areas designated by the Office of Management and Budget that are based around an urban center of at least 10,000 people and any adjacent areas that are socioeconomically tied to the urban center by commuting). Rather than look at differences in hospital billing practices nationwide since there are so many confounders, I decided to aggregate the data and look at the amount of variation in billing practices within each core-based statistical area because I’m assuming that the socioeconomics/cost of living/overall health of the patient population/any other potential confounders are likely pretty similar within these areas.
  • I’ve calculated the means and standard deviations in what the hospitals in each area bill for the same procedures, but I’m not sure what the best way is to compare these. The data are not normally distributed, so I’m not sure that standard deviation is even the best way to represent variations in the amount billed. Further, some areas have many more hospitals than others – from 2 in one area to 105 in another, which I think should be taken into account. I thought about using ANOVA, but I’m not so much interested in the mean amount billed by the hospitals in each area, since certain areas of the country (California, NYC, Florida) are known to charge more for certain procedures than other areas for economic reasons.
  • How can I best compare the amount of dispersion between many groups (there are > 500 areas that I’d like to compare), while addressing differences in sample size? * Have calculated the coefficient of variation (standard deviation/mean * 100) for each core based statistical area, although I am not sure if that's the best metric to show variation. Also not sure how to compare these areas with hypothesis testing. * After discussing her project, we suggested she explore a linear mixed effect model as well as further explore some of the geographic representations she had started with somehow including some aspect of the variation of charges by region in addition to reporting mean charges by hospital within a region.

Michael DeLisi, Biomedical Engineering

* Michael has a project comparing how his intervention to image-guided surgery for minimally invasive eye surgery helps in time to reaching the desired target and the ability to hit the desired target. * The study uses 4 skulls with different targets per eye. In each skull, one eye is operated on using standard image-guided methods; the other eye uses the enhancement to the standard methods. Sixteen surgeons were tested, each operating on each of the skulls. The order of skulls for each surgeon was the same but the order of methods of surgery was randomized. * Currently, he has tested for differences using t-tests and F-tests. * Our recommendation was to use linear and logistic mixed effects models to account for the correlation among surgeon, including method of surgery, skull, and eye (?) as covariates in the model with surgeon as the random effect.

Older Notes

 I am looking for guidance on how to proceed with performing a validation of the bedside swallow screening used for acute stroke patients in the ED, neuro ICU, and the neuro care unit.
The project is development of risk prediction models for placement of a ventricular assist device vs medical management with outcomes of survival to transplant and 1 year post transplant survival in pediatric patients.  Considering using propensity matching due to variability within groups.
I am planning a clinical study and would like my sample size calculations to be reviewed by a biostatistician before I submit for VICTR funding.
The study is a randomized, double-blind, experiment in human volunteers examining the effects of a drug commonly given to liver failure patients on oral glucose tolerance.

Pooja Santapuram

Topic attachments
I Attachment Action Size Date Who Comment
BradBeaudoin.docdoc BradBeaudoin.doc manage 31.5 K 10 Oct 2012 - 11:36 JoAnnAlvarez  
For_Pediatric_Pulmonary_Physicians.docxdocx For_Pediatric_Pulmonary_Physicians.docx manage 15.3 K 23 Oct 2012 - 17:08 JoAnnAlvarez Frank Virgin
Jackson-Stripling.docdoc Jackson-Stripling.doc manage 48.0 K 16 Sep 2013 - 10:40 JoAnnAlvarez Protocol for Heather Jackson
MVD.Tables.3-24-13docx.docxdocx MVD.Tables.3-24-13docx.docx manage 12.8 K 22 May 2013 - 11:23 JoAnnAlvarez  
Niesner.Surgical.Biostat.Clinic.5-22-13.xlsxls Niesner.Surgical.Biostat.Clinic.5-22-13.xls manage 46.0 K 21 May 2013 - 14:26 JoAnnAlvarez  
PORC_Dataset_16Jul2013.xlsxxlsx PORC_Dataset_16Jul2013.xlsx manage 159.3 K 17 Jul 2013 - 11:18 JoAnnAlvarez from Jennifer Morse and Breanna Michaels
Questions_for_Pediatric_Otolaryngologists.docxdocx Questions_for_Pediatric_Otolaryngologists.docx manage 15.7 K 23 Oct 2012 - 17:08 JoAnnAlvarez Frank Virgin
Readmission_Study_APM_2-26.docxdocx Readmission_Study_APM_2-26.docx manage 17.3 K 29 Feb 2012 - 11:23 JoAnnAlvarez doc from Andre Marshall
Residency_Project_Proposal-_Juliana_Kyle_2012.docxdocx Residency_Project_Proposal-_Juliana_Kyle_2012.docx manage 44.8 K 13 Sep 2012 - 16:47 JoAnnAlvarez  
SRNACRNASurvey.xlsxxlsx SRNACRNASurvey.xlsx manage 42.7 K 06 Mar 2013 - 09:52 JoAnnAlvarez survey for Jennifer Morse
gestational.docdoc gestational.doc manage 23.4 K 15 Feb 2012 - 11:27 JoAnnAlvarez Document from Sapna Sanjay Shah
propeff.pngpng propeff.png manage 54.4 K 12 Jan 2011 - 11:45 FrankHarrell Simulation study showing efficiency of simple proportions with cutoffs
sarahHill.xlsxxlsx sarahHill.xlsx manage 64.9 K 08 Oct 2012 - 14:52 JoAnnAlvarez  
tmp.txttxt tmp.txt manage 1.0 K 01 Mar 2006 - 13:03 ChuanZhou For Muyibat
Edit | Attach | Print version | History: r747 | r736 < r735 < r734 < r733 | Backlinks | View wiki text | Edit WikiText | More topic actions...
Topic revision: r735 - 07 Oct 2020, HeatherPrigmore

This site is powered by FoswikiCopyright © 2013-2020 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Vanderbilt Biostatistics Wiki? Send feedback