Introduction

The historically low US Army suicide rate climbed beginning in 20041) to exceed the civilian rate since 2009.2, 3 Preventive interventions exist to reduce Army suicides,4 including a protocol for outpatients treated by mental health specialists based on the 2013 Veterans Administration/Department of Defense (VA/DoD) Clinical Practice Guidelines (CPG) on Assessment and Management of Patients at Risk for Suicide for comprehensive suicide risk assessments of all patients in treatment for mental disorders followed by interventions for high-risk patients.5 Although the CPG includes recommendations for risk assessment and stratification, no precision medicine prediction scheme was provided. This is an important gap, as previous research shows clinicians are not good at predicting suicide and that statistical risk models produce better predictions.6, 7 The Army maintains electronic administrative systems that might be used to develop a risk model of this sort for soldier suicides. Two recent epidemiological studies demonstrated that such models can be developed.8, 9 The current report presents a similar precision medicine model to predict suicides among soldiers in outpatient treatment with mental health specialists.

Materials and methods

Sample

Analysis was based on the Historical Administrative Data System (HADS) of the Army Study to Assess Risk and Resilience in Servicemembers (Army STARRS),10 an integrated de-identified data set of Army/Department of Defense administrative data systems (Supplementary Appendix Table 1) for each month in service during the years 2004–2009 of all 975 057 Regular US Army soldiers serving at any time during that time period (32 million person-months), 569 of whom died by suicide. HADS construction and composition are discussed elsewhere.11 We focused initially on soldiers with any outpatient visit having a diagnosis of mental disorders (International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes 290.0–319) or V code indicative of life difficulties often associated with mental disorders (V15.81; V61–62.9; V71.01–71.09), as risk of suicide death was substantially elevated in this segment of the force. Models were built to predict suicide deaths subsequent to these visits using a wide range of HADS predictors. Over 8000 such visits occurred for each suicide death. As it would have been computationally intensive to include all these control visits in the analysis, we selected a probability sample of control visits equal to roughly 100 times the number of suicide deaths and compared values on predictors available at the times of those visits with the values of the same predictors available at the times of visits that occurred before suicide deaths. Control visits were weighted to adjust for their undersampling so that the weighted sum of control visits equaled the population distribution (that is, somewhat more than 8000 times the number of visits followed by suicide deaths). This kind of subsampling and weighing of controls improves the efficiency of estimation without introducing bias into estimates compared with an analysis that included all control visits.12

Predictors

Numerous epidemiological studies have examined predictors of suicide among outpatients13, 14, 15, 16, 17, 18 and military personnel.1, 19, 20, 21, 22, 23, 24, 25, 26 HADS variables operationalized as many of these predictors as possible organized into six broad categories: sociodemographics, Army career (for example, age at enlistment, occupation, deployment history), characteristics of the index visit, prior clinical factors (for example, inpatient and outpatient mental and physical disorders, prescriptions, suicide attempts), crime codes (victimization and perpetration) and contextual factors (for example, unit-level characteristics, registered weapons). We controlled year, season and time until next visit to adjust for secular trends in the Army suicide rate and time at risk.

Given that the administrative data were collected for other purposes, we cast a wide net in extracting indicators of target constructs. For example, we examined 23 different categories of psychiatric diagnoses and 15 categories of NDC psychotropic medication codes based on the First Databank (FDB) Enhanced Therapeutic Classification System,27 (Supplementary Appendix Tables 2 and 3). Nearly 1000 variables were constructed (Supplementary Appendix Table 4). Missing sociodemographic and Army career data were corrected when possible with nearest neighbor temporal imputations. Remaining missing values and inconsistencies were resolved using rational imputation (for example, a soldier classified as female 1 month but male all other months was recoded male). Details about missing data patterns are available in Supplementary Appendix Table 5.

Analysis methods

Deidentified HADS analysis was approved by the Human Subjects Committees of the Uniformed Services University of the Health Sciences for the Henry M Jackson Foundation (the primary grantee), the University of Michigan and Harvard Medical School. Analysis began with cross-tabulations examining suicide risk in the 12 months after each outpatient visit, distinguishing visits in the general medical and mental health specialty sectors by prior psychiatric hospitalization, gender and deployment status. Model building began by estimating univariate associations of predictor with suicide using discrete-time survival analysis of suicide death (coded 1) compared with all other outcomes (that is, some other death, a subsequent mental health specialty visit, separation from service, end of the follow-up period, all coded 0). A logistic link function was used to estimate coefficients with proc logistic in SAS 9.3 (SAS, Cary, NC, USA).28 Functional forms of significant nondichotomous predictors were transformed to capture interpretable nonlinearities.

As multivariable associations were unstable, machine learning methods were used to generate stable estimates comparing four different classifiers: naive Bayes29 using the R-package e1071 naiveBayes;30 random forest31 using the R-package RandomForest;32 support vector regression33 using the R-package e1071 svm; and elastic net penalized regression34 using the R-package glmnet.35 Hyperparameters were selected to maximize cross-validated sensitivity (that is, the proportion of observed suicide deaths among predicted positives) in the 5% of visits with highest predicted suicide risk. Selection of the optimal classifier was based on the same criterion.

Once the best classifier was selected, operating characteristics were examined by comparing predicted probability of suicide death for each sampled person visit to observed suicide death in the entire sample by calculating area under the receiver operating characteristic curve (AUC) and graphing proportional suicide deaths after visits in each ventile (that is, 5%) of visits grouped from highest to lowest predicted probabilities. We then calculated sensitivity (as noted above, the proportion of observed suicides after visits predicted to have high suicide risk) and positive predictive value (suicide rate after visits predicted to have high risk expressed as number of suicides/100 000 person-years) in high-risk ventiles along with specificity (the proportion of visits not followed by observed suicides after visits predicted not to have high suicide risk), negative predictive value (the nonsuicide rate/100 000 person-years after such visits) and AUC (which, in the case of dichotomous predictors, is the mean of sensitivity and specificity). Given the rarity of suicide deaths, we report 1-negative predictive value (that is, suicides/100 000 person-years) rather than negative predictive value. Visit-level estimates were then projected to the person level by aggregating results for selected contiguous 12-month time periods for a probability sample of 100 000 soldiers. Model predictive validity was evaluated by using coefficients estimated in earlier years to predict suicides in later years.

Results

Outpatient visits and suicide by treatment sector, gender, deployment status and time of the 569 suicide deaths of Regular Army soldiers

During 2004–2009, 68 (12%) of 569 suicide deaths occurred among 0.9% of soldiers with psychiatric hospitalizations in the prior 12 months (252.3/100 000 person-years) (Table 1). Another 240 (42.2%) suicides occurred among 24.5% of soldiers without 12-month psychiatric hospitalization who were outpatients with target diagnoses or V codes (31.7/100 000 person-years). The remaining 261 (45.9%) suicides occurred among the other 74.6% of soldiers (11.3/100 000 person-years). Among 0.9% with hospitalization, the suicide rate was highest among those seen outpatient after hospital discharge by both mental health and general medical treatment providers (0.65% of all soldiers; 312.2/100 000 person years), lowest among those seen only by mental health providers (0.11%; 85.7/100 000 person-years) and intermediate among those seen only by general medical providers (0.06%; 107.0/100 000 person-years). Among 24.5% having outpatient visits without hospitalizations, the suicide rate was highest among those seen both by mental health and general medical providers (5.1% of all soldiers; 63.9/100 000 person years), intermediate among those seen only by mental health providers (6.0%; 36.1/100 000 person-years) and lowest among those seen only by general medical providers (13.4%; 17.4/100 000 person-years).

Table 1 The proportions of all soldiers and soldiers who died by suicide who had psychiatric hospitalization and outpatient treatment for mental disorders in the prior 12 months among Regular Army soldiers over the years 2004–2009 (n=975 057)

Given the much higher suicide rate among outpatients seen by mental health providers than exclusively by general medical providers, we focused analysis on the former and distinguished between the 66 suicides with prior 12-month psychiatric hospitalization and the 168 suicides without such hospitalization. The population at risk consisted of 316 686 Regular Army soldiers making 2 950 967 outpatient mental health specialist visits in 2004–2009. Of these visits, 95.8% were made when patients were not deployed (173 suicides; 65.6/100 000 person-years) and the suicide rate after these visits was substantially higher among men than women (75.3/ versus 19.6/100 000 person-years), with 94.8% (164 of 173) of suicide deaths after these visits occurring among men. Based on these patterns, we focused analysis on nondeployed men. The majority (61.6%; 101/164) of suicide deaths in this group occurred within 5 weeks of mental health specialist outpatient visits (145.2, 96.3, 123.6, 116.5 and 115.1 suicides/100 000 person years, respectively, in those weeks), with a 57.4/100 000 person-years rate during the remainder of the first 6 months (28.7% (47/164) of suicide deaths over the 12 months after the index visit) and 31.3/100 000 person-years over the subsequent 6 months. Based on these results, we limited model building to the 26 weeks after the index visit (148 suicides).

Selecting the optimal classifier

Roughly one-third of HADS variables for prior clinical characteristics (244/782 among soldiers with and 178/536 among soldiers without psychiatric hospitalizations) were significant univariate predictors of subsequent suicide. Much smaller proportions of variables characterizing the index outpatient visit (2/46), involvement in crime (2/67) and contextual factors (0/39) were significant. The significant univariate predictors plus 20 sociodemographic and 27 Army career variables were included in multivariable model building. Based on many predictors about psychiatric hospitalization being significant, all analyses were carried out separately among soldiers who had (50 suicides) versus had not (97 suicides) psychiatric hospitalizations in the prior 12 months. The elastic net classifier outperformed the others in terms of higher cross-validated sensitivity in the weighted 5% of observations with highest predicted risk among both soldiers with and without prior 12-month psychiatric hospitalizations. Subsequent phases of analysis consequently focused on the elastic net models. In all, 14 predictors were included in this model for soldiers with and 10 for soldiers without prior psychiatric hospitalizations.

Operating characteristics of model-based predictions

The model AUCs for the continuous distributions of predicted probabilities over 26 weeks were 0.72 among soldiers with prior psychiatric hospitalizations, 0.61 among soldiers without prior hospitalizations and 0.66 among both combined. When the same models were applied to suicide deaths in the 5 weeks after the index visits, AUCs increased to 0.75 (prior hospitalization), 0.65 (no prior hospitalization) and 0.69 (both combined). Sensitivity was more than twice the expected value of 5% after visits in the three highest risk ventiles for both 26 weeks and 5 weeks (Figure 1) and either below or only slightly above their expected values in the remaining 17 ventiles, leading us to evaluate operating characteristics of two dichotomous classifications: between the top 1 and other 19 risk ventiles; and between and top 3 and other 17 risk ventiles.

Figure 1
figure 1

Proportion of suicide deaths that occurred within 5 and 26 weeks of most recent specialty mental health (MH) outpatient visits within ventiles of visits ranked by predicted suicide risk based on the optimal elastic net penalized logistic regression model, male nondeployed Regular US. Army soldiers 2004–2009. The bars show the observed proportions of suicide deaths within 5 weeks of each ventile (5% grouping) of specialty outpatient visits ranked by predicted suicide risk based on the optimal prediction model out of the population of all such visits made by male nondeployed Regular US Army soldiers in 2004–2009.

PowerPoint slide

All calculations of operating characteristics combined soldiers with and without prior hospitalizations (Table 2). Sensitivity in the top ventile was 22.4–24.0% (26–5 weeks after visits). Comparable sensitivities were 45.6–48.0% in the top 3 ventiles. Specificity was 94.9–94.9% in the lowest 19 ventiles and 84.0–84.0% in the 17 lowest ventiles. Positive predictive value was 1076.8–1047.6/100 000 person-years in the top ventile and 602.3–605.9/100 000 person-years in the top 3 ventiles compared with 52.9–71.5/100 000 person-years in the remaining 17 ventiles (that is, 1–negative predictive value). AUC was 0.59–0.66.

Table 2 Operating characteristics of dichotomous classifications distinguishing soldiers in the top 1 and 3 ventiles of predicted risk of suicide death after mental health specialty outpatient visits made by male nondeployed soldiers based on the optimal logistic models combining soldiers with and without prior 12-month psychiatric hospitalizationsa

Person-level projections of visit-level results

As person-level inferences cannot be drawn from visit-level results, we drew a representative sample of 100 000 soldiers in service over the study period who did not die by suicide, combined them with all soldiers who died by suicide and generated predicted suicide risk scores based on the coefficients in our best model for each mental health specialty outpatient visit of each soldier in this data set. These visit-level scores were then aggregated to the person level. The nondeployed men with 12-month mental health specialty outpatient visits had an average of 6.1 such visits. Extrapolating to an Army of 500 574 (the average number of nondeployed male soldiers on active duty in the Army over the study period), this would be 60 654 nondeployed men making 368 233 mental health specialty outpatient visits over a typical 12 months (17 629–55 286 visits in the 1–3 highest-risk ventiles). A total of 4.2% of soldiers who made 12-month mental health specialty outpatient visits had visits in the top risk ventile, with a mean of 7.0 such visits and a mean of 10.3 weeks in the highest-risk time interval after such visits. This means that only 573 (that is, 0.042 × 0.121 × 500 574 × 10.3/52) male nondeployed soldiers in an Army of 500 574 nondeployed men would be in the highest-risk group in a typical week. This number increased to 1103 for patients in the highest-risk ventile over 26 weeks and to 3657 for patients in the 3 highest-risk ventiles over 26 weeks.

Validation

Models were reestimated in the 2004–2007 HADS data using the same predictors but allowing the coefficients to differ from the 2004–2009 model. Results were used to predict 2008–2009 suicides. AUC combining soldiers with and without prior psychiatric hospitalizations was 0.67–0.72 predicting suicides within 26–5 weeks of most recent visit. The 26-week sensitivity was 26.7–41.3% for visits in the highest 1–3 risk ventiles. The 5-week sensitivity was 29.8–47.4% for visits in the highest 1–3 risk ventiles. Replication of this validation exercise using coefficients estimated in 2008–2009 to predict suicides in 2010–2012 yielded much weaker results: sensitivities of 13.3–18.1% for 26–5 weeks in the 1 highest ventile and 36.1–27.4% in the 3 highest ventiles.

Model coefficients

The 14 predictors in the model for patients with prior hospitalization included 6 indicators of prior suicidality, 6 of prior inpatient–outpatient depression treatment and 2 of nonaffective psychosis and bipolar disorder treatment, all associated with elevated suicide risk (Table 3, model 1). Odds ratios (ORs) were all relatively modest (OR=1.01–1.32) because of elastic net penalties. Extreme coefficient instability (indicated by high variance inflation factors) occurred, in comparison, when a logistic regression model (model 2) was estimated with the same predictors, although respecification allowed this problem to be addressed in a less complex logistic model (model 3) that retained essentially the same level of overall prediction accuracy (AUC=0.72).

Table 3 Coefficients in the optimal elastic net and comparable conventional logistic regression models predicting suicide deaths within 26 weeks of mental health specialty outpatient visits made by nondeployed male soldiers with prior 12-month psychiatric hospitalizations

The 10 predictors in the model for patients with no prior hospitalization included one feature of the index visit—whether with a psychiatrist (associated with elevated suicide risk)—along with 3 measures of treatment in the past month (frequency of visits for depression and ill-defined conditions; any inpatient treatment for a physical disorder), 3 measures of treatment in the past 3 months (any for either nonaffective psychosis or personality disorder; number of anticonvulsant prescriptions), 2 measures of treatment in the past 12 months (frequency of outpatient visits for anxiety disorders; any prescription of an alcohol–narcotic abuse treatment agent); and a final measure for whether the soldier was an alleged perpetrator of multiple crimes in the 3 months before the index visit (Table 4). The ORs of these predictors were much more diverse than in the model for soldiers with prior hospitalizations (OR=1.2–8.8), reflecting the weaker associations among predictors (as indicated by the low variance inflation factors in the parallel logistic model).

Table 4 Coefficients in the optimal elastic net and conventional logistic regression models predicting suicide deaths within 26 weeks of mental health specialty outpatient visits made by nondeployed male soldiers without prior 12-month psychiatric hospitalizations

Discussion

Despite the elevated suicide risk of soldiers with mental health specialty outpatient visits, which is consistent with civilian research,36 and the strong performance of our models, suicide was a rare outcome even among high-risk soldiers. This raises the question of whether existing interventions are sufficiently powerful to make targeted preventive interventions cost effective. There is controversy about this question.37, 38, 39 Empirical adjudication would require analyses beyond the scope of this report on competing needs, costs and cost effectiveness of intervention options.40, 41 Our aim was to address a prior question: whether a useful precision medicine model can be developed. We showed that it can. The 5% of visits with highest predicted risk include only 0.1% of soldiers with very high suicide risk (1047.1/100 000 person-years in the 5 weeks after the visit). This is a small enough proportion of individuals accounting for a large enough proportion of suicides to have intervention implications.

Interpretation of model predictors should only be undertaken with caution because machine learning methods maximize model performance at the expense of individual coefficient accuracy. Nonetheless, four observations are noteworthy. First, the vast majority of predictors measured mental disorders found to be important in prior studies of soldier suicides.19, 21, 23, 42 The crime perpetration variable in the model for soldiers without prior hospitalization is consistent with evidence that a high proportion of soldiers who die by suicide had legal problems at the time of death.21

Second, we found that hospitalization for any physical health problem was an important predictor of soldier suicide. Although traumatic brain injury, a widely recognized suicide risk factor,43, 44 was included as a potential predictor, the fact that this composite variable was selected over traumatic brain injury in the predictor set underscores the need for future investigation to focus clinical attention on broader hospitalized physical conditions linked to suicide.

Third, despite previous research consistently finding suicide predicted by sociodemographic characteristics indicating disadvantaged social status (for example, young age, unmarried status) and Army career characteristics indicating low status (for example, low rank, demotion) predicting soldier suicide,19, 20, 22, 25, 42, 45 no such predictors emerged in our optimal models. No attempt was made to determine whether this was because the clinical variables in our models mediated the effects of sociodemographic and Army career variables, but future investigation of this possibility might provide insights into modifiable targets of preventive interventions.

Fourth, important differences were found between patients with versus without prior psychiatric hospitalization. AUC and concentration of risk were higher in models for those with (AUC=0.72–0.75 for 26–5 weeks, with 28–36% of suicides occurring among 5% of patients with highest predicted risk) than without (AUC=0.61–0.65, with 22–24% of suicides occurring among 5% of patients with highest predicted risk) prior hospitalizations. All but one predictor in the model for patients with hospitalization involved characteristics of outpatient visits before the hospitalization rather than of the hospitalization, with a focus on suicidality, depression, bipolar disorder and nonaffective psychosis. The model for patients without hospitalization, in comparison, included a much wider array of diagnoses, the one with the highest OR being alcohol/drug treatment. Recent inpatient treatment for a physical disorder was also a very powerful predictor in that model. These differences suggest that the causal processes underlying suicide are different for patients with and without psychiatric hospitalization. Further investigation of these differences might provide insights to help customize preventive interventions.

Our analysis was limited by considering a large number of predictors of a small number of suicides, introducing risk of overfitting. We addressed this problem by using cross-validation to select the number of predictors in final models and using penalized regression to select predictors, but residual overfitting might have occurred. We evaluated this by predicting 2008–2009 suicides based on 2004–2007 models and 2010–2012 suicides based on 2008–2009 models. Model stability was very good between 2004–2007 and 2008–2009 but much lower between 2008–2009 and 2010–2012, possibly reflecting changes in Army policies–practices for managing suicide risk as awareness of the rising Army suicide rate increased. The only way to guard against such a possibility going forward would be to update prediction models regularly (for example, annually) and carry out sensitivity analyses of the extent to which predictors change depending on the number of years of prior data used in developing the models.

Another set of limitations involves the administrative data used in our models that had more missing, inconsistent and possibly erroneous values than in data collected for research purposes and lacked indicators of some suicide risk factors documented in the literature. These limitations presumably resulted in reduced model performance. Yet, the models nonetheless had good prediction accuracy that would presumably be improved by increasing data quality (for example, adding predictors based on the checklist the VA/DoD CPG now urges clinicians to use to evaluate suicide risk). A final noteworthy limitation is that we were unable to follow soldiers out of service to predict suicides that occurred after separation. This right censoring is an important limitation for long-term prediction given that soldiers with mental disorders are more likely than others to terminate service.45

It is unclear from the results reported here how much clinical judgment could be enhanced by having access to results of our models, as clinical assessments of suicide risk were not systematically recorded in Army medical records over the years we studied. As noted in the introduction, though, previous studies find that statistical models are much more accurate than clinical judgment of suicide risk,2, 46, 47, 48 consistent with a larger literature showing statistical methods outperform expert judgment in many areas of prediction,6, 7 suggesting that access to predictions based on our models could be of value to clinicians as one element in their evaluation of patient suicide risk.