Main

Cardiovascular diseases (CVDs) represent a major cause of death and socio-economic burden globally. In 2015 alone, there were ~18 million CVD-related deaths worldwide1. Identification and timely treatment of CVD risk factors is a key strategy for reducing CVD prevalence in populations and for risk modulation in individuals. Conventionally, CVD risk is estimated using demographic/clinical parameters such as age, sex, ethnicity, smoking status, family history, history of hyperlipidaemia, diabetes mellitus or hypertension2. Imaging tests such as coronary computed tomography, echocardiography and cardiovascular magnetic resonance (CMR) help further stratify patient risk by assessing coronary calcium burden, myocardial scar burden, ischaemia, cardiac chamber size and function.

Cardiovascular imaging is usually performed in secondary care and is relatively expensive, limiting its availability in underdeveloped and developing countries. An alternative approach to risk stratification is to use the information available from non-cardiac investigations. Retinal microvascular abnormalities such as generalized arteriolar narrowing, focal arteriolar narrowing and arteriovenous nicking have shown strong associations with systemic and cardiovascular diseases such as diabetes mellitus, hypertension and coronary artery disease3,4. Retinal images (including details of principal blood vessels) are now routinely acquired in optometric and ophthalmologic practice and are relatively inexpensive. Retinal images could, therefore, be a potential cost-effective screening tool for cardiovascular disease. Beyond risk prediction, retinal images have also been associated with cardiovascular phenotypes such as left ventricular dimensions and mass4. Poplin et al. showed for the first time that retinal images allowed prediction of cardiovascular risk factors such as age, gender, smoking status, systolic blood pressure and major adverse cardiac events5, driven by anatomical features such as the optic disc or retinal blood vessels. This highlighted the potential for using retinal images to assess risk of cardiovascular diseases.

We explore new ways to extend this line of research, by learning a combined representation of retinal images and CMR images, to assess cardiac function and predict myocardial infarction (MI) events. This is supported by the work of Cheung et al.6, who highlighted the adverse effects of blood pressure and cardiac dysfunction on retinal microvasculature. Similarly, Tapp et al.7 established associations between retinal vessel morphology and cardiovascular disease risk factors and/or CVD outcomes using a multilevel linear regressor. This study assessed the relationships between retinal vessel morphometry, blood pressure, and arterial stiffness index, furthering our understanding of preclinical disease processes and the interplay between microvascular and macrovascular diseases. Using retinal fundus images, Gargeya et al.8 and Qummar et al.9 used deep learning to detect diabetes, and classify different grades of diabetic retinopathy, respectively. These studies demonstrate the efficacy of deep learning techniques to quantify and stratify cardiovascular disease risk factors, given retinal images. Other studies such as Pickhardt et al.10 utilised whole-body CT scans and deep learning to predict future adverse cardiovascular events, further supporting the hypothesis that alternate image modalities, covering multiple organs, help assess cardiovascular health and predict CVD risk.

As markers of cardiovascular diseases are often manifested in the retina, images of this organ could identify future cardiovascular events such as left ventricular hypertrophy or MI. This work proposes a novel method that estimates cardiac indices and predicts incident MI based on retinal images and demographic data from the UK biobank (UKB). For MI, we only considered incidents that occurred after the retinal image was taken. Our approach uses a multichannel variational autoencoder trained on two channels of information: retinal and CMR images from the same subject. This method combines features extracted from both imaging modalities in a common latent space, allowing us to use that latent space to subsequently estimate relevant quantities from just one channel of information (that is retinal images) and demographic data. Applied to clinical practice, estimation of cardiac indices from retinal images could guide patients at risk of CVDs to cardiologists following a routine ophthalmic check or directly predicting MI based on the retinal and minimal demographic data.

Patient datasets and demographic data

This study used CMR images (end-diastolic short-axis view), retinal images and demographic data from the UKB cohort (under access application no. 11350) to train and validate the proposed method. When this method was developed, 39,705 participants underwent CMR imaging using a clinical wide bore 1.5 T MRI system (MAGNETOM Aera, Syngo Platform VD13A, Siemens Healthcare)11, and 84,760 participants underwent retinal imaging using a Topcon 3D OCT 1000 Mark 2 (45° field-of-view, centred to include both optic disc and macula)12. Only those participants with CMR, retinal images and demographic data were selected to train our proposed method, totalling 11,383 participants.

From 11,383 participants, 676 participants were excluded due to a history of conditions known to affect left ventricular mass; for example, diabetes (336 subjects), past MI (293 subjects), cardiomyopathy (14 subjects) or frequent strenuous exercise routines (33 subjects).

After excluding participants with the conditions above, a deep learning method for quality assessment13 was used to obtain the retinal images of sufficient quality, as per certain pre-specified criteria. This quality assessment method utilises the public dataset called EyePACS14, a well-known dataset presented in the Kaggle platform for automatic diabetic retinopathy detection, to train and validate its performance. Following quality assessment, 5,663 participants were identified to have good quality retinal images. We followed the RECORD statement for reporting observational data, and a STROBE flow diagram showing the exclusion criteria is presented in Fig. 1. Subsequent preprocessing steps for retinal and CMR images (that is, ROI detection15) are presented in Supplementary Section 1.

Fig. 1: STROBE flow diagram for excluded participants.
figure 1

Criteria for excluding participants in this study.

Regarding the demographic data, a combination of variables derived from the patient’s history and blood samples—such as sex, age, gender, HbA1c, systolic and diastolic blood pressure, smoking habit, alcohol consumption, glucose and body mass index—were also used as input to train and test the proposed method. Although we excluded participants with diabetes, we retain HbA1c as multiple studies have shown positive correlation between HbA1c and cardiovascular mortality even in subjects without a history of diabetes16,17,18. Furthermore, in ref. 19 the authors showed a strong association between HbA1c and left ventricular mass. They found that a 1% rise in HbA1c level was associated with a 3.0 g increase in left ventricular mass in elderly subjects. All of these variables are summarized in Supplementary Table 2.

Aside from demographic data, we also used the left ventricular end-diastolic volume (LVEDV) and left ventricular mass (LVM) extracted directly from the CMR images. These cardiac indices were computed from the manual delineations20 generated using the commercially available cvi42 post-processing software, and segmentations generated automatically using the method proposed by Attar and colleagues21. More details about how these values were used are outlined in the ‘Experiments and results’ section.

Age-Related Eye Disease Study database

The Age-Related Eye Disease Study (AREDS) was a multicenter prospective study of the clinical course of age-related macular degeneration (AMD) and age-related cataract, as well as a phase-III randomized controlled trial designed to assess the effects of nutritional supplements on AMD and cataract progression22,23. Institutional review board approval was obtained at each clinical site and written informed consent for the research was obtained from all study participants. The research was conducted under the Declaration of Helsinki. Further information on AREDS and associated demographic data is included in Supplementary Section 2.

Deep learning approach

Our method is based on the multichannel variational autoencoder (mcVAE)24 and a deep regression network (https://doi.org/10.5281/zenodo.5716142) (ResNet50). For the mcVAE, we designed two pairs of encoder/decoders to train the network, in which each pair is trained on one of the two data channels (retinal and CMR images) with a shared latent space. The full diagram of the proposed method is presented in Fig. 2. The encoders and the decoders are further described in Supplementary Table 3.

Fig. 2: Overview of the proposed method.
figure 2

This system comprises two main components: a mcVAE and a deep regressor network. During Stage I, a joint latent space is created with two channels: retinal and cardiac magnetic resonance. Then, during Stage II a deep regressor is trained on the reconstructed CMR plus demographic data to estimate LVM and LVEDV. Figure rreproduced with permission from UK Biobank.

Antelmi et al.24 highlighted that using a sparse version of the mcVAE ensures the evidence lower bound generally reaches the maximum value at convergence when the number of latent dimensions coincides with the true one used to generate the data. Consequently, we used the sparse version of the mcVAE. We trained a sparse latent space z for both channels of information. A detailed explanation on how mcVAE works, and the difference between mcVAE and a vanilla VAE25,26,27,28, is provided in Supplementary Section 3.

Once the mcVAE was trained, we used the learned latent space to train the deep regressor (ResNet50). To do that, we used CMR images reconstructed from the retinal images plus the demographic data (Stage II in Fig. 2).

Prediction of incident MI

We evaluate the ability of the proposed approach to estimate LVM and LVEDV from the retinal images and demographic data. As an additional experiment, we predict MI using logistic regression in two settings: (1) using the demographic data alone; and (2) using LVM/LVEDV estimated from the retinal images and the demographic data, and subsequently, combined with the latter for predicting MI. Logistic regression eased interpretability, allowing us to compare the weights/coefficients of the variables towards the final prediction (see Extended Data Fig. 3). We extract the cases with MI events from the participants not used to train the system to make this comparison; that is, 73,477 participants out of a total 84,760 participants with retinal images. Of the 73,477, 2,954 subjects have previous MI; however, we only consider the cases where MI occurred after the retinal images were taken, which results in 992 MI cases and 70,523 no-MI cases.

We are dealing with imbalanced data and thus we randomly resampled the normal cases to the same number of MI cases (992). Past studies29 have highlighted that resampling the majority class is a robust solution when having hundreds of cases in the minority class. Once the majority class was resampled, we performed tenfold cross-validation using logistic regression to predict MI in the scenarios described previously (that is, using only demographic and using demographic plus LVM/LVEDV).

Experiments and results

In this study we jointly trained a mcVAE and deep regressor network on CMR, retinal images and demographic data from participants in the UKB cohort. As the first experiment, we used manual and automatic delineations of the CMR images as ground truth to estimate LVM and LVEDV from retinal images. These manually delineated images were analysed by a team of eight experts using cvi4230. On the other hand, the automatic delineations were obtained from the method proposed by Attar and colleagues31. The main motivation for this set of experiments is to perform a fair comparison between our system and the state-of-the-art methods. All methods published in the literature that used the UKB cohort are trained using the aforementioned manual delineations. Results of this experiment are presented in Bland–Altman and Pearson’s correlation plots (see Fig. 3a).

Fig. 3: Estimation of LVM and LVEDV using manual and automatic annotations.
figure 3

a, Bland–Altman and correlation plots for estimated LVM and LVEDV using manual annotations on CMR images. b, Bland–Altman and correlation plots for estimated LVM and LVEDV using automatic annotations computed from Attar and colleagues31 method. GT stands for ground truth or expert manual measurements. In Case A, we used all the available subjects to train and test our method. The solid line represents the logistic regression, and the dotted line represents the line of identity.

Figure 3a denotes the correlation between the LVM (r = 0.65) and LVEDV (r = 0.45) values estimated using our approach, and the ones manually computed from the CMR images using cvi42. The results obtained from this experiment support the clinical findings published years ago by clinical researchers in refs. 3,4,32. They found that retinal images could be potentially used to quantify parameters in the heart.

Aside from the Bland–Altman and correlation plots, we also compared our proposed method with state-of-the-art cardiac quantification methods using CMR images (Bai and colleagues33), including the Siemens Inline VF system (see Supplementary Table 4). The Siemens Inline VF system was the first fully automatic left ventricular analysis tools commercially available34; D13 and E11C versions are currently used as a baseline for comparison against manual delineation30.

Bland–Altman plots and Pearson’s correlation were computed for the participants with automatic annotations for LVM and LVEDV (see Fig. 3b).

Figure 3b shows a considerable correlation between the LVM and LVEDV estimated by the proposed method and parameters computed from Attar’s algorithm. Using more images to train our method positively impacts the obtained error (See Supplementary Table 4, experiment Exp 2B).

Our approach can estimate LVM and LVEDV from the retinal images and demographic data and can improve the prediction of future MI events. To demonstrate this, we compare MI prediction in two settings: (1) using only demographic data and (2) using LVM/LVEDV (predicted using our approach) plus demographic data. To do that, we performed tenfold cross-validation on subjects not previously used for training and a logistic regression model (See Fig. 4).

Fig. 4: Cross-validation results for MI prediction.
figure 4

ROC curves obtained for MI prediction using only demographic data. Accuracy: 0.66 ± 0.03, sensitivity: 0.7 ± 0.04, specificity: 0.64 ± 0.03, precision: 0.64 ± 0.03, and F1 Score: 0.66 ± 0.03 (left). ROC curves obtained for MI prediction using LVM, LVEDV (derived from the proposed pipeline) and demographic data. Accuracy, 0.74 ± 0.03; sensitivity, 0.74 ± 0.02; specificity, 0.71 ± 0.03; precision, 0.73 ± 0.05; F1 Score, 0.74 ± 0.03 (right).

Figure 4 (right) shows a notable increase in the area under the ROC curve when using LVM/LVEDV plus demographics to predict MI.

Aside from predicting myocardial infarction, we also compared the estimated LVM/LVEDV values between the MI cases and no-MI cases using a t-test. Here the null hypothesis is that the LVM/LVEDV values come from the same distribution, whereas the alternative hypothesis is that these values come from different distributions. We consider that the obtained results are different if the P-value is less than 0.05. According to experimental results, P-values of 1.43 × 10−57 and 2.32 × 10−52 were obtained for LVM and LVEDV correspondingly, meaning we rejected the null hypothesis and LVM/LVEDV values for MI no-MI cases come from different distributions.

Additional experiments evaluating the Frechet inception distance35 score for reconstructed CMR images and the impact of the training set size, retinal image size, and different demographic variables to the proposed algorithm are presented in the Supplementary Section 5.

External validation

Finally, external validation using the optimal model identified from the preceding experiments was carried out. This validation was conducted on the AREDS dataset using retinal images and the demographic data presented in Supplementary Table 1. As previously mentioned, this dataset is composed of 3,010 participants in total. From these participants, there are 180 participants with MI events and 2,830 with no-MI events.

We used the mcVAE trained on all the 5,663 retinal images available of size 128 × 128px. In the AREDS dataset, the demographic data available differed from that in the UKB. We trained our method for the available metadata in the AREDS. This means variables such as systolic blood pressure, diastolic blood pressure, smoking status, alcohol consumption status, body mass index, age, and gender were used for the external validation. The demographic variable ‘alcohol consumption’ was converted to a continuous variable—in terms of gm/day consumed. The remaining variables are consistent between datasets in the way they are coded.

As the AREDS dataset was initially used because of the detailed information of AMD, we performed three analyses discarding different levels of AMD to show the impact AMD has on MI prediction. The obtained results can be seen in Fig. 5 and Table 1.

Fig. 5: ROC curves obtained from the external validation using AREDS dataset.
figure 5

a, ROC curve obtained considering all the AMD cases. b, ROC curve obtained after discarding AMD cases with labels 2 and 3. c, ROC curve obtained after excluding all AMD cases (labels 1, 2 and 3).

Table 1 Obtained results from the external validation using AREDS dataset. Accuracy, sensitivity, specificity, precision and F1 score were computed to show the impact of AMD on the MI prediction

Discussion

This study demonstrates that retinal images and demographic data could be of great value to estimate cardiac indices such as the LVM and LVEDV by jointly learning a latent space retinal and CMR images. To the best of our knowledge, no previous works use a multimodal approach with retinal and CMR to learn a joint latent space and subsequently estimate cardiac indices using only retinal and demographic data. Our results follow past research demonstrating strong associations between biomarkers in the retina and the heart3,4,32, similar to what has been shown in a recent study in which cardiovascular risk factors such as age, gender, blood pressure were quantified using only retinal images5.

Using the proposed method to estimate LVM and LVEDV, we can assess patients at risk of future MI or similar adverse cardiovascular events at routine ophthalmic visits. This would enable patient referral for further examination. Estimated LVM/LVEDV could also be used to provide insights into pathological cardiac remodelling or hypertension at no extra cost. This means that, if an ophthalmologist keeps a record of those indices for their patient over time, they can refer patients for further assessment to cardiologists, if a remarkable increase in the LVM or LVEDV is detected. The ophthalmologist could be bypassed with automated risk detection if patients consented to share their data on the cloud.

Figure 3 shows that our trained model is less powerful at estimating higher LVM and LVEDV. Two main factors are involved here: (1) the proportion of subjects with elevated LVM/LVEDV available for training (with retinal images) is limited, and (2) retinal images do not contain all of the information to assess cardiac function.

We chose to predict LVM and LVEDV as an intermediate step rather than directly predicting future MI events because: (1) this ensures that the developed approach is flexible in its clinical application, as it could be used not just to predict MI, but to assess left ventricular function in general; (2) using LVM and LVEDV enhances the explainability of predictions, as evidenced by the analysis of the logistic regression coefficients presented in the Supplementary Information.

In the external validation analyses, we presented detailed data on the relative performance of the algorithm to predict incident MI, according to the presence and severity of AMD in the retinal images. The performance was highest in the absence of AMD and seemed decrease with the inclusion of individuals with AMD of gradually increasing severity of it. In its most severe form (that is, neovascular AMD) it can cause extensive fibrosis, haemorrhage and exudation across much of the macula; this is likely to obliterate the relevant signals employed by the algorithm for predicting incident MI. Even in less severe forms, such as early and intermediate AMD, substantial alterations to macular anatomy are observed, including drusen and pigmentary abnormalities36, which may partially degrade or interfere with the relevant signals. We might assume that the most important signals from the retinal images, for MI prediction, are encoded in the retinal vessels5. In this case, even early and intermediate AMD are accompanied by substantial changes in the retinal vasculature’s quantitative and morphological features37. Overall, the presence of retinal disease such as AMD, particularly in its more severe forms, presumably interferes with the ability of the algorithm to infer characteristics of the systemic circulation from the retinal circulation.

The AUC scores obtained using our approach for UKB and AREDS populations has to be considered in the context of a second referral setting at an optician/eye clinic and not a primary cardiology clinic. The sensitivity, specificity and precision/positive predictive value (PPV) of our approach at predicting future MI events from retinal images in: (1) the UKB population were 0.74, 0.72 and 0.68, respectively, when just age and gender were considered as additional demographic variables (representative of the information available in an optician/eye clinic), as highlighted in Supplementary Fig. 7; and (2) the AREDS population—after excluding all AMD cases—were 0.70, 0.67 and 0.67 respectively. Established cardiovascular disease risk assessment models (for example, the Framingham Risk Score (FRS), Systemic Coronary Risk Evaluation (SCORE), Pooled Cohort Equation (PCE) and so on)38,39,40,41 used previously to screen populations for atherosclerotic cardiovascular disease are comparable to our approach in discriminatory capacity, while requiring several additional demographic variables and clinical measurements not readily available at an optician/eye clinic. For instance, in ref. 39 the authors compare FRS, PCE and SCORE in the multi-ethnic study of atherosclerosis, each achieving an AUC of 0.717, 0.737 and 0.721, respectively, and corresponding sensitivity and specificity ranges of 0.7–0.8 and 0.5–0.6, respectively. Similarly, in ref. 38, the sensitivity, specificity and PPV of multiple cardiovascular risk assessment models were compared on the Diabetes and Cardiovascular Risk Evaluation: Targets and Essential Data for Commitment of Treatment study. This study revealed that FRS and PCE’s sensitivity, specificity and PPV ranged from 0.56–0.78, 0.60–0.78 and 0.12–0.24, respectively, when considering a 10% risk threshold. Although the performance of our approach in this study cannot be directly compared to the risk assessment models evaluated in either of the studies above, they provide context to the results obtained on both UKB and AREDS populations, highlighting its potential for use as a second referral tool at an eye clinic/optician; however, it is important to note that this is a proof of concept study with limitations in study design (detailed in the Supplementary Information), predominantly the limited availability of the multimodal data required for such analyses.

Conclusion

This study presents a system that estimates cardiac indices such as LVM and LVEDV, and predicts future MI events using inexpensive and easy to obtain retinal photographs and demographic data. We used 5,663 subjects from the UKB imaging study—with end-diastolic cardiac magnetic resonance, retinal images and demographic data—to train and test our method. We used this system to predict MI in subjects that have retinal images and were not used during the training process. We found that using cardiac indices and demographic data together yields improvements in predicting MI events compared with using only demographic data. Finally, we performed an independent replication study of our method on the AREDS dataset. Although a drop in performance was observed, the discrimination capacity of our approach remained comparable to established CVD risk assessment models reported previously. This highlights the potential for our approach to be employed as a second referral tool in eye clinics/opticians, to identify patients at risk of future MI events. Future work will explore genetic data to improve the discriminatory capacity of the proposed approach and explainable artificial intelligence techniques to identify the dominant retinal phenotypes that help assess CVD risk. This will help facilitate fine-grained stratification of CVD risk in patients, which will be a crucial step towards delivering personalized medicine.