Introduction

As a consequence of the worldwide epidemic of obesity, diabetes mellitus, and metabolic syndrome, non-alcoholic fatty liver disease (NAFLD) has become one of the most frequent causes of chronic liver disease [1,2,3,4] with reported prevalence rates of up to 46% [5]. Without appropriate treatment, NAFLD and especially non-alcoholic steatohepatitis (NASH) can progress to fibrosis and ultimately cirrhosis. As a consequence of these developments, NASH has become the second leading etiology of liver disease in adults awaiting transplantation in the United States [6], and a major cause of hepatocellular carcinoma (HCC) [7].

Histological confirmation is considered the gold standard for diagnosis and staging of the disease [8, 9]. Stage of liver fibrosis is of paramount importance as it has been identified as an independent predictor of liver-related (and all-cause) mortality in patients with NAFLD in various studies [10,11,12]. Thus, detection of liver fibrosis is a crucial diagnostic step to stratify the individual risk of patients with NAFLD.

Several non-invasive tools have been introduced that allow assessment of fibrosis stage even without biopsy [9]. Transient elastography is among the most widely used techniques for non-invasive fibrosis assessment, but shows some limitations in morbidly obese patients [13, 14]. Fibrosis scores based on patient characteristics, anthropometric measurements, and laboratory parameters are increasingly used, and are considered as feasible alternative to imaging techniques, especially for exclusion of advanced fibrosis (stage III/IV) [9]. It has been repeatedly demonstrated that non-invasive fibrosis scores accurately predict advanced fibrosis in NAFLD [15,16,17,18]. To our knowledge, however, the value of non-invasive fibrosis scores with respect to body mass index (BMI) has not been evaluated.

The aim of this study was to assess the performance of different non-invasive scoring tools for liver fibrosis in NAFLD patients of different weight classes. For this purpose, we have evaluated non-invasive liver fibrosis scoring tools in a cohort of overweight or moderately obese (class I) NAFLD patients and in a NAFLD cohort with morbid or super (class III) obesity.

Materials and methods

Patients with well-characterized and biopsy-confirmed NAFLD were retrospectively studied at the University Medical Center Hamburg-Eppendorf. All patients underwent biopsy between January 2012 and December 2015. Our patient population consisted of 143 NAFLD patients recruited by the Department of Gastroenterology (conventional cohort), and 225 patients with class III obesity who underwent bariatric surgery and were recruited by the Department of Surgery (morbidly obese cohort).

NAFLD was diagnosed in patients with hepatic steatosis on liver biopsy after exclusion of drug-induced steatosis, excessive alcohol consumption (>210 g/week in men or >140 g/week in women), chronic hepatitis B or C infection, and histological evidence of other concomitant chronic liver disease [15]. Alcohol abuse was excluded by interviewing the patients, and their relatives, if available.

Clinical and laboratory data were collected in the course of routine assessment prior to biopsy. Laboratory testing, including blood count, clinical chemistry, and coagulation-parameters, was performed at a median of 51 (interquartile range [IQR] 14–108) days prior to liver biopsy. If necessary, coagulation- and hematology tests were repeated shortly before the procedure. Within 48 h prior to biopsy, all patients underwent a physical examination including assessment of body weight and height; BMI was calculated according to the usual formula (BMI = body weight (kg) / height (m)2).

Obesity and overweight were defined by BMI ≥ 30 kg/m2 and BMI = 25–29.9 kg/m2, respectively [19]. Diabetes was diagnosed in patients already on anti-diabetic medication and in patients with a fasting glucose ≥126 mg/dl. Impaired fasting glucose (IFG) was defined by fasting glucose levels ≥110 mg/dl. Arterial hypertension was diagnosed in patients already under antihypertensive medication and in patients with a blood-pressure ≥130/≥85 mmHg.

This study was approved by the ethics committee of the Hamburg Medical Chamber (WF-042/17). The need for informed consent was waived due to the observational character of the study. A subgroup of our cohort has been previously evaluated with a different aim and methodology (PMID 29316577).

Non-invasive fibrosis assessment

NAFLD Fibrosis Score (NFS) [15], aspartate aminotransferase (AST) to platelet ratio index (APRI) [20], AST/alanine aminotransferase (ALT) ratio, fibrosis 4 (FIB-4) score [21], and BARD score [17] were calculated based on clinical and biochemical parameters.

Liver biopsy and histology

Liver biopsy was performed by mini-laparoscopy in 140 patients (38%), whereas 225 patients (61%) underwent intraoperative biopsy during bariatric surgery. Three patients (1%) underwent conventional percutaneous biopsy. Mini-laparoscopic biopsy was performed using an 18-gauge biopsy needle as described elsewhere [22]. Surgical liver biopsy (wedge biopsy) was performed during either laparoscopic Roux-Y gastric bypass or sleeve gastrectomy. Liver biopsy was performed when the liver appeared macroscopically abnormal at the time of bariatric surgery [23]. Pathological examination was performed at our central pathology department by experts in liver pathology. Liver fibrosis was evaluated according to the NASH-CRN scoring system: F 0 = no fibrosis; F I = perisinusoidal or portal/periportal fibrosis, F II = perisinusoidal and portal/periportal fibrosis, F III = bridging fibrosis, and F IV = cirrhosis [24]. Fibrosis stages 3 and 4 (F III/IV) were considered as advanced fibrosis. The NAS score was assessed based on the histological criteria of steatosis, ballooning, and inflammation, as described elsewhere [25].

Statistical analyses

Data are presented as median (25–75% IQR) for metric variables or as absolute number (%). Continuous variables were compared using Mann–Whitney U test, binary variables via χ2 analysis or Fisher’s exact, as appropriate. Factors associated with fibrosis stage were evaluated by spearman rank correlation and by univariate ordinal regression. Area under receiver operating characteristic (AUROC) analyses were performed to identify cut-off values for different parameters with respect to the presence of advanced fibrosis. In addition, a precision-recall curve and ROC curve was used to visualize and assess predictive abilities of various parameters. Sensitivity, specificity, accuracy, positive and negative predictive values, as well as positive and negative likelihood ratios were calculated according to the usual formulae. Diagnostic odds ratio was calculated as described elsewhere [26]. AUROCs were compared by a non-parametric approach suggested by DeLong et al. [27]. Percent improvement in prediction error was calculated according to the following formula: \(100 \times \frac{{\left[ {\rm{AUROC}}_{{\rm{Test}} - {\rm{Score}} - {\rm{AUROC}}_{\rm{Reference}} - {\rm{Score}}} \right]}}{{\left[ {1 - {\rm{AUROC}}_{\rm{Reference}} - {\rm{Score}}} \right]}}\), where Test-Score refers to the first (superior) and Reference-Score to the second (inferior) score. A p value < 0. 05 was generally considered statistically significant. SPSS Statistics Version 22 (SPSS Inc., Chicago, IL) and R (RStudio Version 1.2.1335) were used for statistical analyses.

Results

Patients

Three hundred sixty-eight patients with NAFLD were identified during the observation-period. Main clinical, histologic, and laboratory characteristics of our NAFLD cohort are illustrated in Table 1. Most of our patients were Caucasian (90%) with a median age of 47 years. A total of 43% of our patients were male. The majority of patients were overweight/obese (16%/77%) with typical underlying conditions suggestive of metabolic syndrome. We observed significant differences between the conventional and the morbidly obese cohort, attributable to different recruitment procedures. Accordingly, 88% (n = 126) of patients showed abnormal aminotransferase levels in the conventional group, compared to 32% (n = 73) in the morbidly obese group.

Table 1 Characteristics of 368 patients with non-alcoholic fatty liver disease.

Factors associated with fibrosis stage

Factors associated with fibrosis stage included age and parameters suggestive of metabolic syndrome (presence of diabetes/IFG and arterial hypertension). Among laboratory parameters, AST and platelet count were associated with fibrosis stage. By contrast, INR and bilirubin were not significantly linked to fibrosis stage. We observed an inverse relationship between albumin levels and fibrosis stage, not reaching statistical significance in our total cohort (p = 0.056). Yet, after adjustment for patient cohort (conventional vs. morbidly obese), there was a significant inverse association between albumin and fibrosis stage. Similarly, we found a significant inverse association between BMI and fibrosis stage, which was completely abolished by adjustment for patient cohort (conventional vs. morbidly obese) and thus probably caused by different recruitment between patients with and without morbid obesity, as discussed below. The detailed analysis of underlying conditions and laboratory parameters with respect to fibrosis stages is shown in Table 2.

Table 2 Demographic, clinical, and laboratory parameters in relation to an ordinal model of fibrosis stage assessed by univariate analysisa.

Performance of non-invasive scoring systems in patients with and without morbid obesity

All of the five tested scores were significantly associated with fibrosis stage in both patient cohorts (conventional vs. morbidly obese; Table 2). However, AST/ALT ratio and BARD score showed only moderate predictive potential. In our total cohort, FIB-4 score, APRI score and NFS showed the highest AUROC values in prediction of advanced fibrosis (Table 3). The AUROC for FIB-4 score (0.904) was significantly higher than AUROCs of all other scores (p < 0.001 for all). While FIB-4 and APRI score yielded comparable ROC curves in the conventional and the morbidly obese cohort, NFS curves differed considerably between conventional and morbidly obese NAFLD patients, thereby resulting in a lower AUROC in the total cohort. The relation of FIB-4 and NFS to the presence of fibrosis stage III/IV in our NALFD patients with respect to patient cohort (conventional vs. morbidly obese) is illustrated in Fig. 1A, B. Figure 1B shows that NFS overestimates advanced fibrosis in our morbidly obese compared to conventional NAFLD patients. The prognostic value of ROC-derived threshold values for NFS, APRI, and FIB-4 score for our total NFLD cohort is shown in Table 4.

Table 3 Receiver operating characteristic analyses of different non-invasive scores in prediction of advanced fibrosis (stage III/IV) in NAFLD.
Fig. 1: FIB-4 and NAFLD Fibrosis Score in prediction of advanced fibrosis.
figure 1

Association of FIB-4 and NAFLD Fibrosis Score with advanced fibrosis in NAFLD patients with and without morbid obesity.

Table 4 Performance of ROC-derived cut-off values for FIB-4, APRI, and NAFLD Fibrosis score in identification of advanced fibrosis in NAFLD.

NAFLD Fibrosis Score and BMI in morbidly obese NAFLD patients

Based on the aforementioned findings, we hypothesized that the excess in BMI observed in our morbidly obese cohort did not correspond to a relevant increase in the risk of fibrosis III/IV; thus, BMI may be overrepresented in the NFS when applied to morbidly obese patients. Indeed, we did observe no correlation between BMI and fibrosis stage in our morbidly obese NAFLD patients (r = 0.051, p = 0.446). Thus, we calculated a modified NFS (NFSmod) with no changes in the basic formula, but with BMI limited to 40 kg/m2. The AUROC analyses for NFSmod in prediction of fibrosis stages III/IV in our conventional and morbidly obese patients are shown in Table 3. Limiting BMI to 40 kg/m2 led to a significant improvement of the NFS’s performance (p < 0.001 between AUROCs). In our total NAFLD population, the predictive value of the NFSmod with regard to presence of advanced fibrosis (stage III/IV) was comparable to APRI, but still inferior to FIB-4 score (p < 0.001, Fig. 2).

Fig. 2: Performance of non-invasive scores in prediction of advanced fibrosis.
figure 2

Receiver operating characteristic and precision-recall cures for different non-invasive fibrosis scores in prediction of advanced fibrosis (stages III/IV).

FIB-4 score is the most reliable predictor of advanced fibrosis

FIB-4 score improved prediction error rates as compared to all other non-invasive scores (Supplementary Fig. 1). Accordingly, FIB-4 was identified as the most potent predictor of fibrosis stage III/IV in our total cohort of NAFLD patients, independent of age, sex, BMI and group-membership (conventional vs. morbidly obese). The results of the multivariate logistic regression model are shown in Table 5. Overall, FIB-4 scores >1.0 were strongly associated with presence of advanced fibrosis (OR = 29.1 (95% CI 12.6–67.3), p < 0.001; Table 4). Advanced fibrosis was found in patients 45% of patients with FIB-4 values >1.0 compared to <3% when FIB-4 was ≤1.0. Moreover, probability of advanced fibrosis was 88% in patients with FIB-4 values >3.0.

Table 5 Multivariate logistic regression model for FIB-4 score in prediction of advanced fibrosis in NAFLD.

Discussion

We were able to demonstrate in large cohort of NAFLD patients that non-invasive fibrosis scores accurately predict the presence of advanced fibrosis (stage III/IV) in both, patients with and without morbid obesity. FIB-4 score, APRI, and NFS score were identified as suitable predictors of advanced fibrosis in our NAFLD patients. Current EASL guidelines on management of NAFLD state that non-invasive scores as well as transient elastography are acceptable procedures for the identification of patients at low risk of advanced fibrosis [9]. Accordingly, in our NAFLD patients, we found high negative predictive values for ROC-derived thresholds of FIB-4 and NFS, respectively, with regard to presence of advanced fibrosis.

Assessment of fibrosis stage is of central importance in patients with NAFLD. Studies suggest that fibrosis stage is the strongest single predictor of death in NAFLD [10,11,12], and it has been shown that advanced fibrosis stage is associated with an increased risk of developing HCC [28, 29]. Moreover, the presence of liver cirrhosis is associated with increased perioperative risk [30, 31]. Thus, estimation of fibrosis stage should be part of the routine preoperative assessment in patients with metabolic risk profiles, and especially in morbidly obese patients. Yet, predicting liver-fibrosis in morbid obesity is challenging, as imaging techniques are often insufficient in these patients [13, 32]. Transient elastography is widely being used for non-invasive assessment of fibrosis stage; however, this strategy is hampered by high failure-rates of up to 41% in patients with morbid obesity [13], although use of the XL-probe seems to be more efficient in producing reliable results in obese NAFLD patients [33]. It has been reported that AST, male gender and presence of type 2 diabetes mellitus are associated with NASH, and waist-to-hip ratio, AST and focal necrosis on liver biopsy with fibrosis [34]. However, the applicability of non-invasive scoring systems has not been evaluated in a large cohort of morbidly obese NAFLD patients. Recently, it has been suggested that different thresholds may be required in order to identify patients with advanced fibrosis when these scores are applied to morbidly obese patients [35]. In our NAFLD patients we found that FIB-4 (and also APRI score) could be applied to morbidly obese patients without adjustments. FIB-4 score >1.0 was significantly associated with presence of advanced fibrosis in conventional (OR = 17.3 (95%CI 6.2–48.1), p < 0.001) and morbidly obese (OR = 32.1 (6.7–151.9), p < 0.001) NAFLD patients. In addition, negative predictive value was high; thus, advanced fibrosis was almost excluded, if the criterion was not met (Table 4). These findings contrast a recent study suggesting that both, FIB-4 and NFS performed poorly in morbidly obese patients [36]. In terms of identification of advanced fibrosis, AUROCs indicated good discriminative capabilities for both scores in both the conventional and the morbidly obese NAFLD patients (each group analyzed separately).

In contrast to FIB-4, NFS showed only moderate performance when applied to our total NAFLD population. BMI is a central component of the NFS; however, the NFS was derived from NAFLD patients with a mean BMI of 32.2 kg/m2 [15], whereas our morbidly obese group showed a median BMI of 50.8 kg/m2. Thus, one possible reason for the aforementioned phenomenon may be found in an overrepresentation of BMI in the NFS with consecutive overestimation of fibrosis stage in morbidly obese patients. In support of this hypothesis, we found no correlation of BMI with fibrosis stage in our morbidly obese NAFLD cohort. By contrast, BMI correlated significantly with fibrosis stage in our conventional NAFLD cohort (p < 0.05). Thus, we decided to calculate a modified NFS with BMI limited to a maximum value of 40 kg/m2 (NFSmod) in order to overcome the issue of fibrosis-overestimation. Indeed, this modification significantly improved the score’s performance when applied to the total cohort (Table 3). These findings support earlier reports indicating that fat distribution and metabolic syndrome rather than BMI alone are associated with NAFLD [37, 38]. Yet, our hypothesis requires further validation in other morbidly obese NAFLD cohorts.

FIB-4 score was identified as a reliable predictor of advanced fibrosis in our NAFLD patients, independent of group-membership (conventional vs. morbidly obese), age, sex, and BMI (Table 5). FIB-4 was initially developed for assessing fibrosis stage in hepatitis C/human immunodeficiency virus co-infected patients [21]. With the increasing prevalence and impact of NAFLD, this score has been increasingly evaluated with respect to prediction of advanced fibrosis in NAFLD patients [39, 40], and it has become evident that FIB-4 is indeed a valid predictor of advanced fibrosis in NAFLD. In accordance with these findings, we observed a good predictive value of FIB-4 in both our morbidly obese and conventional cohort, with comparable thresholds for predicting advanced fibrosis. Moreover, performance of FIB-4 in predicting advanced fibrosis was significantly superior to all other tested non-invasive scores. Thus, our data suggest that FIB-4 score is a useful and reliable predictor of advanced fibrosis in NAFLD patients with and without morbid obesity.

With the rise of new markers of metabolism such as adipokines/hepatokines, new parameters are available that may contribute not only to a better understanding of the pathophysiology, but also to a more accurate non-invasive assessment of NAFLD progression. In 2019, Canbay et al. proposed a score based on age, gGT, HbA1c, caspase-cleaved cytokeratin 18 fragments (M30), and adiponectin, which predicted presence or absence of NASH with a reasonable performance [41]. Moreover, adipokines and hepatokines have been suggested as potential markers for disease progression and development of HCC in NAFLD/NASH [42]. Yet, these parameters were not routinely assessed in our patients. Future studies further will have to clarify the role and potential therapeutic implications of adipokines and hepatokines in NAFLD/NASH patients.

Two main biopsy-techniques were used in our patients: needle biopsy and surgical wedge biopsy. In 2006, a study suggested that—compared to needle biopsies—wedge biopsies were associated with higher rates of fibrosis [43], which may be explained by an overestimation of fibrosis due to subcapsular sampling [44, 45]. However, wedge- and needle biopsies in the aforementioned study were not performed in the same patients. A small study comparing both techniques in the same patients showed fair to good concordance between wedge and needle biopsies with kappa coefficients ranging from 0.15 to 0.65, but also with a trend for higher fibrosis stages in the wedge samples [46]. Another smaller study—based on a morphometric evaluation—suggested that wedge biopsies were not only appropriate for assessing fibrosis stage, but also showed smaller sampling variability as compared to needle biopsy [47]. Moreover, it has been shown that even if the same biopsy-technique was used, considerable variability in the results were observed in bariatric NAFLD patients [48]. Thus, although larger comparative studies are lacking, wedge biopsy appears to be a suitable method for assessing liver fibrosis, and fibrosis assessment results seem to correlate with those obtained by needle biopsy. Yet, sampling variability remains an issue, which underlines the need for validated and reliable non-invasive tools to assess fibrosis stage in NAFLD.

This study has strengths and limitations. To our knowledge, our study represents the first comprehensive evaluation of commonly used non-invasive liver fibrosis scores in a large cohort of NAFLD patients focusing on the potential bias caused by morbid obesity. A strength of our study is that two separately recruited patient cohorts underwent histological scoring at the same central pathology. The performance of non-invasive liver fibrosis assessments in the conventional cohort was comparable to published results, indicating validity of our center’s histological evaluation. Thus, the different performance of NFS between our cohorts indeed depends on the patients’ BMI. Yet, due to the different recruitment procedures used in the two cohorts, there are significant differences in baseline characteristics that hamper direct comparability of the groups. However, the aim of this study was not to compare both patient groups in terms of their baseline characteristics, but to evaluate the performance of the non-invasive scoring systems in different stages of obesity. This is a retrospective study. However, all data were documented prospectively at the time of diagnosis/treatment in our patient data management system following the departments standard assuring high reliability of the data. Yet, bias inherent to retrospective analyses and residual confounding cannot be entirely excluded.

In conclusion, our study shows that among commonly used non-invasive scoring systems especially FIB-4 score is a useful and accurate predictor of advanced fibrosis. FIB-4 scores >1.0 are highly suggestive for advanced fibrosis in NAFLD patients of all BMI categories. NFS tends to overestimate fibrosis in morbidly obese patients. Future studies should clarify whether limitation of BMI instead of adjustment of thresholds to BMI groups may improve the score’s performance in morbidly obese patients, as suggested by our data. The use of non-invasive scoring systems may help to avoid unnecessary biopsies and identify patients requiring a closer follow-up.