Transient elastography (TE) defined-liver stiffness measurement (LSM) is a reliable, reproducible, and noninvasive toolkit to identify subjects experiencing liver fibrosis or cirrhosis [13]. Aligning with a marked increase in the cumulative HCC incidences as indicated by a higher LSM score, this modality has been widely adopted to assess and stratify HCC risk with remarkable performance [9, 11, 14]. Given the heavy burden on healthcare resources owing to HBV infection and the broad feasibility of TE techniques in China, we herein hypothesize that it is tempting to generate a novel and readily available LSM-dictated model, in hopes of enhancing surveillance and prevention in patients with CHB who are prone to dramatic HCC aggravation. Furthermore, a sizable derivation cohort and two independent external validation cohorts have been used for ascertainment in this study.
This was a retrospective study by consecutively enrolling patients between January 2013 and December 2023, who were followed up until December 2024. The inclusion criteria consisted of (1) persistent serum HBsAg presence for > 6 months, (2) completed follow-up data for ≥ 6 months, and (3) at least two follow-up visits. The exclusion criteria comprised (1) coinfection with hepatitis A/C/D/E virus or human immunodeficiency virus; (2) prior history of HCC or other malignancies at enrollment; (3) confirmed HCC, deaths, or receiving liver transplant < 6 months of the enrollment; (4) lost to regular follow-up; (5) incomplete clinical and/or laboratory data; (6) without verified LSM; or (7) pregnancy or lactation. Notably, the external validation datasets conformed to the same inclusion/exclusion criteria (external validation cohort 1 [EV-1]: individuals from Mengchao Hepatobiliary Hospital of Fujian Medical University; external validation cohort 2 [EV-2]: individuals from Beijing Ditan Hospital of Capital Medical University, Nanjing Drum Tower Hospital, and the First Affiliated Hospital of Zhengzhou University). This study aligned with the ethical standards of the Declaration of Helsinki and was approved by the Ethics Committee of Hebei Medical University Third Hospital (W2024-004-1). This study was conducted in accordance with the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) guidelines for the development and validation of prediction models [15].
Clinical data, including age, sex, alcohol use, family history of liver cirrhosis/HCC, comorbidities, and a range of laboratory parameters, were retrospectively collected at baseline. All patients underwent periodic surveillance of laboratory tests, including routine blood chemistry, serum HBV DNA level, and other serologic viral markers, at a 3- to 6-month interval. Meanwhile, serum alpha-fetoprotein (AFP) level and abdominal ultrasound were recorded every 6 months to screen for possible hepatic decompensation or HCC. Liver cirrhosis was diagnosed to fulfill the respective liver histology, clinical features, and radiological evidence or their combinations. The primary outcome indicated development of HCC in terms of histological or radiological findings [16].
LSM was performed using a liver TE scanner (iLivTouch, Wuxi HISKY Medical Technologies Co., Ltd., China). Although a high correlation of LSM between operators, interobserver variability in TE should not be negligible [17, 18]. In this study, all measurements were performed by experienced operators trained for TE with over 100 operations. The operators underwent professional operation training and conducted practical operation demonstrations under guidance, and were assessed to obtain the training certificate issued by the manufacturer. According to the manufacturer, the result was considered reliable only if at least 10 successful acquisitions were obtained, and the interquartile range-to-median ratio was less than 0.3 [19]. Throughout the study, ongoing monitoring of LSM data was conducted. All measurements were reviewed by investigators to ensure continued adherence to protocol. The patients were scanned in the supine or lateral position while hugging their head with the right hand to maximize the intercostal space. Under ultrasound guidance, the 7th, 8th, or 9th intercostal space from the right anterior axillary line to the midaxillary line was selected as the detection point.
Statistical analysis was employed using the R software (version 4.3.2; https://www.R-project.org). Using a theoretical ratio of 3:1, patients were randomly split into the derivation cohort, responsible for screening variables and constructing the model, and the internal validation cohort, which was used to verify the results obtained. At the same time, data from two independent external validation cohorts were applied to testify the generalizability and validity of the proposed model. Categorical variables are presented as the number and percentage. On the other hand, continuous variables are presented as the mean (standard deviation [SD]) or the median (interquartile range [IQR]) as appropriate according to the normal distribution attributes. The predictive performance of the model was assessed using the receiver operating characteristic (ROC) curve.
We performed a least absolute shrinkage and selection operator (LASSO) regression algorithm to select features initially in the derivation cohort [20]. The tenfold cross-validation method was employed to choose the optimal regularization parameter λ (minimizes the binomial deviance: λ-min = 0.01). To balance the model's complexity and performance, 7 variables with nonzero coefficients were finally retained according to the critical value setting as λ-1se = 0.03. The proportional hazards assumption was checked using the Schoenfeld residuals, and all selected variables exhibited equally without time-course fluctuation. By incorporating these LASSO-selected variables, a multivariate Cox proportional hazards regression was utilized to identify independent risk factors of HCC, which were finally visualized in the resulting nomogram.
We also assembled several validation approaches to evaluate the prediction model's accuracy across the derivation and validation cohorts. Regarding the predictive performance, we evaluated and compared both their discrimination and calibration abilities. The discriminative ability of our proposed nomogram was estimated in terms of Harrell's concordance index (c-index), along with the time-dependent area under the curve (tdAUC) at 3, 5, and 8 years. On the other hand, the model's consistency was assessed using the calibration curve, as shown in plots of estimated versus observed probabilities and the Hosmer-Lemeshow goodness of fit test [21]. Additionally, the X-tile software was employed to identify three risk strata associated with HCC development based on the total score derived from the nomogram, categorizing the participants into the low-, intermediate-, and high-risk groups. The Kaplan-Meier method for each risk group at different stages was constructed, and the cumulative HCC development between groups was analyzed using the log-rank test. A P value of < 0.05 was regarded as statistically significant.