The Journal of Arthroplasty, Volume 34, Issue 10, 2235 - 2241.e1

Predicting Inpatient Payments Prior to Lower Extremity Arthroplasty Using Deep Learning: Which Model Architecture Is Best?

Karnuta, Jaret M. et al.
Hip Knee


Recent advances in machine learning have given rise to deep learning, which uses hierarchical layers to build models, offering the ability to advance value-based healthcare by better predicting patient outcomes and costs of a given treatment. The purpose of this study is to compare the performance of 2 common deep learning models, traditional multilayer perceptron (MLP), and the newer dense neural network (DenseNet), in predicting outcomes for primary total hip arthroplasty (THA) and total knee arthroplasty (TKA) as a foundation for future musculoskeletal studies seeking to utilize machine learning.


Using 295,605 patients undergoing primary THA and TKA from a New York State inpatient administrative database from 2009 to 2016, 2 neural network designs (MLP vs DenseNet) with different model regularization techniques (dropout, batch normalization, and DeCovLoss) were applied to compare model performance on predicting inpatient procedural cost using the area under the receiver operating characteristic curve (AUC). Models were implemented to identify high-cost surgical cases.


DenseNet performed similarly to or better than MLP across the different regularization techniques in predicting procedural costs of THA and TKA. Applying regularization to DenseNet resulted in a significantly higher AUC as compared to DenseNet alone (0.813 vs 0.792, P = .011). When regularization methods were applied to MLP, the AUC was significantly lower than without regularization (0.621 vs 0.791, P = 1.1 × 10 −15). When the optimal MLP and DenseNet models were compared in a head-to-head fashion, they performed similarly at cost prediction ( P > .999).


This study establishes that in predicting costs of lower extremity arthroplasty, DenseNet models improve in performance with regularization, whereas simple neural network models perform significantly worse without regularization. In light of the resource-intensive nature of creating and testing deep learning models for orthopedic surgery, particularly for value-centric procedures such as arthroplasty, this study establishes a set of key technical features that resulted in better prediction of inpatient surgical costs. We demonstrated that regularization is critically important for neural networks in arthroplasty cost prediction and that future studies should utilize these deep learning techniques to predict arthroplasty costs.

Level of Evidence


Download article