PT - JOURNAL ARTICLE AU - Guangning Xu AU - Geng Deng AU - Xindong Wang AU - Ken Fu TI - Automatic Spline Knot Selection in Modeling Mortgage Loan Default Using Shape Constrained Regression AID - 10.3905/jsf.2021.1.123 DP - 2021 Jun 04 TA - The Journal of Structured Finance PG - jsf.2021.1.123 4099 - https://pm-research.com/content/early/2021/06/04/jsf.2021.1.123.short 4100 - https://pm-research.com/content/early/2021/06/04/jsf.2021.1.123.full AB - In mortgage default modeling, many of the key variables, such as loan age, FICO score, Debt-to-Income ratio (DTI), and Loan-to-House-Value ratio (LTV), have nonlinear relationships with the target default rates. Experienced modelers generally apply a spline transformation with knots to the individual variables. In this article, we introduce the Quantile-based Shape Constrained Maximum Likelihood Estimator (QSC-MLE), which features an automatic spline knot selection in a mortgage default model. QSC-MLE is an enhanced variant of SC-MLE (Chen and Samworth 2016) used in combination with a quantile-based knots set, to effectively process large datasets. QSC-MLE requires generic shape information of the inputs, for example, the monotonicity or convexity of the FICO score, DTI, and LTV to capture any nonlinear effects. We show that the new default model considerably improves the accuracy of the out-of-sample prediction in comparison with the logistic regression and the Cox proportional hazards model. Moreover, the model conveniently generates component-wise spline functions, which facilitates the interpretation of the default rate response to the input variables.TOPICS: MBS and residential mortgage loans, quantitative methods, statistical methods, credit risk management, performance measurementKey Findings▪ A mortgage default model using the Quantile-based Shape Constrained Maximum Likelihood Estimator (QSC-MLE), which features automatic spline knot selection.▪ QSC-MLE constructs shape-constrained spline functions to capture nonlinear effects of model inputs.▪ The new default model considerably improves the accuracy of the out-of-sample prediction.