Nonparametric prediction distribution from resolution-wise regression with heterogeneous data

J Bus Econ Stat. 2023;41(4):1157-1172. doi: 10.1080/07350015.2022.2115498. Epub 2022 Oct 6.

Abstract

Modeling and inference for heterogeneous data have gained great interest recently due to rapid developments in personalized marketing. Most existing regression approaches are based on the conditional mean and may require additional cluster information to accommodate data heterogeneity. In this paper, we propose a novel nonparametric resolution-wise regression procedure to provide an estimated distribution of the response instead of one single value. We achieve this by decomposing the information of the response and the predictors into resolutions and patterns respectively based on marginal binary expansions. The relationships between resolutions and patterns are modeled by penalized logistic regressions. Combining the resolution-wise prediction, we deliver a histogram of the conditional response to approximate the distribution. Moreover, we show a sure independence screening property and the consistency of the proposed method for growing dimensions. Simulations and a real estate valuation dataset further illustrate the effectiveness of the proposed method.

Keywords: Binary Expansion; Data heterogeneity; Nonparametric Statistics; SSANOVA; Sure independence screening.