Linear probability models in information systems research

Shmueli, G and Chatla, S. (2014) Linear probability models in information systems research. In: 22nd European Conference on Information Systems, 9-11 June 2014, Tel Aviv.

Full text not available from this repository. (Request a copy)


Sizes of datasets used in IS research are growing quickly due to data available from digital technologies such as mobile, RFID, sensors, online markets, and more. It is not uncommon to see studies using tens and hundreds of thousands or even millions of records. Linear regression is among the most popular statistical model in social sciences research. Linear probability models, which are linear regression models applied to a binary outcome, are commonly used in many social science disciplines, despite criticisms of such usage. Surprisingly, LPMs are rare in the IS literature, where logit and probit regression models are typically used for binary outcomes. Whether LPMs provide value or constitute an abuse has been discussed only for specific aspects. A thorough and broad evaluation of their pros and cost for different goals in different scenarios is missing. We carry out an extensive study to evaluate the advantages and dangers of LPMs, especially in the realm of Big Data that now affects IS research, where large samples and many variables are available. We evaluate performance in terms of coefficient estimation as well as predictive power. We compare performance to alternatives suggested in the literature. We find that the LPM is beneficial for explanatory modeling when the outcome is naturally binary, whereas it is beneficial for predictive modeling when the outcome is binary by dichotomization. In large-sample studies IS researchers should consider LPMs for purposes of coefficient estimation if the outcome is naturally binary, but not if it is dichotomized. For predictive purposes, LPM should be considered even with small samples. We motivate and illustrate our study using a large dataset on online auctions from eBay.

Item Type: Conference or Workshop Item (Paper)
Subjects: Business Analytics
Date Deposited: 05 Nov 2014 07:09
Last Modified: 05 Nov 2014 07:10

Actions (login required)

View Item
View Item