++
HIGHLIGHTS
Machine learning is a deeply statistical science, built largely out statistics and algebra, neither “logistic regression on steroids” nor a panacea for all data-derived difficulties - it’s just another tool whose utility, like others, greatly depends on how it’s used.
Although the learning algorithms themselves are of course important, the relevance of high-quality, representative, and voluminous data cannot be understated.
Machine learning is increasingly being used in surgical settings. This chapter focuses on more traditional aspects of machine learning.
++
Machine learning is neither “logistic regression on steroids” nor a panacea for all data-derived difficulties; it is just another tool whose utility, like others, greatly depends on how it is used. As it and other technologies have graduated into public discourse, experts in domains other than computer science are now trying to work out what all the fuss is about, and it may be difficult to separate the expectation from the truth.
++
Artificial intelligence (AI) is often used as if it is synonymous with machine learning (ML), and vice versa, which is not accurate—at least not yet. Although the specific definitions for these terms are constantly in flux, the consensus is that AI involves any sufficiently complex software that mimics human behavior on some task, which could reasonably include a circuitous flow chart designed by a committee, with branches marked “if” and “otherwise” that one may follow downward toward the output, mechanistically. Thus, AI includes software that may be explicitly instructed by human experts on how to mimic their domain expertise, whereas ML—a subset of AI—foregoes humans altogether, to the extent possible, and learns from experience (ie, from data). The nature of this learning is also typically quite limited to a very specific (or domain-specific) task. In surgery, we might have a piece of ML that can predict the presence of a hypoxemic event, for example, but might have nothing to say about the severity of that event or about sepsis, anaphylaxis, or bleeding, unless the data allowed for those aspects to be learned.
++
The field of ML has been steadily moving from expert-defined deterministic rules to data-defined statistical patterns, although domain experts still serve their purpose in increasingly niche areas. The various subfields of AI, including computer vision (CV) and natural language processing (NLP), have tended to incorporate considerable domain expertise (eg, in understanding and interpreting issues of syntax or lexicography in language, in the latter), but modern ML has slowly eroded the necessity for such prior knowledge. One reason for this gradual shift, which may have begun as a mere trickle as far back as the 1980s, is the tidal wave of data now being recorded, including in health care. The average clinic or small hospital may have on the order of a petabyte (1024 terabytes) of patient data or more, approximately 80% of which is unstructured (eg, computed tomography [CT] scans, textual ...