Technology » Artificial Intelligence/AI Books

The best books on Machine Learning

recommended by Eric Siegel

The AI Playbook: Mastering the Rare Art of Machine Learning Deployment by Eric Siegel

The AI Playbook: Mastering the Rare Art of Machine Learning Deployment
by Eric Siegel

Read

Machine learning uses data to predict outcomes, explains Eric Siegel, a former professor at Columbia who now advises companies on deploying it in their business. Unlike artificial intelligence, it's a real technology with a proven track record, he says. He recommends practical books on machine learning that are accessible to the layperson and useful to anyone looking to use it in their business or organization.

Interview by Sophie Roell, Editor

The AI Playbook: Mastering the Rare Art of Machine Learning Deployment by Eric Siegel

The AI Playbook: Mastering the Rare Art of Machine Learning Deployment
by Eric Siegel

Read
Buy all books

Before we get to the books you’re recommending—and apologies for asking something you’ve had to answer hundreds of times—could you explain what machine learning is?

Not at all. I’ve been working on explaining the topic to general audiences for decades, so it’s definitely something I’ve always taken an interest in. The focus area of my books and my teaching and everything is to bridge that gap.

Machine learning learns from data to predict. In established company applications, that means to help target marketing, fraud detection, ads, etc. The title of my first book is Predictive Analytics, which is basically a synonym for machine learning. The subtitle is an informal definition: The Power to Predict Who Will Click, Buy, Lie, or Die. The book on the table today, The AI Playbook: Mastering the Rare Art of Machine Learning Deployment, is my second book.

Most of the media attention these days is on generative AI. It’s still the same underlying technology, but instead of predicting on that level (per customer, or per corporate client, or per satellite that might run out of battery), you’re predicting ‘what should the next word be in this sentence?’ But the core technology—of learning from data to predict—is, broadly speaking, the same.

How does that tie in with artificial intelligence? What’s the difference between AI and machine learning?

Machine learning is a real technology with a proven track record and a true value proposition, whereas AI is the brand we hear about. It’s a subjective word that people use in many different ways, and no one can agree on the definition. Generally, AI conveys at least a little bit of overpromising, that is to say: hype. It’s basically a brand on top of machine learning.

Wow. OK. So in terms of the books you’ve chosen for today—what is it you’re trying to get across with these recommendations?

In general, the focus of my career—what I think is important—is getting value from machine learning. It’s deploying it for concrete, real uses, so that it’s actually helping organizations improve efficiency, or helping consumers, or some combination of the two.

The majority of books are either this highfalutin’ overhype with abstract buzzwords and a lot of wild promises about where we’re headed, or they’re just super technical and meant for data scientists. The in-between land of concrete, practical, meaningful, and yet accessible, entertaining, interesting books—that’s where we need more books, and that’s the main focus I had here. I did include one technical book, that’s my final choice, but my interest is in bridging a gap that’s so often unbridged.

Most new machine learning projects fail to achieve deployment. It’s an organizational failure. There’s a business practice—that’s the ‘playbook’ in the title of my new book—that needs to be followed. It brings together the tech and the business—the quants, the data scientists, the business stakeholders, their clients, their bosses—so that they get on the same page, speak the same language, and collaborate deeply in order to create value by making real change.

“There are a lot of AI hype books…They’ve been driving me bananas for decades”

The only way that what you’ve learned from data can be useful and valuable is if you act on it. That means changing operations, and that’s what I mean by deployment. It’s not just number crunching, but the actual use of models that make predictions. Those predictions then directly inform: Who should I market to? Which credit applicant should I approve? Which ad should I display? Which satellite should I inspect as potentially being low on battery? Should I drill for oil here? Whatever it is—there are a million examples.

That’s what’s so severely lacking. There is no standardized business practice well-known to business professionals. I offer that practice. I call it ‘bizML’ in my book. Two of the books I’ve chosen really are pioneering books in that realm, offering the business perspective, covering the technology and not just core number crunching for its own sake.

Because sure, the science is really cool and that’s why I got into it—that’s why most data scientists get into it in the first place. In my case, it’s been thirty years. I’ve had three decades to recover and realize, ‘Wait a minute. It’s not just the cool number crunching, let’s put the pedal to the metal and get people acting on the output.’

Your book is aimed at management, but your ideas aren’t necessarily just for companies. Could it be for government, or any organization that’s looking to use machine learning?

Yes. The Obama campaign used predictive analytics to target its campaign activities in 2012. That’s a case study I covered in my first book. It’s definitely applicable across sectors, including healthcare and hospitals, in both clinical and healthcare management issues.

Let’s go through the books. First on your list you’ve got Evil Robots, Killer Computers, and Other Myths (2021) by Steven Shwartz. Tell me more.

There are a lot of AI hype books. I didn’t include any of those in my list. They’ve been driving me bananas for decades. Then, there are a bunch of generally less popular myth-buster books. This is my favorite one. There’s one in French that’s called L’Intelligence artificielle n’existe pas. I thought that was hilarious.

This book, I think, does a great job. It’s not just a mythbuster; it’s a general introduction for newcomers, lay readers, and business readers with any background to understand, and put perspective on the hype and the overpromises. Of the mythbuster books—of which there are a dozen or so that I’m aware of—this is the one I like best.

He argues that we don’t need to worry about artificial general intelligence or AGI…

Yes, and I say the same thing. AGI is where the computer can do anything a person can do and we basically have virtual humans. I believe that’s either a ghost story or a fairy tale, depending on whether you’re being a doomer or a utopian. It’s the novel Mary Shelley would have written if she knew about algorithms. The computer is not going to come alive. The fact that it’s getting better does not mean it’s taking concrete, proven steps towards general human-level capabilities. These days it’s a lot more seemingly human-like in certain ways which are astounding, but there’s no reason to think that it’s actually approaching human-level capabilities in general, or that it’s going to get a volition of its own.

Isn’t he also making the point that the ‘It’s going to take over the world!’ fear slightly distracts from the actual problems that AI is causing in certain ways?

Yes. He’s not the only one to say that, because there are grave concerns. I’ve published a dozen or so op-eds myself (in the San Francisco Chronicle, the Scientific American blog, and other places) about those concerns. They come up in my second to last choice, The Rise of Big Data Policing, which is about the ethical issues and responsible machine learning.

Whether you’re saying AI is so powerful that it’s going to kill everybody, or put everyone out of work, or that it’s going to make a utopian society because it does everything for us, any of those stories serves the same thing. It’s hype, or criti-hype (a term somebody coined) that’s trying to almost deify it.

In general, the hype just serves to increase valuations and stock prices. It certainly does distract from the real, concrete problems that we could have in machine learning deployment, but it’s mostly meant to oversell. I don’t mean that it’s all just cynical—I think a lot of people genuinely believe it. It becomes almost a religious belief.

Do you agree with him that the threat to our jobs has also been exaggerated?

It has certainly been exaggerated: that’s part of the hype. Automation—which is what computers do in general—and optimization are going to make things more efficient and going to change the job market. We’ve been through those shifts many times before in history. It’s important for communities and businesses and governments to help individuals adapt to those changes, but it’s not going to wholesale replace white-collar jobs. It’s a tool that helps with tasks—it’s not like an employee who you can hire and onboard and set off on their own, like you would with a human.

Let’s go on to the next two books: Mining Your Own Business, by Jeff Deal and Gerhard Pilcher and Digital Decisioning, by James Taylor, which you wanted to talk about together.

Yes these are both more brass tacks, about the business process and business management. They are both pioneering books that really talk about how a company is going to operationalize or deploy this technology, and what it takes from a management and executive and organizational practice perspective to make those projects work effectively.

When have technology that’s great that works, that’s different from the overall project working, and actually achieving deployment. That takes organizational understanding and an unusual collaboration, and it’s a very neglected topic. That’s why I wrote The AI Playbook.

These two books are very much pioneering in bridging that gap and writing about the technology, not just in an abstract, buzzword way, but in a concrete, specific way, about how value is delivered and how the project needs to be run, in a way that’s understandable to all readers.

That’s critical because right now, we have data scientists operating in a vacuum. They come back and they say, ‘Hey! Look, I made a predictive model.’

The executives say, ‘Oh, that’s interesting. That sounds good.’

Then the data scientist says, ‘So, are you going to use it?’

‘What do you mean, use it? You want me to make a change to the way my company is operating?’

That’s another conversation, and it ends up being a non-starter because that change wasn’t part of the plan from the get-go, but that’s what needs to be put in place. That’s the message of these books—alongside just deepening the understanding of machine learning for people who are not data scientists.

Is there a good example of the way a company or an executive should approach the use of machine learning to support their activities?

In my book, I’ve got a number of examples, including UPS. It’s an established company, more than a hundred years old, that’s very much set in its ways. It improved its delivery in the US of 16 million packages a day using a system that predicts where packages are going to need to be delivered tomorrow, in combination with a system that prescribes driving routes. They save 185 million miles of driving a year, $350 million, 8 million gallons of fuel, and 185,000 metric tons of emissions, only because the leader of the project was really aggressive from the get-go, saying, ‘Hey! Look, this isn’t just a number-crunching project. We’re going to need to change the way shipping centers across the country allocate packages to trucks and then load the trucks overnight for their departures in the morning.’

That change met a lot of resistance from above him, at the executive level, and later, when they went to actually deploy it, from people working on the loading docks, who needed to follow new prescribed assignments of, ‘Put this package here; put this package there. Regardless of where you think this package should go, put it in this other truck.’ There were a lot of trials and tribulations, but relatively speaking, they had really good forethought of what it would take to manage that change.

Change management is an established field unto itself. There’s an art to it. Everyone knows it’s hard. The problem here is that with machine-learning projects, people aren’t conceiving it as requiring it. They’re not applying the art of change management, because they see it as a technical project rather than what it should be seen as. It needs to be reframed as an operations improvement project that critically uses machine learning as part of achieving its goal.

If somebody’s reading this and they’re an executive in a company, are there businesses you think are really suited to using machine learning, and others, where you can immediately say, ‘Look, everybody’s talking about AI, but actually, in your case, it’s just not relevant’?

If an organization has large enough processes—and that applies to all large organizations, even some small ones—it’s hard to think of a place where there’s not at least a certain potential. The important thing is that they focus on a concrete value proposition rather than just thinking, ‘We’ve got to use AI! Everyone else is doing it!’

If you need to improve a large-scale operation, then the way you’re going to do that is by predicting some outcome. Who’s going to turn out to be a bad debtor? Which item rolling off the assembly line is going to need to be inspected as potentially having a fault?

Business is a numbers game. Most marketing mail is junk mail; most marketing email is spam. We can’t predict like a magic crystal ball, but we can tip the numbers game by predicting better than guessing, and that’s the value proposition. It doesn’t really hinge so much on the sector or the size of the company, but on the size of the operation.

If you’re a really small company, and you’re sending out a marketing catalog once a year during the holiday season to sell candies or gifts or something, but your prospect list is a million, you can learn from the responses you got from last year’s mailing in order to better target this year’s mailing. That’s machine learning. The opportunities abound. Most large companies are using it in certain ways, but its potential is being only partly tapped at this point.

Is one of the big issues you face, when you meet executives, that people both don’t exactly understand what machine learning is and they expect too much from it?

Yes, and that goes hand in hand with the way the word AI is generally used. There is a mismanagement of expectations and over-promising. Even when a business stakeholder has a pretty good idea, e.g. ‘Let’s predict which customers are going to cancel, in order to target retention campaigns’ (i.e. giving the right incentives to those customers to try to keep them around), it turns out they need to get a lot more detail than that, in collaboration with the data scientists, in order to make the project successful.

It’s not just the general gist of what the project is meant to do, but there are a lot of semi-technical details about exactly what’s predicted, and therefore what those probabilities that the system is going to output mean, and how exactly, mechanically, you’re going to use those probabilities. Because integrating probabilities into operations—which sounds a hell of a lot more boring than machine learning or AI—is really what we’re talking about. It’s very practical—if you’re actually trying to put it into practice.

Let’s go on to the next book you’ve chosen, The Rise of Big Data Policing: Surveillance, Race, and the Future of Law Enforcement, by Andrew Guthrie Ferguson. This is about some of the pitfalls of machine learning.

This is where it gets into what’s important to you ethically and philosophically and politically. In my world, I’ve got some serious concerns about the wholesale automation of discrimination by machine. These decisions that are being driven by computers are often very consequential and have impacts on lives. Who gets access to housing, to credit?  In cases where the predictions directly inform a judge’s sentencing decision or a parole board’s decision, it’s even: How long do you stay in jail as a convicted felon?

There are a couple of levels of problems. One is: does the system make a decision based directly on a protected class, like race or ethnicity or national origin? In general, no. But there are places in the law where that could be allowed, and there are examples of where it takes place.

It turns out that there are a lot of needle-nosed technical experts—even in machine learning ethics—who are proponents of allowing that type of direct discrimination, where the model output by machine learning is permitted access to those protected classes and therefore can base a decision, at least partially, directly on those factors. It could say, ‘Hey! Look, you’re black. We’re going to increase your risk score by seven points.’ That’s literally the kind of thing it could do, and it potentially would do, depending on the data from which it learns.

Support Five Books

Five Books interviews are expensive to produce. If you're enjoying this interview, please support us by .

Getting rid of that problem, ensuring that the model is “colorblind” in that sense and doesn’t have direct access, is, I would say, the bare minimum first step. But that doesn’t eliminate the problem because it turns out you still have what some call ‘machine bias.’ The most famous citation everybody refers to is an article by that name in ProPublica, which talks about predictive policing, where it turns out that there’s a high rate of errors. The system’s always going to make an error, just like humans do. We don’t have a crystal ball or clairvoyance. They can predict better than guessing, potentially better than humans, and potentially less biased than humans.

But, because of the state of the world today and historic unfairness, it turns out that underprivileged groups are going to suffer these costly mistakes that would unjustly keep them in prison longer or lacking approval for a credit application or housing. To put it in technical terms, it’s a higher false positive rate. It’s a model saying, ‘I’ve identified you as a positive member of this high-risk group’ when they didn’t deserve it. It’s going to turn out to make that error proportionally more often with underprivileged groups.

Those are some of the issues that I think are really important and that I’ve written about in my op-eds. There are a lot of books on this area, and a lot of them I think are really terrific and then always have some point of fault. By process of elimination, I landed on this one book that I think is an exemplar. It addresses at least some of the responsible machine learning, ethical areas without missing the forest for the trees.

Your last book is the Handbook of Statistical Analysis and Data Mining Applications, by Robert Nisbet, John Elder, and Gary Miner. What’s this book for?

This is for the data scientists! How can we have a list of books without having a representative from the major class of books, which is the technical ones? There are a million of them, and a lot of them are great. They’re the bread and butter I grew up on. They’re not for general readers. Among them, this book is very comprehensive. It also has a relatively large dose of business side vantage, not just the number crunching part, so it is unique in that way. And it makes your biceps really big because it’s a huge, thick, heavy book.

Could anyone aspiring to be a data scientist pick it up?

Yes, they could. It’s a technical book, but it starts from ground zero: What does it mean to learn from data? The computer program is automatically finding patterns and formulas and, in that sense, learning from the historical data or the labeled data—what do those patterns look like? Sometimes they’re just if-then business rules, and my book gets into a little of that detail as well.

This book gets into it in a much more technical way, but it does from the get-go. If you’re already a computer programmer, you’d understand it. If you’re already relatively mathematically oriented but you know nothing about machine learning, you’d follow it.

If you’re a newcomer and you have no inclination in terms of math or technology, it would be a challenge to understand. It is meant to start from ground zero, but it assumes quantitative aptitude.

​Finally, let’s say I’m in a state of fear because I’ve read quite a few books now suggesting AI is going to take over the world. You’re now telling me not to worry. How can I evaluate who is right and who is wrong?

You’ve got to be able to tell the difference between the hype and the reality because the hype is very prevalent. If you’re a general reader, it’s about getting a good sense of what this technology is. My two books are meant to ramp people up on the specific way it works and what it’s capable of. So is Evil Robots, Killer Computers, and Other Myths.

In terms of the hype you could also read an article I wrote in the Harvard Business Review last spring: “The AI Hype Cycle is Distracting Companies.” I break down where it’s over promising, why the definition of AI is such a problem and why it matters. I’ve also written a follow-up article—that’s still forthcoming—that really tries to break down the problems with the argument that we’re headed towards AGI. It’s a myth and I list five fallacies in people’s thinking that lead them to believe this is true.

The computer is not going to come alive. When you look at what concretely it can do, it’s very cool—and it’s not nearly that scary.

Interview by Sophie Roell, Editor

February 4, 2024

Five Books aims to keep its book recommendations and interviews up to date. If you are the interviewee and would like to update your choice of books (or even just what you say about them) please email us at editor@fivebooks.com

Eric Siegel

Eric Siegel

Eric Siegel is a former Columbia University professor who helps companies deploy machine learning. He is the founder of the long-running Machine Learning Week conference series and its new sister, Generative AI World, the instructor of the acclaimed online course “Machine Learning Leadership and Practice – End-to-End Mastery,” executive editor of The Machine Learning Times, and a frequent keynote speaker. He wrote the bestselling Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die, which has been used in courses at hundreds of universities, as well as The AI Playbook: Mastering the Rare Art of Machine Learning Deployment. At Columbia, he won the Distinguished Faculty award when teaching the graduate computer science courses in ML and AI. Later, he served as a business school professor at UVA Darden. Eric also publishes op-eds on analytics and social justice.

Eric Siegel

Eric Siegel

Eric Siegel is a former Columbia University professor who helps companies deploy machine learning. He is the founder of the long-running Machine Learning Week conference series and its new sister, Generative AI World, the instructor of the acclaimed online course “Machine Learning Leadership and Practice – End-to-End Mastery,” executive editor of The Machine Learning Times, and a frequent keynote speaker. He wrote the bestselling Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die, which has been used in courses at hundreds of universities, as well as The AI Playbook: Mastering the Rare Art of Machine Learning Deployment. At Columbia, he won the Distinguished Faculty award when teaching the graduate computer science courses in ML and AI. Later, he served as a business school professor at UVA Darden. Eric also publishes op-eds on analytics and social justice.