Dean Abbott is the Principal at Abbott Analytics and currently is the Bodily Bicentennial Professor in Analytics at the University of Virginia Darden School of Business.
He is an internationally recognized thought leader and innovator in data science solving problems in customer analytics, fraud detection, text mining and more for more than three decades. He is frequently included in lists of the top pioneering and influential data scientists in the world.
Dean is the author of Applied Predictive Analytics (Wiley, 2014, 2nd Edition forthcoming) and coauthor of The IBM SPSS Modeler Cookbook (Packt Publishing, 2013). He is a popular keynote speaker and bootcamp instructor at conferences worldwide and serves on advisory boards for the UC/Irvine Predictive Analytics and UC/San Diego Data Science Certificate programs.
Dean holds a bachelors degree in computational mathematics from Rensselaer Polytechnic Institute and a masters degree in applied mathematics from the University of Virginia.
You can also learn more about Dean on LinkedIn here.
Here’s a quick summary of key takeaways:
Host Ryan Atkinson engages in a captivating conversation with data analytics expert Dean Abbott. They explore the history and evolution of analytics, statistical techniques, emphasize the alignment of business objectives, and discuss the art of communicating data-driven stories. They also discuss how you can become a data analyst.
- Analytics has come a long way over the years, evolving from basic statistical techniques to encompassing advanced algorithms and big data analysis.
- Despite the advancements, foundational principles such as understanding data, validating models, and assessing accuracy remain crucial in the analytics process.
- The availability of vast amounts of data presents both opportunities and challenges. It’s important to focus on collecting relevant data and avoiding biases that can distort analysis.
- Businesses must align their analytics efforts with their overall objectives to ensure actionable insights and drive meaningful results.
- Effective communication of data-driven stories is essential for decision-makers to understand and act upon the insights. It requires presenting data in a visually appealing and easily understandable manner.
- Data analytics holds immense potential for the future, with applications in diverse fields such as healthcare, finance, and marketing. However, ethical considerations and privacy concerns must be carefully addressed.
Listen to the podcast here, or find if wherever you get your podcasts:
Ryan Atkinson: Welcome Dean. So, so excited to have you on.
Dean Abbott: Thanks for having me. I’m really looking forward to the conversation.
Ryan Atkinson: Yes, and you are like a mastermind, a wizard and all things data analytics. You’ve been in analytics for like over 20 years. So I’m just curious, like the history of analytics and like how the perception of like the world has changed on and on data analytics since you got started.
Dean Abbott: It’s such a, cause I didn’t start my career planning and going to analytics, so I was a computational math undergrad, applied math grad, uh, grad student. I got a master’s and I was always interested in the analytics side. I mean, back to when I was a kid playing baseball and computing my own batting averages, things like that.
But when I started my career. It was so useful and practical. And this was it back in the 80s. This is like 35 years ago. And, but the core methods that I started learning at that time really persist to this day. So in one sense, not much has changed over the last, 30, 40 years in analytics.
You still have basic statistical methods and those go back. 100 years. A lot of the machine learning algorithms that are commonly used today. They’ve been around for 100 years. So a lot of things are the same. There are a few new things and we could go into some of the new things, but the primary difference between doing analytics now and then is the size of the data you can analyze.
It’s much bigger today because of improvements, primarily in hardware software improvements as well. But it’s primarily [00:02:00] hardware. You can just crunch a lot more data now, which is fantastic because that means that instead of trying to squeeze your analysis into a small size, you can like load into an Excel spreadsheet.
No, a few thousand rows or something like that. Now you can analyze millions and tens of millions and gigabytes of data all at one time.
Ryan Atkinson: Interesting. And yeah, how small was the sample size back in like the eighties and nineties of like data? Like, can you give us like a deeper comparison on that?
Dean Abbott: So it was not uncommon to be looking at hundreds, two thou a few thousand records.
In fact, I interesting. I kind of crashed a at the University of California San Diego, which is close to where I live. They have a supercomputing center and I crashed a meeting of statisticians. These are all like, these are famous statisticians. I don’t want to embarrass anybody with naming names, but if you’re in statistics, you would know who these people are and what are the questions they ask.
This is in the nineties. What’s the largest data set you’ve ever worked with the largest number of rows. And for some of them, it was a thousand records or 10, 000 records. And so that, I mean, that gives you context about what the way people were working and statistics back in that time was really geared around small sample sizes.
Like how do you ring out as much information from a hundred records or. 500 records. Now we can because the data size is so big, those constraints really aren’t applicable anymore. And we can apply other techniques don’t require all the strong assumptions that statistical methods require us to use.
Ryan Atkinson: That’s super interesting. It sounds like a lot of the foundation, as you said, um, but data sets are a lot bigger, but the foundations is still there. And it’s still applicable. I’m curious, can you dive into like the foundations of like statistics and data analytics?
Dean Abbott: Yeah. So some of the Foundational things.
Everybody who wants to get into analytics should understand some like basic statistics and half. Do you think you don’t understand the math? I am really not talking about the math of it. The math is interesting [00:04:00] and the way statistics is usually taught is very mathematical. I’m talking more about the intuition around what happens To the data.
When you compute these metrics, things, the simplest ones, things like means like the average. And what happens to the average value? If you’re the shape of your data is weird. Like most of the one of the core principles in statistics, there’s theoretical reasons for this, but is this idea of a normal distribution or Gaussian distribution?
And that normal distribution is there, like I said, for theory of Theoretical reasons. That’s the way data tends to operate. That’s the way it tends to be distributed, but it’s not always like that. So if your data isn’t of that shape, if you’ve got weird values, like way out there, you can’t see my hand out there, but outliers, how much that influences.
What that central there the idea of the mean is what’s typical what’s in the middle of the distribution and so but if your data shape is odd, then that average value does not represent the middle anymore and you need to think about what other representations can you provide for the data? How can you change the data so that it complies with these assumptions?
Statistical methods, whether it’s linear regression or like principal component analysis or any of these kinds of techniques, they assume that shape of the data is normally distributed, and that’s what gets people into trouble first, especially with smaller data when you can have weird distributions. If you don’t comply with that shape, all those metrics like a mean does, it’s not representative of what’s going on in the data.
So that’s one of the things is knowing how you could be fooled. By these metrics,
Ryan Atkinson: interesting. And so, like, how are some of the ways, like when you’re analyzing data like this, like, how are some of the ways to sent those benchmarks and know, like, how do we check ourselves as long as we go when we’re building out like a model like this or some sort of [00:06:00]
Dean Abbott: analysis?
Yeah. So what and I’m going to we may revisit this later, so I’ll say it now. Well, one of the frameworks I like to use to describe analytics and when you’re a professional doing analytical based projects I usually think of in terms of six stages of what you’re doing and the framework that I use, it’s, it’s not because it’s unique, but it’s, it’s written, it’s called crisp D.
M. C. R. I. S. P. Hyphen D. M. So it’s the cross industry standard process model for data mining. And so the stages are business understanding, data understanding, data preparation, fixing data, cleaning data, the modeling, evaluation, and then final deployment. So in the early stages, data understanding, that’s where we use a lot of these statistical techniques to get a lay of the land.
What does the data look like? What are the characteristics of the data? So you compute something like a mean to tell you, well, what’s typical? Because the question you want to ask at that stage is, does this make sense? So let’s say you’re a the nonprofit and which I’ve done a bunch of nonprofit modeling, and you want to know what’s the average age of the donors to my nonprofit.
And if the average age is 120 years old, then, okay, that we know that’s not realistic. That’s not right. That’s what’s going on here. Note to self, look at the data, understand why it’s so off. Why it’s so weird. Interesting. What the average age is like. 10 years old. There’s something else. Cause like, for example, something that might happen, maybe for a half of the data, you don’t have an age.
And so it’s, it’s called missing data. And let’s say somebody on the database side said, Hey, we can’t have missing data, let’s fill in those missing values with a zero, which shifts the mean down, which makes it unrealistic. So, but sometimes you don’t know it’s the analyst. You’re just, you just get a pile of data.
You don’t know [00:08:00] all the stuff that’s been done to it. So the, at this stage, you’re just trying to understand. What does it look like? Does it make sense? Can I see some weird things in the data that I’ll need to fix later? So that’s kind of the, in terms of core fundamental things, that’s the first things we always do.
You should always do with the data is just summarize every data element you’re going to use in your analysis to make sure that it makes sense and it doesn’t have weird things in it. Because there’s like, I, one of the quotes I that gets picked up a lot is It’s a a flip of a George box quote. The George box quote is all models are wrong, but most, but some are useful.
What I usually say to change that is all data is wrong, but most is useful because data always has problems with it, but usually we can fix it to a point where it’s better than nothing. And that’s really what we want to improve. This is practices with data. It doesn’t have to be perfect. Just has to be better than what we had before.
Ryan Atkinson: Yeah. I’m curious when it comes to like a business practice perspective on like the data that people, that companies should be crunching what’s like. What is like some sort of practice that you think no matter the industry, like you should be crunching, it might not be, it can’t, might not be bad blanketed, but like, is there some sort of framework that you think any organization should implement when like analyzing their data?
Dean Abbott: Yeah. And that’s why I like this crisp DM process model is describing before, because these six stages are common, no matter what kind of analysis you do. So the first stage, for example, is business understanding. So it’s defined. Finding the problem you’re trying to solve. And one of the biggest misses I see with analytics projects is there’s a disconnect between the business and the analyst.
So the business will describe, they may, they don’t know necessarily how to describe the problem in terms of the analyst can understand and use. So what I usually try to get them to do is describe. [00:10:00] What you want to do in your language, let’s, let’s say for this nonprofit, they want to know, well, we want to know which donors who haven’t given in more than more than a year are recoverable so that we can try to recover them in a cost effective way.
Okay. That’s great. Now that’s not a modeling project that I can take as an analyst, but at least gives me an idea of what they want. So then the question is, how do we translate that into a quantifiable? The thing. So, for example, in that situation, maybe do we have in the data examples of lapsed donors who have been recovered and other lapsed donors who have not been recovered?
Okay. And then we can create a column in our data that says recoverable, not recoverable. And then we could build a model to see if we can understand the difference between the recoverable and the unrecoverable ones so that we can score every donor So that we can identify which subset of donors we should try to recover.
And so you can see there’s a whole sequence of steps, but that’s a translation between the, what the business is trying to accomplish and then what you can do with analytics.
Ryan Atkinson: What I like about that is like getting the alignment on with like the business objectives, but like the analysts be like, I don’t really know what that means, but like, I can kind of like take these steps to identify the problem and then give you the solution from that.
Dean Abbott: Yes. And that’s always a conversation because the reason why I say this is where most projects fail is the business can’t. describe in quantitative terms. They can always describe in qualitative terms and the analysts sometimes super smart analysts kind of, they think they understand, but they really don’t.
And so I see so many times when the, when some organizations will throw the data over to the analyst, say, Hey, make something, do something magical with this data and they build incredibly accurate. Interesting models that solve irrelevant problems [00:12:00] because they don’t know what they’re really trying to solve.
So in the end, the director or the VP level, they show us what you got, and the analyst will show them and they’ll say, That has nothing to do with what I need. I guess. And well, we didn’t know what you wanted, and so there, that disconnect really bites you down the road. So there has to be the conversation, connect these together and there has to be really a regular feedback, especially the first time you’re introducing more advanced analytics to the organization, because you’re trying to train each other.
On how to do analytics.
Ryan Atkinson: Yeah, that’s what I’m also curious about. So let’s just say you’re aligned on the objectives. You run the models. It turns out great. Like yes, like thank you, like, like this is perfect. But how do you then present that data? Do you use like any storytelling technique? Or how do you?
effectively translate this data to someone that might not speak data.
Dean Abbott: And I think it’s good the way you described it. They’re storytelling because the data is telling stories about your business process. So again, if we go back to the nonprofit, what is the data telling? The data is telling stories about the way.
And the story we want to tell is which donors are lapsed donors because of something which we don’t understand, but their heart is really with the organization and they’re they’ll come back if you just nudge them a little bit and if you send the right message to them. So that storytelling is important.
So that’s why defining the problem initially is so important, but also then for the analysts, what should they be communicating back up? It’s not just. How accurate the models are, or in this case, there are two things. I think in particular, the analysts need to play back up to the business side. Number one is how will these analyses or these models change?
Behavior how will they [00:14:00] ultimately impact the nonprofit? And in this case, it’s what kind of revenue, what’s the net revenue I expect when I deploy this model? That’s the language that the business will really want to know. Not what percent correct classification do I have, or what’s the average donation amount?
No, we want to, how much revenue is this going to generate? The second thing they’ll want to know is. What’s driving the high likelihood to be recoverable donors. So what, what characteristics are there? Because you could say, hey, it’s, it’s these particular characteristics and maybe one of them is, well, those in the zip, this particular zip code, that’s really the key.
And they’ll look at you and go, wait a second, why does zip code matter? Why is zip code even in this? We want to be careful of things like zip code. We don’t, we don’t want to do redlining and then the analyst is going, Oh, I need, but in order to explain why the models are good to, to describe which.
Characteristics, which attributes are driving the model decisions that will help give confidence at with the decision makers that is doing a good thing. And the ideal is like 80% of what you tell them. They say, yeah, yeah, of course, I didn’t need a machine learning model to tell me that. But then every once while you wanted to describe something and they’ll go, huh?
Okay, that makes sense. I wouldn’t have thought of that because you wanted to do the obvious, of course, and then the not so obvious to show that it’s finding something in the data that’s buried in the data. One of those hidden gems that they wouldn’t have found if they had just tried to do something in an Excel spreadsheet themselves.
Ryan Atkinson: That’s super interesting. And then once this is all, so you have the model, you had the objective, you were able to present this. But how are some other ways do you like continuously like evaluate the performance of the model, ensure that it’s correct all the time? Maybe I don’t know if all the time’s a stretch, but like, how do you continuously update the model and make sure it’s right?
Dean Abbott: It’s, it’s such a good question. It’s [00:16:00] almost like you’ve done this before. The way you’re describing the year, that’s exactly the right question to ask at this point, because a lot of organizations, once they build the model and then they deploy it, and this is generating like in this donation model, just to pull this thread even more, you deploy it.
And let’s say every week you get a new list of lapsed donors because I mean, let’s say a lapstone or someone who hasn’t given in a year. So every year you just get an updated list, you score them all. Yeah. And then you decide which ones you’re trying to recover. So a lot of times they deploy it and they set it and forget it, like the commercials.
And and but the problem is this over time, maybe patterns change or maybe once you deploy the model because you’re capturing these lapsed donors and they’re coming back and they’re sticking with you. Maybe you’re getting adrift with the characteristics, the primary characteristics of lapsed donors that you have remaining.
It’s just so things are just changing and it could be something just over time. It could be economic conditions are changing or something like that. So it’s as the week in technology terms, we may say it’s a non stationary process. It changes over time. So because of that, one of the things that A lot, maybe most organizations don’t do, but should do is this.
You monitor your models. Are they doing what you expect them to do? Now we know what they should do. As I mentioned before, we should be able to tell the organization, this is the revenue I expect to generate from these models. So you deploy a list of like a hundred thousand labs donors, and we should know we’re supposed to, we should be generating an additional like 200 grand in revenue from this.
Something like that. So what happens if now we’re just getting a hundred thousand, we’re getting less than we expect. The question is why, and the analysts can look at that and show. Oh, we’re not getting the same recovery rate that we expected. And it’s because of these variables are not doing the same things they used to do in the past.
It’s time to refresh the models. [00:18:00] And so that’s what you can do. So you can, then you can take all the new data that you’ve been getting, rebuild your models. Get a new model score, get a new estimate of how well they’ll do and then deploy that the more advanced and the more technically savvy companies will be doing this on a continuous basis.
You’ve got your champion model that’s doing something and then you’re constantly building a challenger model. And when a challenger model does better than your champion model, you swap them out and it’s your automatically it’s it’s auto tuning the way you’ve deployed your models and you can. It’s not technically difficult to do this, but it’s something that it does take a little extra time and effort to work through.
But if you can do that, then your model is always going to be doing well and doing what you expect it should be doing. Yeah,
Ryan Atkinson: I think it’s just like really cool. Like the updating that you like do, I like how you put it, like the challenger model versus like the original model and then just continuously updating that I’m curious, like just from like all the projects you’ve worked on has there ever been a time when like a challenger model?
Came to through, came up and it like totally like blew your mind that like it changed. Kind of a wide spectrum question there, but like, has there ever been a time that that’s happened to you where it’s like, Holy smokes, this challenger model. I didn’t think this was actually going to challenge it.
Dean Abbott: Yeah. Maybe not exactly like that. But what I have seen happen is the way of scrubbing for drift where you where you see conditions are changing. And this happens a lot with fraud detection, for example. So I was building models for over 10 years with the I. R. S. And looking at, for large corporations, noncompliance and we would update models because economic conditions change and once, especially with something like fraud detection, you’re trying to shut down noncompliance and those patterns associated with noncompliance that the auditors are investigating.
Once they start doing that, companies get savvy to the kind [00:20:00] of the triggering events, and then if they want to be noncompliant, they’ll try to find another way to be noncompliant. And so then another thing will open it up. So the way I think of that, sometimes it’s like you’ve got air in a balloon. You’re always going to have noncompliance, but if you squeeze the air, one part of the balloon, it gets pushed into another area.
And so, if there’s an easy way for people to try to avoid paying tax, once you close that down, that loophole. Then they’ll find a new way to do it. So that means you just have to continually look at your models and update them so that you find those new patterns, behavior, different line items where people will pack money or different vehicles of, of siphoning off taxable income that is not legal that people will start using or when tax law changes, all of a sudden some things are shut down and other new things, unintended consequences of the tax law appear.
Ryan Atkinson: Interesting. And I want to hit on that same thread. Because you’ve been like the president, the founder of Abbott analytics, which has been really in years for 24 years or so. Some of the projects you’ve worked on, like the organizations have been with like the U S Navy Intel, Los Angeles times, Alaska airlines, IBM, Tara data, you name it.
You probably worked with them. Can you share with us like a favorite project, if it was like a passion project or any, just a favorite project that you worked on with any of these and just kind of. What led up to it, what you worked on with it and like the end result that you gave.
Dean Abbott: Yeah. So one of those, I just described with the IRS, which is fascinating.
You do that, but another one, you mentioned the U S Navy. I was pulled in, uh, for a contract with the Navy SEALs for several years. And one of the things they were looking at was They’re trying to identify what characterizes a successful seal. So especially in the late 2000 decades early 2010s, there were a new new kinds of needs they had with with the seals.
But something else happened around that time, too, which was [00:22:00] interesting is they they had shifted the way they recruited seals from being totally through the Navy through a school. So So these are people who have been In the Navy for a couple of years and they’re being funneled to the seals. They started recruiting guys off the street right out of high school and they saw rates of success dropping.
Now you’re not a, it’s not a surprise that the rates aren’t that high because the standards are so rigorous. It’s, it’s brutal. So, but they would have like roughly like 25% success rate, something like that, but they’re dropping a well below that. So they want to understand why. So we’re building models, uh, along those lines, it’s trying to identify.
The characteristics of high schoolers and older guys that led to success. So here’s the part that was interesting about that to me is of course, one of the most important things for all of these individuals is physical fitness. And there are these things called PST scores where pushups, pull ups, sit ups.
Running time and swimming time, these five characteristics. And and so you could measure, I mean, there are minimum standards, but people who succeeded tended to have better scores, more polar pull ups, more sit ups shorter running, better running scores, better swim scores. But a couple of things are interesting.
One is you didn’t have to be Excel in all of them. You had to excel on some, so you had to be really good in some pull ups, push ups, and even if your running scores weren’t as good, as long as you excelled in one half, you tended to do better. But we also were able to actually quantify how much, how many more pull ups, sit ups, Push ups you needed as a high school or compared to someone who’s like 20 or 21 and so we never were able to get it exactly why, but I think the reason for that was when you’re 18.
And you’re less mature. Everybody is being super [00:24:00] stressed. Nobody has an easy time making it through training, getting to hell week, making it through hell week. So, because of maturity is a factor. We. Hypothesized without having the data to prove it, that what was happening is high school is being less mature, needed to be less stressed physically than somebody who is a little older, more emotionally mature who could absorb more of that stress level.
And so they didn’t need to be they didn’t need to worry as much about the physical stress. So you can say that, well, they needed like another 2 pushups or 3 pushups or something compared to a a 20 year old. So it’s kind of interesting. You could see from the data how those change. Now, these are, of course, on average influences.
Yeah. They, they doesn’t, it doesn’t tell you every individual. So it’s good at predicting an aggregate, but you’re, it’s not going to be perfect because there’s more going on than just the physical part, the psychological part, the grit and persistence and all is important. And measuring that is a completely different story.
Ryan Atkinson: Yeah. Well, that’s of course, that’s my follow up question. How do you measure, did you guys measure like, like psychologically or like intelligence or anything? And if not, that’s fine. But like, how would you even like measure something like that?
Dean Abbott: We did so there’s a standard score. There’s a, an academic score across all these dimensions that they that everyone had to take those tests and that wasn’t super predictive in of itself.
But. Something else we’re starting to look at, and I know special forces in general have been looking at this more since I was doing this years ago, but this time was just beginning to gain some traction. There was a, a researcher, a woman named Angela Duckworth at the university of Pennsylvania who is measuring grit, you know, can you actually measure how how much perseverance somebody has, and there was a score you could actually take a test.
And we’re just starting to look at this. If you could measure that level of perseverance. We thought you could have a better idea of who would make it through or the way I’d like to think of it. How does that trade off some of [00:26:00] these other characteristics? Yeah. Because everyone needs to be physically fit.
But maybe somebody who’s got more grit could absorb less physical standards and still make it through. Because sometimes we’d see, anecdotally, when we talk to SEAL trainers, we’d say, Can you tell me something about the people who don’t make it through? And one of the things that came back several times is that.
This super stud studs in high school didn’t necessarily do that well at buds, the training and we hypothesize is because if you’ve never failed and had to overcome failure, it’s really hard because everyone hits the wall and you have to be able to overcome that just like Angel Duckworth, which she’s looking at spelling bee champions.
The spelling bee champions weren’t the smartest kids, You want to be really smart, but it’s very smart. I give , the top, two quintiles of intelligence, but really high in the grit scale. So if you’re really super smart, if your grit wasn’t really high, it didn’t matter how smart you were. You still wouldn’t win the spelling bee.
So it’s, it’s interesting. There’s there’s a human psychology is fascinating from that perspective.
Ryan Atkinson: Yeah, that’s amazing. I know Angela, she is, she’s the author of grit. I will literally be reading that like right away. That sounds incredible. And I think, I mean, I mean, from like a quantitative perspective, measuring intelligence has always been like ACT, but like how you actually measure grit, I feel like goes a long ways.
Cause like you said, someone that’s a stud high school athlete that goes to buds, but they never failed really. Cause they just dominated sports. I think that’s a really interesting concept that grit can actually
Dean Abbott: measure. Yes, and there’s some sports that had different impacts like some of that, the more individualistic sports at that time anyway, tend to be more predictive of success and that that time was like MMA and wrestling and no water polo, of course, was, was good, but it was like kind of the oddball sports, [00:28:00] not like football, basketball, which a lot more people are involved with, but I think part of it is just the mental Thank you.
stress you go through. And also, it’s unusual. It’s different. It’s not, it doesn’t just grab, for popularity reasons people into the fold.
Ryan Atkinson: So we, we’re kind of winding down here. We’ve talked a little bit more about like what has gone into like modeling projects that you worked on. I think that is super cool.
I’ve always been talking about that. I also want to talk about like the future of like analytics. Just for people that are young, that are wanting to get started in their career, are there specific industries you’re more bullish on for like analytics specifically that you think people should consider
Dean Abbott: checking out?
And that, and that’s an interesting question. I am not. I don’t think there’s any particular industries that are great or bad because there’s data everywhere. Yeah. Every, every vertical is leveraging data and becoming more data driven. So some of this will depend, I think, for the individuals. What things do you enjoy doing?
And then once you get into an organization that is data driven, or maybe sometimes it’s a group within the organization, that’s more data driven to get in there. Roll up your sleeves, start, join that team or or find a way to be a part of that team and then try to make your boss and your boss boss look super successful with analytics and a lot of times it means biting off small chunks at a time, quick wins rather than trying to hit a home run with analytics.
And so that’s what I recommend for people just starting off. If they’ve got an analytics background already, that can help. But even if you don’t, if you just love. Data and love numbers to get attached to those organizations that are becoming more data driven, data informed, uh, it could be more business intelligence groups.
They’re just trying to report and summarize data and take the opportunity while you’re doing that to understand the [00:30:00] why question. Because to me, that’s a lot of B. I. and reporting describes what, but then the question is why. Or why are there changes over time? And that takes, you have to peel back that onion and not a layer to look at the data and try to understand it a little bit better.
Cause one of the best analysts I ever knew was a woman who worked for one of the blues, a blue cross blue shield doing fraud. And she was just in spreadsheets all the time, but she loved data and she was just kind of poking around. She didn’t know. All the machine learning stuff, but she was just looking at data and then she came to one of my courses.
We started talking about it. I’d love just over lunch. I love the way she thought about the data. She was really trying to understand what was going on with some of the fraud. And this was, I think, Medicare fraud that she was seeing weird things in the data. And so she was picking up how to do this at scale.
Using machine learning techniques to speed up the process, because that’s really the advantage because, you know, if you look at a spreadsheet, maybe there’s like, if there are like 10 columns, the spreadsheet you’re analyzing, you can do that. But if there’s 1000, forget it. It’s going to take you forever.
And that’s what machine learning could do is tell you. Okay, focus on these. This is where the game is. He whatever techniques using, and then you can look at that, try to understand it better, visualize it, do the data storytelling you were telling before, tell your boss the insights you’re seeing from the data, and then and help your company become more successful in a data driven way.
Ryan Atkinson: Uh, that’s amazing. That’s a great, great answer. I love like the, yeah, like the, why, like, why is this happening and just measuring it down and using machine learning to really hone in on that. And then just last question for you, Dean, you have been amazing. And like, I have loved, I have loved hearing about like analytics from you, like the projects you’ve worked on.
So this has been great. And just last question, like what general career advice would you have for anyone that’s wanting to, stand out in their career and like do well in their career? [00:32:00]
Dean Abbott: I thank you. I alluded to a little bit before from a career, uh, trajectory standpoint. When you’re starting with a company, if you’re interested in data and trying to improve in analytics, I mean, there’s, I would say there’s two prongs.
One is your own self learning and understanding. And I’m going to, I’ll get to that in a moment. But the second part is. It’s just as important to interview your boss or to understand who is going to be, who you’re reporting to directly, uh, as much as it is being prepared for your role, because I, it’s, I hear such terrible stories about people who they position themselves well, but their boss is not either.
And I don’t mean just in terms of mentorship, but recognizing talent. Giving you freedom and flexibility to do something that’s interesting and creative, whereas understanding, of course, that you’ve, you’ve got to deliver for the company. So it’s not like it’s, it’s not like you’re a grad student can do whatever you feel like doing, but to, uh, to help you become better and improve.
So, finding a good group to be involved with. And if you’re in a group in the company where you’re being stifled, or there’s a lot of politics going on, you know, if you can move out of that group into another group, that’s more. more, I guess, affirming and positive and and will let you thrive more.
That’s, uh, that’s tremendously important. And then from your own standpoint, read, read, read and watch videos to, uh, There’s a science to machine learning, which you can read from any books or you go on medium and get some decent material or, or whatever, or take you to me courses or other things. But I think as much as that’s viable, it’s also valuable to talk to people who have done it and roll up their sleeves because there’s an art to it.
As well as a [00:34:00] science. And the art doesn’t get written down as much and the art is tremendously valuable to see what do you do when you hit this problem or that problem in the data? How can you triage your analysis so you can be effective in delivering that model next week or that analysis in a week, not in a month?
Ryan Atkinson: I love that. Well, Dean, thank you so, so much for joining us. It’s been an awesome episode about like all things analytics. So thank you so, so much for joining us. We really appreciate it. Thank you so much.
Dean Abbott: Oh, really enjoyed it. Thanks so much for having me.