162: Innovating With Data Analytics with Peter Savich
Data Analytics = Things You May Not Know About Might Live In There
Today we delve deep into data, how data can give you insights you never knew, how the human touch is still important, and if massive data can answer every question
Hi my name’s Chris Kalaboukis and once again we are coming at you live from deep in the heart Silicon Valley California show #162 and today we have a pretty special episode for you we’re talking to our expert on data analytics Peter Savich and he’s going to regale us with many tales of how using data analytics or how you can use data analytics to uncover interesting new insights that can help you drive innovation so Peter. Let’s let’s hear a little bit about your background. Thanks Chris. Yeah I guess my background for this work starts many decades ago I guess I was a computer scientist you know in college and I did that kind of that kind of work early on like in my in my twenty’s in. But then I and then I went to law school like later in my mid twenty’s and I was a lawyer for about ten years and in the intellectual property law are the best kind and yeah the best kind right like you know innovation and patents and you know all kinds of different things really in the actual property and after that jump into the Internet and had certain business roles. And then the you know I lost ten years of in consulting you know using all I guess all three of those skill sets you know sort of computer science and law and and business and. You know one service that we’re talking about here the data analytics absolutely uses all three in and you know and that would be clear like from you know telling you how kind of it’s come together. So let’s see over the past few years I’ve served a number of clients you know providing these services and one class of problems that I’ve dealt with involve invoices. Invoices you know you know you’ve got a big company and it’s using outsourcers for certain type of work and then the outsource user submitting invoices and I was thinking you know. You know why data analytics you know why you know what why is it you know sort of at the top of the client’s mind today and I realized you know because like in the old world you know invoices were submitted by paper by mail and somebody in the receiving would get it in writing for the racks remember Department Cian. Oh yeah actually yeah it’s a loop to write and yeah they get facts then they get you know filed in the paper files and you know they sit there and they grow and then they get moved into a back office when they’re older and. You know if you have or if you if you had any questions about the invoices and you wanted to do analytics on it with this you know stream we have you looking to get into you know and especially specially and. There was just so much paper I mean there was you know the myth of the paperless office I mean we’re here today but it’s just taken forever for us to get away from paper and even especially I mean you’ve experienced it yourself in law offices there’s still tons of paper are still huge boxes of paper none of this stuff has been digitized so there’s plenty of opportunities for things to get better there but I mean I mean you’re right it’s only recently that we’ve seen a lot of this data being even being available digitally so that you can write any kind of NOW AND else. Right and lawyers are great examples because they’re you know they’re going to be the most conservative and hidebound practices but you know they’re called their own clients and drag them into the modern world so one of you know classes of invoices I’ve been looking at actually were from law firms and their clients and said to them Look if you’re going to bill us I want to see your prick and your. You know to use this use this software you know that’s sitting on the web on the cloud and you know put it into this format in. And that’s what we want and. If you don’t get Office Max Staples of business because you keep selling those file boxes that’s. Going to let you go with me. I know I know so but you know so these are you know their lawyers have been dragged into the twenty-first century by their clients are saying OK no I’m not paying you unless you submitted through SAS like I know. You know there’s some work I’ve done for clients where I have my own invoicing system in it’s assessed software and you know it’s you know I generate invoices on it. But then my clients say no you were not paying you unless you use this other stuff so you know then those two systems don’t talk but I was I had a lucky stroke recently because. The client said look I don’t want you to enter you know like Textor line items in our invoices just give us a P.D.F. so that’s great so the system I use I can export a P.D.F. and just upload it to any other system right but you know it’s you know this is the world we were living now but for lawyers. You know they’re constrained he used you know these software systems that are you know SAS right there on the web and you log in and lawyers you know lawyers have always built the same way they you know put in time entries I worked for three point two hours on on this matter and here’s my description of my work yeah and. You know that now being online. You know led to some pretty interesting questions you know from my client and one. One you know theme of it is this. That. You know you have you have people in your company and they’re working with the outsourcers And so somewhere you’d think that their instinct and knowledge would be you know as good as you needed to be because they can tell you what’s going on you know the outsourcers right but I recall a project I had with. A certain client involving. Certain deals the CA This company was doing and these were the people that I was doing the data analytics for were the legal team who were using the outside lawyers who were doing. Sort of who were working up the deal right so they were the outside counsel who would work on these deals and there were I want to say thirty to fifty deals something like that I mean it was how it depending on how we counted you know which deals we want to look at and. What was interesting was that billing for the is outside firms per deal range from I think the numbers were somewhere around fifteen thousand at the low end right like this is the total amount that the firms billed for this deal all the way up to three hundred fifty thousand you know it’s just like you know like order of magnitude higher and you know and then everything in between and the client. You know. They were they were getting pressure you know from upper management to say why are you spending so much money right I can why you know where we especially in these deals where you’re bleeding out money you know like crazy and so you know to this story like what the with the manager. Did was he asked the two people who were working you know with the outside lawyers who were kind of managing them on this deal or. To you know you know a pint on what they thought were the reasons. For the variance and billing and they wrote they were like a two-page report with I counted something like twenty-seven different possible explanations. Like I and it was just kind of you know there was like almost a law school test where like they’re just coming up with every different idea they could possibly write in and of course you know what does a client do with twenty-seven answers will nothing right like there’s nothing you can write so you wonder about actually about so many answers on purpose. And you wouldn’t know which way to go. Which is that if you yeah you will for choices tough to make a decision the more choices you have a harder it is to make a decision. Salute the right in you know so I can’t really upon the motivation of the people who did that in the report it to be tougher if they just said oh it’s going to be this or that but it was comprehensive but it would what was really interesting the punch line was so what you know I was brought in then to say well OK And you know can you use the data and cover this answer and at the end of the day you know and so you pulled in the data did some you know regression analysis on it and correlations and found that. You know the punch line was that I think there were five factors in the model that best explain it you know that we came up with the model that explained you know the billing and of these five factors three of them were in this two page report of twenty-seventh and you know so they were right of three right four. Was arguable whether they captured it or not it was kind of like well sort of they sort of got that one right and if they complete they completely missed right it was not in the not in the twenty-seven right and. That was. That was a that was a. Can I open her for the client This is the first time they actually you know apply data analytics really to you know to this area and they were like Oh crap you know we can’t trust the intuition of the people who’ve been working well yeah it sounds like basically if you have twenty-seven reasons suddenly they just sort of sat in a room brainstorm twenty-seven without even really doing any kind of research or maybe throwing in a little bit of research but not really doing the deep digging that’s required to provide the really answers and that’s really interesting that there was so little overlap among the one twenty between twenty one and I came up with very interesting and so this this and this story is common in all the projects I’ve done and there’s always a disconnect between the human intuition because you know as humans we have a recency bias we have all these biases right and the dated the data is no but I mean it’s exams no bias these and you know the bias is only come from. You know sort of you know ourselves right and then I’ll So before I jump into like something I’m doing with another point I want to just say one thing about. You know why is it important that. You know I have all this experience you know as a lawyer then as a businessman why can’t i just be a computer science statistician and do data analytics. And the reason is that. The business questions are what drives the impetus to do the data analytics right we’re not doing stuff for technical interest or. Academic purpose now there’s a there’s a there’s a pressing business question people are the jobs are on the line people’s reputations are on the line and so but to go from the business question that the clients are given. And not just the literal question they give but understand behind it what is. Behind that question who is driving this question Why what’s the context what’s the stakeholders in this who wins and loses Yeah all those questions you have to have lived in the business world absolutely right like to to get it and now when you have this in this what you have Chris Right like. Oh when you’re staring at the data and asking well do I look at it this way do it like that we what questions do I ask of the data how do I poke it it’s informed by what you have and what I have which is you know experience in the business world and you know I know that at twenty two I could have done this insisting required. Probably faster than I do. But I would have been flying blind blind data I would be coming in as like a robot Yeah you know that really didn’t understand the world of humans and how humans you know work great and so so this is why I like. You know it’s a data analytics you know service but. You can’t help but not have the business you know you know accurate Oh absolutely it’s just like it’s like answering asking the right questions knowing which questions to ask I mean I’ve had startups come to me and they go here’s this really great idea that I’m trying to get trying to put out there I’m trying to patent it and you know what I want to know what to do with it and it’s kind of like hold on a second you know let’s add some business filters this to see if this thing is actually viable to see if this thing is actually going to go anywhere you know there’s just so many things that we can appreciate where start a founder who’s sort of new to the business would be like oh yeah I want to begin to move forward with this idea and there are so many things that they haven’t thought of that we would be able to provide I mean it’s just it’s it’s just immense So yeah that’s exactly that’s exactly right now and then him you know this is where like we’ve earned or you know the gray in our hair so. You know. And how valuable that can be you know in these times right where you need to not just have like a point scale but broad based you know a diverse array to be able to answer this and also one more interesting thing about like. Kind of how you know how we move from paper to this and then you know we have the story of like OK surprise and it’s you know defeating human intuition there’s one client that I’ve worked with who. You know over over the course of a year I did three different kinds of projects like this for them and they had the same effect this client had which was wow they did analytics fantastically you know telling us stuff we you know we didn’t even guess and telling us in a really rigorous way. So what they did was interesting because I don’t think they would have done this absent you know well that anybody coming in perhaps but certainly me coming in and doing this work with them what they did twenty seventeen at the beginning of twenty seventeen was take a guy in their department who was you know running one area of their work and make him the data analytics king of the group now he’s not a data analyst so but he loves it he loves data and he has he has a he has a creative thoughts about like all kinds of things you might want to you know check with the data all the way from like you know like the invoices and classification systems to you know they’re in a highly secure building in their campus you know for this company and they’re you know when you when you come in through the door you know you know there’s not just like I would be called the gate with a card where you got to go through you’ve got to give your driver’s license you know him who your company is here I mean everything they own highly secure their you know they they they all know who you are right and they could call you know the D.M.V. They can call and. Like extreme so it’s so that. It’s extreme vetting and so what’s interesting is so this guy you know like you he is interested in like looking at the visitor data and see if we can the data can uncover any patterns of who comes when I wear how to write I mean our competitors sneak in through here like you know and. So. You know the reason I tell this story is to say you know in this data analytics world right it’s still new enough that. You know this group didn’t hire a data scientist right because they decided to really expensive and the other problem with data scientists you know if you hear you hire a young guy there’s so much coaching about with that kid you know tell him you know what you’re interested in looking and why and all that yeah so instead what they did was got a non data scientist who is just a creative thinker on problems that they will to sulk and he and he turns out to be my main point of contact and there is kind of endless work in that way over time then because you have a person who’s he’s the clearinghouse right for this this large group. For any data questions and then I happen to be you know this is the main person that you call and you know for the work you know over time I could see him you know getting a farm system so can I ask you a philosophical question about data please do I have this theory I have a series that if we are able to collect enough data if we have enough refined data from all different sources so we have tons of data now but we don’t really have refined enough data or enough data if we can collect enough data and apply it to it would we be able to sort of answer almost any question. Yeah I mean in theory right it’s just the it’s a dead it’s a in my view it’s a data collection problem because just just a quick just with the topic that we had in this in this conversation you know we’re graybeards one of the beers by a green My here and you do. We have you and I have have intuition born of experience that we’ve forgotten like consciously but it’s sub conscious right in the sub conscious is guiding us you know through this so you know. I mean it would have to be like a massive tree. Like that and then they you could see in the world right there this is starting to be done with web web data that like you know is becoming you know this massive TRO right now is nexus between. You know the the fears of the security state and the advertisers but in all of that is. I think there is certainly we’re getting to like. A singularity certainly shopping behavior. Like when and how exactly people are going to shop I think you know you know the you know we’re getting close to that because because there’s getting to be anough data right and you know predict down to you know I mean once they start you know really like the watches picking up my glucose date and you know whatever and yet they’re cornered in that you know my my by you know my if you notice with my wallet Yeah I mean sure we’re getting closer and closer and closer to. You know the advertiser being able to call me exactly at the moment that I’m you know got a hair trigger my wallet sure in. No I mean I think you know we are heading to that in certain areas but you know like you know like I’m saying in that area that’s where there’s like. Just absolutely I mean that’s where all the effort of data collection is going right now right like I mean there’s security into you know all this and then there’s you know fed bid and trying to get health data but that’s you know so but you know the idea of someone trying to sell you something and getting you the offer at the right place at the right time is all that is the holy grail of advertising yet and there are so so much startup workers yet they do and so why do data analytics and then people say was that big data I say sure it doesn’t matter the size of small big or whatever but what is business big data it’s that data that I’m just talking about its web traffic to figure out who my next customer is right and how do I when that guy comes on to my site. What do I know about him or her and how to tell you how can I. Add a tie all we have this is development all of this stuff is going to be revolutionized with the amount of data because we’re all we’re all spitting out so much data we leave these trails in the world in the online world that can so easily be mined and and being able to like you say find that exact moment when that person is able to buy and I think I can’t wait for that to happen is not that we’re working on something like that right now and because it’ll basically declawed or the Internet of messages that are all over because I mean we all get Yeah ninety percent of the Internet they were getting it are all sales messages right so if we could just figure out some way of targeting the consumer down to the moment that they really want to buy then and only send them a targeted message at that moment then we could get rid of ninety percent of the spam and other crap that’s Internet. The great I would love that it’s true and anybody and it’s going to be a you know even more subtle issue because there is a human thing that you know you might not consciously register your own itchiness to I but if you were able to see this at this time you might be more receptive to it so there’s this almost. You almost You will never get rid of the advertising because that’s one of the you know that’s one of the dynamics that you know the data can’t pull out you know your interest until you actually see the and then you go you know that’s what I was trying to say it’s like one of the if we have enough data if we have massive amounts of data we have data to the point where oh I know that guy’s looking at this ad or I know that guy is walking across the room towards that show room where it seemed I’m saying it’s like if the refinery is fine enough would we be able and the reason I’m thinking this is I’m kind of strapped leading from atmosphere right because a lot of the reason why autonomy works so well is because they know so much more than a driver right Google’s a Thomas vehicle I mean has has the ability to instantly tap into not just the sensors in the car but all the sensors that are around you know satellite views traffic all of this stuff so so you Google the times you know or any atmosphere to for that matter should know more than a typical driver and that’s why it can be as good as a human or source and sometimes better than a human because I get it I get it but what it would I would have because missing his motivation like there are constraints there are constrained rules in driving and then there’s you know but the lanes you’re in the direction you stop signs I mean everything right there’s a there’s a finite set of rules of driving right and. So that is a constraint problem where yeah you know you could. Actually reach. A sufficient data place but human motivation you know many orders of magnitude of you know variable variability we know that there’s just an extrapolation it’s as I’m doing I’m extrapolating That’s right let’s see if we have enough data then you’re right we’re going to get to some kind of data singularity like you call it and when we get that data security we’ll know enough we won’t know everything but we’ll know enough to be able to you know that pinpoint targeting and not just pinpoint targeting cell things which is of course the first thing that we’re going to do pinpoint targeting if you determine the intent of the individual just so we can sort of reconfigure reality around right so I know I’m watching into this and I know I’m walking in living rooms to turn the light on before I get there so even stuff like that so but it’s almost like Seems to me once we have to hit that data singularity and then once we hit that data singularity then a lot more cool things are going to happen. Yeah I think that’s that’s a fair statement and there are you know like I’m saying and particularly in the constrained areas that are you know it’s interesting I like the fact that in shopping. You know that effort is being put in because that is a very tough tough problem it’s not like driving it’s a much tougher problem and I mean and you know it’s so it’s great right it’s a high bar it’s a high bar for data to really do what I love to be able to have the day when like all my shopping is automatically done for me because the system. When I want and delivers it exactly when I want it. So if we could just get that yeah some but I know it was shooting off in the future here but maybe you can tell us a little bit about when you’ve had a situation where. Sort of like other than what you talked about earlier where there’s. There was some sort of gut feel that things were going in one direction when you did the data analysis and you came back and said wow this is completely different from what we originally got. Yeah. In terms of complete difference I haven’t seen complete difference but I’ve seen like for example. Oh a recent project I’m working on involves a classification tree and. The classification tree was organically grown by multiple different people and it’s you know you think of it as a fruit tree it’s got you know some very sparse branches and some other branches that is laden with so much fruit the bread the branches broken over her and and so. You know there was some thoughts of the people like saying well can you dive into this data like who’s classified what with was in and then you know kind of you know analyzed the tree and see what we might do and I guess they they had a sense that it was just going to require you know the back to the algae the fruit tree just a few guys with some shears to prune the tree right and. No it’s you know the data showing is now we’re going to come in with chains. Like if you go to Greece like you see the way that you know that you know you do all of the trees Yeah I mean you just you know you cut the tree three feet off the ground yeah I just cut it at the trunk and grows back right it so you know where as you know like an apple tree we have you know apple trees at our place and we’ve had you know and they’re you know carefully snipping it at the edge and so I guess you know that’s the closest I’ll say to a surprise where they thought they had an apple tree but they really have an olive tree or just. You know it’s. Ever so there are there are certain trees that you don’t do extreme trimming on right I mean you don’t do that otherwise all yeah you know I mean a lot of like parrot trees and we have parrot trees apple trees you know you can’t just lop off you know a major branch and you know you know and then deal with that I mean you have to. You know there’s a lot to be done but I you know I was amazed in Greece. The best thing you can do to a tree you know all of trees is like set fire to it. Like the growth the growth the next year is extraordinary Right I mean if I did that to every one of our you know parrot trees or you know you know pool or you know percent men whatever they’d be dead they would die so you know it’s you know every tree has its own tolerance for extreme events and so this is all this is just an example or I wouldn’t say it was a reversal but it was a surprise. You know at how drastic the cuts you know were militated but with the data showing So conceivably. You can do that sort of thing where once you sort of dig into the data in more detail because you could have your seat or senior leadership saying OK we’re going to do X. Y. Z. because this is what we feel we need to do and then you go in and do your analysis on the customer data and they’re like wait a minute they actually want A B. C. So those are that’s conceivably. Yes I think that would happen on the customer data so I haven’t done too much on that’s like because like what I’m saying is that is the area customer data especially web traffic that the big data companies are really looking to looking to do yeah and I think they’re not they’re not I see a lot of. I don’t want to say that word schlock out there but I guess what I’m seeing is like for example there’s a class of. Data analytics. Companies like web based data and these companies what they do is it’s like they give you a set of tools and so you know you bring your data you use their software and they then give you the tools so that you can roam through your data and ask it questions right it’s you know it’s what I do right like I get your data and but I’m on my machine looking through would ask me questions you know and my thought about that is the following My father in law was a house builder and he built houses you know for fifty years or something like that and I remember I remember helping him he was renovating our house when we bought it and we were in the attic and he had two tools in yeah but he needed a third too I can’t remember if he needed the hammer or the screwdriver or the whatever and I said all just run down and get your you know tool from near your chest on the bottom floor he said Now don’t worry any took this tool a camera what it was and he used it like the other tool and it just worked it worked perfectly right and I was like amazed right I was just like crap I would never get out of it you know that’s interesting this sort of use the right tool for the job because I mean that’s what you always hear right you always hear OK don’t use the right tool for the job but it sounds like the OP but if you’re actually more effective if you’re an expert I don’t know what your next if you’re next but you could do all kinds of magic So here’s the thing here’s the thing just to use these tools of these new sites you have to know something about data science right you have to know how to use the I’m saying my father in law he’s retired now and he’s given you know me some of his tools and I can use a couple of them but like the other ones I don’t even know what they’re called. Who or how. That’s my clients regard to their own data right right like I. So I don’t know who their customer is these companies but their customer is probably somebody who knows data science right and I’m saying if they know there is science you’re already like me you’re running Python and your system and you know and. Like whatever and you’re doing it yourself or you’re not using a friggin web based Yeah pointing click system I’m not even sure so I’m just I’m saying this is early days in data and I’m kind of like a mercenary coming in there with my like my tool like my father in law you know your house builder and it’s my clients are like What the hell are those tools. But they kind of harkens back to your point earlier of you know having that business knowledge because I was actually going to sort of go down the path of you know compare the differences between sort of human learning machine learning when it comes to looking at the data right so you can apply that all those years of business experience and look at the data and go I see I know exactly what’s happening here where as you know you can spend tons of time doing some machine learning and throw some throw some algorithms at it you still won’t get the same insights so we’d still needs that true yeah I know there’s there’s this play between the experience and the questions and then you send the data down this path and then either comes up with something interesting and then you back up so there’s this you know there’s definitely is a driver there a driver of the tools and what’s in the driver’s mind is is very very important you know to cover a question so yeah we’re not at the point where the driver can be a robot unless it’s a driver in a car so. This stuff absolutely requires a human touch at least for that at least for now or. At least for now yeah least for now I’m sure yeah that’s great well on that note thank you thanks Peter thanks for talking with us is a great conversation I love the philosophical bent myself you know that you too I mean the love and know that well. Well thanks a lot I thank you thanks for listening and talking next time until then don’t forget to thinkfuture.