Measuring the impact of AI on software engineering – with Laura Tacho
Scores
What happens with the time saving as a developer? What would that mean for me as a developer? >> When Dora researched this question, what they found was that many developers were actually feeling less satisfied because AI is accelerating the parts that they enjoy. And so what was left over was more stuff that they didn't enjoy. The toil, the meetings, the administrative work. It gave me pause when I read that. >> What is the actual impact of AI tools and software engineering? Laura Taco is a CTO at DX, a company with a mission to measure developer productivity with data and has been doing so even before AI tools went mainstream from 2022. Today we discuss why most of the hype in the media about AI gets things wrong thanks to oversimplification and why the burden is on us engineers to set the record straight. The actual data of the impact of rolling out AI tools for development at companies Booking.com and Workhuman. How developers report that their most timing saving use case is not actually AI code generation but debugging tricky stack traces and doing it faster. The paradox of AI tools, how using AI tools to help with coding can make developers less satisfied with our jobs because we actually like to code and many more interesting topics. If you're a lead or engineer interested in data about what works today with AI and how to stay on the ground with all the hype in the media, this episode is for you. If you enjoy the podcast, please subscribe to it on any podcast platform and on YouTube. All right, so Laura, welcome to the podcast. >> Hey Gary, good to see you again. >> So to kick off, one thing I hear a lot and I I see it as well is how AI is and has been and remains overhyped. Uh so when when I look at media headlines, you know, some of the headlines are are are just frankly they they sound ridiculous. Uh there's some headlines around how actually let me read you a few. These are some are a bit scary as developers and some just feel like like o over the moon. So here's one from Forbes. Are coders jobs at risk? AI's impact on the future of programming. A CIO magazine. AI coding assistants wave goodbye to junior developers. A gizmodo. Open AAI just released a coding tool to quote help programmers in brackets replace their jobs. probably software development may never be the same. Now these are all like you know they're not like tech like not publications where software engineers would say like okay they're like amazing with tech but you know Gizmo and Forbes like decision makers read them uh and they often get forwarded to developers or or you see them what what is your take on what we're hearing with mainstream media and and as you're talking with developers and engineering leaders how are what are they telling you about these? Yeah, I mean these headlines are headlines for a reason, right? They get clicks, they get engagement, they're sensational, >> they're ad supported media usually. >> Absolutely. And I think anytime I think in general media literacy is really important, data literacy is really important and this is no different. So whenever I see a headline that like that, I always trace it back to the money. Like who's getting paid? As you said, it's ad supported media. Who's being covered? Are they a vendor? Are they selling an AI tool? like I ask all of these questions. You should be asking all of these questions as well. >> Are are they paying for coverage? Sometimes they are. And even when they're not paying, they might be paying PR agencies who pitch these ideas to magazines and the magazines who need to do output. So they, for example, they're being paid by ads. They will often produce these shallow articles which are kind of pre-written for them. By the way, this is a fun fact. I think there's a challenge in AI which is that simplifying something to the point where it can be understood by someone who doesn't have a background in developer in development can oversimplify to the point of being incorrect. And so one recent example that I read was actually in the Wall Street Journal and they talked about how um you know companies have AI employees and these AI employees had line managers. I'm using a lot of air quotes here line managers and company credentials. Well, sure. You could think about co-pilot or, you know, claude or any agentic workflow as having a line manager being the person who's dispatching the work or, you know, verifying the work. And you can think about >> it instantiated the agent. Yeah. >> Yeah. Or like company credentials. Does that mean it has access to commit to on GitHub? Like does my dependabot have company credentials because it can open a pull request? And so, yeah, I could see in a journalistic world that is accurate, but is that really a reflection? Is there a engineering manager out there hiring AI agents as employees and giving them company email addresses? Like, that's just not really what's happening. And I think the oversimplification can be really sensational. And everyone wants a piece of the AI hype right now. So, you know, those are the things that I think about when I read this. I think it's really unfortunate because it puts a big burden on engineering leaders to educate their business counterparts who maybe don't have they just don't have the background knowledge and experience to understand what is authentic and what is not authentic. What are the real limitations of these tools? What is possible now? And you know it is our job like it or not to be that person who can translate those and explain it because ultimately it hurts us and it hurts developers when we don't do that because nobody wins in this hype cycle. Developers are like this is super gimmicky. That's another reason for lower adoption. Like this is super gimmicky. There's no way that this is going to work as good as it says. They try it once. It gives them spaghetti code and they're like this is just a load of BS. And then on the other side, on the executive side, there's CEOs saying like, "Hey, I heard that Microsoft is writing 30% of their code with AI." Like, why aren't we why aren't we doing that? You know, those headlines suggest that 30% of Microsoft's code that's running in production was authored by AI. That is not at all realistic. We don't have data to support that from any of the companies that we work with. And, you know, hundreds of companies, we've never seen data consistent with that kind of sensational claim. If you want to build a great product, you have to ship quickly. But how do you know what works? More importantly, how do you avoid shipping things that don't work? The answer, Statig. Static is a unified platform for flags, analytics, experiments, and more. Combining five plus products into a single platform with a unified set of data. Here's how it works. First, StatSic helps you ship a feature with a feature flag or config. Then it measures how it's working from alerts and errors to replays of people using that feature to measurement of topline impact. Then you get your analytics, user account metrics, and dashboards to track your progress over time, all linked to the stuff you ship. Even better, Static is incredibly affordable with a super generous free tier, a starter program with $50,000 of free credits, and custom plans to help you consolidate your existing spend on flags, analytics, or AB testing tools. To get started, go to stats.com/pragmatic. That is satsig.com/pragmatic. Happy building. This episode is brought to you by Graphite, the developer productivity platform that helps developers create, review, and merge smaller code changes, stay unblocked, and ship faster. Code review is a huge time sync for engineering teams. Most developers spend about a day per week or more reviewing code or blocked waiting for a review. It doesn't have to be this way. Graphite brings stack pull requests, the workflow at the heart of the best-in-class internal code review tools at companies like Meta and Google to every solver company on GitHub. Graphite also leverages high signal codebase aware AI to give developers immediate actionable feedback on their poll requests, allowing teams to cut down on review cycles. Tens of thousands of developers at top companies like Asana, Ramp, Tecton, and Verscell rely on Graphite every day. Start stacking with graphite today for free and reduce your time to merge from days to hours. Get started at gt.dev/pragmatic. That is g for graphite t for technology.dev/pragmatic. At Google they had I think 25% a few months ago similar claim. And then I talked with someone at Google and they actually brought up saying like you know when they're like yeah we have all these like AI integrations AI code review we have the autocomplete like they have they have pretty much yeah as as usual you know this because you also talk with Google but uh they have the whole internally they have the the stack that's available for anyone using cursor etc just the internal version which is kind of trained on on Google's uh data and this engineer was saying like I'm pretty sure they're counting accepted completions as AI generated. But that just seems weird because like yeah, I accept it when it makes sense, but I'm I'm like reading it and I'm I'm reading it out and then they're like, we we don't know where this number comes from. Like who's measuring it? They didn't even tell them. And and you know, is acceptance really AI generated? Well, technically yes, but even before AI, then we could have said it was machine generated a lot of your code because autocomplete has always been very good at predicting, you know, at your start to type out the first two letters of a class. Was our code machine generated? Technically, yes. But so yes, I agree it's confusing. >> Yeah. And I think your point about acceptance rate is exactly it. That's a lot of these studies that are producing these numbers for headlines are looking at accepted suggestions and that's just not a great measure of the first of all the business impact, but also even if that code made it to production. There is absolutely no there's not a line between I accepted the suggestion and now it's running in production. So that's that's quite misleading. I think you know we could say are 30% of pull requests being assisted by AI like I think that is probably true at a majority of companies. And so that's just a really different magnitude of influence than 30% of code being written by AI. We can also say that 100% of of PRs at most companies or like most larger companies have been kind of robotically checked, right? Like the llinter has been run on them. Static analysis has has run on them. This has been going on for years and and it and it catches all the obvious things. So >> I mean can you imagine that headline like Acme Corp only ships code to production that's been you know read by robots? Is this the end of software engineering? like you know we could certainly come up with a sensational headline to just talk about CI/CD as you said >> you talk with a lot of engineering leaders right like on a on a companies engineering leaders tech leads those kind of folks these days what are some of the most common questions you get related obviously to to AI and what kind of sentiment are you gathering when it comes to to AI from from these same engineering leaders? Yeah, I think the most common question I get is what should I be doing? I think as engineering leaders, we have operated in a space where we can pattern match in a lot of things, right? Like if if I'm trying to modernize my CI/CD pipeline, I can go talk to another customer of a different tool and see what they're doing and how they've modernized and look at their before and after. And in AI, we just don't have that because it's frontier work for everyone. And I think that's very exciting but also very distressing when you're holding the bag of money and you have to figure out where you should spend that money. That's really tricky. So what should I do? Um the other thing is you know how do I measure it and how do I prove that I've made the right decisions right now? Because that is something that every engineering leader is being held to account by their exec team by their board. How are you investing in AI and can you show me the results? you know tapping exactly on on this how how can we actually measure the impact of AI actually in the pragmatic engineering we did some deep dives actually we did one of them with you on how you know we figured out all the things that don't work like lines of code or the single metrics and we started to make progress with things like the space framework later the devx framework what are measurements here that work might work even for developer productivity or just measuring the efficiency of of AI what have you seen you know work out >> yeah what's so tricky is that as you said devel developer productivity is just that's it's a really hard problem on its own and when we add AI on top of it we certainly don't reduce complexity here. So you know companies that had invested a lot in understanding developer experience and developer productivity are in a better spot right now to understand the impact because they have that baseline of understanding of how their teams and organization operated before and then we can just do experimentation style. We can look at what the impact of AI is. I think for any leader who is out there sort of feeling like lost in the forest, not quite understanding what are even the measurements to look for when it comes to telling the story about the impact of AI. Abby, who's the co-founder of DX and I have just put together a new AI measurement framework um which I'll just share on the screen. We can kind of talk through it. This is the DX AI measurement framework. And what we recommend based on our field experience working with hundreds of companies who have been working with AI from, you know, the the infancy, the very beginning when AI was a glimmer in everyone's eyes to now sort of full scale rollouts where they're seeing some pretty impressive um results from using AI. Want to look at it across utilization, impact, and cost. And these are really the three areas that will give you together a really complete picture about how AI is working or not working, what you should do next, and then how you can tell that story about impact in your organization. >> Yeah. Because I I guess in in the end like everyone's looking for like impact, right? Like it it it should result in something tangible. Am I feeling this right? in terms of either you're like as a software engineering organization either you're like building more stuff building better stuff or generating more revenue or like somehow if if it doesn't help with any of those or or some related things then you know what am I even doing right >> yeah and I think you know as an industry we've looked for output metrics to kind of quantify that end result and at the beginning and actually what you'll see in a lot of the headlines is about um the quantity of code that can be produced with AI. But this is really disconnected from everything we know about developer productivity, developer experience. Like quantity of code doesn't actually mean business impact. >> Um, and so when we think about measuring the impact of AI, we need to sort of track it across the life cycle, but also really stay focused on the end result, which is, as you said, it's more revenue, it's reduced cognitive load for developers, it's a better developer experience, it's more time, more time to innovate. These things are really important and focusing on something like acceptance rate for example just by itself isn't really going to tell you the whole story there. I wonder if we're going to have a bit of a speedrun of what we've learned about developer productivity over let's say 20 years compressed in a few years because I still remember that when the first kind of developer productivity products came out measuring they started to measure things like lines of code per developer and then they looked at number of commits per developer on average and you know the first product said like this is this is good look at this and then as an industry we started to say that's that's BS >> like I'm sorry the developer who pushes the most code to production or writes the most lines of code might not be your best developer because they might just be doing boilerplate stuff, updating frameworks. Uh actually, they might just be fixing their own bugs because they ship so many of them. So I I I feel but we've had that conversation let's say maybe like 10 years ago and everyone agreed like you know like lines of code is not not the best metric. In fact, some of the best developers sometimes don't even write lines, they like delete them. But now we're now back here like, oh yeah, AI generates a lot of lines of code, therefore it should be productive. >> Yeah. And I think one of my more controversial opinions is that source code is a liability. I think it sounds controversial and then when people think about it, they realize that yeah, it actually is. And now we're in a world where it is trivially easy to produce a tremendous amount of source code. And so what does that actually mean for productivity and business impact when what could have been written in one line is now written in five lines. Do we really want to measure AI impact in terms of lines of code generated? I certainly don't. We don't recommend it. We did not include acceptance rate in our framework for good reason. I think it does give insight into whether the tools are fit for purpose. But when we're looking at broadly measuring business impact and the impact on developer experience and the impact on the business, acceptance rate is just such a tiny part of the story. >> And by acceptance rate, you mean like what percent or how many lines did the developers say accept the tap suggestion or whatever the AI is spitting out. >> Yeah. So this we can use that to figure out if it's just spitting you spaghetti code and you know suggestions that are not accurate then acceptance rate is going to be low and we can use that as a signal to understand that okay these tools are not sufficiently robust for the use cases but if we're just looking at oh well you know developers are accepting 95% of the suggestions that doesn't really tell us anything in terms of is it increasing their velocity is it saving them time is it going to help us innovate faster. Those are the things that we actually want to look at and that's what we've included in this AI measurement framework, not necessarily just those granular measurements of acceptance rate or lines of code. >> I'm I'm I'm glad to hear a little bit of like kind of more grounded uh approach and I guess it helps that you're not you're not an AI vendor and you're you're not >> you know you're I guess your goal is to try to figure out like what actually works. Now you did an interesting research/case study with booking.com uh and how they use AI. What did you find there? >> Yeah. So for for booking what they really realized was adoption is the key to getting a better result. What they found was that the developers who were adopting the tool so going from non-user to periodic but consistent users and moving you know moving the population up from from not using into daily weekly usage. that was where they were seeing the most benefit. So they did some very concerted organizationwide efforts around enablement for adoption things like office hours, things like you know workshops and trainings and they got their adoption up to 65% of developers using these tools on a weekly daily basis which is actually well above the median which is 50% industrywide and the top cortile is 60%. So they're doing quite well AC you know according to that industry benchmark. >> So so just to pause here just so I understand. So booking was like okay we'd like as many devs as possible to use these tools GitHub copilot or whatever or any other co-pilot that they had the the chat etc let's say on a weekly basis and we're going to do training on this will leadership will say hey please use it or like try it out and they did office hours they had teams on this and even after doing this we're saying that about 65% of devs use it weekly you know or maybe daily but but mostly weekly So that means 35% are still like, "No, I'm good. I'm just going to, you know, do do what I did before." Right. >> Yeah. For for a variety of reasons, right? But that 65%, what's interesting is like that's still above the 20, you know, above the P75 industrywide. >> But I I'm just pausing here because I think there's two types of people, right, who are who are listening as well. It depends on what environment you are. But if you're in a startup environment or or you're just like an early adopter, you'll be like, "Well, why would you not use it?" like everyone using it. Like I'm not going to use it all the time, not for everything, but like yeah, it's there. Like I I know when when to use it, when to not, and they'll use it daily. And then there might be some people saying, why would anyone use this? But it's interesting that in a company that is trying to say like, okay, let's get everyone to use it. They to me it still feels interesting how it's only 65% when they do all these things that a bunch of places don't do from the training from the enablement from the investing from partnering with uh with you know companies like you so what have you learned about that 35% like why what is either holding them back or or are they right to be skeptical >> yeah I think the biggest learning for me is that it's not necessarily that these individuals are skeptical leites who don't want to use any new technology. Some of it is just that the organization doesn't make a license available to them. They would like to >> they would like to use it but the licenses aren't available. And I'm not suggesting that's the that's the case for booking. I want to make it clear but I have seen that pattern repeated >> in many organizations. So we have to when we when we think about utilization you know in our framework we recommend looking at the number of daily active users or weekly active users. If you use DX to measure that, you can actually look at that as a percentage of your total population and then you can actually look at where the licenses line up to be across your population because it might be that the people who would like to use it can't yet because the licenses are not available to them. >> Yeah. >> You know, some companies right now as they're experimenting will say, "Okay, we're going to get 500 licenses for Copilot, 500 licenses for Cody and whatever." So there's a limited pool to pull from. And so there's no scenario where a 100% of or of developers could be using it. Just they simply haven't invested the money in making licenses available to 100% of their developers. And so I would say that's a fairly big reason where we see not 100% development or 100% adoption. Um, I think the other thing, and you covered this in your lead dev talk, is that for certain services, components, product areas, it's just not that effective because of the very novel or green field nature of the kind of code. So like we can think about this on a spectrum and one is like writing Terraform files or like one is one is writing something in YAML in like a really defined way like for Terraform and the other one is like doing something that no one has ever done before. AI is amazingly excellent at stuff that has a lot of structure and a lot of pattern. But when you know you use the example of that healthcare startup who wanted to remain nameless because they didn't even want to go on record saying that they don't use AI because it just doesn't work for them. And so you know you can imagine at a company as big as Booking or as big as Meta Dropbox, there are going to be pockets of developers who just aren't well served yet by the tools. Well, and I can also see cases where these tools just really fall flat where you're trying to do something very very specific and in a concise or very performance uh effective way which is usually about understanding the whole structure making small tweaks. For example, if I think of I was talking with a stable Linux branch maintainer, Greg KH. Uh I was saying like how do you use AI and he was like well I mean like we don't really they use it for tools but if I look at every Linux commit to the kernel it's a few lines and those lines have been thought for so long and they need to be as concise as possible. you know performance matters all those things matters and for those use cases especially in big companies I can imagine that if you're in a platform team you're optimizing the P95 performance of your Android app this is like you might use it for brainstorming or here and there but but in the end the the changes you make are so small in terms of lines of code but they're so large in in impact and so much of it is about testing about wins that have not been made before or seeing connections So I I wonder if some of that is also that. >> Yeah, I want to show this to you because I think that exactly that point that you made that maybe the biggest gain is not in code generation but in like you could still use it for brainstorming, you can still use it for error analysis. This is part of that enablement and training that companies can offer in order to increase adoption, increase impact. We did a study of 180 plus companies and we looked at the developers who were saving a a a serious amount of time with AI and we tried to understand like what are you actually doing and interestingly enough code generation like midloop code generation is the third highest use case for saving time but actually stack trace analysis and refactoring existing code we're saving more time than the midloop code generation and so this is also reallyant important for companies and platform engineering teams to understand because I think the idea you know is like well we give our developers a license for co-pilot and then just expect them to kind of figure it out and a lot of us go to midloop code generation because that's kind of what is mostly talked about >> yeah most demos the most obvious things right like that's what I thought about until we talked about >> that's the most obvious thing um but things like you know putting in a 100 lines and stack trace analysis and being like what why is this happening? Give me like give me a diff that would fix this problem >> or or give me four possible ideas. >> Totally. >> And then two of them might be things that I didn't think about and now I can go off and research. >> Yeah. And so there's there's kind of like an there's no ceiling on the different kinds of use cases that AI can help, especially when it has really good context and understands your codebase really thoroughly. code documentation, brainstorming and planning. Unit tests are an area that um are really well served by by AI, anything that's like very well defined. Um but I was really surprised to hear about state stock trace analysis being the top times saver and not necessarily being midloop code generation because as you said it's the most obvious thing. >> And when you say mid loop, this is like what does mid loop stands for? Yeah, kind of I can write out the the scaffold of whatever function I want to write and I can give it the the input and the output and then just say give me finish my function make it complete this um thing for me. But this is this is really interesting as well because I feel it goes a little bit against the mainstream narrative even with developers as you said that what is AI good for code generation it generates code faster because I think that's what we see right it does spit it out it can do it like it is superhuman in term of speed like you you cannot write this fast but if we see that these are the main use cases stack trace refactoring okay code generation is still down there and then we have the other stuff maybe it it suggests that yeah as you say Maybe we're thinking a bit wrong. And I wonder if this might also impact how, you know, there's this narrative that now anyone can be a developer. You don't need to be a developer to like write software. But if these tools really help with like let's say stack trace analysis and refactoring, unless you're a developer, you're not going to use that. And so, so maybe these tools are actually a lot better for experts, you know, professional software engineers who know what they're doing. You know to me the sticking point about things like stack trace analysis or um refactoring code is that it's about time savings and not interaction with the tool. And what I mean by that is like number one typing speed has never been the bottleneck in development. But now we have we have all this code generated faster than we can type. That's great. But it still takes me time to review that code to, you know, cognitively make sure that I understand it and that it's accurate. Time to review it. And so the time savings, it's not that we're saving time because we don't have to type. A lot of that time we're just reallocating to reviewing or other parts of code authoring that's not typing. For stack trace analysis, we're actually just eliminating the toil completely of parsing through this, you know, huge output trying to figure out what's going wrong and then going spelunking in the code. And so that is truly like a a net positive time saving to say, you know, give me four examples or like what's the what's the most likely cause of this. I can just totally leaprog that whole 45 minutes that I would have spent banging my head against my keyboard trying to figure out. Whereas if I'm using it for code generation, yeah, it's faster, but I also have to invest time reviewing that code to make sure that it's accurate, >> reviewing it, iterating on it. You might refactor it. Often times, you know, I think as developers, we know that it generates something, but if you have some coding styles or if you have a way of coding, you will tweak it, rewrite it, change it, it gets it wrong, etc. So, okay. So, so this makes sense. Going back to that, the the first one where like, okay, it truly saves time. What happens when it saves time? Like let's say okay I until now I've usually had to spend a bunch of time analyzing the sack trace and I was stuck on it for you know 15 minutes at first but by the way with experience it would have gone down to like 10 and then five and now I'm a senior engineer and boom I look at it I know uh what that is but what happens with the time saving like as a developer like what would that mean for for me as a developer or me as an organization do I just you know clock out earlier uh will I now have a bit more space to help out others to start to think about like bigger things instead of the day-to-day and you know some strategic stuff where have you seen it because this is not new right like developers saving time you know we've seen this with with other tools as well uh the big question is the organization thinks like oh if my developers spend 10% time each of them on average I can either fire 10% of them this is the you know the big evil corporation or I can just not hire and my productivity goes up 10% this is what the business thinks, but it's not really what happens, does it? >> It's definitely not what happens. I think one thing to keep in mind is that on the very best day, developers are not spending even 80% of their time coding. I think the industry average is like 25%. There was a study at AWS that an average AWS engineer only spends 20% of their time coding. And so when we apply AI to the coding tasks, we're only working with 20% of that time to begin with. Then when we save 10% of that time, that actually doesn't that doesn't amount to, oh, we can, you know, ship 10 new product lines, you know, overnight, that's just not realistic. I think this is though where things get a little bit weird. So what happens with that time? Because we would like to think like, oh, this is going to be really great for developers. They're going to be saving, you know, they're saving time. They can reinvest that time in tech.rement repayment or you know other stuff. When Dora researched this question, what they found was that many developers were actually feeling less satisfied because AI is accelerating the parts that they enjoy. And so what was left over was more stuff that they didn't enjoy, the toil, the meetings, um you know, the administrative work. And so that was an that was an interesting result and that's from their um their guide on AI engineering that came out a couple months ago. It gave me pause when I read that because I thought I've always had the very strong conviction that AI time savings is not going to come from the coding task. It makes sense that that's the obvious place where we all started but organizationally how fast we can create code has never been the bottleneck. It's been everything around it. And now when we take away or make that faster the code authoring process for people who like to author code that has some impact. Well, I I I think you you know when I was thinking when I was a developer and then was a manager with developers like what is a good day as a developer and I think you know an average good day like again there can be different days but it was like I come into work you know we say hi to people you know maybe we talk about something usually it's it's we don't have meetings I have a clear goal in mind it's something challenging that I I want to complete some maybe it might have been from yesterday or I'm now just starting as fresh. I get into the zone. I, you know, get it together. I get it working. Uh, I clean it up. It's working. It's amazing. I commit it or I test it. I check it. I I I put up a pull request and I'm done. And you know, if this happens at 2 p.m. And it was something challenging. It should have taken me eight hours, but I got it done in four and I'm really proud of it. Uh, it it works. It does. Maybe I could clean up. And then I I help out some people. when I had that like it's been a good day and the bad day is the opposite. It's like I get into work, I have this thing that I need to do and I get interrupted. I go to a meeting and it and I try to go back and now another meeting or I I finally get back into it but now I'm just stuck. It's more complex and I go home and I'm when I'm trying falling asleep, I'm still thinking about this goddamn thing and I actually want to open my laptop and I don't sleep well. You know, we used to do things before AI uh just like things like no meeting days and like bunching meetings to give people the chance to be in the flow because there is something about that when when you are in the flow for a day like for for some time. It's it's it's good like right like yeah >> like you're you're a developer >> you used to like you still build software or you used to build software right? Yeah, both. Yeah. >> Yeah. >> I think this is, you know, going back to that question about measurement and how do we measure the impact of AI? This is one area that I'm personally very curious about, which is does AI allow us to manage interruptions better? Does it help developers stay in flow state? Does it reduce cognitive load? And these are numbers that aren't going to show up necessarily in time savings per developer, but will show up in other areas of developer experience because one hypothesis is that with the use of AI tooling, the tax that you pay when you have to switch tasks, so going from, you know, um maybe you have from 10 to noon focus time and then you have a meeting from 12:00 to 12:30 and then you can focus for the rest. You know, that tax beforehand and the tax after. Does having an AI coding assistant reduce the amount of time that it helps you or that it takes you to get back into flow because you have basically a body double um or a pair programmer so to speak who is holding context and keeping context for you and it's easier for you to pick up where you left off. This is really difficult to measure systemically with only workflow data alone, which is maybe another thing that I didn't emphasize when I talked about the measurement framework, but combining self-reported metrics with system and workflow metrics is absolutely essential when measuring the impact of AI tools because it does have an impact on the authoring experience as well. And some of that we cannot observe from our systems. We actually have to talk to developers in order to figure it out. And this is one area. So talking about um you know change competence, developer experience uh measurements, seesat for the AI tools, those are all things that are really important parts because we might actually miss important signal about how AI is impacting the code authoring experience or you know other parts of the software development life cycle but we we might miss them if we're only looking at the workflow tools themselves. So we need to have a more robust comprehensive way of measuring across the organization. >> So let let's talk about what is working and what you've seen working. And one of the things that we we previously talked about is how some of the teams who are actually making pretty good use of of AI are starting to make some architectural changes to their codebase uh and and make some architectural decisions to to make it easier to read for like let's say agentic modalities. what have you seen work and and how how is this coming along? Yeah, there's two um kind of two broad things that I'll talk about here. And one is the architecture itself and then the other is sort of the discoverability of the architecture and of the system. And there's a lot more going on there. On the architecture itself, what I have seen just sort of anecdotally from my own conversations is leaders re recommitting to like clean interfaces between services. I would say that's probably the top thing that comes up into um >> Nice. That that's kind of a nice thing to come out of this. >> Great. I love to see it. I think um you know we can think about sort of AWS's everything is an API as a model here like when your own system systems operate like that the interfaces are so clear and well defined it becomes easier for agentic models to use your codebase because the boundaries are more clearly defined and so >> by the way also for humans right like >> also for humans yeah it's kind of like >> it's like it works really great um and so that actually that point also for humans is the interesting point about documentation Um because this is the shift that I'm seeing more often. I was actually in Amsterdam while you were in Mongolia otherwise we could have had another stake. Um but while I was in Amsterdam I did a fireside chat with about 45 other engineering leaders and you know very quickly into Q&A the question was should we be be writing documentation for AI or for humans? And my question my answer to that question is yeah both. Um, but here's one thing that I've seen kind of pick up, I would say, in the last six weeks. So, human documentation often has like visual dependencies. It'll be a screenshot of something. It needs this sort of like narrative flow. Whereas for AI, it's really good to have the coding examples. There need can't be visual dependencies. uh it doesn't work great that way because developers aren't going to a documentation page necessarily or watching a YouTube tutorial to see how to use your thing. They're going to cla interacting, you know, in their IDE with an AI assistant and trying to implement it. And so documentation needs to be there for AI so that that developer gets the information they need in the best way. And I think companies that are going AI first like Verscell clerk for example they have really really solid examples of AI first documentation because it's a flywheel like you know they have great documentation then when a developer is trying to implement this thing they actually get a good suggestion and can do it successfully from within their ID with whatever coding assistant then that coding assistant has more data about what actually works and it just keeps reinforcing it. And so for internal development teams like platform teams, this is a great model to think about like how can you make documentation that is getting people the information they need at the moment they need it which is now in the editor not in your document you know your documentation necessarily. And then for external developers if you're making a dev tool that's like out there in the ecosystem this is the way that people are discovering your tool and implementing it. um there, you know, the just the way that pe that developers are are coming across tools and using them is really different. So that's been the biggest way that I've seen companies think about or already start trying to change the way that they architect their their services, but also document their services to make them work better in this sort of AI assisted coding world. >> That's interesting. I also like your thinking of how this also creates a bunch of opportunities for uh especially company startups who are building APIs or things for developers to use. If you make it easy for them to use and also make it a bit more friendly for AI crawlers or or any any of these to ingest, you know, you might be getting more users later on or unblocking your users because in the end I'm I'm going to guess like in in the future like you know two years from now developers will be like all right I want to create a project using this technology. As a developer, you will specify this technology. If not, it'll default to whatever. But if you have too much trouble with the technology eventually, you know, there's going to be this learning thing, you will choose a technology that as as today the one that you're familiar with, you know, it works, you know, if you get stuck, uh, it's easier to do. So, these things will remain important. technology SDK vendor vendor uh you know reliability maintainability these things are going to remain important for professional software engineers that you know we are and we will be >> I like what you said about this is also good for human beings and also it's good for human beings and it's good for AI like there's so many areas about developer experience and the kind of the world that I operate in where like what's good for the developer and good for the business is a circle that ven diagram and this is you know what's good for the AI agent and good for the human being is also a circle like clearer boundaries between services. I think that's such an interesting space to operate in. It's like we needed AI as sort of like the business kick in the pants of like hey we're not going to get as much out of our investment in AI because now our wallets are open. >> Before it was like well read the manual um kind of work around bad tooling but now that there's significant financial investment around it. Not that development teams aren't a significant financial investment, but I think the tolerance is different, you know. >> Yeah, I I I like your thinking that this is this is a good ven diagram to to to draw because again when you onboard to a new company, it's always been a problem. Onboarding has been just difficult. There's the onboarding documentation is out of date. The presentation is out of date. No one tells me how to do this. So then people take you know a month to get productive and now you can say okay let's update it. So now our AI tool can also be helpful but also it goes both ways right like when people on board they can also read their thing they can now turn to the chat agent which will actually give them accurate information. I I love it like I feel I feel there's a lot of like these these wins like this it's always nice to discover these. So speaking of of wins as an engineering leader uh at a at an organization that is adopting AI you know there's either a mandate or you want to do it or both. How can you measure what is working? And before you told me about this company called workhum uh who did something similar where where they actually figured out what to measure how to measure in a practical way. How did this this work? What was the story there? workhum I think like many other companies started working with copilot. Um it's very accessible to most developers and what they found was like they knew that it was working like they heard from the developers that they enjoyed using this tool but what was really hard was figuring out how to quantify how much it was working and where it was working. And so what work human did was they, you know, used those metrics in the AI measurement framework across utilization, impact and cost to try to figure out how we can actually kind of draw a map of where things are are going well and where things are not going well. So what they found was that looking at developer experience more broadly, they were able to figure out that AI has a good impact on our company because it improves developer experience. And I think this is sort of very broadly speaking the advice that I give to every single company when they're trying to figure out like how do I how do I reason about AI's impact on my company. I tell them AI is a tool to improve developer experience. When you improve developer experience, you have better outcomes. It is it follows like that. It's the same pattern in every single company. AI isn't this like magic bullet that's going to solve everything. We're talking about improving developer experience. And so Workhum found that it's an 11% boost in developer experience. >> And how did they find was it via kind of a survey? Did they measure data? Is it a mix? >> Yeah, they're measuring a mix. So they're using they use DX. So they use the developer experience index which is our kind of composite metric of 14 researchbacked developer experience drivers. So everything from incident management to local dev iteration speed um lots of these different factors that play a big important role in the day-to-day work of developers. So um they are seeing you know an 11% gain and that's all correlated as well to time savings. So >> and and just so I I understand I think for anyone like saying so they were measuring these things before right and then as they rolled it out they kept measuring it and now they're seeing a gain right cuz because you can only get an improvement on something that you measure. So if I'm working at a company which never measured anything I'm going to have a like I can measure now and have a baseline but unless I I did it before AI it's going to be a bit harder for me to tell you know how much has it improved because I don't have data before. >> Yeah. Exactly. And so my other general advice is like start measuring now. Um we can pull in historical data, you know, to cover some gaps in, you know, we can look at GitHub and and Jira and those tools historically when it comes to surveying on the developer experience and from those self-reported things. Just do start small. Just get started. Don't wait to hire that other person. Don't wait to, you know, do anything. You're just delaying success. So get started. Get a baseline. That's what helped work human be able to figure out what the the biggest gains were. >> And so what were the gains that they found? You said 11% across the board. What does that >> Yeah. So they were um they kind of measured organizationwide and found that developer experience went up 11% across the whole organization which was great. the developers that they found that use AI, so we're just segmenting here, like daily, weekly users with users who are not had a 15% higher velocity. So, they were able to um ship more code to production, get more stuff done than non-users. These numbers are from several months ago. And what they've noticed is like it just keeps compounding and getting higher and higher. So this is a great example of you know we know and if you and I are developers like we know that these tools are delightful to use most of the time we know that they make a big difference when it comes to you know enjoying work and and bringing the joy back to doing development work for a lot of us. Um, but going to your VP or CTO or board and saying, "Well, the developers like to use them is not a, you know, like that's a that's a that's a hard sell to keep the wallet open for getting more funding for these tools." >> Yeah. Yeah. Cuz because these things cost money, especially now with the Agentic uh tokens, you know, they can burn through a lot of money. >> Yeah. Absolutely. That's actually another thing that you know is worth discussing here because in a lot of these you know companies that have been using AI for a longer time and are a little bit more mature they were sort of operating in that binary you have a license you don't have a license right now one of the biggest challenges when it comes to measurement in front of us is looking at consumption based pricing and figuring out who are the developers that have the most to gain who are the developers that have the least to gain from, you know, being able to to have access to more tokens. What are the use cases that have the biggest impact? For example, stock stock trace analysis because what I'm hearing from engineering leaders right now on the ground is like, I just don't know how to allocate the buckets uh of money. Do I give a bigger bucket of money to senior engineers than junior engineers? Or is it the other way around? Do I give a bigger bucket to junior engineers because I can get more, you know, productivity value from them with AI assistance and the senior engineers don't need as much. >> This this is just so interesting because I feel a little bit of deja vu. Now, if you remember back to the last time in in the tech industry where we've had companies spend thousands of dollars per year on developer tools was around 2,000 to 2010 where a lot of startups and a lot of tech companies spent about 3,000 to up to up to five or up to $8,000 per year per developer on Visual Studio licenses, which was not just for Visual Studio, but it was for documentation. This was pre- internet documentation was terrible on the internet and that was amazing and access to early release software where they could use like SQL server all these things and it was it was huge amounts of money and even startups uh paid because uh they could get an all-in-one development kit for like thick clients so like Windows applications web database servers etc. And this this lasted for for a few years. It kind of died out. But there was a case where almost every company who said and they did it because well they could have just used open source but it was slower and they were like oh it's you know if we're paying 100k for developer per year even back then it was that we'll pay 8k more or 10k more per year to get them more productive and to get ahead of the competition. So, we've had this before, but now it feels we're we're getting back to this where like there will be companies that say, "You know what? Let's just bite the bullet, do it, even if we don't have the data." Shopify is doing this. They have no budget limits. And there are the the companies that are like, "I'm not sure. It feels expensive. We're used to like $200 per year per developer allocation, that kind of stuff." >> Yeah. And I think um history always repeats itself, right? in when you're of a certain vintage uh as you and I are, we see these patterns over and over again. And so, you know, that's been providing maybe some comfort to me is like, you know, I've lived through quite a lot of hype cycles. I was talking with Jesse Adams from Twilio yesterday who leads their developer platform and like he and I were both like in the thick of the Kubernetes hype. We were kind of comparing notes of like how does this feel compared to like the Kubernetes and container hype? You know, there's a lot that's the same. There's a lot that's different. And the hype eventually, you know, it all concludes one way or another. So, we're not still living in the container hype cycle. Eventually, the we're going to see some stabilization. Right now, we're in this like Cambrian explosion of tool sprawl and there's just so much unknown. pricing is unknown and eventually we're going to kind of consolidate and and come to something that's a little bit more stable, but it's going to probably take us a year or two to get there. But Gary, I would not be surprised um in 18 months if we're spending Yeah. Like what did you say 3,000? >> It was $3,000 to $8,000 per year. So if if you that was per year. So per on a monthly basis that would have been something like you know 400 or 300 to like $800 per month per developer on those tools that back in 2000. >> Yeah. And so I mean we can adjust that for inflation but my my sort of prediction has always been you know or is right now if we think about 18 18 months in the future I don't think it's unrealistic to spend 1,200 to2,000 US on an agent who can complete tasks autonomously even if they have to be verified by a human in the loop and I you know I think that there are going to be companies who are willing to open their wallets because maybe this does allow them to avoid increasing headcount at a the previous rate or it just allows their senior developers to be spending more time on more complex work. Um, which is another thing that we've seen in our data at DX. So, we have a uh core measurement framework which is our sort of evergreen solid foundation of developer productivity. I'll actually I can show that right here for those who are are curious to see it. But one of the things that we look at is speed. And when we think about measuring AI, we see AI as like an enabler and we're going to see the impact on all of the core measurements of productivity and performance. It's not that we need to rethink everything because AI exists. Like we still need to just go back to our fundamentals, understand what performance means, and then see the AI impact on it. And specifically in this speed category, what we've seen is that the diffs per engineer increase. So we're able to get more throughput, but also the complexity of them increases as well. So AI users are able to work on more complex work and get more of that work through the systems to production which is interesting. Part of me thinks is that complexity good or bad? We see when we >> triangulate it also goes back a little bit to you know source codes being a liability, right? We're also seeing diffs increase and eventually you know opportunities for bugs will increase. In fact probably bugs will increase >> unless you have more thorough testing. >> You know what you're very right about that and actually that's a trend that we're seeing already. Let me show you this from the a um yeah >> I'm I'm speaking the future here >> predict predicting >> first principles thinking. >> This is from the Dora's um like impact of AI study that they released a couple a couple months ago. What they've seen already is that delivery throughput is actually slowing a little bit because my hypothesis is batch size is increasing. And this is the thing about AI like it doesn't change the fundamental physics of things we already understand to be true about software development. Bigger batch sizes are riskier. >> So we want to keep batch size small. Um but here's the >> batch size usually it's you know the diff size for >> Yeah. Exactly. how much how much work is being shipped is it a small amount small chunk big chunk and then here's this um kind of forecasting like if AI adoption increases by 25% they actually predict that it's going to be a minus 7.2% 2% reduction in delivery stability for I mean we can hypothesize about the number of different reasons that might be I mean part of it goes back to that fundamental thing that bigger changes are riskier. >> Yeah. >> AI makes it trivially easy to write very very big changes all at once. So, and this is something that you know when we put together the the AI measurement framework, I think one of the biggest risks about measuring AI is that when we do get tunnel vision on stuff like lines of code or acceptance rate, we miss the picture about quality, stability, reliability, maintainability, and we can't get short-term gains at the sacrifice of long-term stability. Like, we know that that's not a viable solution. And in order to be able to protect yourself, you have to have good measurements in place that you're seeing all parts of the picture and not just hyperfocusing on the speed gains. >> Yeah. And as a software company, just taking this a bit further again, let's let's put our optimistic glasses on where this thing works and like you know most of the code is good and we can even do a bit better testing with AI. So like the quality will not like degrade that much or it might still still say the the same. What is the best case outcome? because people are starting to ask this question. And I'm seeing you know social media AI has been around for two and a half years some or or or coming up three years starting from chat GPT has been around for earlier but again let's just take that for the sake of it but as a as an end user of a customer uh of a company that has invested heavily in AI may that be Google Microsoft a startup what should you be seeing obviously you know from the company's perspective maybe they'll doing the same with fewer people but should you be seeing higher quality more frequent iteration better bang for your buck if you will, no price changes and more more functionality. What what might we see or or is it just like the cloud which is been a cost exercise, reliability, you know, might be higher in some cases or or dependability will be good, but end users don't see anything. You know, you don't know if your your the service or your your company that you're using has their own infrastructure or the cloud. You don't care. The company very much cares and you know, they can do all sorts of tricks there. I think it's all all of the above most likely. So as an end user what I expect is faster time to market and that's really on you know on the other side on the building side what we're trying to emphasize and what a lot of our conversations have focused around with other you know executives and engineering leaders we're really trying to reduce the amount of time to market. >> Yeah. >> So um I think this has a lot of implications. You know software is developed usually right now very sequentially. We have a road map and maybe we have a Gant chart of what we're doing now. >> We have a PRD. Uh we have a meeting with all the business stakeholders because we know it will be expensive to change later, right? >> This is another one of my maybe unconventional right now. Um maybe a little off the off the rails opinions, but I think like I think road maps are on their way out in the age of AI. I think companies that are going to win with AI are not ones that think about things in roadmap sequential form, but think about it more as experiment portfolios. And so rapid experimentation and trying to figure out what does delight your customers is going to help companies win. I think the companies that will win are the ones that focus or that already have muscles to do experimentation, AB tests, trying to figure out, you know, how how to delight their customers. I think as as an enduser, what I don't want is thrashing as an end user. And I could see that happening. It's just like, oh, we're just going to because now there's fewer reasons to say no to things. And there's probably good reason that some of those things weren't built yet and they're sitting on your backlog. And now that it's, you know, not trivial, but much easier to build those things, that doesn't mean that as an end user, I'm going to find it useful. And I don't want to see thrashing and just like feature bloat. But what I do want to see is faster time to market for the things that have been validated and experimented already that really do delight end users. The same is true. I don't care whether you know my applica the application I'm using is running on Kubernetes or not or if it's running in Azure or it's running in AWS really doesn't I don't care I just want the end user experience to be great and I don't care if AI was used to development or not I just want the great the great experience as an end user >> as as a company who might be rolling out either as an engineing leader or or you're a tech lead there. Have you seen a good roll out be for these AI AI tools? Is there a case study or or or a company that you might have observed who actually did a pretty good job in figuring out what to measure, how to roll out, how to, you know, deal deal with things like reliability, those kind of things. >> Yeah. You know what's wild, Gay, is that I'm seeing that highly regulated industries, financial, insurance, pharma, are having the best results from introducing AI tools. Here's my here's my reason. Uh is because they have to be so deliberate and structured in rolling out. And what we have found is that structured rollouts get the best results. And so it's that whole like slow is smooth and smooth is fast kind of thing. So, um, you know, I I've had many conversations with very large banks who are far ahead, I would say, of their kind of tech counterparts at even smaller companies because they've had to be so intentional with acceptable use policies, with licensing, um, with with budget and finance and kind of making sure everything is >> making sure how you're not going to have sensitive PII data leak. >> Absolutely. And the more intentional and structured a roll out it is, the the higher chance it has to be successful. So that's one thing. Structure is everything. So one of the companies that has actually a really good story about structured rollout and adoption is Indeed. So indeed.com is a a global job site um talent matching service. They do a lot with um you know helping people find careers that are meaningful to them. Indeed is operating at enterprise scale and one of the things that is like very much built into the fabric of of what Indeed does is this experimental mindset. And so one of the ways that showed up with their structured rollout was actually triing a bunch of different tools and doing like a cohort analysis of which tools were working better um for which use cases and which which individuals because you know as I said before we're in kind of like the Cambrian explosion when it comes to all the different tools and it can be very overwhelming for engineering leaders or for executive teams to like have a bag of cash and They want the confidence that they're actually spending that bag of cash on the right stuff. Their wallets are open. And so what they did was kind of segment groups and do trials of different tools and then look at the results comparatively. And then from there they figured out okay well let's do you know these tool these two. I've seen other companies take the same approach and they ended up with just one vendor. But this is like a very structured methodical approach um to sort of run a controlled experiment and figure out what are the tools that are giving us the most gains. I think the other thing that they did experimentation wise is just trying to figure out like can we we have a hypothesis. So for example high latency code reviews are bad for everyone. They've got teams all over the world and say you're a developer in Seattle working with a developer in I don't know Vienna Austria is where I am. So, you know, obviously if I if I if I'm in Seattle and I do something at the end of or middle of my day, my co-orker in Vienna is already probably sleeping or offline and this just leads to high turnaround time. And so they had an idea like what if we could use an AI tool for code review, which maybe it's not going to be perfect or as thorough as that human code review, but at least I could unblock someone early, you know, at the time that they need it and give them preliminary feedback. So closing the feedback loop, which we know is an important part of developer experience. And so does that hypothesis hold up? I think that's the the one thing that I really appreciate about about their approach is that they treat these like experiments with hypotheses that they're trying to, you know, validate or not and not just like, oh well, let's just use AI for code review. That certainly sounds like it's going to help. Um, they're taking a really methodical approach. And so, you know, code review was was one of those things to shorten feedback loops. They're also looking at different use cases to figure out which ones are the most impactful. So unit tests for example are a really good use case and figuring out how they can then spread that across also having um you know these sort of organizationwide change initiatives. We can think about like adoption. This is a a use case that's come up actually a bit in the kind of in the academic research world. there was a paper about how Google is using AI for migration um and some different different tricks which you know I I see that pattern actually think I had probably four conversations about that this week it's only Tuesday morning so um that's a really common pattern right now is using AI for migration >> AWS is also doing something like that with their Q developer pro migrating from one Java older version to a newer one apparently their tool is trained on that so I I I've heard that from internally on like one of the and apparently what I've heard is hit hit or miss but sometimes Sometimes it can work really great. So like yeah, just a plus one to migrations. >> Yeah, absolutely. And so they're taking this approach of like let's look for those use cases and run some experiments to validate that they're actually working and then try to roll them out. >> You know what you you mentioned at the beginning like how as a developers can get demotivated if you're if AI is doing the stuff that you like doing. But migrations is something that I don't know. I'm going to speak for myself. I always hated doing it and it always took longer and it was a drag and I think everyone hated doing it because it like you just want to get it over with like you don't you want to get it like right but >> undifferiated heavy lifting. >> Yes, undifferiated heavy lifting. It's no the business doesn't care about it as a developer. You're not going to be a much better developer. Obviously you're going to be a better professional but it's you know it's not not the thing you're excited about. >> Well, can AI reduce the cost of change or reduce the complexity of change? And so Indeed and many other companies that I speak with are thinking what if we could proverbally knock at the door of our development team with a PR that's includes the migration. It's already been tested. It has an ephemeral environment. It's already been run and all they have to do is press approve like >> well and you review it. But but but suddenly like reading through it the changes that have been done it's a lot easier with the migration then and figuring out does this match the you know like our our coding standard with the modern language features that that I expect as opposed to looking it up because with the migration especially from an old framework version you have to look up how it used to work which hard to find documentation is outdated and then obviously you want to get the modern right but yeah I I I I wonder if this will be a great use case by the way cuz who who wants to do it to start with. >> Here's a top a top tip. Um, one of my one of my favorite in my arsenal of tips. So, I think a lot of folks approach a migration and they like, you know, give one file and say, "Okay, migrate this to whatever, upgrade this to the next version." And that's going to give kind of mixed results. And so, a different technique to use is actually do one of them by hand. Like do one of the migrations by hand. give the diff uh or give both files then to your you know model of choice and say give me a prompt that will reproduce this result for subsequent files that match the structure and format of file A and you're going to get a prompt that more closely matches the actual work that needs to be done instead of starting with like oh you know upgrade this file from whatever version X to whatever version X.2 too. I guess we can call that prompt engineering. Um, we have a lot of tips like that in the AI the guide to AI assisted engineering. Um, which you can find on DX's website, but we went through and interviewed like 180 different companies and you know power users of AI to try to figure out like what were some of the things like prompt engineering, great system prompts, recursive adversarial prompting, all of those different things. Um, my my tip about migrations isn't in there because that's uh that came at a a different time. But there's lots of different techniques that developers can use to get more out of their tools because it's not it's not just as easy as throwing a license at your developers and hoping that they figure it out. They really do need support and enablement and training and so that guide can play a good part of that. >> Amazing. And a as as closing for engineering leads, tech leads who who want to stay grounded and they do want to use AI tools when it makes sense. How can they get better at kind of, you know, avoiding the hype and and sticking to what works? And what what is your advice to them? >> Uh data beats hype every time. So I think it's of course important to keep an eye out in the industry and look for opportunities, look for new and novel approaches and think how you might be able to fold that into what you're doing. But ultimately AI to work on an organizational scale needs to be thought about as an organizational problem and you need to have really solid like organizational hygiene when it comes to measuring your performance treating AI as an experiment trying to figure out then what's the impact of AI. So you know get your baseline measurement as quickly as you can and then start running experiments. Don't just expect AI to be a silver bullet or that every engineer is going to inherently know exactly how to use it because just like any other tool, we still need training and enablement. So when you have the data and you can storyt tell around the data, that's going to also protect you from the hype cycle and maybe some un unnecessary pressure from inflated expectations from the media. So that's my advice. Data beats hype. >> Love it. So to to wrap up, I'll just do some rapid questions if that's okay with you. So I I ask and then you just shoot what comes. What's a tool is your is your digital physical that you love using and why? >> I have to shout out to Granola because there has never been a tool that I've used that's like dramatically increased my quality of life as much as Granola has. I'm a bad notetaker and a forgetful person. >> And this is the AI meeting notetaker, right? >> Yeah. So, what Granola does, um, of course, make sure that you get permission from the person that you're in a meeting with, but you know, it's an AI notetaker, but what I love about it is that I can take my own sort of like disjointed notes and then Granola comes back and fills them in with all of the context that I missed. It's really magical. And this was like one of the the times that I was like, "Wow, AI really is amazing." But it's been such a transformational tool. >> What is a book that you would recommend and why? There's two books that I want to recommend if that's okay. First one is like a more business book and it's called Write Useful Books by Rob Fitzpatrick. This will get your writing clear and snappy and just cut out a lot of the fluff. So, it's incredibly useful. It's meant to be skimmed and you can really digest it in like an hour. So, use it. I think the other book that's not a professional book, but I think useful nonetheless is called Unsavory Truth by Marian Nestle. She's like a food scientist, author, historian. And it really breaks down like food marketing and the food lobby more more in the United States and about how like everything is kind of constructed. And I think right now we're talking about AI hype. We're talking about like media literacy, data literacy. There has been no book that has really just strengthened my understanding of like how deep this all goes with politics and funding and lobbyists than that particular book. It is about food and not about tech, but a lot of the concepts are really um transferable and it's just very fascinating. >> Well, amazing, Laura. This was nice to go a bit more deeper into data and and like seeing what actually works and and doesn't work. So, it was really good to have you here. >> Yeah, thanks Gary. >> I hope you enjoyed this data driven conversation with Laura. I really like how Laura said that the best way to deal with hype is with data. To check out more data and detailed case studies that Laura referenced, see these collected in the show notes below. For more in-depth reading about how we can use AI tools as software engineers in a grounded way, check out the pragmatic engineer deep dives, also linked in the show notes. If you enjoy this podcast, please do subscribe on your favorite podcast platform and on YouTube. This helps more people discover the podcast and a special thank you if you leave a rating. Thanks and see you in the next
Summary
Laura Tacho discusses the real impact of AI on software engineering, emphasizing that media hype often misrepresents AI's capabilities. She advocates for data-driven measurement of AI's impact on developer productivity, focusing on utilization, impact, and cost to make informed decisions and avoid the pitfalls of oversimplification.
Key Points
- Media headlines about AI in software engineering are often sensationalized and misleading, oversimplifying complex realities.
- The most common questions for engineering leaders are about what to do and how to measure AI's impact.
- Measuring AI impact requires a framework that looks at utilization, impact, and cost, not just lines of code or acceptance rates.
- AI tools save time primarily by reducing toil, such as debugging stack traces and refactoring code, not just generating code.
- The biggest time savings come from reducing cognitive load and eliminating tedious tasks, not just speeding up coding.
- AI can make developers less satisfied if it accelerates the parts they enjoy, leaving more unenjoyable work.
- Adoption is key; companies like Booking.com saw benefits only when over 65% of developers used AI tools regularly.
- Companies should treat AI as an experiment and measure its impact on developer experience and business outcomes.
- Structured rollouts, like those at Indeed, are more successful than unstructured ones, especially in regulated industries.
- The best way to avoid hype is to use data to measure AI's impact and make decisions based on real-world results.
Key Takeaways
- Focus on measuring AI's impact with a comprehensive framework that includes utilization, impact, and cost to get a complete picture.
- Prioritize use cases that reduce toil and cognitive load, like debugging and refactoring, over simple code generation.
- Use data to make informed decisions and avoid the media hype cycle; data beats hype every time.
- Ensure structured rollouts with clear policies and experimentation to maximize AI adoption success.
- Measure the impact on developer experience and business outcomes to justify AI investments.