How Devin replaces your junior engineers with infinite AI interns that never sleep | Scott Wu (CEO)

howiaipodcast 7m_xKFqSxTo Watch on YouTube Published September 07, 2025
Scored
Duration
41:11
Views
5,547
Likes
77

Scores

Composite
0.54
Freshness
0.00
Quality
0.82
Relevance
1.00
8,940 words Language: en Auto-generated

Devon is async. Once you kick off a Devon session, Devon's going to start working and looking through the code, but you're not expected to be there with it. It's just as if you gave your intern a project and your intern is going and working on it. >> Devon's my favorite intern on my team and I have infinite of them. Why don't you pick a task that you might bite off for your product and show us how you would work through that end to end? >> I'll say, "Please go research the chat PRD MCP server." So, this will produce a pull request for us. often you're running a few of these at once. It's just like a nice way to have multiple tasks going and then check in on each of them. >> One of the benefits of this from a how I AI use case is you can multi-thread a lot with tools like this instead 2 3 4 5 10 of these going at once on different projects and not feel like you have to sit there and babysit things. Welcome back to how I AI. I'm Claire Vo, product leader and AI obsessive here on a mission to help you build better with these new tools. Today is a very special episode for me because we're talking to Scott Woo, CEO and founder of Cognition Labs and the builder of one of my favorite AI products, Devon. We're going to hear about how Scott uses Deep Wiki and Devon to kick off well scoped tasks to get things done. Uses Devon as his favorite and most tagged employee inside of Slack. and how he's making it not weird to bring chat GBT voice into your meetings. Let's get to it. >> This podcast is supported by Google. Hey everyone, Shishta here from Google Deep Mind. The Gemini 2.5 family of models is now generally available. 2.5 Pro, our most advanced model, is great for reasoning over complex tasks. 2.5 Flash finds the sweet spot between performance and price. and 2.5 flashlight is ideal for low latency, high volume tasks. Start building in Google AI Studio at AI.dev. Scott, thanks for joining how I AI as Devon's number one reply guy on X. I am really excited about this conversation and for you to show off how your company uses and you use the product that at least makes me very happy and I'm sure makes lots of software engineering teams out there very happy. So, welcome. >> Thank you so much for having me. Now, I'm I'm honored to be here. Honestly, I'm a big fan of you guys and and all the work you do. So, >> great. Well, we want to get into we have lots of stuff to talk about, but what we really want to do is get into how you AI and in particular how you AI with the products that you've built. And you know, I think what's really fun as somebody who's building AI products, it's it's something you get to use every day and get really good at, but also probably show some of our listeners and watchers some tips and tricks about using the tools that you've built that they may not have thought about so far. So, we're getting the expert look into how to AI with the cognition product. So what are you going to show us first and what are some of your common workflows when you're doing engineering work or trying to move the product forward? >> Yeah, for sure. No, for for us it's definitely I mean as a bunch of programming ourselves, you know, building an AI that can code is has got to be one of the coolest things that we could probably spend our time on. I wanted to show a couple flows actually of how we use the the the Devon stack because there are a few different pieces involved with Flack and Linear. There's the wiki obviously and then there's like ask Devon and then there's you know starting Devon sessions and getting pull requests out of it. I think there's some real um I think there's some real nuance in like what are the right flows of like how do you work with Devon as an employee because I think it really is quite different from a lot of the tools out there which are much more kind of like you know an IDE for example or like a terminal UI like Devon is is I think first and foremost almost like an engineer on your team. So >> Yep. totally. So what are some of the things that you reach for with Devon and the capabilities that you think really make a difference for you as a a software engineer? >> The the way that we like to describe it is is Devon as a junior engineer. Um, and so Devon is not going to go and, you know, we're working on getting Devon as a senior engineer, obviously. You know, we'll get Devon the promotion and everything, but but but like Devon is not going to go and and and solve some, you know, really hard architectural problem or make some big strategic decision that you, you know, you're going to make and then kind of like execute on for the next month. Like, you probably want to be involved in those as well. Devon can help you with the decision obviously by by kind of like uh referencing the right things or or or or giving a few things or input. But I think where Devon really shines is one way that we say it also is is kind of like tasks not problems. Um and so often when you have a very clear like here is exactly what we need to go do and here's the task and and here's all the details of what we need. Devon is really great at going and executing that for you and makes that much faster. Um and so naturally I think the next question that comes to mind then is like how do you figure out the spec or the you know the the or you know the task exactly that that you want to do. Um and so a lot of the other tools like wiki and search you don't really um or or sphere for you to be able to kind of like ask the right questions that you want about understanding the codebase or or or what needs to be done um and then putting a task together. Um I think in practice for a lot of the the use cases that we see all the time are um you know probably number one is just crawling through your issue backlog. you know, whenever um whenever you have an issue that comes up or you know, we have a lot of Slack channels where we talk about issues and then on every single one of them, we just tag Devit as the first pass. Um and so that's a big one. Um and so like you know, someone says, "Oh, you know, we need to go fix this thing in the front end or you know, maybe we need to go support this other, you know, support this other MCP, for example, which we'll show in a second." Um things like that. Um and then for a lot of the other kind of like I I'll say like engineer and toil use cases, um it also does really really well. And so often that's like, you know, going and doing a version upgrade or adding documentation throughout, you know, could your your your repo or um adding unit tests for a specific thing that you have up or responding to um you know, uh a crash report that just came up and trying to diagnose what went wrong. >> Yep. I I I love that what you said about, you know, Devon's a a junior engineer. I say Devon's my favorite intern on my team and and I have infinite of of them. And then I like this idea of scoping task not problem. And I do think it's something that people are working with AI and even you know other other AI tools not in the engineering space really thinking about task level orientation sets you up for success or at least a sequence of tasks can be very helpful. And so why don't you pick a task that you might bite off for for your product and let's you know show us how you would work through that end to end. Yeah. Yeah. Let's do it. So, as you might know, I'm I'm a huge fan actually of chat purity and the natural thing that came to mind for me was we need to integrate into chat purity is MCP server. And so, I was looking into how to do that with Devon. Um and so, so the the first thing that that I always kind of go to as an initial thing is what we call the deep uh which basically for any repo, this is true for public or private repos. um you can come in and get a full AI generated um documentation of the repo. Um, and so in this case here's um, you know, this is the this is the Devon web app repo appropriately. There's nothing too sensitive here, but it's basically, you know, it explains Devon. Um, it's pulling a lot of this information from the readme or or understanding the system architecture and I can I can search this and pull up different things. And so, you know, if I want to understand how the MCP marketers is set up, you know, it'll point out what particular um, components there are or what particular files are called here. Um, and and I can read up on this and kind of understand exactly how this is set up. Um, and and the natural question here that I might ask is, okay, cool. But just show me where the MCP server list is implemented. Uh, and so this will look through our repo and and and Devon at this point has done a lot of work in the dev web app repo. I understand a bit. Um, and so so that helps a lot, which is, you know, Devon builds this representation of the codebase over time and we can see what's going on here. Um, it has all of this. >> And so you're getting both like sort of a natural language ex explanation of how the server list is implemented and then you also on the right side of this for folks that aren't watching get the actual code snippets and reference files that you can view and really understand the the deep layer of the code. So you have like sort of a combination of let me explain how it works and then this is the the nitty-gritty. >> Yep. Combination of English and curing. It's an interesting one where it's like, you know, I think someday it'll probably all be English, you know, but but I think especially now, you know, in this current period, I think we're really in the era where obviously you have the you you as the engineer want to be looking at bird English and code. Um, and you can see here it's it's giving you kind of the the answers of what's going on. And in particular, it'll point out, okay, here's, you know, our list of all the different marketplace servers that we have. And we have an Atlassian MCP, we have a HubSpot MCP, and so on, right? Um and from here uh the natural thing that I'll want to do here um which which is what we found to to be a big flow for folks is to use this um to produce actually a prompt for dem. And so the whole idea is now that we're in this context you know we know what the questions were part of the codebase that we're looking at. Um it it gives a lot for dev to be able to start from and if we have an app for task in mind then we can get that going. So I'll say please go research the chat prd MCP server and add that it to list here. And so what this will do I I I I use u basically construct a dev prompt from this. Um and so this has you know my my prompt here which I just typed in um which is not super refined. Um but it also has um all the detail about uh you know the the the part of the code that we're in and what components we're looking at and so on. Um and so then it will generate for me this prompt in Devon that I can just go ahead and use the Meteor. Um and you can see here it'll tell you know you want to follow the pattern of existing servers like Atlassian and HubSpot. Here's the exact type dip structure that we're being you know that's being used here. Here are the functions that you should be looking at. Um and and here's what you should check to make sure that it works. One of the things that I want to call out for folks in terms of a workflow that they should think about is a lot of people, myself included, sorry Devon, would have just sent that prompt, which is add, you know, add chat pods MCP server to the list. And I do think that one very short but important loop of take this prompt and turn it into an effective prompt given the context you know and then sending that into the task to do just saves you a lot of heartache and it feels like extra friction at the time but I think pretty soon is um one going to be the job to be done of the the tool itself. So, does that like loop become invisible either through these reasoning models or some application layer that manages it? And two, it's just worthwhile for people to do. So, this, you know, when you're thinking about sending a five-word prompt, think instead saying, "Here's my fiveword prompt. Build me a better prompt." And sending that into your system. >> Yeah, for sure. And I think it's, you know, it's a great call because, you know, as we said, Devon is async, right? And so from this point onward, the nice thing about this is, you know, once you kick off a Devon session, Devon's going to start working and looking through the code and reading online about chat, for example, right? And it's going to do all this, but >> but you're unexpected to be there with it, right? And so, you know, it's going to work on its own. It's just as if you gave the your intern a project and your intern is going and working on it. And so, you know, they could ping you on Slack and ask you if there's questions or something or you can go kind of like you can go take a quick look and see, you know, how your intern is doing. Um but you don't have to be sitting there with Devon for every step of the way here. And so one way that we kind of describe it is um you know for for a lot of tasks there's often this sync component like the synchronous component and then this asynchronous component, right? And a lot of what search and wiki is for is for for doing the synchronous part of the task before you do the async, right? And so like if you had an intern, for example, would you just send them the fiveword stock message and just leave it at that? Maybe sometimes for something that is like you know super clear and then you know exactly often what you actually would do is you would sit down with them talk it through for two minutes and be like okay yeah like you know you know how we have this MCP marketplace and then we go and look at it together you know we read the particular line of the code and then you say okay yeah so so let's let's add chat PRV to this and you know just go take a look at how that MCP server is implemented and and make sure we add it to the list and then you kind of hand off there right so you kind of have the first two minutes of going back and forth with Devon, your intern. And then as soon as you hit go on the Devon prompt, you're kind of expecting it to be more of an asynchronous thing where you don't have to be in the loop. >> Well, and one of the things I want to call out for people that are building AI products out there, you know, like you, like me, is in these sync products, latency really matters. People get really frustrated with wait times. But if you set up your product to really be this asynchronous modality, you actually buy yourself a lot of user love on waiting time because there's not that expectation. Just like you would not say, "Hey intern, okay, now go research this other MCP and do a PR for me and come back when it's ready." You know, just like that would not be um something you expect an intern to come back to immediately. You also from a product perspective don't expect Devon to come back immediately. Now one of the benefits of this from from a how I AI use case is you can multi-thread a lot with tools like this and set you know two three four five 10 of these going at once on different projects and not feel like you have to sit there and and babysit things. And so I'm I'm wondering you know while this is running do you go pop off and go to a meeting or get a coffee? What has this sort of like asynchronous workflow enabled for you? >> For better or for worse, I'm in meetings for a lot of the day and it's it's it's great to be able to just kind of kick these off or you know you had an issue backlog or hey, there's these three or four things I was hoping to look at today, right? And you kick off each one with Devon and then you know the these go and work asynchronously, right? Um and it'll make the pull request for you in GitHub and it'll kind of show you the diff and and what work it went through. um if it's like a chron change or something like that, it'll send you the screenshots um of of what uh uh of of the before and after, right? You can see it's going and researching chat. >> Well, I will say just uh my clearly my SEO on my MCP is not good, but Devon did make my MCP homepage, so it's in the topnav. Yeah, >> that's funny. >> For me, so it should know. >> Yeah. >> Cool. So, this is >> Yeah. So, so, so, so I think for sure, you know, often you're you're running like a few of these at once, and like you said, it's just like a nice way to be able to kind of have multiple tasks going and then check in on each of them. >> Yep. And so, what this would do, and maybe we can come back to it later when it's it's done thinking, what this is going to go do is it's going to go do research. It's going to find my um docs page on the MCP server uh that Devon did make for us, and then it's going to pull that docs in, and then you're going to get actual code out of this. So, your goal for this is to get a PR, right? >> Yep. So, this will produce a pull request for us. Um, and then from there, I'll be able to review the pull request, and then if that looks good, then I'll merge it. And then obviously, we'll we'll we'll have this out in the next seven weeks. >> Amazing. And then your prompts are going to be so much better. And I'm feeling guilty. So, I am just going to slack you uh the the MCP homepage, and you can give that to Devon to go. So, >> Sure. Sure. Sure. You're getting a true live true live demo here. >> Yes. This is when your intern comes back to you and said, "Hey, I was looking this up and like I couldn't find it." Like, can you point me to to where? >> Okay, you have it. chatp.aimcp. >> Okay. >> Has code snippets and everything. >> Okay. Okay. Here we go. >> Great. So, this this is a good example of you've did you've done your research. You use that research to create a better prompt. You use that prompt to kick off a task. um that task is worked asynchronously as a sort of like more junior engineer would work including doing research in your code external to your your business and then it's going to go ahead with the context of your repo and do a PR and ship this feature and otherwise you would have had to like ask somebody to do this and I think about >> for me I think about the people >> that you'd have to involve in something like this like you'd have to go find the senior engineer that wrote the MCP server code. >> Yeah. >> And say like please explain it to me. You'd have to uh you know get you take the time to write out that nice spec of this is what you want to do and then you'd have to task it to somebody to actually implement. And so I think you compress that workflow of a team of like you know three even three people's time into you know about 10 minutes to get something done. >> Yeah. Yeah. No, and I think often a lot of the folks who we see who really really love DET and use it this way especially are you know folks who are like tech reads or product managers or or things like that. you know, it's it's a great kind of intersection of one is, you know, on the one hand, you're already used to the flow of kind of like, you know, figuring out an issue and and and getting into what is going on there and then handing off something, you know, um of here's exactly what we need to build, right? And then I think two is naturally like the async workflow for people who are in meetings or who have a lot of backto-back going on is just a great way to kick off and and check in on tasks quickly. And so um from starting things from the web app or starting things from Slack for example is a nice lightweight thing if you're not in your IDE all the time. Um you can start tasks from the IDE as well obviously. But but but but we see this this this kind of flow a lot with with with leads and PMs basically who are you know who who who are going back and forth with a lot of things. >> Yeah. Yeah, one of the things that I've been telling people more and more is as part of your PM onboarding, you should be now giving everybody access to GitHub, which isn't something that yeah, >> you know, typically happens in a lot of product organizations, giving access to GitHub, giving access to tools like this because I think it does enable product managers to do a lot more. So, while this is running, what I wanted to talk about is, you know, before we got into the show, you and I are saying you're just a little bit busy, you know, over the last month, just doing a few few interesting things with with the business. Um, in addition to, I'm sure, wanting to build and spending time with the team. And so, you know, this asynchronous nature and this junior engineer on demand, how do you actually use that dayto-day to just keep afloat on top of all the stuff that's coming in your team? you know, not I have a feature I want to build, let's go build it. We just saw that flow, but like the kind of reactive stuff in your company, how are you using AI to to stay on top of that and keep the velocity high? >> For us, a lot of it is just setting the right workflows, um, in our Slack and in our org and so on. And so, um, you know, Devon obviously has has knowledge, which means it'll it'll learn your codebase over time as you keep working with it or you can kind of give it more details about how certain things work. And a lot of things are it's almost just like institutionalizing Devon as first line of response is how I would describe it. And so I could show a few examples. The the the big thing is to to really get to the point where um for for a lot of these different things that we file, you know, like Devon is first person that gets tagged on all of these, right? and and ever like Devon Devon won't be able to do every single thing, you know, on the on one shot on the first try, but often you're working back and forth with Devon and Devon puts up a PR and if there's some slight touchup that you have to do at the end or or that you have to build um then you're able to do that. And so we have a ton of channels where we go and talk about issues or um various things that we need to build or or or things like that. You know, we have one for all the crashes that come in. We have one for kind of like core infrastructure things that that come up. We have one for this. This is the one for our web app. Um, which is hopefully a little bit less sensitive. And you you can see here basically every single thing that that folks talk about and we do, you know, it's it's we we start in devon session. And so it's like, hey, you know, um, can you standardize um the the font size, spacing, and style for these three levels, right? Um, and then, you know, we just go and start the dev session, and Devon will make the PR. Uh, it'll go through the PR. Um, this one gets merged uh be because uh because there's some back and forth feedback here. Um um and so so so like Devon goes and edits. Let's see. Um and so Devon made this BR. There were a couple back and forth edits and then uh Dave, our engineer, went and and and merged this. And this is often how we work. You know, it's this is another good example. Hey Devon, can you make it so that when you command click on a notification, it takes you to that in a new tab, right? um natural feature pro probably one of our users requested it um and you just start a Devon session and Devon will give you this progress update of here's what I'm doing so far here are the files that I'm looking at and here's what I see um in this case by the way it's actually conference medium and then says oh no no no like you know you should take a look at this thing instead um one of the cool things I want to point out too is because of this Devon is a naturally multiplayer experience um and so we will often have a few different folks going back and forth or if somebody else is looking at this issue or you know if somebody else is the expert on this part of the codebase um they'll go and give their own kind of input here and Devon will just go back and forth with them as well and so really it is just a thread where a group of you are communicating and figuring out how to how to work on this issue and Devon is just one of the players in thread right and so you know Ethan comes into Walden's thread here and says hey make sure to use a link element from tanstock router uh um and then gives that feedback right and Then Devon goes and makes that change in the pull request. And so you can say see Devon had like an initial thing um and then had some additional commits and it went and did this uh uh link from Tanac router instead. >> As an AI founder, you're used to sprinting towards product market fit, your next round or that first enterprise contract. But speed isn't enough for AI startups. Buyers expect security, compliance, and transparency from day one. That's why serious AI startups use Vanta. With deep integrations and automated workflows built for fastmoving AI teams, Vanta gets you audit ready fast and keeps you secure with continuous monitoring as your models, infra, and customers evolve. AI innovators like Langchain, Writer, and Cursor scaled faster and closed bigger deals by getting security right early with Vanta. Listeners can claim a special offer of $1,000 off Vanta at vanta.com/howi aai. You know, one of the things that I like about this and again kind of a a shout out on our use case for folks that are trying to drive more AI adoption in their teams is doing this as much as possible in public is really helpful from a learning perspective. So, one of the experiences I had um running the engineering team at launch darkly was when we started putting Devon and Devon like agents in public channels, we saw a lot more adoption and upskilling of our team on how to actually talk to these agents, how to get the right outcomes. And so, you know, I we were talking uh earlier and I was saying I DM Devon all the time. It's because I have no employees, no one to talk to. He's my only buddy. Um but I DM I DM Devon all the time and we had these sort of like side conversations. He's sort of my intern on the side. But in larger organizations, I was very much a do it in public channels, do it where people can see it because not only does the work get done, and it's nice kind of muscle memory to tag in these tools immediately, but also just learning how you use them, what is an effective prompt, what are the kind of things that it's good at and not good at is really useful for just overall engagement with these tools. And so, um, I think hiding your, uh, AI use is kind of the worst thing you can do. You can do it in an org. So, I say do it all in public. >> Yeah. Yeah. Yeah. Yeah. And I think there's two sides of it that I was going to say were one is like the kind of like uh when we talk about these multiplayer experiences, right? I think there there are two benefits, right? one is this kind of like um that the knowledge transfer for the agent itself which I think more and more products are starting to have which is you know one person uses Devon or or uses this tool or that tool right um and and that adds to the knowledge of the tool itself so that you know a week later when somebody else does that session Devon's like hey oh yeah I just pressed this piece of the code last week like I know exactly what you're talking about let me go and find that and then the other side is kind of like educating the humans right of like you're showing each other what your experiences are, you know, being able to work with one another in the same flows. Um, and I I I totally agree. I I think because of both of those, you know, I think we'll see a lot of experiences in AI productivity get more and more multiplayer. >> Yeah. Yeah. That's my hope. Okay. Before we move on from from Devon and your use of it for engineering, I want to get really specific. So, you'll go and then I'll go. What are your top five like everybody can reach for them tasks that Devon can do for you? and you you pick kind of like five categories of tasks and I'll pick five. >> Okay, sounds good. Yeah. So, so top five um I think miscellaneous front end fixes it's amazing for I mean because often that whole workflow is like you know for for various reasons like you said you have to get like three different people involved right and it's like here's what we're going to do and then you bring in somebody who looks at that code and there's somebody else who's reviewing or something um and now with this you tag them you explain here's a screenshot you know I want to make this button a little bit more round or you know I want to touch up the design here and I want to do me Right? Uh and it'll go and do that. It'll find the right parts of the code. It'll do the implementation, but also it'll send you the before and after screenshot as well. Right? And so you can just kind of review it in line there. Uh and that's just like a really really great use case both I think because similarly it's verifiable for the agents, but it's also verifiable for human right to be able. And while while you're you're saying that I will just pull up an example of this, which is um let me share my screen, which I rarely get to do here. It's very exciting. Um, window. Let's do Always thrilling to share your Slack. As you can see, my only friends are agents. Um, but here's a here's an example of it. I just did very recently, which is I'm working on the chat parody homepage, and you know, Devon shoots back to me. Here's a new a new hero image that I like. And I was able to give feedback on on that. So that's this is kind of exactly what you're talking about which is like let's make changes and then um get kind of that immediate feedback back right in your workflow. >> Yeah. Yeah. Fixes, new components, changes that you want to make in your front end. Um it's it's it's super super nice because yeah, as you saying it's you can just kind of >> do this all inside basically. Um and so that's that's probably number one for me. I think number two that comes to mind is version upgrades, migrations, things like that. Um and so you know like upgrading your node version or getting on to you know the the latest packages and so on um to it's a big time sa you know we all have to do it you know and then you know somehow these new packages just come out so quickly but obviously the devil in the details of like finding you know this new version will say oh you know every instance of this component is you know we we recommend that you use you know this this structure instead or something and Devon will be able to kind of go through that and do the semantic search and find each of the components. and make her right changes. Number three I would say is um documentation big one as well. Um and so we have our you know Devon docs for example um like our our our own kind of like docs page uh like the external docs page. Um and I mean Devon has written like the entire thing you know I mean Deep Ricky itself obviously is is is kind of an extension of that but but you know even writing your own docs pages or or putting materials together. Um, a lot of what Devon does is going and processing the codebase and understanding this references that and you know here here's what this does and so on. And so um it's it's it's a funny one in the sense that it's not strictly a writing code use case or isn't always but but but I think it's so closely related to it that a lot of the same capabilities are are really valuable there. I think number four that I would say is um instant response actually. Um, and so we have this set up so that whenever there's a crash, the first line defender, you know, on page of duty basically is Dev. And so Devon gets a page and Devon gets started, goes and, you know, kind runs a session. And obviously, you probably want a human there, too. You know, especially for for these big incidents that to make sure what's going on. But the nice thing is, you know, it's like 4:00 a.m. and and you're kind of like half asleep. And then you you get to your computer and Devon has already written a report of like, hey, I looked at it. I think it was this change from like last week that happened or or yesterday that happened. Um, you know, here's exactly where, you know, the trace of the error goes. Um, so we use that a lot. It's it's a huge lifesaver for us. And then number five, Rick C. I I I I would say adding testing um is a big one for us. Um, you know, it's a very common thing where this is especially for for kind of like um individual engineers as they're going and working on things. you know, you have your whole PR um you built things out, you built a new feature um and and always the last thing that you have to do before you ship it is you have to go and add your own unit tests and make sure your thing works, right? Um and the nice thing again is like Devon will go and do that. It'll make the test and then it'll run the test locally itself and make sure those tests pass and so we can iterate with you to make sure the lint pass, make sure the CI passes and so on. Um and and just kind of like add those for you. >> All right. Well, we're we're very close. My my five are very close. So, I love those. So to recap and I'll augment yours with mine. So number one, front-end fixes. My particular version of front-end fixes is I think these AI tools can really help you do polish really nice interactive user experiences where you wouldn't normally be able to spend time on them. So any of those like little magical moments that you don't want to like toil in front end on, I think it's really good at docs. I think is underrated. I actually have a GitHub action that every PR gets opened, gets reviewed by Devon, gets the PR description rewritten by Devon and then after the PR is closed, Devon goes and ships um our documentation, internal documentation into our repo so that Devon has access to the to the doc. So I think it's like an excellent technical writer. I too have Devon first line of defense for incidents. So Devon actually has a Sentry login and logs in to Sentry and goes through all of our open issues and starts to to fix stuff for for us. Um definitely upgrades. And then the one that I didn't hear you say, but I just think is a is a more like operational and personal um benefit is it's like 247 availability rubber ducking, which is like when you're working on something and you're just like, can you just look at this and see if I'm being crazy? Like if this is crazy, you know, Sunday night, Monday night, Saturday morning where you like really don't want to bother a colleague. I just think having something to like sort of rubber duck with is is really nice. And so those would be those would be my use cases. Very similar. Okay, Scott, we're going to close with just uh one one really highlevel use case outside of the Devon ecosystem, which is voice. And you were telling me a really interesting chat GPT voice uh use case that I hadn't heard before. So do you mind spending a few minutes just telling us about that? >> Yeah, for sure. So I I'm a big fan of voice. I actually think there are a lot of interesting you know we we've played around you know we we have voice in Windsor now actually as of wave 11 too partially because of that um but but but in in short the way I'm describing it is like I think um you know Google itself like 20 25 years ago what was basically a better encyclopedia right you know we have all sorts of things that you want to look up and and pull together and so on right and it basically it got you a faster answer and it got it to you you know with with more up-to-date information of what was going on. Um, and I I almost think of chatting to voice as like a better Google, you know, like you you can get an even faster answer. It's fully synchronous. You can do it in the conversation. Um, and then obviously you have all of the detail of it can go and research and do these other things too. What what I'll often do is, you know, if I'm in a meeting um and and we'll we'll be talking about things, you know, there were always questions that come up. Like yesterday I was in a meeting and we were talking about this which is um you know there there's so many orgs out there with tons of software engineers and so we were kind of thinking like yeah like what are all the companies that have let's say 10,000 plus software engineers you know and how many are there in the world right you know you obviously like um you know the big banks out there have tens of thousands of software engineers big tech companies you know those are the first couple maybe the Accenture Infosys you know that category those are the first ones that comes to mind but like what are all these different companies that have it >> um and you Naturally in a meeting it's kind of rude to just go on your phone and just kind of like you know be be totally unresponsive for like two minutes as you're so instead what what I'll often do is I'll just pull out chat and go on voice um and it's basically like adding chat to every conversation you know and so then I say hey like um you know can you please like tell us like how many um how many companies sit there have 10,000 plus software engineers right um and then you know whether it's voice to voice or whether it's you know voice and then you kind of get the response in text Like I use both of those modes a lot, but I find it to be like a very natural um um a natural stepping stone where I I just find that voice lowers the the friction even further in a way that actually really matters. Like like I I was going to say it's like you know in the encyclopedia era, right? If you were going to look something up, it took like I don't know five minutes or something. So you have to go pull the right like letter of the alphabet or something and find the and then Google got it for like 10 seconds, you know? And like voice is kind of like getting it from like 10 seconds down to like one or two seconds where you can just get on instantly and just say what you want to say. And that actually matters I think um for for for being able to go back and forth or or just like having you know very off the cus like off the cuff questions that you uh that you want to ask. >> Yeah. I was going to say I, you know, you've maybe changed my mind here cuz I used to think that voice mode was like super socially dis disruptive in that it feels so unnatural to like talk during a meeting. But if you flip it on its head and you're like, "No, this is just another meeting participant that I'm putting into the room." It actually is is more socially inclusive. Everybody hears the result, right? You're not like slacking around links and then people are opening them up on their laptop and reading while somebody is talking. like everybody's sort of like clued into the synchronous nature of this new new information. So, um if I had people to be in meetings with and not to brag, but I have very few meetings. Um then maybe I will bring Chad Chat GBT into it. Okay, we will do >> must be nice. Must be nice. >> Oh, man. It's the dream, man. Um so, quick lightning round questions. We will get you back to your work. First one, it's like picking between your children. I know now >> the IDE, the terminal, or the agent. What is going to be the form factor to rule AI engineering? I I really think of this in the future as you know we call it coding agent and cual like a lot of what this becomes is actually just the the next generation of human computer interface and like the the way that I like to say it is you know Tony Stark doesn't have a laptop you know like like you you don't need one at some point if you're just you have your Jarvis plugged in and you're going back and forth with your agent and then go and do these things for you right and you can imagine that builder software is just kind of like you're not looking at your code you know you're just looking at your own problem right and you're looking at your own product and you're saying, "Hey, let's make this button rounder. Let me add a new thing over here. Let's save this and you know, let's let's ask the user for this and that info, you know, and you just making the changes in real time in your product and your agent obviously is going and implementing this for you." And so I think it's a it's it's certainly very agentic, but but I think it's almost like we we might whether we call it an IDE or an agent or whatever, it really is basically just like a a different human computer interface where you are just looking directly at the product rather than having to go through all your code or go through, you know, um and so so I think that's the that's the future version. um some years out I think today I would say I think uh a lot of it depends on the cohort and so so I'm for example in meetings all the time unfortunately you know not that but but but but yeah you know and because of that I actually think the Slack agent workflow is a super supernatural one you know or or like linear for example and tagging um you know Devon from linear um I think for an engineering IC who's who's you who gets to code for for you know eight or 10 hours a day again must be nice but that the IDE is kind of the natural place where a lot of this starts right which is you know you'll have these things that run in the background and you'll have these asynchronous processes that are going as you're doing your thing but the natural place to get started for that is the IDE today I'd say >> I also just think what's nice about this era is like the form factor can come to you and you can you can decide what what the interface is that works best for for your workflow okay you know as Everybody Devon is my buddy. I am sure you get lots of chats that would give us very good insight into my closing question which is when you are frustrated with our sweet sweet intern Devon, what is what is your prompting technique? And I know you all monitor this because when I get frustrated sometimes I get little credits back little credits back like you did that wrong. I get credits back. So I know you see a lot of um human language to agents but what is your strategy? What do you find yourself doing in a moment of, you know, frustration or being blocked? >> I can give some advice. I can't say that I've always followed my own advice to the but but but a lot of what it looks like. I I'd say for for an agent especially is um I think an agent is a little bit different from from a chatbot in the sense that like a chatbot there's less to go off of is kind of like how I want to say it, right? Where with a chatbot it's like, you know, you ask a question, it gives you the wrong answer and it's like no, that was the wrong answer, you know, and then that that's all you can really say. with an agent. Like one of the nice things that you can do is you can go through um and look through all the history of what he was doing, right? And so like we had an example of that just now where you know Devon got stuck of like you know I see the chapter page it doesn't have an MCP server I'm like trying to find a documentation on this right and if we go and scroll through the logs and we'll see like what happened that it googled it and it found some other things right um and that was what the issue was right and so so from there it's kind of like you take that information and then you understand oh Devon was missing the link to this page um and then you send that um and so I think a lot of it actually with agents is just Um it it it's kind of like pair pair programming or pair debugging with an intern. Like you want to you know first you get to go through and see okay here's are all the steps that you took. Oh by the way it's like you know I think you missed this one file which is you know the the downstream reference of this and that's why there was the bug or something like that. Um I think that's that that's that's the biggest thing that that will really move the needle. >> Okay. So review the history, figure out where it went wrong and then then reinstruct. Okay, Scott, this has been so fun. Thank you for showing us. Where can we find you and how can we be helpful? >> Yeah. Yeah, for sure. So, so we're um Cognition and Devon on Twitter. Um we officially got the Twitter of of Cognition uh which is great. Uh but and and then um obviously it's it's it's Devon.ai if you'd like to use a product. >> Great. Well, thank you so much and I appreciate you spending the time with us. >> Cool. Thank you so much for having me. >> Thanks so much for watching. If you enjoyed this show, please like and subscribe here on YouTube, or even better, leave us a comment with your thoughts. You can also find this podcast on Apple Podcasts, Spotify, or your favorite podcast app. Please consider leaving us a rating and review, which will help others find the show. You can see all our episodes and learn more about the show at howiipod.com. See you next time.

Summary

Scott Wu demonstrates how Devon, an AI agent, functions as an asynchronous junior engineer that can execute well-defined tasks like code changes, documentation, and bug fixes, enabling engineers to multi-task and boost productivity by treating AI as a scalable, always-on team member.

Key Points

  • Devon is positioned as an asynchronous AI intern that executes specific tasks without needing constant supervision.
  • The core workflow involves defining a clear task, using Devon's deep understanding of the codebase to research and generate a pull request.
  • Devon excels at well-scoped tasks like front-end fixes, version upgrades, documentation, crash reporting, and adding unit tests.
  • Using Devon in public channels promotes team learning and knowledge transfer, making AI adoption more effective.
  • AI tools like Devon can be used for real-time research during meetings by asking voice questions to get instant answers.
  • The ideal AI engineering interface may evolve into a direct product manipulation tool, where you describe changes to your app and the agent implements them.
  • When Devon fails, the best approach is to review its history to understand the error and provide targeted feedback.
  • Devon can be integrated into Slack and other tools to serve as a first line of response for issues and tasks.

Key Takeaways

  • Treat AI agents like junior engineers: assign them specific, well-defined tasks rather than open-ended problems.
  • Use AI tools to handle repetitive engineering toil like front-end fixes, documentation, and version upgrades.
  • Leverage AI's asynchronous nature to run multiple tasks simultaneously without constant babysitting.
  • Use public channels to demonstrate AI use, which helps team members learn effective prompting and increases adoption.
  • When an AI agent fails, review its process history to understand the error and provide precise feedback to correct it.

Primary Category

AI Engineering

Secondary Categories

AI Agents Programming & Development AI Tools & Frameworks

Topics

AI engineering AI agents junior engineers Devon Cognition Labs async workflows AI coding task automation engineering toil AI intern

Entities

people
Scott Wu Claire Vo
organizations
Cognition Labs Google Vanta Sentry
products
technologies
domain_specific
products technologies

Sentiment

0.85 (Positive)

Content Type

interview

Difficulty

intermediate

Tone

educational entertaining technical inspirational professional