AI coding assistants do not boost productivity or prevent burnout, study finds

zohaibahd

Posts: 933   +19
Staff
In a nutshell: Developers were supposed to be among the biggest beneficiaries of the generative AI hype as special tools made churning out code faster and easier. But according to a recent study from Uplevel, a firm that analyzes coding metrics, the productivity gains aren't materializing – at least not yet.

The study tracked around 800 developers, comparing their output with and without GitHub's Copilot coding assistant over three-month periods. Surprisingly, when measuring key metrics like pull request cycle time and throughput, Uplevel found no meaningful improvements for those using Copilot.

Matt Hoffman, a data analyst at Uplevel, explained to the publication CIO that their team initially thought that developers would be able to write more code and the defect rate might actually go down because developers were using AI tools to help review code before submitting it. But their findings defied those expectations.

In fact, the study found that developers using Copilot introduced 41% more bugs into their code, according to CIO. Uplevel also saw no evidence that the AI assistant was helping prevent developer burnout.

The revelations counter claims from Copilot's makers at GitHub and other vocal AI coding tool proponents about massive productivity boosts. A GitHub-sponsored study earlier claimed developers wrote code 55% faster with Copilot's aid.

Developers could indeed be seeing positive results, given that a report from Copilot's early days showed nearly 30% of new code involved AI assistance – a number that has likely grown. However, another possibility behind the increase usage is coders developing a dependency and turning lazy.

Out in the field, the experience with AI coding assistants has been mixed so far. At custom software firm Gehtsoft USA, CEO Ivan Gekht told CIO that they've found the AI-generated code challenging to understand and debug, making it more efficient to simply rewrite from scratch sometimes.

A study from last year where ChatGPT got over half of the asked programming questions wrong seems to back his observations, though the chatbot has improved considerably since then with multiple updates.

Gekht added that software development is "90% brain function – understanding the requirements, designing the system, and considering limitations and restrictions," while converting all this into code is the simpler part of the job.

However, at cloud provider Innovative Solutions, CTO Travis Rehl reported stellar results, with developer productivity increasing up to three times thanks to tools like Claude Dev and Copilot.

The conflicting accounts highlight that we're probably still in the early days for AI coding assistants. But with the tools advancing rapidly, who knows where they are headed down the line?

Permalink to story:

 
Like any tool, it depends how you use it.

For me Copilot has massively increased productivity, the Chat window has helped a lot with problem solving and planning.

If you make good use of comments, particularly at the top of a script detailing its purpose, the autocomplete is then a lot more accurate.
 
I'd give up Copilot a lot faster than I'd give up my IDE or most of its functions. To me it's mostly a party trick that adds a little enjoyment through the day to see what it will do. Every once in a while it auto-completes a tedious bit of routine processing, but more often it just takes extra time to delete or fix or its suggestion.

I liked it best when I first tried it, before they had gotten better about protecting unique information. I used to regularly get suggested comments that while never outright confidential, would hint at some inside ball on how another company thought about a particular business process or policy. It's been a while since I've seen anything interesting now.
 
As a weekend warrior I used to spend alot of time just figuring out the syntax and looking for sample code, here it helps tremendously. It’s like I have a mentor now constantly suggesting how to realize function to code and answering my questions about very specific details that would be hard to google.

My commenting has improved alot, it’s key for copilot to function well.

I can see how a very experienced coder might be annoyed with it always going in a slightly different direction, but for us less fluent people it’s literally like an assistant writing the code for us, we just plan the structure of the program.
 
So far, AI solutions are nothing more than just a good help to find answers quickly. If you want to use AI to do complex coding, you will still need to validate the entire code because we all know that the output is never going to be 100% correct. But to validate the codes written by someone else can be challenging itself. So if you are spending time figuring out why the code is written in a certain way, it would have been easier to just do it from scratch right? So I think the usefulness of AI here depends on how you use it. If you are coding something and using AI to aid you in finding a way to code, it may be helpful. But in both cases, manual effort will still be required, and I don't think it saves any time.
 
Copilot absolutely helps with my productivity, but it depends on the developer. It will help you if:

- you already understand good coding practices and patterns
- you realize what it is that AI is really doing (generating code based on what it has already seen)
- you appreciate that it will NOT be 100% correct

The more you follow patterns and "code like everybody else", the more accurate the AI will be since it can better understand your code.

If you don't follow good practices or even try understand the code the AI has produced, it won't help you and instead annoy your coworkers who have to review your garbage.
 
Like any tool, it depends how you use it.

For me Copilot has massively increased productivity, the Chat window has helped a lot with problem solving and planning.

If you make good use of comments, particularly at the top of a script detailing its purpose, the autocomplete is then a lot more accurate.
It has definitely been a useful tool for me - I think a lot of people just have the wrong expectations of it and how it should be used, and, as a result, don't properly benefit.
 
I can confirm it works both ways.

On one hand it can be very useful for a tedious task. If you need plow through several levels of nested things to find data (external libraries are always a joy aren't they?) instead of doing that just write the code and generate the code. Take care to inspect the code that it does what you actually wanted it do.
Got all your properties in JSON and need a C# ViewModel for it? That's a lot of tedious typing, just generate it.
Big time savers!

On the other hand if you want it write half the program for you because you're lazy you end up with buggy code that you don't understand because you didn't write it. It can end up costing more time than it saves.

As an 'aid' when you're learning a new language it's a bit double edged imo. It's tempting to lean heavily into it because you don't know what you're doing yet but you end up slowing down your learning. Instead of reading the documentation and understanding things you get a quick fix handed to you.

I do greatly appreciate that I can sometimes just do a "Hey I wrote this block of code. I'm pretty sure some things could have been a bit more efficient got any pointers?" and it will pick stuff apart and actually come up with solid pointers. (Getting my hands dirty in C# at the moment and often my loops can be simplified into an easier to read/understand linq expression).
 
So, if you haven't tried Claude yet, it's night and day difference from most of the others with regards to context. Context is often the most expensive part. When you have to keep telling AI the same stuff over and over again because it doesn't remember your rules or what your choice looks like, out gets really tiring.

That being said, o1 is the smartest so far, for me.

Generally when I start getting the go around from Claude, I'll usually pop the same question into o1, and it frequently gets the answer perfectly right the first try.

I tend to prefer perplexity and Gemini when I need more of up to date info by default, but even though perplexity uses the other AIs behind the scenes, it can really be dumb sometimes.

I definitely find myself wanting to get into the learning side of these more with stuff like tensor flow and some of the other learning models, so I can integrate these AIs with my own knowledge and data and get the AI to understand my workflow and what's important to me.

I think until we get more of that, AI will be helpful, but not necessarily in a huge way.

Plus, right now, I have to be really careful what I put into AI. I can't be violating my NDA, or PII.

I think we will start seeing the real gains when
- we have larger context Windows
- can train the AI not only on our patterns, but also on what we want our patterns to be
- the AI has full context to our apps
- the AI is better at understanding the context in our questions (e.g. understanding pronouns, lol).
- the AI thinks more carefully about the question
- the AI is able to prioritize some information higher based on self reflection and discovery (e.g. able to write programs itself and confirm its correctness when it correctly guesses actual results).

All of these things are in the works, and even when I've find AI slowing me down, it's usually helping me understand my own problem better. It's almost like talking to the duck (or dog).

It frequently gets me unstuck when I have analysis paralysis.
 
As with all tools you get better results if you learn to use them properly and understand how to apply their results.

People that say they don't work are just not very good programmers
 
Super useful to help you learn new languages. You can ask it why and how things are in the language. Also pretty useful to get you going and finding libraries that help you do what you need to do. It's effectively the mentor you wish you had.
 
Also pretty useful to get you going and finding libraries that help you do what you need to do.

That's one point where co-pilot falls a bit short for me. It could be because I have a strong dislike for including massive libraries though.

It seems a bit eager to suggest a library when 10 lines of code and already available functionality can do the trick. It also seems to like suggesting the most popular ones over one that does the trick and is smaller in scope.

It does make sense as it will suggest whatever was in the dataset it was trained on the most (the most popular) or expect that you might need additional functionality. I just really dislike including a library which is several times bigger than the code for the App it is for. Especially when it's a relatively simple app and you end up with multiple libraries.

When you try to ask if another library will do it will almost always agree with you because they made it such a 'yes' man after the initial launch. People trying to pick fights with it for cloud made it to non-confrontional imo
 
I can confirm it works both ways.

On one hand it can be very useful for a tedious task. If you need plow through several levels of nested things to find data (external libraries are always a joy aren't they?) instead of doing that just write the code and generate the code. Take care to inspect the code that it does what you actually wanted it do.
Got all your properties in JSON and need a C# ViewModel for it? That's a lot of tedious typing, just generate it.
Big time savers!

On the other hand if you want it write half the program for you because you're lazy you end up with buggy code that you don't understand because you didn't write it. It can end up costing more time than it saves.

As an 'aid' when you're learning a new language it's a bit double edged imo. It's tempting to lean heavily into it because you don't know what you're doing yet but you end up slowing down your learning. Instead of reading the documentation and understanding things you get a quick fix handed to you.

I do greatly appreciate that I can sometimes just do a "Hey I wrote this block of code. I'm pretty sure some things could have been a bit more efficient got any pointers?" and it will pick stuff apart and actually come up with solid pointers. (Getting my hands dirty in C# at the moment and often my loops can be simplified into an easier to read/understand linq expression).
Agreed. These people are using AI incorrectly if it hasn't improved productivity. I use AI to simplify tasks and remove tedious processes in development. It saves me countless hours in a week allowing me to do more with my time... do I work less as a result. No. Do I accomplish more in that time. Absolutely.
 
As a developer with more than a couple decades of experience, I tried using AI to generate a program for work (instead of writing it myself). What I got was a mess that wouldn't even pass syntax checks, let alone solve the issue. It made up fields that didn't even exist in the database. I was slightly shocked, but in a good way; my job is safe for the time being, at least for the language I am using. :laughing:
 
When COBOL was invented, people said it would be the end of the job description of ‘professional programmer’, because managers would be able to write the code the business needed. Reality rarely matches hype.
 
As a developer with more than a couple decades of experience, I tried using AI to generate a program for work (instead of writing it myself). What I got was a mess that wouldn't even pass syntax checks, let alone solve the issue. It made up fields that didn't even exist in the database. I was slightly shocked, but in a good way; my job is safe for the time being, at least for the language I am using. :laughing:
Definitely true. AI is more like the intern you give lite / basic tasks or you'll get a cluster. :-D Our dev jobs are safe for now. :p
 
Back