> We aren't seeing higher overall levels of productivity.
You can't measure productivity for shit, otherwise companies would look entirely differently. Starting from me not having to do my own finances or event planning or hundred other things that are not my job description, not my specialty, and which were done by dedicated staff just a few decades ago, before tech "improved office productivity".
> We aren't seeing the developers who start using copilot/gpt rush ahead of their peers.
That's because individual productivity is usually constrained by team productivity. Devs rushing ahead of their teammates makes the team dysfunctional.
> We aren't seeing any ability to cut back on developer spend.
Devs aren't stupid. They're not going to give you an opportunity if they can avoid it.
> We aren't seeing anything positive yet and many developers have been using copilot/gpt for >1 year.
My belief is that's because you aren't measuring the right things. But then, no one is. This is a problem well-known to be unsolved.
Perhaps we have added more meetings because developers have more free time.
Or perhaps developers were never the bottleneck.
We can see large productivity improvements when we make simple changes like having product managers join the developers daily standup meetings. We can even measure productivity improvements from Slacks/Zooms auto-summary features. Yet gpt/copilot doesn't even register.
> We can even measure productivity improvements from Slacks/Zooms auto-summary features.
While not code generation, this auto-summary is powered by the same tech. I think using it to sift through and surface relevant information, as opposed to generation of new things, will have the biggest impact.
By far the greatest value I get out of LLMs is asking them to help me understand code written by others. I feel like this is an under-appreciated use. How long has this feature been in Copilot? Since February or so? Are people using it? I do not use Copilot.
I use ChatGPT copilot etc to reduce my cognitive load and get a lot of things done quicker so I also have more time to fuck around. You're out of your goddamn mind if you think I'm going to increase my output for the mere chance that maybe I'll get an above inflation raise in a year. "We gave our devs a magic 10% productivity boost machine, but their output hasn't increased? I guess the machine doesn't work..." It's amusing how out of touch you are.
There is an ethical question in here that I don’t have an answer for. As an employee, I find a way to do my job more efficiently. Do I hand those efficiencies to my employer so I can get a pat on the head, or do I keep them to myself to make my own life less stressful? If I give them to the boss, do they even have the ability to increase my pay? Using the extra time to slack off rather than enriching the employer might be the best choice.
Passing on personal productivity gains to management is always a HUGE L for the individual worker.
As a dev, you can use the saved time to slow down and not be stressed, spend more time chatting with colleagues, learn new skills, maybe improve the quality of the code, etc. Or you can pass it on to management which will result in your workload being increased back to where you are stressed again and your slower colleagues will be let go, so now you get to feel bad about that and they won't be around to chat with.
I have never in my life seen workers actually get rewarded with pay raises for improved productivity, that is just a myth the foolish chase, like the pot of gold at the end of the rainbow.
I have also tried being the top performer on a team before (using automation tools to achieve it), and all I got was praise from management. That's nice, but I can't pay for my holidays with praise, so not worth it.
Writing code is just one part of the process. Other bottlenecks might prevent you from seeing overall productivity improvements.
For example:
- time between PRs being created and being picked up for review and merged
- time spent on releasing at end of sprint cycles
- time spent waiting for QA to review and approve
- extreme scrum practices like "you can only work on things in the sprint, even if all work is done"
How are you measuring developer productivity? Were those that adopted copilot and chatgpt now enabled to finally keep up with their faster peers (as opposed to outstrip them)? Is developer satisfaction improved, and therefore retention?
Yes, other bottlenecks might be preventing us from seeing overall productivity improvements. We might require large organisational changes across the industry in order to take advantage of the improvements.
I guess we will see if smaller startups without many of our bottlenecks are suddenly able to be much more competitive.
> How are you measuring developer productivity?
We use a host of quantitative and qualitative measures. None of them show any positive improvements. These include the basics like roadmap reviews, demo sessions, feature cycle time, etc as well as fairly comprehensive business metrics.
In some teams every developer is using copilot and yet we can't see any correlation with it and improved business metrics.
At the same time we can measure the impact from changing the label on a button on our UI on these business metrics.
> Were those that adopted copilot and chatgpt now enabled to finally keep up with their faster peers
No.
> Is developer satisfaction improved, and therefore retention?
> We use a host of quantitative and qualitative measures. None of them show any positive improvements. These include the basics like roadmap reviews, demo sessions, feature cycle time, etc as well as fairly comprehensive business metrics.
Those are very high level. If there's no movement on those, I'd guess there are other things bottlenecking the teams. They can code as fast as possible and things still move at the same pace overall. Nice thing to know.
If you want to really test the hypothesis that Copilot and ChatGPT have no impact on coding speed, look at more granular metrics to do with just coding. The average time from the moment a developer picks up a work item to the time it gets merged (assuming code reviews happen in a timely fashion). Hopefully you have historical pre-AI data on that metric to compare to.
Edit: and average number of defects discovered from that work after merge
> look at more granular metrics to do with just coding. The average time from the moment a developer picks up a work item to the time it gets merged (assuming code reviews happen in a timely fashion)
We do collect this data.
I personally don't put a lot of stock in these kinds of metrics because they depend far too much on the way specific teams operate.
For example perhaps Copilot helps developers understand the codebase better so they don't need to break up the tasks into such small units. Time to PR merge goes up but total coding time could easily go down.
Or perhaps Copilot works well with very small problem sizes (IMO it does) so developers start breaking the work into tiny chunks Copilot works well with. Time to PR merge goes way down but total code time for a feature stays the same.
For what it is worth I do not believe there have been any significant changes with these code level metrics either at the org level.
> We aren't seeing higher overall levels of productivity.
> We aren't seeing the developers who start using copilot/gpt rush ahead of their peers.
You think we are antsy worker bees, hastily rushing forwards to please the decision maker with his fancy car?
You are leadership. It's not hard. Cui bono, follow the money, etc. The incentives are clear.
If me and my peers were to receive a magic "do all my work for me" device I can assure you exactly zero percent of that knowledge will reach your position. Why would it? The company will give me a pat on the back. I cannot pay with pats on the back. Your Tesla cannot be financed with pats on the back. Surely you understand the nature of this issue.
If you write a spaghetti system where collecting the context for the AI is a big time sink, and there are so many service/language barriers that AI get confused, of course AI is going to suck. Of course, if you give your programmers a game pad and tell them to use it to program with a virtual keyboard, they're gonna suck ass too, so you should consider where the fault really lies.
Is it the superstars or the line holders that have been the first adopters? I could speculate, but I am actually curious what you are seeing in practice.
You say this but from a management perspective at a large enterprise software company I have not seen it.
Some of our developers use copilot and gpt and some don't and it is incredibly difficult to see any performance difference between the groups.
We aren't seeing higher overall levels of productivity.
We aren't seeing the developers who start using copilot/gpt rush ahead of their peers.
We aren't seeing any ability to cut back on developer spend.
We aren't seeing anything positive yet and many developers have been using copilot/gpt for >1 year.
In my opinion we are just regaining some of the economic value we lost when Google Search started degrading 5-10 years ago.