Coding with LLMs (Claude Code, OpenAI Codex) is often presented as the ‘killer app’ for Generative AI. But looking at data, it seems the one piece of the puzzle missing is actual cost. …
I use it a lot, and if you are getting these kinds of results you are either trolling, or just flat out not providing the details and guardrails required with your prompts.
I’ve been in software for decades, and if used correctly, yes it can accelerate velocity of building code out. 10x? No… if you are lucky and careful perhaps 2-4x.
As ALWAYS the human should be in the loop and is on the hook for any code generated.
Lots of anti-ai people in this thread it seems. I get it - personally I HATE the fake image generation! But I have to agree with you in terms of coding that using LLM’s correctly can offer huge benefits.
Modern harnesses are getting more and more sophisticated and your milage varies depending on how well you use them (like any complex tool). At the end of the day it’s still up to the developer to take the code and make sure it’s correct - no different from before where we used to copy code from Stack Overflow or other examples and modify them for our own use.
One thing I have to add - I honestly don’t understand why anyone would use Claude or chatgpt at their ridiculous prices when DeepSeek exists…
The question comes down to cost. The actually good models are already expensive, yet still apparently subsidised. Once we have to pay the true cost they will only be worth using when you are truly stuck.
Lots of use for the simpler models for basic util creation and simpler cleanup refactor stuff though. Quite nice if it actually turns out like that.
I was using a set of template files designed for LLMs to review that project. It is absolutely the fault of Claude that it tools me to do something one way, then told me to try another and when I reverted it said it was the optimal approach.
Where I find it helps is in getting initial starts and as a start to code review. But in both cases they aren’t ever operating on their own and their feedback is filtered through myself or another senior dev.
I use it a lot, and if you are getting these kinds of results you are either trolling, or just flat out not providing the details and guardrails required with your prompts.
I’ve been in software for decades, and if used correctly, yes it can accelerate velocity of building code out. 10x? No… if you are lucky and careful perhaps 2-4x.
As ALWAYS the human should be in the loop and is on the hook for any code generated.
Lots of anti-ai people in this thread it seems. I get it - personally I HATE the fake image generation! But I have to agree with you in terms of coding that using LLM’s correctly can offer huge benefits.
Modern harnesses are getting more and more sophisticated and your milage varies depending on how well you use them (like any complex tool). At the end of the day it’s still up to the developer to take the code and make sure it’s correct - no different from before where we used to copy code from Stack Overflow or other examples and modify them for our own use.
One thing I have to add - I honestly don’t understand why anyone would use Claude or chatgpt at their ridiculous prices when DeepSeek exists…
The question comes down to cost. The actually good models are already expensive, yet still apparently subsidised. Once we have to pay the true cost they will only be worth using when you are truly stuck.
Lots of use for the simpler models for basic util creation and simpler cleanup refactor stuff though. Quite nice if it actually turns out like that.
I was using a set of template files designed for LLMs to review that project. It is absolutely the fault of Claude that it tools me to do something one way, then told me to try another and when I reverted it said it was the optimal approach.
Where I find it helps is in getting initial starts and as a start to code review. But in both cases they aren’t ever operating on their own and their feedback is filtered through myself or another senior dev.