Comprehension debt: A ticking time bomb of LLM-generated code

codemanship.wordpress.com

516 points by todsacerdoti 2 days ago


mrkeen - 2 days ago

This was a pre-existing problem, even if reliance on LLMs is making it worse.

Naur (https://gwern.net/doc/cs/algorithm/1985-naur.pdf) called it "theory building":

> The death of a program happens when the programmer team possessing its theory is dissolved. A dead program may continue to be used for execution in a computer and to produce useful results. The actual state of death becomes visible when demands for modifications of the program cannot be intelligently answered. Revival of a program is the rebuilding of its theory by a new programmer team.

Lamport calls it "programming ≠ coding", where programming is "what you want to achieve and how" and coding is telling the computer how to do it.

I strongly agree with all of this. Even if your dev team skipped any kind of theory-building or modelling phase, they'd still passively absorb some of the model while typing the code into the computer. I think that it's this last resort of incidental model building that the LLM replaces.

I suspect that there is a strong correlation between programmers who don't think that there needs to be a model/theory, and those who are reporting that LLMs are speeding them up.

mixedbit - 2 days ago

My experience is that LLM too often finds solutions that work, but are way more complex than necessary. It is easiest to recognize and remove such complexity when the code is originally created, because at this time the author should have the best understanding of the problem being solved, but this requires extra time and effort. Once the overly complex code is committed, it is much harder to recognize the complexity is not needed. Readers/maintainers of code usually assume that the existing code solves real world problem, they do not have enough context to recognize that much simpler solution could work as well.

trjordan - 2 days ago

LLMs absolutely produce reams of hard-to-debug code. It's a real problem.

But "Teams that care about quality will take the time to review and understand LLM-generated code" is already failing. Sounds nice to say, but you can't review code being generated faster than you can read it. You either become a bottleneck (defeats the point) or you rubber-stamp it (creates the debt). Pick your poison.

Everyone's trying to bolt review processes onto this. That's the wrong layer. That's how you'd coach a junior dev, who learns. AI doesn't learn. You'll be arguing about the same 7 issues forever.

These things are context-hungry but most people give them nothing. "Write a function that fixes my problem" doesn't work, surprise surprise.

We need different primitives. Not "read everything the LLM wrote very carefully" ways to feed it the why, the motivation, the discussion and prior art. Otherwise yeah, we're building a mountain of code nobody understands.

alexpotato - a day ago

Was listening to the Dwarkesh Patel podcast recently and the guest (Agustin Lebron) [0] mentioned the book "A Deepness In The Sky" by Vernor Vinge [1].

I started reading it and a key plot point is that there is a computer system that is thousands of years old. One of the main characters has "cold sleeped" for so long that he's the only one who knows some of the hidden backdoors. That legacy knowledge is then used to great effect.

Highly recommend it for a great fictional use of institutional knowledge on a legacy codebase (and a great story overall).

0 - https://www.youtube.com/watch?v=3BBNG0TlVwM

1 - https://amzn.to/42Fki8n

low_tech_punk - 2 days ago

Most programmers don't understand the low level assembly or machine code. High level language becomes the layer where human comprehension and collaboration happens.

LLM is pushing that layer towards natural language and spec-driven development. The only *big* difference is that high level programming languages are still deterministic but natural language is not.

I'm guessing we've reached an irreducible point where the amount of information needed specify the behavior of a program is nearly optimally represented in programming languages after decades of evolution. More abstraction into the natural language realm would make it lossy. And less abstraction down to the low level code would make it verbose.

wkirby - 2 days ago

I see this as the next great wave of work for me and my team. We sustained our business for a good 5–8 years on rescuing legacy code from offshore teams as small-to-medium sized companies re-shored their contract devs. We're currently in a demand lull as these same companies have started relying heavily on LLMs to "write" "code" --- but as long as we survive the next 18 months, I see a large opportunity as these businesses start to feel the weight of their accumulated tech debt accrued by trusting claude when it says "your code is now production ready."

donatj - 2 days ago

A friend was recently telling me about an LLM'd PR he was reviewing submitted by a largely non-technical manager where the feature from the outside entirely appeared to work, but actually investigating the thousands of lines of generated code, it was instead hacking their response cache system to appear to work without actually updating anything on the backend.

It took a ton of effort on his part to convince his manager that this wasn't ready to be merged.

I wonder how much vibe coded software is out there in the wild that just appears to work?

meander_water - 2 days ago

I've done my share of vibe coding, and I completely agree with OP.

You just don't build up the necessary mental model of what the code does when vibing, and so although you saved time generating the code, you lose all that anyway when you hit a tricky bug and have to spend time building up the mental model to figure out what's wrong.

And saying "oh just do all the planning up front" just doesn't work in the real world where requirements change every minute.

And if you ever see anyone using "accepted lines" as a metric for developer productivity/hours saved, take it with a grain of salt.

myflash13 - 2 days ago

This is not just for LLM code. This is for any code that is written by anyone except yourself. A new engineer at Google, for example, cannot hit the ground running and make significant changes to the Google algorithm without months of "comprehension debt" to pay off.

However, code that is well-designed by humans tends to be easier to understand than LLM spaghetti.

VikingCoder - 2 days ago

Kernighan's Law - Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.

Modern Addendum: And if you have an LLM generate your code, you'll need one twice as smart to debug it.

BoxFour - 2 days ago

I'm not of the "LLMs will replace all software developers within a year" mindset, but this critique feels a bit overstated.

The challenge of navigating rapidly changing or poorly documented code isn’t new: It’s been a constant at every company I’ve worked with. At larger organizations the sheer volume of code, often written by adjacent teams, will outpace your ability to fully understand it. Smaller companies tend to iterate so quickly (and experience so much turnover) that code written two weeks ago might already be unrecognizable, if the original author is even still around after those two weeks!

The old adage still applies: the ability to read code is more crucial than the ability to write it. LLMs just amplify that dynamic. The only real difference is that you should assume the author is gone the moment the code lands. The author is ephemeral, or they went on PTO/quit immediately afterward: Whatever makes you more comfortable.

IgorPartola - 2 days ago

So far I have found two decent uses for LLM generated code.

First, refactoring code. Specifically, recently I used it on a library that had solid automated testing coverage. I needed to change the calling conventions of a bunch of methods and classes in the library, but didn’t want to rewrite the 100+ unit tests by hand. Claude did this quickly and without fuss.

Second is one time use code. Basically let’s say you need to convert a bunch of random CVS files to a single YAML file, or convert a bunch of video files in different formats to a single standard format, or find any photos in your library that are out of focus. This works reasonably well.

Bonus one is just generating sample code for well known libraries.

I have been curious what would happen if I handed something like Claude a whole server and told it to manage it however it wants with relatively little instruction.

romaniv - a day ago

No, this is not a pre-existing problem.

In the past the problem was about transferring a mental model from one developer to the other. This applied even when people copy-pasted poorly understood chunks of example code from StackOverflow. There was specific intent and some sort of idea of why this particular chunk of code should work.

With LLM-generated software there can be no underlying mental model of the code at all. None. There is nothing to transfer or infer.

gwbas1c - 2 days ago

One of the things I find AI is best for is coding operations that don't need to understand context.

IE, if I need a method (or small set of methods) that have clearly defined inputs and outputs, probably because they follow a well-known algorithm, AI is very useful. But, in this case, wider comprehension isn't needed; because all the LLM is doing is copying and adjusting.

mcoliver - a day ago

The counterpoint to this is that LLMs cannot only write code, they can comprehend it! They are incredibly useful for getting up to speed on a new code base and transferring comprehension from machine to human. This of course spans all job functions and is still immature in its accuracy but rapidly approaching a point where people with an aptitude for learning and asking the right questions can actually have a decent shot at completing tasks outside of their domain expertise.

dbuxton - 2 days ago

I think this is a relative succinct summary of the downside case for LLM code generation. I hear a lot of this and as someone who enjoys a well-structured codebase, I have a lot of instinctive sympathy.

However I think we should be thinking harder about how coding will change as LLMs change the economics of writing code: - If the cost of delivering a feature is ~0, what's the point in spending weeks prioritizing it? Maybe Product becomes more like an iterative QA function? - What are the risks that we currently manage through good software engineering practices and what's the actual impact of those risks materializing? For instance, if we expose customer data that's probably pretty existential, but most companies can tolerate a little unplanned downtime (even if they don't enjoy it!). As the economics change, how sustainable is the current cost/benefit equilibrium of high-quality code?

We might not like it but my guess is that in ≤ 5 years actual code is more akin to assembler where sure we might jump in and optimize but we are really just monitoring the test suites and coverage and risks rather than tuning whether or not the same library function is being evolved in a way which gives leverage across the code base.

simonsarris - 2 days ago

A softer version of this has existed since word processing and Xerox machines (copiers) took off, in law and regulations. Tax code, zoning code etc exploded in complexity once words became immensely easy to create and copy.

dexterlagan - a day ago

We've been through many technological revolutions, in computing alone, through the past 50 years. The rate of progress of LLMs and AI in general over the past 2 years alone makes me think that this may be unwarranted worry and akin to premature optimization. Also, it seems to be rooted in a slightly out of date, human understanding of the tech/complexity debt problem. I don't really buy it. Yes complexity will increase as a result of LLM use. Yes eventually code will be hard to understand. That's a given, but there's no turning back. Let that sink in: AI will never be as limited as it is today. It can only get better. We will never go back to a pre-LLM world, unless we obliterate all technology by some catastrophy. Today we can already grok nearly any codebase of any complexity, get models to write fantastic documentation and explain the finer points to nearly anybody. Next year we might not even need to generate any docs, the model built in the codebase will answer any question about it, and will semi-autonomously conduct feature upgrades or more.

Staying realistic, we can say with some confidence that within the next 6-12 months alone, there are good reasons to believe that local, open source models will equate their bigger cloud cousins in coding ability, or get very close. Within the next year or two, we will quite probably see GPT6 and Sonnet 5.0 come out, dwarfing all the models that came before. With this, there is a high probability that any comprehension or technical debt accumulated over the past year or more will be rendered completely irrelevant.

The benefits given by any development made until then, even sloppy, should more than make up for the downside caused by tech debt or any kind of overly high complexity problem. Even if I'm dead wrong, and we hit a ceiling to LLM's ability to grok huge/complex codebases, it is unlikely to appear within the next few months. Additionally, behind closed doors the progress made is nothing short of astounding. Recent research at Stanford might quite simply change all of these naysayers' mind.

__mharrison__ - 2 days ago

My experience has been that LLMs help me make sense of new code much faster than before.

When I really need to understand what's happening with code, I generally will write it each step.

LLMs make it much easier for me to do this step and more. I've used LLMs to quickly file PRs for new (to me) code bases.

kristianc - 2 days ago

Soon a capable LLM will have enough training material to spit out LLMs are atrophying coding skills / LLM code is unmaintainable/ LLM code is closing down opportunities for juniors / LLMs do the fun bits of coding pieces on demand.

A lot of these criticisms are valid and I recognise there's a need for people to put their own personal stake in the ground as being one of the "true craftsmen" but we're now at the point where a lot of these articles are not covering any real new ground.

At least some individual war stories about examples where people have tried to apply LLMs would be nice, as well as not pretending that the problem of sloppy code didn't exist before LLMs.

injidup - 2 days ago

Shouldn't you be getting the LLM to also generate test cases to drive the code and also enforce coding standards on the LLM to generate small easily comprehensible software modułes with high quality inline documentation.

Is this something people are doing?

estimator7292 - 2 days ago

At my first programmer job, a large majority of the code was written by a revolving door of interns allowed to push to main with no oversight. Much of the codebase was unknown and irreplaceable, which meant it slowly degraded and crumbled over the years. Even way back then, everyone knew the entire project was beyond salvage and needed to be rewritten from scratch.

Well they kept limping along with that mess for another ten years while the industry sprinted ahead. The finally released a new product recently, but I don't think anyone cares because everyone else did it better five years ago

blindriver - 2 days ago

No, I 100% don't think it will happen.

LLMs have made the value of content worth precisely zero. Any content can be duplicated with a prompt. That means code is also worth precisely zero. It doesn't matter if humans can understand the code, what matters is if the LLM can understand the code and make modifications.

As long as the LLM can read the code and adjust it based on the prompt, what happens on the inside doesn't matter. Anything can be fixed with simply a new prompt.

osigurdson - 2 days ago

I wonder how long it will take for the world to kind of catch up to reality with today's (and likely tomorrow's) AI? Right now, most companies are in a complete holding pattern - sort of doing nothing other than small scale layoffs here and there - waiting for AI to get better. It is like a self-induced global recession where everyone just decides to slow down and do less.

sorcercode - a day ago

There's truth to a lot of what's said in this post and I see many people complain but these opinions feel short sighted (not meant derogatorily - just that these are shorter term problems).

> Teams that care about quality will take the time to review and understand (and more often than not, rework) LLM-generated code before it makes it into the repo. This slows things down, to the extent that any time saved using the LLM coding assistant is often canceled out by the downstream effort.

I recently tried a mini experiment for myself to (dis)prove similar notions. I feel more convinced we'll figure out a way to use LLMs and keep maintainable repositories.

i intentionally tried to use a language I'm not as proficient in (but obv have a lot of bg in programming) to see if I could keep steering the LLM effectively

https://kau.sh/blog/container-traffic-control/

and I saved a *lot* of time.

sixhobbits - a day ago

One side of the equation is definitely that we'll get more 'bad' code.

But nearly every engineer I've ever spoken to has over-indexed on 'tech debt bad'. Tech debt is a lot like normal debt - you can have a lot of it and still be a healthy business.

The other side of the equation is that it's easier to understand and make changes to code with LLMs. I've been able to create "Business Value" (tm) in other people's legacy code bases in languages I don't know by making CRUD apps do things differently from how they currently do things.

Before, I'd needed to have hired a developer who specialises in that language and paid them to get up to speed on the code base.

So I agree with the article that the concerns are valid, but overall I'm optimistic that it's going to balance out in the long run - we'll have more code, throw away more code, and edit code faster, and a lot of that will cancel.

- a day ago
[deleted]
efitz - a day ago

I think I disagree with the premise.

If the assertion is, I want to use non-LLM methods to maintain LLM-generated code, then I agree, there is a looming problem.

The solution to making LLM-generated code maintainable involves:

1) Using good design practices before generating the code, e.g. have a design and write it down. This is a good practice regardless of maintainability issues because it is part of how you get good results getting LLMs to generate code.

2) Keeping a record of the prompts that you used to generate the code, as part of the code. Do NOT exclude CLAUDE.md from your git repo, for instance, and extract and save your prompts.

3) Maintain the code with LLMs, if you generated it with LLMs.

Mandatory car analogy:

Of course there was a looming maintenance problem when the automobile was introduced, because livery stables were unprepared to deal with messy, unpredictable automobiles.

strangescript - 2 days ago

So many of these concepts only make sense under the assumption that AI will not get better and humans will continue to pour over code by hand.

They won't. In a year or two these will be articles that get linked back to similar to "Is the internet just a fad?" articles of the late 90s.

malkosta - a day ago

I fight against this by using it mostly on trivial tasks, which require no comprehension at all, also fixing docs and extending tests. It helps me to focus on what I love, and let the boring stuff automated.

For complex tasks, I use it just to help me plan or build a draft (and hacky) pull request, to explore options. Then I rewrite it myself, again leaving the best part to myself.

LLMs made writing code even more fun than it was before, to me. I guess the outcomes only depends on the user. At this point, it's clear that all my peers that can't have fun with it are using it as they use ChatGPT, just throwing a prompt, hoping for the best, and then getting frustrated.

wiradikusuma - a day ago

From my experience, you should either treat LLM-generated code as the usual code (before LLM age) that you need to review every time it changes, or you should not review it at all and treat it as black box with clearly defined boundaries. You test it by putting on your QA hat, not your Developer hat.

You can't change your stance later, it will just give you a headache.

When the former breaks, you fix it like conventional bug hunting. When the latter breaks, you fix it by either asking LLM to fix it or scrap it and ask LLM to regenerate it.

randomtoast - 2 days ago

I this the only way to escape this trap is by developing better LLMs in the future. The rapid rate at which new AI-generated code is produced means that humans will no longer be able to review it all.

book_mike - 2 days ago

LLMs are powerful tools but they are not going to save the world. I have seen this before. The experienced crowd gets chuffed because it is a new pattern that radically changes their current workflow. The new crowd haven't optimised yet so they over use the new way of doing things until they moderate it. The only difference I can detect is that rate of change increased to an almost uncomprehensable pace.

The wave’s still breaking, so I’m going to ride it out until it smooths into calm water. Maybe it never will. I don't know.

cadamsdotcom - a day ago

Easy way to understand the code: have your AI write tests for it. Especially the gnarliest parts.

Tests prevent regressions and act as documentation. You can use them to prove any refactor is still going to have the same outcome. And you can change the production code on purpose to break the tests and thus prove that they do what they say they do.

And your AI can use them to work on the codebase too.

purpleredrose - 2 days ago

Code will be write only soon enough. It doesn't work, regenerate it until it passes your tests, which you have vetted, but probably was also generated.

titaniumrain - 2 days ago

If this problem has existed before, why start worrying now? And if scale might make it problematic, can we quantify the impact instead of simply worrying?

softwaredoug - 2 days ago

You learn more when you take notes. In the same way, I understand the structure of the code better when my hands are on keyboard.

I like writing code because eventually I have to fix code. The writing will help me have a sense for what's going on. Even if it will only be 1% of the time I need to fix a bug, having that context is extremely valuable.

Then reserve AI coding when there's true boilerplate or near copy-paste of a pattern.

pshirshov - a day ago

It can be partially addressed with proper set of agent instructions. E.g. follow SOLID, use constructor injection, avoid mutability, write dual tests, use explicit typings (when applicable) etc. Though the models are remarkably bad at design, so that provides just a minor relief. Everything has to be thoroughly reviewed and (preferably) rewritten by a human.

- a day ago
[deleted]
giancarlostoro - 2 days ago

If the code outputted does not look like code I cannot maintain in a meaningful way (barring like some algorithm or something specialized) I don't check it in. I treat it as if it were code from Stack Overflow, sometimes its awful code, so I rewrite it if applicable (things change, understandings change), other times it works and makes sense.

segmondy - 2 days ago

Well, IMO, the issue is that we are trying to merge with AI/LLM. Why must both of us understand the code base? Before it was us that just understood it, why not just have the AI understand it all? why do you need to understand it? to do what exactly? document it? improve it? fix it? Well, let the LLM do all of that too.

drnick1 - a day ago

Unrelated to the content of the article, but please stop including Gravatar in your blogs. It is disrespectful to your readers since you allow them to be tracked by a company that has a notoriously poor security and privacy record. In fact, everyone should blackhole that domain.

jebarker - 2 days ago

The phenomenon is not just true in coding. I think over time we’ll see that outsourcing thinking isn’t always a good idea if you wish to develop long term knowledge and critical thinking skills. Much like social media has destroyed the ability for many to distinguish truth and fiction.

mlhpdx - a day ago

Building roads, power, sewer and schools without budgeting for maintenance, upgrades and ultimately replacement. Having a capital burn that can’t plausibly be repaid. Focusing on having more code rather than the right code. Artfully similar behaviors to me.

jv22222 - a day ago

I've found LLMs to be pretty good at explaining how legacy codebases work. Couldn't you just use that to create documentation and a cheat sheet to help you understand how it all works?

holtkam2 - a day ago

I’m convinced this is the root cause of the strange phenomenon I call “LLM coding assistants don’t increase team velocity in the long run in real world settings”

vjvjvjvjghv - 2 days ago

You have a similar problem with projects where a large number of offshore developers is used. Every day you get a huge pile of code to review which is basically impossible within the available time. So you end up with a system that nobody really understands.

daveaiello - 2 days ago

I started using LLMs to refactor and maintain utility scripts that feed data into one of my database driven websites. I don't see a downside to this sort of use of something like Claude Code or Cursor.

This is not full blown vibe coding of a web application to be sure.

energy123 - a day ago

Been trying to figure out a way to use LLMs to better understand code that comes from LLMs at a level of abstraction somewhere between the code itself and the prompt, but haven't succeeded yet.

dweinus - a day ago

Ok, not peeking at the comments yet, but I am going to predict the "put more AI on it" people will recommend solving it by putting more AI on it. Please don't disappoint!

hnthrow09382743 - 2 days ago

Frankly, I'm on the side of no tech/comprehension debt ever being paid down if you want to believe this idea is true.

The analogy of debt breaks when you can discard the program and start anew, probably at great cost to the company. But since that cost is externalized to developers, no developer is actually paying the debt because greenfield development is almost always more invigorating than maintaining legacy code. It's a bailout (really debt forgiveness) of technical debt by the company, who also happens to be paying the developers a good wage on the very nebulous promise that this won't happen again (spoiler: it will).

What developers need to do to get a bailout is enough reputation and soft skills to convince someone a rewrite is feasible and the best option. And leadership who is not completely convinced you should never rewrite programs from scratch.

Joel Spolsky's beliefs here are worth a revisit in the face of hastened code generation by LLMs too, as it was based completely on human-created code: https://www.joelonsoftware.com/2000/04/06/things-you-should-...

Some programs still should not be rewritten: Excel, Word, many of the more popular and large programs. However many smaller/medium applications that are being maintained by developers using LLMs in this way will more easily have a larger fraction of LLM generated code that is harder to understand (again, if you believe the article). Where-as before you might have rewritten a small program, you might now rewrite a medium program.

rafaelbeirigo - 2 days ago

I haven't used them in big codebases, but they were also able to help me understand the code they generated. Isn't this feasible (yet) on big codebases?

vanillax - a day ago

Offshore coding practices in the 2010's is the same thing as LLM. Id take LLM over offshore 10/hr devs any day of the week...

codazoda - 2 days ago

I find LLMs most useful in this understanding of legacy code.

I can ask questions like, “how is this code organized” and, “where does [thing] happen?”

axpy906 - a day ago

I get what the author is saying but isn’t that why we have problem solving, test coverage and documentation?

JCM9 - 2 days ago

It’s not just code, but across the board we’re not seeing AI help people do better things faster, we’re seeing them meh mediocre things faster under the guise of being “good.”

The market will eventually self correct once folks get more burned by that.

jermberj - a day ago

> An effect that’s being more and more widely reported is the increase in time it’s taking developers to modify or fix code that was generated by Large Language Models.

And this is where I stop reading. You cannot make such a descriptive statement without some sort of corroborating evidence other than your intuition/anecdotes.

purpleredrose - 2 days ago

Code is going to be write only soon enough. There will be no debt just regenerated code.

laweijfmvo - a day ago

is no one using LLMs to help them read/understand code? reading code is definitely a skill that needs to be acquired, but LLMs can definitely help. we should be pushing that instead of “vibe coding”

vdupras - a day ago

Our collective future has "mediocrity" written all over it.

And when you think about it, LLMs are pretty much, by design, machines that look for truth in mediocrity in its etymological sense.

intrasight - 2 days ago

> ... taking developers to modify or fix code

Fix your tests not your resulting code

danans - a day ago

The guiding principle for most of the tech industry is to produce the cheapest thing you can get away with. There is little intrinsic motivation toward quality left in the culture.

When velocity and quantity are massively incentivized over understanding, strategy, and quality, this is the result. Enshittification of not only the product, but our own professional minds.

Havoc - 2 days ago

At least llms are pretty good at explaining code

cmrdporcupine - 13 hours ago

After a few months of going down this hole with agentic coding (mostly Claude Code) I personally think the problem comes down to a few factors:

1. Initially euphoria both with having the tool and seeing how much can be done quickly, not having a good sense of its limits or reach. Mining too deep, and disturbing the Balrog. Basically: doing too much.

2. Not sufficiently reviewing the work it produces.

3. The tools themselves being badly designed from a UX POV to encourage #1 and #2.

From my perspective, there's a fundamental mis-marketing of the agentic tools, and a failure on the part of the designers of these products -- what they could be producing is a tool to work with developers in a Socratic dialogue, in an interactive manner, having the engineer have more of mandatory review and discussion process that makes sure there's a guided authoring process.

When guided and fenced with a good foundational architecture, Claude can produce good work. But the long term health of the project depends on the engineer doing the prompting to be 100% involved. And this can actually be an insanely exhausting process.

In the last 6 months, I have gone from highly skeptical and cynical about LLMs as coding agents, to euphoric and delighted, back to a more cautious approach. I use Claude Code daily and constantly. But I try to use it in a very supervised fashion.

What I'd like to see is agentic tools that are less agentic and more interactive. Claude will prompt you Yes/No diff by diff but this is the wrong level of granularity. What we need is something more akin to a pair programming process and instead of Yes/No prompts there needs to be a combination of an educational aspect (tool tells you what it's discovered, and you tell it what you've discovered) with review.

The makers of these tools need to have them slow down and stop pretending to automate us out of work, and instead take their place as tools used by skilled engineers. If they don't, we're in for a world of mess

ccvannorman - 2 days ago

I joined a company with 20k lines of Next/React generated in 1 month. I spent over a week rewriting many parts of the application (mostly the data model and duplicated/conflicting functionality).

At first I was frustrated but my boss said it was actually a perfect sequence, since that "crappy code" did generate a working demo that our future customers loved, which gave us the validation to re-write. And I agree!

LLMs are just another tool in the chest; a curious, lighting fast jr developer with an IQ of 85 who can't learn and needs a memory wipe whenever they make a design mistake.

When I use it knowing its constraints it's a great tool! But yeah if used wrong you are going to make a mess, just like any powerful tool

pnathan - 2 days ago

I am running a lightweight experiment - I have a Java repo I am vibecoding from scratch essentially. I am effectively acting as a fairly pointy haired PM on it.

The goal is to see what how far I can push the LLM. How good is it... really?

rhelz - 2 days ago

When was the golden age, when everybody understood how everything worked?

- 2 days ago
[deleted]
throw_m239339 - a day ago

That's fantastic IMHO, it guarantees competent engineers decades of work to fix all the bad code deployed, eventually. Let's not even get started on performance optimization jobs.

m3kw9 - a day ago

This is what happens when you let the AI run for 30 minutes. Ain’t no way you will read the code with much critique if it’s a 1 hour+ read. You have to generate compartmentized code so you don’t need to check much

wilg - a day ago

Luckily LLMs can also comprehend code (and are getting better at doing so), this problem will probably solve itself with more LLMs. (Don't shoot the messenger!)

josefrichter - 2 days ago

I guess this is just another definition of vibe coding. You're deliberately creating code that you don't fully understand. This has always existed, but LLMs greatly amplify it.

danans - a day ago

When speed and quantity are incentivized over understanding and quality, this is what we get. Enshittification of not only the product, but our own professional minds.

justinhj - a day ago

Technical leaders need to educate their teams not to create this kind of technical debt. We have a new tool for designing and implementing code, but ultimately the agent is the software engineer and the same practices we have always followed still have value; more value perhaps.

bongodongobob - a day ago

This is just tech debt. It's all around us. This isn't a new concept and it's not something new with LLMs/AI. This isn't ANY different than on boarding any tech into your stack.

vonneumannstan - 2 days ago

This is surely an issue and more and more serious people are admitting 50% or more of their code is now AI generated. However it looks like AI is improving fast enough that they will take the cognitive load of understanding large code bases and humans are relegated to System Architecture and Design.

scotty79 - 2 days ago

I'm sure future LLMs will be able to comprehend more. So the debt, similarly to real world debt, is fine, as long as the line goes up.

bparsons - a day ago

This is a problem that is not unique to software engineering, and predates LLMs.

Large organizations are increasingly made up of technical specialists who are very good at their little corner of the operation. In the past, you had employees present at firms for 20+ years who not only understand the systems in a holistic way, but can recall why certain design or engineering decisions were made.

There is also a demographic driver. The boomer generation with all the institutional memory have left. Gen-X was a smaller cohort, and was not able to fully absorb that knowledge transfer. What is left are a lot of organizations run by people under the age of 45 working on systems where they may not fully understand the plumbing or context.

righthand - a day ago

I think it’s a lot worse. My coworkers don’t even read the code base for easily answered questions anymore. They just ping me on Slack. I want to believe there are no dumb questions, but now it’s become “be ignorant and ask the expert for non-expert related tasks”.

What happened? I don’t use Llms really so I’m not sure how people have completely lost their ability to problem solve. They surely must remember 6 months ago when they were debugging just fine?

claytongulick - a day ago

I'm so glad someone has finally described the phenomena so well.

"Comprehension debt" is a perfect description for the thing I've been the most concerned about with AI coding.

One I got past the Dunning-Kruger phase and started really looking at what was being generated, I ran into this comprehension issue.

With a human, even a very junior one, you can sort of "get in the developer's head". You can tell which team member wrote which code and what they were thinking at the time. This leads to a narrative, or story of execution which is mostly comprehensible.

With the AI stuff, it's just stochastic parrot stuff. It may work just fine, but there will be things like random functions that are never called, hundreds or thousands of lines of extra code to do very simple things. References to things that don't exist, and never have.

I know this stuff can exist in human code bases too - but generally I can reason about why. "Oh, this was taken out for this issue and the dev forgot to delete it".

I can track it, even if it's poor quality.

With the AI stuff, it's just randomly there. No idea why, if it is used, was ever used, makes sense, is extra fluff or brilliant.

It takes a lot of work to figure out.

techlatest_net - a day ago

[dead]

revanwjy - 2 days ago

[dead]