LLM Use In 2025

December 30, 2024 6-minute read

The AI hype is real, and so are the AI benefits. I started using AI coding tools during the early days of GitHub Copilot and haven’t looked back. They’ve only gotten better in all respects. In 2024, there was a big surge in interest in Cursor, and I, too, used it up until recently. The functionality truly is amazing—almost creepily so. Only a year ago, you’d have to explicitly ask the AI what you wanted. Now, I can open a brand-new file, and Cursor will often provide the right bit of code based simply on what I’ve been working on recently. For things I’m not super familiar with, such as Tailwind, the productivity increase is probably 50x. I can give it an image of something and simply say, “Make this using HTML and Tailwind,” and it gets reasonably close.

But here’s the question: at what cost? The media frequently covers things like energy costs and the potential for job loss attributable to AI. But lately, I’ve been thinking about something much more personal: the decline in my own skills. The AI proponents I read always add a big asterisk to their statements, saying that, yes, there will be hallucinations, and, of course, you have to verify the output of the AI. Oh, and you also have to have good tests. So the AI will deliver a lot for you, but you have to stay on top of it. Let’s unpack how you do that.

One immediate problem that comes to mind is that writing basic tests has become, for better or worse, one of the examples of things AI can do quickly to reduce tedium for programmers. But if AI is writing both the tests and the code, then, at best, it is verifying its own view of the world. But of course, the humans will be reviewing all these tests and code, right? Let’s set aside the fact that we’ve now described a future where AIs do the fun stuff and humans are left doing tedious code reviews, and ask: will humans be able to continue to do effective code review?

Today we can, because we’ve spent years writing code and are intimately familiar with the domain and the subtleties of programming. But what about when we’ve offloaded much of the generative aspect of programming to AI? Therein lies my concern. After a couple of years of using AI assistants, I can already feel my brain rewiring to optimize AI use. I’m losing some fo the precision I had, as well as a clear mental model of how everything is fitting together. “Tab-complete all the things!” my brain says. And since most of the time it works, I’m getting lazier and lazier with my on-the-spot reviews.

AI-assisted coding is like a drug. When faced with a completely empty terminal with no code completion, I feel withdrawal-like symptoms and wonder if I can even write code that builds. If you haven’t tried this, I’d encourage you to!

In some respects, this is just part of technological advancement. I mean, thanks to technology my penmanship is worse, I can only do simple math in my head or even on paper, and I’d probably struggle to navigate a city using just paper maps. But those things aren’t causing me any stress, so is this any different? I think so. In those previous examples, the result is usually verifiable with little effort. The algorithms themselves are fine-tuned to the task and have been verified over decades and we have high confidence in them. When your calculator gives you a result, you don’t verify it by hand.

Not so with AI/LLMs. We are continuously told that there may be errors in generation, and we need to check the results. Imagine if that was the case with calculators, and 10% of the time, the answers were a bit off. It would be quite nerve-racking if you weren’t confident in your own skills to do the same job as the calculator, at least well enough to verify the output.

Yet that is where I fear we’re headed with AI-generated code. Consider the case where a code suggestion brings in a new library, and everything is building and seems to work. Will developers visit the library’s documentation and understand what all it does, whether it’s the best solution, whether the functions are being used correctly, and whether the library is maintained or has security issues? It will be tempting to skip this type of due diligence. Maybe we were skipping it before, but it was certainly more common to have to seek a library based on the need and read about it, just to get it working.

This type of situation recently had me second-guessing how I’m using AI code. I needed some HTMX code the other day and asked both Claude and ChatGPT for help. The solutions they offered looked functional but felt verbose, manually doing something I thought HTMX could handle directly. I probed a bit and didn’t get much improvement. I then checked the HTMX docs, found the function I was thinking of, and specifically asked why it didn’t use it. I finally got the all-too-familiar enthusiastic/apologetic response with a solution incorporating that function. I happened to know about the function because I’d skimmed the docs in the past, and had a sense that there was a better way. If I’d been new to HTMX, I wouldn’t have realized how bad the AI solutions were.

So what is the developer to do? What AI can do is amazing, and ignoring the enhancements is probably career-limiting. But I also don’t think we really know what the right balance is. Accepting no assistance at all will be too slow relative to what others are doing. Blindly shipping AI-generated code, or not having the skill to properly review it, will be worse. But what is the balance point? What is the right amount of human-generated content that allows one to stay sharp enough to work with the machines responsibly? I think a lot of developers and educators will be wrestling with that question over the coming years. For me, 2024 was the year I felt I’d swung too far away from putting in the work myself. I need to step back from maximal automation and exercise my own generative muscles. So far, I’ve decided on:

No longer leaning on long-form code suggestions like Cursor was delivering unless I’m specifically asking for it in special circumstances where I’m tolerating not knowing all the details (e.g., “convert this to Tailwind”).
Continuing to use old-school auto-completion is totally fine. It’s often better, in fact, and avoids the tiresome routine of an LLM serving up functions that don’t exist, especially when the LSP knows exactly what does exist.
Getting back to a simpler environment. This means stepping back from VSCode and heavier IDEs. For 15 years, I lived in Vim, so it’s time to get Neovim going and rediscover terminal editing. The recent GA release of Ghostty, a project I’ve been following for a while, is a good motivator here.

2025 will be a year of course correction. I don’t know what the balance point is or how much AI I should use, but hopefully I’ll find the sweet spot.