New Notes on Artificial Intelligence
July 16th, 2025
The last time I talked about AI here was all the way back in March of 2023, just a few months after ChatGPT was released to the public, and had already been adopted by over a hundred million users for things like cheating on homework, writing unoriginal resumes for job applications, and getting kicked out of college for plagiarism. But my focus then was not on how AI tools were already being used, but how they might be used in the future. I was overall optimistic at that point that artificial intelligence could be integrated into the ways we already use the internet — summarizing and writing emails for us, posting things on social media for brands, filtering our experience further to help eliminate distractions — in ways that could be helpful and time-saving, but I also mentioned a general sense of wariness I had about the increasing pace of technological development.
Today, more than two and a half years after the release of ChatGPT and the introduction of large language model systems into many aspects of our daily lives, I want to talk about the ways in which artificial intelligence technology has so far consistently failed to live up to the lofty expectations big tech has set for it, the many problems with current versions of it, but also the ways in which it can and will continue to be transformative.
I think that any new technological development since the beginning of the first Industrial Revolution in the late 18th century can be viewed and measured almost like a balancing update to a video game. And, like any good balancing update, there are going to be positive and negative changes. With the invention of the telegraph in 1837, it suddenly became possible to transmit messages across great distances much more cheaply and much more quickly. This can be imagined as the +% efficiency stat. As the technology spread, it also became much easier for the average person to send and receive messages across distances, rather than such a thing being reserved for the wealthy and powerful. This is the +% accessibility stat. But there were also costs associated with this advancement. If, for example, you were a diplomat in the 19th century, it might not be a good idea to send sensitive information to a representative of another nation via telegraph when sending a message via personal messenger might help avoid miscommunication. This is the -% quality stat. With any technological advancement, the goal ought to be to add as much value as possible in terms of efficiency and accessibility, while minimizing any kind of reduction in quality or other negative side effects.
Today, tech giants like Google, Microsoft, Nvidia, and OpenAI are desperate to have you believe that the positive impact of artificial intelligence tools like ChatGPT in terms of accessibility and efficiency are going to be on the same level of consequence as the internet, the radio, the telegraph, etc, and that any negative consequences will be negligible and easily managed. I'd like to take a minute to unpack each of these claims separately.
Let's start with the exciting promises about the positive impacts of artificial intelligence tools. I remember when ChatGPT was first released, back in 2022, it felt very much like some kind of super intelligence may be mere months away given the rate of advancement in the capabilities of large language model generative AI.
However today we can see that the difference, in terms of computational power, between ChatGPT 3.5, released in late 2022, and GPT 4o, released in May 2024 is not that impressive, and while it remains to be seen what the capabilities of a GPT 5 might be when released, we're currently experiencing what seems like a performance s-curve for large language model capabilities. There's a great video by NeetCode which explains these performance issues in more detail, and I highly recommend you check it out. None of this is to say that AI tools aren't improving, they are, but the slow rate at which that improvement is taking place (which, by the way, seems disconnected from the trillions of dollars of market value that is being created by hype around AI) indicates to me that tech companies, despite massive investments, are struggling to push the +% quality stat as high as it needs to be in order for large language models to be a truly transformative technology in their usefulness, and in their profitability. So much of the money that is being made by AI tools right now — such as the 16.8 million dollar deal OpenAI recently signed with the California State University system to give students access to ChatGPT 4o — is founded in the expectation that the technology will continue to become more advanced, more powerful, and more integrated in the everyday lives of students and professionals. Taken one way, this kind of investment by educational institutions seems like an intelligent way to get ahead of oncoming advancements. Taken another way, it seems like a self-dependent ploy by big tech to buy themselves more time and money before the house of cards of stalling advancement falls apart.
Another company that has little to show for their continued investment in AI is Apple. During their 2024 World Wide Developers Conference, where they annually show off the new capabilities of their software, artificial intelligence, which they have creatively dubbed "Apple Intelligence" was the focal point of the entire event. They promised that new versions of IOS and MacOS would include advanced large language model upgrades to Siri, and integrations of ChatGPT into their photo-recognition and messaging tools. A year later, almost none of those promises had been kept, and at WWDC 2025, so-called "Apple Intelligence" was barely mentioned. In a surreal interview with the Wall Street Journal's Johanna Stern, Apple executives basically admitted that they hadn't been able to get the technology to work as well as they thought they'd be able to in the time-frame they'd set for themselves.
To me, these kinds of failures also undercut the potential artificial intelligence tools have to be a major advancement in accessibility. In theory, a large language model tool like ChatGPT can be used as a powerful research assistant, a personal tutor, or even a therapist (and indeed, many people are already using it as such) but if isn't able to perform those tasks as competently as we've been promised, it's not making those services more accessible, it's making them less accessible by posing a risk to the jobs of real human equivalents. As it stands right now, large language model tools offer very weak +% efficiency and +% accessibility stats in exchange for an extremely worrisome -% in quality.
It might seem like I've already been talking about "ways in which AI doesn't work", but really all that we've focused on so far is failures of tech companies to live up to promises. Tools like ChatGPT do, in fact, work. But I'd now like to turn my attention to the many ways in which they aren't working very well.
The biggest bombshell that I can drop here, which for me was dropped by Angela Collier in her amazing video, is that AI doesn't exist. Yes, that's right, artificial intelligence, which we've been talking about for over 1200 words here and for going-on three years out yonder, has yet to be invented. There's no clever caveat to that statement, it's just true. When tech companies create, sell, and market "AI" products, like ChatGPT, what they're selling is not an artificial intelligence like the ones in science fiction movies, even if talking to them makes it feel like they are. ChatGPT is a large language model, which is a kind of machine learning tool. Machine learning is not intelligence, it's a kind of computer program which, if given enough data, can predict, and generate, a desired result. So when you ask ChatGPT to answer a question you have, it doesn't know the answer, and it doesn't know whether or not what it's telling you is true. It doesn't know what truth is. It's just guessing, based on all of the trillions of data points it was trained on, what an appropriate and (maybe) accurate response might be given the input you've provided it with.
The "guessing game" nature of machine learning large language models is what accounts for the frequently hilarious and strange ways that ChatGPT can be wrong, such as in the viral videos of people asking it how many 'r's are in the word "strawberry", and it consistently insisting there are two, or one, or none. I'm not an avid user of any LLM (being, as I mentioned, primarily a precocious student of the humanities) but even I have discovered some similar "hallucinations" in the information it has provided. I'll give you a few examples.
Earlier this year I was reading a passage by John Dunne for a literature class I was taking, and I stumbled across his usage of the phrase, "for whom the bell tolls". It caught my eye, and I wondered if perhaps this passage was where Hemingway got the title for his famous novel. I looked it up on Google, which has in the last few months fully integrated its own large language model tool, Gemini, into search features to give a helpful "summary" of results, and my suspicion was confirmed. According to the AI, the name of Hemingway's novel was taken from a poem by John Dunne. Except wait! I looked down at the textbook in front of me. I wasn't reading a John Dunne poem, I was reading a John Dunne sermon (specifically Meditation XVII. During his lifetime, Dunne was probably more famous for his work as a clergyman as he was for his poetry, whereas now it is largely overlooked outside of literary circles.) Here was a mistake that a human doing research would be hard-pressed to make, since to us the difference between a sermon and poem is usually pretty obvious. But without having the passage in front of me, I totally would have trusted what I was told, and perhaps even misattributed it in an essay I was writing! To Google's credit, after I went through the relatively painless process of reporting the issue, it was fixed within a few days.
Another issue I experienced just a few days ago also came from Google's AI search results. I had been driving around town the day before and had to park at a parking meter. After I got home I became curious how much money Santa Monica, California makes from parking meters. Google's AI told me the figure was approximately 1.7 million dollars per year. But when I scrolled down to the actual, non-AI-generated search results, I saw a 2014 article from the Santa Monica Daily Press, a local news source that I generally trust, which told me that revenue from parking meters was over one million dollars per month! Scrolling back up to the AI results, I saw that I completely misread what the AI told me. It's answer, verbatim, to my query "how much money does Santa Monica make from parking meters" was "Santa Monica expects to generate an additional $1.7 million annually from its new smart parking meter system, [according to a 2012 article from the New York Times]" Because it had so helpfully highlighted a number, I scrolled past it assuming it had answered my question, when in fact it hadn't at all. Instead it had sourced an article that was over 10 years old which answered the totally unrelated question, "how much more money will Santa Monica make from parking meters in 2012 versus 2011?"
Of course, particularly in the second case where the AI wasn't explicitly wrong so much as simply unhelpful and misleading, the blame rests on my shoulders for not being more attentive, and instinctively trusting what I thought I saw. I remember when Google's AI search results launched last year, (which for me has become by far the most common way I engage with large language models of any kind) I thought "no way is this trustworthy, I'll just scroll to the actual results", but even that tiny amount of friction between myself and real, human-generated results, have made it so that frequently I don't. Each of these examples may seem tiny and insignificant, and that's because they are, but these are just the ones that have happened to me, and just the ones that I've noticed. Imagine what happens to a world where the number one source of "trusted" information for the vast majority of people starts giving totally incorrect or misleading answers 10% of the time, or 1% of the time, or even 0.1% of the time. We are living in one of those worlds right now, and the long-term effects could be that it continues to get more and more difficult, rather than easier and more convenient, to get trusted, verifiable information that you can find credible citations for.
One of the guidelines that Angela Collier talks about in her video for potential uses of AI is that no matter what, LLM AI should never be trusted with any kind of decision-making tasks. However, there is an argument to be made that any and every question you might ask an AI involves a whole bunch of different kinds of decision-making. If I were to ask Google's search AI for popular new fantasy books, for instance, it is going to decide to include some and exclude others from its recommendations based on metrics that I have no view of or control over, and which could in theory be determined by who is paying Google the most money to get that kind of recommendation. We've seen over and over again the ways that the personalities and belief systems of LLM tools can be influenced or changed based on the whims of its creators, such as how recently it has seemed that Elon Musk, famed Neo-Nazi, has tried to manipulate his chat-bot, Grok, to be more antagonistic to viewpoints that don't align with his own.
A lot of people are worried about artificial intelligence tools like ChatGPT evolving into some kind of advanced superintelligence that could destroy the human race. And indeed, if real artificial intelligence, not just machine learning, were to replace our current large language model tools, I think that could be a real cause for concern. Right now though, my chief concern about AI isn't that it's going to blow everything up in one fell swoop, but rather that by the flawed nature of its technological foundations it will instead make the world slightly worse in millions of almost unnoticeable ways, all while causing immense ecological damage due to the huge amount of natural resources and energy it requires to operate at scale. Curtains.
But now let's talk about the ways in which large language models still have enormous potential. While general intelligence would be a truly transformative technology on the scale of the telegraph, the radio, and the internet, I think that advanced machine learning technologies like ChatGPT will be transformative more along the lines of the spinning jenny, alternating current, or some other instance where an existing technology was refined into something more efficient and widely usable.
There's no one metric to determine how accurate or inaccurate a tool like ChatGPT or Google's Gemini really is in terms of a simple percentage. Accuracy is hugely dependent on what kind of question or task is being asked, what model is being used, and even when you're asking (as these models are continuously evolving, sometimes they seem to go through phases of being more or less accurate depending on what changes their developers have tried to implement), but answers to this question range anywhere from as low as 40-something percent accurate to upwards of 80-something percent. Maybe within the next few months or years there will be another performance breakthrough, and those numbers will go up, but another way to make these tools more useful is to introduce more guardrails, such as making sure an AI generated answer is able to cite where it's retrieving information from (though, as we saw in the parking meter example, that doesn't always guarantee a straightforward, accurate answer) and also giving artificial intelligence a more reserved, cautious personality that is less eager to please with an answer, and more concerned about getting the right answer and avoiding wrong answers.
There's also a lot of potential in the field of robotics with integrated large language model AI. Thinking about this always makes me think of Star Wars, a fictional universe where robots like C3PO are integrated into all aspects of society, and are able to perform all kinds of tasks, but are also the first to admit when they need a human to step in or when they aren't sure how something works. At this point though, imagining how AI and robotics might advance together is still more speculation than anything else.
Finally, I'd like to acknowledge the untapped potential of machine learning tools like large language models in the open source technology space. Right now, if you have a decently powerful computer, you can download and run large language models locally. These kinds of smaller models are under-powered compared with huge LLMs like ChatGPT, but I could totally imagine a future where hobbyists are able to utilize their own hardware and software to create AI tools to help them with work or household tasks, integrate them into video games as chat bots, conduct scientific research, and much, much more that I might be simply too inexperienced to imagine myself.
The potential capability of localized LLM technology is also promising because it would help to alleviate the efficiency and energy consumption issues which currently plague large LLMs like ChatGPT. It might also help to avoid the enormous data security and privacy concerns that have been raised about companies like Google and OpenAI having total control over the infrastructure of cloud-based LLM platforms.
Although it is impossible to predict what will happen in the future, historically it seems to me a pretty safe bet to think that the world won't be nearly as different, technologically speaking, in two years as people worry it might be. That's certainly been the case since ChatGPT was released two and a half years ago. But it's also a safe bet to think that the world will be radically different thirty years from now in ways that nobody could possibly predict, regardless of how large their marketing budget is. The only way to know for sure is to wait and see! Maybe I'll be back here with more thoughts on this in another few years. I remain more excited than scared of what future me might have to say.
​
- ALGC
