It’s something else.
Recently, there’s been a lot of talk about AI, or artificial intelligence. With the rise of online services to generate images or content in response to a simple textual prompt, the news has been full of both breathless reports and samples that run the range of impressive to cringe-worthy.
The problem? What most people are referring to as AI isn’t really AI at all.
Yes, I’m being somewhat pedantic, but I think it’s important to understand what’s really happening so as to set expectations — and perhaps blame — appropriately.
Become a Patron of Ask Leo! and go ad-free!
What artificial intelligence isn't.
Most of what’s being labeled as AI, or artificial intelligence, isn’t AI at all, but rather ML, for machine learning. ML is nothing more than software that analyzes huge collections of data for characteristics that can then be used by other software to perform specific tasks. While those tasks can look artificially intelligent, in some ways, they’re nothing more than the result of a lot of data processing.
AI is ML
Most of what we’re seeing discussed as artificial intelligence is really something called Machine Learning, or ML.
Machine learning is nothing more than collecting lots of data, analyzing the heck out of it for patterns, and then using the results of that analysis to perform other tasks.
For example, some so-called “assistive driving” technologies use data collected from thousands and thousands of cars on the road to “learn” what roads are, what objects are, what signs and lights are, etc. Having learned (and continually refining that knowledge), ML uses that information to steer an automobile safely down the road.
Of course, it’s nowhere near that simple. Particularly when it comes to something as critical as driving a car, there are boundaries on what can be “learned” and what requires innate knowledge, along with governors and other safety measures.
Let’s choose a simpler example. Dogs.
How ML works
With apologies to the pedants and ML researchers, this is a gross oversimplification to keep it somewhat understandable.
It all starts with people writing software. They design the software to analyze data fed to it for commonalities and differences and then record those characteristics as data of some sort.
They then feed the software data: lots and lots of data.
For our example, let’s give the software 5,000,000 images. Half of those images contain dogs and half do not. There is information called meta-data (data about the data) with each image that says “This is a dog” or “This is not a dog.” The software creates a large database of characteristics of what it means to be a dog. Or not.
People write more software. This software uses the database to analyze more photos. This time, the goal is not to update the database, but rather to determine whether the photo being analyzed is or is not a photo of a dog. The goal of this software, then, is to respond “Yes, this is a dog” or “No, this is not a dog.”
Since nothing is absolute, most such analysis software includes a measure of certainty. A response might be “I’m 99% certain this is a dog.”
People write still more software. This software now uses the information in the database to create images of dogs.
The ability to accurately analyze images and the ability to create realistic images are highly dependent on the quality of data from the original analysis. That data’s quality depends in turn on how much was collected — the more the better — and how accurate or error-free the initial meta-data was.
And, of course, how well all the software was written.
So far, though, no artificial intelligence was involved; just a lot of data collection, analysis, and processing.
ML and writing
Another area that’s gotten a lot of attention of late is the ability of “AI” to write text. In reality, it’s the same process:
- Software analyzes lots and lots (and lots and lots) of text to learn what the written word means.
- Software uses the result to generate written words based on a prompt.
As an example, I asked popular ChatGPT to “Describe the difference between artificial intelligence and machine learning.” Here’s what it came up with.
Artificial intelligence (AI) and machine learning (ML) are often used interchangeably, but they are not the same thing. AI refers to the ability of a computer or machine to mimic the cognitive functions of the human mind, such as learning and problem-solving. In other words, AI is the broader concept of machines being able to carry out tasks in a way that we would consider “smart.”
Machine learning, on the other hand, is a specific type of AI that involves the use of algorithms and statistical models to enable a system to improve its performance on a specific task over time. In other words, machine learning involves the use of algorithms to enable a machine to learn from data and improve its performance on a specific task without being explicitly programmed to do so.
In summary, AI is the broader concept of machines being able to carry out tasks in a way that is intelligent, while machine learning is a specific type of AI that involves the use of algorithms to enable a machine to learn from data and improve its performance on a specific task.
It’s not wrong.
On the other hand, when I asked it to describe the process to download and install Windows 11, it started by telling me that Windows 11 isn’t available yet.
It appears that my job is safe. For now.
Poisoning the well: the risks of ML
One of the things I hope you noticed above is that each step of the machine-learning journey started with people writing software.
There’s nothing magical about ML; it’s just very complex software written by humans. Imperfect humans.
Much of the so-called AI you see today — including my examples from DALL-E and ChatGPT above — are considered to be in “beta”. The software is being tested and presumably improved over time. But, as my Windows 11 example showed, it’s not perfect by any means.
Even with perfect software, machine learning suffers from the “garbage in – garbage out” phenomenon. It’s only as good as the data you feed it. If you feed it poor data, you can expect poor results.
Perhaps more concerning is that if you feed it intentionally misleading data, you can poison the entire dataset. Consider telling the analysis software above that several thousand pictures of gorillas are really dogs. Results could be… disturbing.
Now consider something more serious, like poisoning the data used for automobile assistance, and you can see that not only are there pragmatic issues (“That dog just ain’t right!”) but also safety and security issues (“Why is my car suddenly veering to the left?”).
But once again, there’s nothing “intelligent” about this. It all comes down to computer software processing data. Lots and lots of data.
Honestly, these are exciting times. I drive one of those cars, and it’s truly impressive what it can do and how it responds to the surrounding environment. Do I trust it completely? Of course not! But it’s a harbinger of great things to come, of that I’m certain.
As with any technology, it’s worth becoming familiar with at least the basic terms and concepts. Understanding that it’s just people and software and not (yet) HAL 9000 should make things a little more concrete and a little less scary.
I’m not artificial. Not yet, anyway. Let me prove it to you: Subscribe to Confident Computing! Less frustration and more confidence, solutions, answers, and tips in your inbox every week.