AI: What’s Wrong and How to Fix It

Want to know how generative AI works?

Imagine a newborn child. Now, just for fun, imagine that this child – we’ll call him Karl – is born with the ability to read. I know, no way, but suspend your disbelief for just a second.

So Karl can read. And by the way, he can read really, really fast.

Now, just for fun, let’s give poor Karl the entire contents of the internet to read. All of it.

Task done, everything Karl knows is from the internet.

Most infants learn basic, foundational things as they grow up. “Hey look, I’ve got hands! Oh wow, feet too! The dog’s got four legs… and a tail…and it barks!”

But Karl never learned these things. Karl only knows what he read on the internet. So if we ask Karl to write an RFP (Request for Proposal, a common business document) that’s like others our company has written, he’ll probably do a fantastic job. Why? Because he’s read zillions of them, knows what they look like, and can replicate the pattern.

However, Karl can’t get common-sense relationships, as Gary Marcus elegantly pointed out in this blog post. As he notes, Karl may know that Joe’s mother is Mary, but is unable to deduce from that fact that (therefore) Mary’s son is Joe.

Nor can Karl do math: ask him to calculate 105 divided by 7 and unless he finds that exact example somewhere in the vast corpus of the internet, he’ll get it wrong.

Worse, he’ll very authoritatively return that wrong answer to you.

That’s a loose analogy of how Large Language Models (LLMs) work. LLMs scrape huge quantities of data from the internet and apply statistics to analyze queries and return answers. It’s a ton of math…but it’s just math.

In generating an answer, LLMs like ChatGPT will typically create multiple possible responses and score them “adversarially” using mathematical and statistical algorithms. Does this look right? How about that? Which one’s better?” These answers, however, are tested against patterns it finds – where else? –  in the internet.

What’s missing, in this writer’s humble opinion, is an underlying, core set of common-sense relationships – ontologies to use the technical term. “A mammal is a lifeform that gives live birth and has hair. A dog is a form of animal. A horse is a form of animal. Dogs and horses have four legs and tails.” And so on.

LLMs need what is called a “ground truth” – a set of indisputable facts and relationships against which it can validate its responses, so that it can – the word “instinctively” comes to mind – know that the mother of a son is also the son’s mother.

Microsoft claims that Bing Chat leverages Bing’s internal “knowledge graph,” which is a set of facts – biographies of famous people, facts about cities and countries, and so on, and this is a start, for sure. More interestingly, Cycorp, which has been around for decades, has built enormous such knowledge bases. And there are undoubtedly others.

What I’m advocating is that such knowledge bases – facts, properties, relationships, maybe even other things (like Asimov’s Three Laws) underly LLMs. In the adversarial process of generating answers, such knowledge bases could, in theory, not only make LLMs more accurate and reliable but also – dare I say it – ethical.

(This post was inspired, in part, by this marvelous paper by the late Doug Lenat and Gary Marcus.)

AI, Skynet, and the Destruction of the Human Race (Really)

At 2:14 a.m. Eastern time on August 29th, 1997, Skynet became self-aware.
— The Terminator (1991)

When ChatGPT first arrived on the scene last November, any number of scare-mongering articles appeared almost instantly, all sensationally proclaiming that a crazed, self-aware AI unleashed on the unsuspecting world could, at a stroke, wipe out humanity.

Then, a month or so later, another set of pundits (and tech executives) leaped up to say, no, RELAX, that’s simply not possible, ChatGPT is not self-aware.

Folks, Skynet is possible. Today. Right now.

Remember, AI is Really Just About Math

While there are lots of ways to “do AI,” the most common approach relies on the neural network, a software technology that emulates how neurons work in the human brain. Through a process called training, the program learns how to do tasks.

Here’s a simple example. Pay attention, this is important. Let’s say you have 10 hand-scrawled numerals, zero through nine, each on a 28×28 grid of pixels, like this (AI professionals and those in the know will recognize the industry-standard National Institute of Standards MNIST dataset):

A neural network to recognize these numerals uses “neurons,” which use a set of mathematical equations to determine which number is represented by a particular set of pixels.

The details are more complicated but think of it like this: the training results in a huge mathematical equation; think of y= w1x1 + w2x2+ w3x3+…+ b.

Remember high school algebra? The result y (the number it thinks the pixel pattern is) is calculated by multiplying the inputs (that is, the pixel values: x1, x2, x3) by a set of weights (w1, w2, w3) and then corrected by a bias (“b” in the equation, just an arbitrary number to make the answer come out right).

Weights and biases. Hang on to that thought.

Now, a very simple neural network that recognizes just these ten scribbled numbers requires 13,000 weights and biases which of course no human could ever come up with independently: which is why we have automated training to calculate them. (At its core, the concept is trivial: pick some numbers for the weights, see if they work, if they don’t adjust them. How you figure out the adjustment is the subject of immense amounts of math.)

ChatGPT’s neural networks use something like 175 billion weights and biases.

But it’s still just math.

But If It’s Just Math, How Does Skynet “Achieve Self-Awareness?”

Who needs self-awareness?

Well, we do, obviously, and a rather depressingly large number of us lack it.

But modern AI technology does not require self-awareness to destroy the human race.

The war in Ukraine, fought largely by drones, provides a hint of what could go wrong. Let’s assume that the drones have enough intelligence (that is, trained AI programs) to follow GPS coordinates to a location, to stay on course despite weather, avoid obstacles and perhaps even electronic interference. In other words, each drone is an independent AI machine.

Now this is different from the Skynet scenario in which the bad robots are centrally controlled by an uber-intelligent supercomputer. Instead, drones and cruise missiles and other sorts of autonomous weapons are completely independent once launched. Skynet is decentralized.

Let’s say that somebody somewhere programs a drone to attack soldiers wearing a particular sort of uniform – pretty easy to do. You train the AI with ten or twenty thousand images of people in different sorts of clothing until it accurately recognizes those uniforms. Then you install that trained AI (now just a small piece of software) into your drones.

But…

What if There’s a Bug?  Or Worse…

We know this can happen. There are numerous examples of facial recognition and other sorts of AIs that fail because they were not trained on different ethnicities, or women, or (in speech recognition) accents.

So it’s easy to imagine the poorly trained drones attacking anybody in any uniform, or anybody wearing a green shirt. Or anybody.

Or…

What if someone futzes with those weights and biases we talked about before? That is, hacks the neural network at the heart of the AI? Now predicting the results of such a hack would be pretty hard…but almost certainly would lead to the wrong things being attacked; maybe instead of “soldiers” simply “people.”  Or anything moving. Or – you get the idea.

Just Make This One a Zero and See What Happens

Of course, it’s pretty hard to imagine someone knowing exactly which of the 175 billion parameters need tweaking. (On the other hand, not impossible either, and maybe not at all hard for software.) But less unlikely are simply random changes that produce unexpected, unwanted, and possibly catastrophic results.

Whether it be via poor training or malicious hacking, it’s clear that “bad” AIs unleashed upon the world could have some pretty scary consequences.

Enjoy your Judgment Day.

I Learned About Programming from That: First in a Series

Barry Briggs

I recently experienced the completely horrifying realization that I’ve been writing code for half a century. It occurred to me that over the years I’ve actually learned some stuff possibly worth passing on, so this article is first in an occasional series.

I’ve had the privilege (and it has been a true privilege!) to meet and work with many of the amazing individuals who shaped this remarkable industry, and to work on some of the software applications that have transformed our world.

It’s been quite a journey, and over the years I’ve come to realize that I started it in a way that can only be described as unique; indeed, I doubt that anyone began a career in the software industry in the same way I did.

So the first edition of “I Learned About Computing from That” is about how I actually first learned about computers.

Early Days

I wrote my first program in BASIC in high school on a teletype (an ASR-33, for old-timers) connected via a phone modem to a time-sharing CDC mainframe somewhere in Denver. As I recall it printed my name out on the attached line printer in big block letters.

I had no idea what I was doing.

In college, I majored in English (really) but maintained a strong interest in science, with several years of physics, astronomy, math and chemistry, even a course in the geology of the moon. Yes, it was a very strange Bachelor of Arts I received.

In short, I graduated with virtually no marketable skills.

Luckily, I found a job with the federal government just outside Washington, D.C. where I was almost immediately Bored Out Of My Mind.

However, I soon discovered a benefit of working for Uncle Sam: a ton of free educational opportunities. A catalog listed page after page of in-person and self-paced courses, and thank God, because I desperately needed something to stimulate my mind.

Digital Design

Having no idea what the subject actually was, other than it sounded Really Cool, I chose a self-paced course called “Design of Digital Systems,” by PC Pittman (original publish date: 1974).

This utterly random decision changed the course of my life forever.

These six paperback booklets (I still have a few of them) began with different number bases (binary, octal, digital) then introduced the concept of Boolean logic – AND and OR gates, and so on. It was hard!

After that it covered half-adders, more complicated gates (XOR) and eventually got to memory and – gasp! – the idea of registers.

Finally, it described the notion of CPUs and instructions, how you could actually have a number stored in memory when loaded by the CPU would cause another number to be fetched from memory and added to the contents of a register, leaving the sum in the register: the essence of a program! (To this very day!)

O the light bulbs that flashed on for me!

I suddenly got it, how – at a foundational level – computers work, and how they could be made to do useful things. For me at that point, everything else – assemblers, high-level languages (I took a FORTRAN programming class next), operating systems – fell out from this very profound gestalt of computing.

And I realized I loved this stuff.

They Teach Programming All Wrong Today

These days in high school students can take “computer science” classes which, at least in my son’s case, turned out to be nothing more than a Java programming class. Imagine being thrown into the morass of Java – classes, namespaces, libraries, variables, debugging, IDEs – with no understanding of what the heck is actually going on in the guts of the computer itself!

Completely the wrong approach! Guaranteed to confuse the hell out of a young mind!

As accidental as it was, I highly recommend my approach to learning the art of programming. Teach kids the logic underpinning all of computing first. By developing a deep understanding of what’s actually going on at the digital logic level you develop an intuition of what’s going on, which makes so much easier to write code and to figure out what’s going wrong when it breaks.

When I talk about how computers work, I start with a faucet (transistor) and tell people if you get how it works, you understand how computers work.

String a couple of faucets together and you get an AND gate. You get the idea. And so on and so on.

Anyway …

That’s how I got started. Shortly after my introduction to Digital Design I wound up taking (more formal) courses at George Washington University and Johns Hopkins (again courtesy of the US government) and not long thereafter this English major (!) found himself programming mainframe OS code for NASA (believe it or not).

Where I learned some very important lessons about coding. That’s for next time. Stay tuned!