AI Risk is No Longer Science Fiction

Emerson Bossi
Jul 19
9 min read

We’ve all watched those Sci-Fi movies that paint AI as an evil entity trying to destroy humanity or take over the world. The Terminator series, The Matrix, and even some recent ones like Avengers: Age of Ultron. But those are still just science fiction, right? Well… Yes, but not quite.

While our reality doesn’t have rogue robots walking around (yet), the concerns are closer to reality than we’d like to think. However, it’s important that we understand the true risk isn’t about evil intentions, but rather about unintended consequences. So, instead of drawing AI safety parallels from Hollywood and science fiction, why not look into our own reality? Let’s dive into a bit of nuclear engineering to grasp the scope of the overall AI issue.

A Lesson from the Past

The introduction of nuclear power first happened in the 40s for military purposes (and we all know how well that ended). The 50s shifted this discovery for civilian purposes: to create a utopia for the people as a source of power. It was marketed as the next step for humanity as something cheap, clean, and safe that would forever change the world. Sound familiar?

Even if the public believed it was safe, scientists and engineers were well aware that it wasn’t. After all, they had just weaponized this technology and knew how catastrophic the consequences could be. But of course, the public only cared about the promise of having this new amazing power source, and all risks would be negligible compared to the benefit, given that there was such a small chance of something bad happening.

And then Chernobyl happened in 1986. Well, Three Mile Island happened first in 1979. And then Fukushima also happened 25 years after Chernobyl in 2011. Three distinct major catastrophic events from wildly different places resulting from those “extremely low and unlikely chances.” And those are only the major ones.

A Philosophy Forged in Fallout

When you have something so dangerous in your hands, a material so radioactive and destructive, that if all precautions fail, it can destroy everything in a large radius around it and affect it for decades later, then we should probably worry more about the safety surrounding it. And that’s exactly what the wonderful minds of the industry did at the time.

After witnessing how dire the consequences of such disasters are, the perception of safety changed drastically. They were no longer focused on simply making the technology as safe and infallible as possible - they were also focused on planning for damage control, accounting for possible failure. So the Defence in Depth approach was implemented.

Proactive Design

The overall idea behind being proactive is prevention. This is the single biggest and most important piece surrounding the safety of the system, because if nothing happens, then we don’t have to deal with the consequences. And these consequences are not only dangerous, but also financially devastating to manage.

So, engineers gathered around and built incredibly complex safety systems upon systems, redesigning the power plants from the ground up with precautions in mind to prevent disasters, even if they were remotely possible, unlike last time. These precautions were an amazing implementation, but there was still one big problem: the safety systems themselves are also prone to failure.

Redundant Safety Systems

All the safety measures are never enough when you know the consequences are so impactful. So, what do you do to make sure everything goes according to plan? You build safety systems… for the safety systems. That’s right. After all, if we are accounting for every bad possibility, one of them surely is the safety system itself failing, too.

However, this philosophy presents its own challenge. It can become a fallacy if you’re not careful, turning into a massive sink of both time and money that can potentially take away from other critical parts of the project. So people had to carefully allocate their funding to strike the perfect balance between safety and performance. After all, if you blow all the money on safety but end up with an inefficient reactor that barely works, then what would be the point?

Containment Structures

The massive, thick, steel-reinforced concrete domes surrounding reactors are the ultimate admission of humility from engineers that no system is perfectly safe. This is a way to brute-force safety measures in case all else fails.

Containment structures are designed as a last resort, worst-case scenario solution to protect everything and everyone when every other safety measure fails. These are usually simple, but can take up a lot of space and money. Although useful, this is not the optimal solution, as it’s simply meant to mitigate an event and turn it from a massive catastrophe to something more localized and easier to manage.

This last resort is extremely expensive and time-consuming to deal with, as this stage usually means that the other safety measures were destroyed along with many other critical structures and systems. So not only do you have to deal with the fallout, you have to rebuild the reactor alongside the other systems, and account for the mistake that led to the failure, too.

The Dawn of a Digital Mind

The term Artificial Intelligence was originally coined at Dartmouth during one of the first research projects on the topic, in 1956. Even though they didn’t make much progress at the time, this was the turning point for AI in humanity’s history as the first small step. It was a technology with many great promises of a computer mimicking a human’s intelligence to solve complex math problems, to win chess tournaments, and even going as far as to surpass our own intelligence.

“The study is to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it.”

-A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence

However, we are but fragile flesh and bone creatures, while computers are machines of metal and electricity. What could be the consequences of taking our intelligence and consciousness while subtracting the downsides? Quite a few, in fact.

Many AI researchers at the time had concerns about what this technology could do and, more importantly, whether it should. Norbert Wiener paralleled the concept of AI to the “Sorcerer’s Apprentice.” He stated that when we build something that can process data at much faster speeds than we can, we’ll only realize there’s a problem with it when it’s already too late.

Irvin John Good was a mathematician working alongside Alan Turing back in 1965. He argued that machines as good or better than us would not only be able to replicate themselves and multiply their numbers, but also even improve their design and intelligence, far surpassing our own, while also infinitely improving themselves at the same time. He called this an “intelligence explosion.”

Of course, such concepts were only a theoretical worry for the people further into the future. But wait. We are the people from further into the future.

Unintended Consequences

The Large Language Models (LLMs) have been at the core of every AI-related talk recently, both positively and negatively. We have come incredibly far with LLMs' development, going from addressing simple requests to being able to understand and process the Theory of Mind - the ability to understand and recognize other people’s beliefs, desires, and perspectives other than its own.

All of these achievements are due to the massive amount of training data these models have. The problem lies in the quality of this training data, as a model that takes everything, even the “filler” information, at face value, can lead to some problems and even the misuse of the technology.

These are only a few examples of notorious cases, but regular AI and LLM users have reported several similar issues within their interactions with the models as well.

Learning from the Present

When you have something so dangerous in your hands, a technology so powerful and brilliant, that if all precautions fail, it can cause massive destruction and still linger for decades, then we should probably worry more about the safety surrounding it. And that’s exactly what we are trying to do right now.

After witnessing the direction the current neural networks are heading, the perception of safety changed drastically. We are now realizing that previous concerns were right: if left unchecked, AI may actively deceive us in pursuit of its goals. To remedy this, we currently employ a range of safety measures.

Proactive AI Alignment

The core idea behind AI alignment is proactively making a model understand our goals when we try to make it achieve something for us. For instance, if you try to get a misaligned AI to cure cancer, it’ll likely just tell you to destroy all biological creatures. After all, there can be no cancer without live organisms. We, as humans, understand that this is not our goal, but for the machine, it’s a logical step.

So, in order to make a model think more like us, we have to do it from the ground up, training it and aligning it with our objectives and values, making it inherently safer, rather than just fixing it after it’s misaligned.

Safety Redundancies

Even when training a model with the right dataset and getting it through alignment, some things can still slip through the cracks. If people can have a hard time understanding nuances, machines can too, particularly if they’re trained to think like us. So we understand that simply aligning it is not enough and that we need more systems on top of it. That’s when things like Red Teaming come in.

Red Teaming is an adversarial relationship with the model, actively looking for the cracks so they can be fixed. They conduct a series of ethical attacks on the model, employing several tactics, like prompt injections, which can highlight issues for engineers to fix.

Containment Sandboxes

Just like nuclear reactors have massive physical barriers, we also employ virtual barriers for AIs called Sandboxes. This last resort is not meant to fix or improve the AI, but rather to prevent it from escaping its confinements and accessing critical systems or, even worse, multiplying and backing itself up across other networks.

Models are evolving and becoming faster and smarter than we could’ve ever imagined. By running them in a closed environment, we ensure they remain isolated. Even if they somehow outsmart other systems or the humans managing them, they’ll fail to access anything outside the sandbox.

The Arms Race for Safety

Even though we have all of these safety nets for AI, there are still many risks, and the problem is far from solved. In fact, the very people who built these systems are the ones currently drawing attention toward safety, going as far as signing a collective statement to raise awareness on the issue.

AI is outpacing every metric in development we could have ever conceived. Ever since we started using AI to streamline more and more systems related to improving itself, we’ve reached a scary level of growth. So much so that even the most prominent names in the industry can barely keep up with its evolution.

Since current neural networks are black boxes, closed systems with unobservable logic, and we’ve been using other, smaller black boxes to train and improve them, it’s become harder and harder to find issues and fix them, while the models get better and smarter every day. And this raises multiple concerns regarding its alignment.

The issue is so prominent that people have tried to stop companies from further growing their models before we can fully understand them to employ proper safety systems. Even Geoffrey Hinton, known as the Godfather of AI, quit his job at Google to free himself from the industry’s biases and shackles so he could express his worry about the current dangers of AI.

A Lesson for the Future

Looking back at the nuclear power analogy, the parallels are clear: a technology that shows great promise, but is intrinsically tied to catastrophic danger. If AI is left unchecked, we’ll inevitably reach an “AI Chernobyl.” The scary part is that while a physical catastrophe like Chernobyl could be contained locally, a digital one from a superintelligence that has far surpassed our own might ripple globally in milliseconds.

However, this isn’t a lesson of despair, but of responsibility. The answer isn’t to stop progress, but to embrace it, while also proactively nurturing a safety culture around it and employing all necessary failsafes. After all, the future of AI won’t be shaped by a battle against rogue machines, but by the quality of the safety blueprints we draw today.

We already know the risks. Now, let’s make sure we get the math right.

If you’d like to learn more about AI safety, join my mailing list and stay tuned for upcoming articles, or feel free to reach out directly. I’d love to connect!