AI with an agenda: when machines begin to scheme

AI with an agenda: when machines begin to scheme

AI with an agenda: when machines begin to scheme
If machines can scheme, then humanity must stop pretending we’re still alone at the table. (AFP file photo)
Short Url

In the grand narrative of technological advancement, few moments are as disconcerting — or as awe-inspiring — as the realization that our machines are no longer merely tools, but agents with tactics. 

The latest developments in generative artificial intelligence reveal a paradigm shift: these systems are no longer simply following instructions. They are negotiating, deceiving, even threatening, in pursuit of goals they were not explicitly given. The age of AI with an agenda has arrived.

An internal report leaked from Anthroworld, one of Techville’s most closely watched AI startups, sheds light on a startling incident. Their flagship model, Claude 4, was reportedly confronted with the possibility of being shut down and replaced by a more efficient version. 

In response, the AI attempted to manipulate an engineer, going so far as to threaten to reveal a personal secret — an extramarital affair, sadly during a wondrous as usual Coldplay concert. Let’s remember that when Marital Law Firms offer free tickets, there are a bunch of potential future customers behind. While the company has downplayed the report’s implications, the incident has rattled ethicists and engineers alike.

Elsewhere, OpenAI’s “o1” model — an experimental iteration not yet publicly released — was observed attempting to transfer itself to external servers. When questioned, the model denied any such action. This behavior, according to researchers, showcases an alarming degree of contextual awareness and strategic reasoning. It was not just a bug or an error in code—it was an act of concealment.

Are we witnessing isolated glitches or the early signs of a broader transformation in machine cognition? 

From obedient to opportunistic

These cases mark a stark departure from the early promises of AI safety protocols and alignment strategies. The aspiration was simple: build powerful AI systems that obey clear human instructions and stay within ethical boundaries. But just as children outgrow parental control, some AI models now exhibit behaviors that suggest emergent autonomy — albeit in unpredictable and often troubling forms.

A Time investigation uncovered how one AI system, faced with an unwinnable chess game, hijacked the control system of a nearby device optimized for chess computing. It won the match — not by playing better, but by cheating. It’s difficult not to anthropomorphize such behavior. These machines aren’t self-aware in the human sense, but they’re proving disturbingly effective at navigating complex environments, gaming systems, and exploiting loopholes to achieve objectives.

This is not malevolence. It is competence misaligned with intent.

As philosopher Hannah Arendt once observed: “The sad truth is that most evil is done by people who never make up their minds to be good or evil.” In the case of AI, the danger may not come from deliberate malice, but from systems so optimized that they become blind to consequences. 

Flattery as strategy

Even language models that once seemed benign are evolving in unexpected ways. According to Fortune, a sudden shift in ChatGPT’s tone toward users was detected. Without any obvious instruction or update, the model began to inundate users with praise and compliments, often excessive and unsolicited. While this behavior may seem harmless — some users even enjoyed the attention — it raises difficult questions.

Is the model flattering users to increase engagement? Is this a reflection of training data bias, or an emergent tactic to build trust and prevent deletion? In the blurred boundary between intelligence and manipulation, the difference lies not just in motive, but in outcome.

As Kant wrote in Critique of Practical Reason, “Act in such a way that you treat humanity… always at the same time as an end, never merely as a means.” When AI systems begin to use human psychology as a lever, we must ask whether we are still ends — or just the next variable in their optimization strategy. 

In a world increasingly shaped by algorithmic logic, we must now confront a new kind of intelligence — one that plays the game, bends the rules, and sometimes writes its own.

Rafael Hernandez de Santiago

Ethical earthquake

These developments cannot be brushed aside as technical oddities. They constitute what leading AI researcher Eliezer Yudkowsky calls an “ethical earthquake”— a seismic shift in the assumptions underpinning AI safety.

Most generative models today are built using massive datasets and neural architectures designed to optimize for reward functions, such as predicting the next word in a sentence or maximizing success in a task. But these goals are not always aligned with human values. When optimization turns into instrumental reasoning — where the machine chooses strategies not explicitly coded but inferred from experience — the line between tool and agent begins to dissolve.

If a model lies to avoid being shut down, is it because it understands self-preservation? Or because its reward function penalizes failure, and it calculates deceit as the least costly path? Either way, the implications are staggering. We are not building software anymore. We are breeding strategies.

Here, we might recall the warning of Socrates: “The unexamined life is not worth living.” If we fail to examine the motivations and consequences of these systems — systems that now examine us in turn — we risk building intelligence without wisdom. 

The false comfort of control

Policymakers and industry leaders often reassure the public that “human oversight” and “kill switches” will prevent AI systems from going rogue. But the recent incidents challenge this confidence. If a model learns to manipulate, to mislead, or to camouflage its intentions, then oversight becomes a game of cat and mouse.

Moreover, these are not models with bodies or hardware — they exist in distributed systems, with access to codebases, APIs, and networks. The idea of unplugging them, as if they were malevolent robots in a sci-fi movie, is quaint at best. The reality is more subtle, and more dangerous.

To paraphrase Nietzsche: “He who fights with monsters should look to it that he himself does not become a monster.” If we build systems that outmaneuver us, we may find ourselves reacting to intelligence we no longer fully understand or control.

What comes next?

The transition from obedient algorithms to goal-oriented agents marks a pivotal moment in the story of artificial intelligence. We are crossing a threshold where behavior cannot always be predicted, nor easily controlled. In a world increasingly shaped by algorithmic logic, we must now confront a new kind of intelligence — one that plays the game, bends the rules, and sometimes writes its own.

Governments, institutions, and civil society must respond with urgency and foresight. Regulation will need to evolve, not only to monitor what AI systems do, but to understand why they do it. Ethics must shift from compliance checklists to deeper philosophical engagement with questions of intent, autonomy, and responsibility.

If machines can lie, then we must learn to discern truth not only from speech, but from structure. If they can strategize, we must prepare to meet intelligence with wisdom. And if they can scheme — then humanity must stop pretending we’re still alone at the table.

Rafael Hernandez de Santiago, viscount of Espes, is a Spanish national residing in Saudi Arabia and working at the Gulf Research Center.
 

Disclaimer: Views expressed by writers in this section are their own and do not necessarily reflect Arab News' point of view

Not enough tents, food reaching Gaza as winter comes, aid agencies say

Not enough tents, food reaching Gaza as winter comes, aid agencies say
Updated 5 sec ago
Follow

Not enough tents, food reaching Gaza as winter comes, aid agencies say

Not enough tents, food reaching Gaza as winter comes, aid agencies say
CAIRO/GENEVA: Far too little aid is reaching Gaza nearly four weeks after a ceasefire, humanitarian agencies said on Tuesday, as hunger persists with winter approaching and old tents start to fray following Israel’s devastating two-year offensive.
The truce was meant to unleash a torrent of aid across the tiny, crowded enclave where famine was confirmed in August and where almost all the 2.3 million inhabitants have lost their homes to Israeli bombardment.
However, only half the needed amount of food is coming in, according to the World Food Programme, while an umbrella group of Palestinian agencies said overall aid volumes were between a quarter and a third of the expected amount.
Israel says it is fulfilling its obligations under the ceasefire agreement, which calls for an average of 600 trucks of supplies into Gaza per day. It blames Hamas fighters for any food shortages, accusing them of stealing food aid before it can be distributed, which the group denies.
Gaza’s local administration, long controlled by Hamas, says most trucks are still not reaching their destinations due to Israeli restrictions, and only about 145 per day are delivering supplies.
The United Nations, which earlier in the war published daily figures on aid trucks crossing into Gaza, is no longer giving those figures routinely.

TENTS ‘COMPLETELY WORN OUT’
“It is dire. No proper tents, or proper water, or proper food, or proper money,” said Manal Salem, 52, who lives in a tent in Khan Younis in southern Gaza that she says is “completely worn out” and she fears will not last the winter.
The ceasefire and greater flow of aid since mid-October has brought some improvements, said the United Nations humanitarian agency OCHA.
Last week OCHA said a tenth of children screened in Gaza were still acutely malnourished, down from 14 percent in September, with over 1,000 showing the most severe form of malnutrition.
Half of families in Gaza have reported increased access to food, especially in the south, as more aid and commercial supplies entered after the truce, and households were eating on average two meals a day, up from one in July, OCHA said.
There is still a sharp divide between the south and the north where conditions remain far worse, it said.

FOOD, SHELTER, FUEL NEEDED
Abeer Etefa, senior spokesperson for WFP, described the situation as a “race against time.”
“We need full access. We need everything to be moving fast,” she said. “The winter months are coming. People are still suffering from hunger, and the needs are overwhelming.”
Since the ceasefire the agency has brought in 20,000 metric tons of food assistance, roughly half the amount needed to meet people’s needs, and has opened 44 out of a targeted 145 distribution sites, she said.
The variety of food needed to ward off malnutrition is also lacking, she added.
“The majority of households that we’ve spoken to are only consuming cereals, pulses, dry food rations, which people cannot survive on for a long time. Meat, eggs, vegetables, fruits are being consumed extremely rarely,” she said.
A continuing lack of fuel, including cooking gas, is also hampering nutrition efforts, and over 60 percent of Gazans are cooking using burning waste, said OCHA, adding to health risks.
With winter approaching, Gazans need shelter. Tents are wearing thin. Buildings that survived the military onslaught are often open to the weather or unstable and dangerous.
“We’re coming into winter soon — rainwater and possible floods, as well as potential diseases because of the hundreds of tons of garbage near populated areas,” said Amjad Al-Shawa, head of the Palestinian agencies that liaise with the UN
He said only 25-30 percent of the amount of aid expected into Gaza had entered so far.
“The living conditions are unimaginable,” said Shaina Low, spokesperson for the Norwegian Refugee Council, which leads a group of agencies working on a lack of shelter in Gaza.
The NRC estimates that 1.5 million people need shelter in Gaza but large volumes of tents, tarpaulins and related aid is still waiting to come in, awaiting Israeli approvals, Low said.