The Rule That Forgot Time
[Auf Deutsch lesen]

What a 240-year-old moral law misses about AI, and how one small adjustment changes everything


Here is a rule you have followed all your life without ever needing to state it: do not make a promise you do not intend to keep.

Now comes the question almost nobody asks: when, exactly, is that rule true? This morning? Forever? Was it true before you were born?

You probably want to answer: always. That instinct is the problem this post is about. The most famous moral rule ever written has the same instinct. And that instinct is what makes it quietly unfit for the machines we are now building.

Let me show you the crack. Then let me show you the fix. There will be two formulas. They are friendlier than they look. Both are punchlines, not homework.


The most ambitious sentence in philosophy

In 1785, a famously punctual German named Immanuel Kant wrote down a single test he believed could separate right action from wrong action. It is called the categorical imperative, and stripped of its formal clothing it says this:

Act only on a rule that you could will everyone to follow.

That is the whole machine. Before you act, imagine everyone acting on the same rule. If the world still makes sense, the rule passes. If the world collapses into contradiction, it fails.

Lying fails at once. If everyone lied whenever it was convenient, the very idea of telling someone something would lose its force. There would be no trust left to exploit. Lying would destroy the condition that makes lying possible. The rule eats itself. Test failed.

It is a beautiful machine. You feed in a proposed rule, turn the crank, and out comes a verdict. Philosophers have been turning that crank for two and a half centuries.

But notice what the machine never asks.


The missing question

The test takes a snapshot. It freezes the world, imagines everyone acting at once, and checks that single frozen frame for contradiction.

A snapshot has no before and no after. For some rules, that is fine. “Do not lie” looks the same on Tuesday as it does in the next century.

But now point that same machine at a modern problem.

A hospital installs an AI system that reads scans and flags dangerous cases for a doctor. Someone has to write the rule for when the AI may decide alone and when it must call a human. Suppose the rule is this: the AI decides routine cases; the doctor decides the hard ones.

Run it through Kant’s machine today. Could everyone follow this rule? Yes. It is sensible. It is fair. It passes. Snapshot approved.

Then time does what time does.

In year one, the doctors review the AI’s hard cases sharply, because they still see many of them. By year three, the AI has quietly improved. The “hard” pile has shrunk. The doctors now see fewer difficult cases, and they begin to lose practice on exactly the cases the rule still assigns to them. The skill the rule depends on has thinned out, because the rule slowly stopped feeding it.

Nobody broke the rule. The rule broke itself by running.

That is the uncomfortable part. The snapshot was never wrong. On day one, the rule genuinely passed. The failure did not live in any single frame. It lived in the motion between frames. And a snapshot, by definition, cannot see motion.

This is the crack. Kant built a law for things that move, but gave it a test that only sees things standing still. A rule about acting, willing, and striving is judged by a photograph.


The fix is smaller than you think

The tempting move is to throw Kant out and reach for something that loves time: “Just maximize good outcomes over the long run.” Resist that move. It trades one problem for another, and it throws away the one thing the categorical imperative gets exactly right: it binds you regardless of what you happen to want.

The fix is smaller and stranger. We do not change the law. We change what we feed into it.

Instead of testing a rule, we test a rule that carries instructions about its own future.

Picture a maxim upgraded from a flat sentence into a small three-part packet:

what I do now   →   what may follow from it   →   what it grew out of
    (content)            (continuation)              (provenance)

That is the whole trick. A rule stops being a frozen statement. It becomes a link in a chain, holding hands with the rule before it and the rule after it.

And why exactly three parts? Not because three is tidy. Because a moment in time has exactly three neighbors and no more: a present, a future, and a past. There is no fourth direction for time to go. The structure is not a design choice. Time hands it to you.


The rule that will not sit still

Once a rule has to account for its own future, the famous test grows two new clauses. Kant’s original survives untouched as the first one:

One. Your rule passes the old snapshot test, right now. (Could everyone do this today?)

Two. Any rule that could grow out of yours also passes, and passes that same demand on to whatever grows out of it. (Does following this leave the next person able to follow it too?)

Three. Your rule can show that it is a legitimate descendant of whatever came before it. (Did I get here honestly?)

Fold all three together and you get the upgraded imperative:

Act so that your rule could be a link in a chain that carries its own fairness forward through time.

Now re-run the hospital. Clause one still passes on day one, exactly as before. But clause two trips an alarm the original never could. A rule that quietly erodes the doctors’ skill is a rule that makes its own future harder to keep. It saws off the branch it is sitting on. The upgraded test catches in advance the slow rot that the snapshot noticed only after the damage was done.

Same spirit as Kant. One new power: it can see in time.

And here is the elegant part. The old categorical imperative is not wrong now. It is simply zoomed in. It is the new rule viewed through a single frozen frame, clause one with the other two cropped out. The way classical physics turns out to be Einstein’s physics for things moving slowly, Kant’s snapshot turns out to be the moving picture paused on one frame.


One formula, and it is a good one

Logicians have a clean way to write “the rules that keep their own promise forever.” Here it is. Do not flinch:

\[\mathcal{M}_{\text{good}} = \nu X.\ \Psi(X)\]

Read it as a sentence. \(\Psi\) is a sorting machine: hand it a pile of rules, and it keeps only the ones whose every possible future move is also already in the pile. The symbol \(\nu\), “the greatest fixed point,” just names the result: the largest collection of rules that survives being fed through that machine and comes out unchanged. A self-sustaining club, where membership requires that all of your descendants are members too.

Now the genuinely interesting part.

You might want a tidy checklist that decides, in a finite number of steps, whether any given rule belongs to the good club. Mathematicians can prove that no such checklist can exist. The membership question has no general shortcut. Ever.

If that sounds like a bug, look again. A finish line you could actually cross would mean moral effort is a chore you complete and then set down, like laundry. The fact that the good club has no membership checklist is the mathematics quietly insisting on something humans have always sensed: being good is not a box you tick. It is a direction you keep walking. The “impossible” formula is just honesty about a road that has no end.


The one thing the formula cannot do, and that is the whole point

We have taught a 240-year-old rule to see in time. We have written it in a single line of mathematics. So let us push our luck and ask the formula for one last job: tell us who is responsible when the chain rolls forward.

It cannot. And the reason it cannot is the most important sentence in this post.

For the imperative to bind a will, and not merely drive a machine, it has to leave room for a real choice. If the rule mechanically forced exactly one next move, there would be no choosing left to do, and you cannot hold a falling rock responsible for falling. So the test does something careful. It narrows your options down to the ones that are allowed, and then it stops. Among those allowed moves, which one actually happens is a question the mathematics politely refuses to answer.

That refusal is not a gap waiting for a cleverer equation. It is a reserved seat.

Try to fill it and you fail in an instructive way every time. Force the rule to pick for you? Then nobody chose, and responsibility evaporates along with the choice. Add a higher rule to pick among the picks? Now you need a rule for that rule, and you are falling down an endless staircase with an open choice still waiting at the bottom. Have the system log a name beside every decision, a signature on the dotted line? A signature records that someone chose. It says nothing about why, and why is the only thing responsibility is made of. The logbook proves a button was pushed. It cannot answer for the pushing.

So the empty seat stays empty, no matter how good the mathematics becomes. And the empty seat has a precise shape. It is the spot where the question “why this and not that?” comes to rest. A machine can carry out the move, record the move, even recite the causes of the move. But ask it why, and it can only pass the question onward, because it has no place where the passing-on stops.

In a person, the passing-on stops. That is not a fact about how clever machines happen to be this year. It is what the word responsibility has meant all along: the place where “why?” finally meets an answer instead of a forwarding address.

We began by upgrading a rule so that it could finally see time. We end somewhere stranger and better. The most rigorous thing the mathematics can tell us is the exact location of its own edge: the seat it carved but cannot fill. And sitting in that seat, holding the one question no formula will ever take off your hands, is you.


This post is part of the series “Thinking Machines, Thinking Humans.” The idea of an AI that knows when to stop and ask a human is explored in “When AI Should Say ‘I Don’t Know.’” The deeper machinery, including the five dimensions that measure when a decision needs human judgment, is developed in the full work on human-AI collaboration.

|||Export EPUB
Post to X
0 / 280
[ Translating... ]