Why LegalTech cannot move forward without formalizing justice

Legal technology has always existed in a space between ambition and limitation. Every few years a new wave of tools promises to transform the profession, and each time the results prove more modest than the rhetoric. Automated research, contract analytics, compliance monitoring, and discovery platforms have all reshaped parts of legal practice, yet none have solved the central challenge: how to make machines capable of true legal reasoning.

That question became sharper with the arrival of large language models. The launch of GPT-5, with its scale and fluency, convinced some observers that law was finally entering the age of automation. The model can summarize thousands of pages in minutes, produce draft contracts that appear convincing, and even simulate cross-examinations. Yet when it comes to the core functions of law, issuing binding judgments and creating new norms, GPT-5 is not merely unprepared. It is fundamentally unable to take on the task.

The obstacle is not a shortage of data or computing power. It is structural. These models remain black boxes. They generate plausible sequences of words based on statistical associations but do not construct normative reasoning. They cannot explain why a decision is justified, they can only reproduce how justifications have appeared in past texts. At best, they serve as support tools, powerful second opinions, or drafting assistants. They cannot replace judges or lawmakers.

Researchers in artificial intelligence increasingly describe this structural barrier as a ceiling. Language models deliver diminishing returns when asked to handle tasks that require explicit reasoning, accountability, and value judgments. In medicine, the ceiling is reached when models propose treatments without causal explanation. In politics, it appears when they generate speeches without responsibility. In law, the ceiling is impossible to ignore, because justification is at the heart of every decision. Without transparent reasoning, there is no legitimacy.

Why Law Resists Computation

If AI cannot break through the ceiling, perhaps the problem lies not only in the models but in law itself. Legal systems were never designed to be computational. They are written for human judges who are expected to interpret, to weigh, and to balance consequences.

Law is filled with open-textured concepts such as fairness, reasonableness, and proportionality. These ideas make sense to people but defy precise coding. Their vagueness is not a flaw. It is a feature that allows law to adapt to new situations, absorb moral and political debates, and evolve with society. For machines, however, this ambiguity is fatal. A program cannot sense fairness. It can only follow instructions.

Attempts to formalize law into code have repeatedly exposed the gap. Machine-readable contracts perform well in narrow areas such as payments or delivery schedules where obligations are precise. They collapse when interpretation is required. Was performance satisfactory? Did a party act in good faith? These are judgments, not data points. The same problem undermines theories of micro-directives that aim to translate policies into exhaustive sets of rules. They assume translation can occur without interpretation. That assumption is false. Every legal rule is bound up with context, values, and history.

This is why the vision of automated justice remains out of reach. The difficulty does not lie in the sophistication of AI but in the ontology of law itself, the way it organizes knowledge, norms, and conflicts. That ontology is still incompatible with machine reasoning.

Where Law Goes Next

If law in its current form resists computation, the easy response is to wait it out. Most governments do exactly that. They keep the human centered design of legal reasoning and avoid the risk of redesigning concepts like fairness or proportionality in machine readable terms. The caution is understandable. Vagueness has long served as a buffer that absorbs disagreement and change. Yet the world that law must govern has accelerated. Transactions are instantaneous, platforms operate at global scale, and public expectations for transparency and speed are rising. In this environment, it is increasingly hard to imagine that adjudication and legislation will remain entirely human-led in their methods, while matching the demands of scale.

The alternative is not to reduce justice to a single number. It is to make law computable to a sufficient degree. Economics offers a useful analogy. For centuries it was a field of essays about prices and trade. It became an engine for policy when it adopted formal models, national accounts, and scenario tools. The debates did not disappear. They became more explicit because assumptions were written down, inputs were measurable, and predictions could be checked. Law needs a similar move. Not total mathematization, but an operational layer that lets us compare alternatives, surface assumptions, and audit the path from norm and fact to conclusion.

A computable legal ontology begins with clarity about actors, rights, duties, actions, events, and consequences. It adds a basic grammar of what is permitted, required, and forbidden, together with the conditions that change those statuses. It represents disputes as scenarios rather than slogans, so that consequences can be compared across time and across affected groups. It specifies thresholds where compromise is not allowed, and it marks the zones where human judgment must decide. Above all, it preserves a provenance trail that shows which sources were used, what assumptions were made, and how conclusions shift under reasonable changes of input.

This is a design for legitimacy, not a shortcut. Without such an ontology, the pressure for speed will still push decisions toward automation, only by stealth. Platforms and vendors will encode hidden choices in defaults and parameters. Courts and agencies will lean on opaque tools without a shared language to control them. We would get automation without accountability. A public ontology is a way to keep the institution in charge. It defines what can be automated, under what constraints, and with what evidence of correctness.

None of this removes people from the loop. It locates their authority where it matters. Open textured concepts retain a human anchor, but within boundaries that are visible and contestable. The model assists with search, patterning, and scenario testing. The person owns the step that turns facts and norms into a decision and explains why that decision is justified.

The path forward therefore follows from the diagnosis. If law resists computation because its ontology is incompatible with machines, the answer is to reform the ontology, carefully and in public. Most states will hesitate. The costs of not moving are already visible in delay, inconsistency, and a shrinking space for audit. A computable layer built around human judgment is the tractable middle course. It is the only way to bring AI to serious legal work without losing the reasons that make law worthy of obedience.

Verification and Legitimacy: How to Test AI in law

If we commit to computability to a sufficient degree and adopt a public ontology, the next step is to specify how a machine assistant is to be verified. The goal is not elegant algorithms but procedures that make outputs reproducible and open to audit. Legitimacy in law rests on a transparent path from norm and fact to decision, not on technological spectacle.

The starting point is traceability. Every conclusion should come with a provenance map that shows which sources and versions of norms were applied, to which established facts they were linked, why alternative rules were rejected, which assumptions were introduced, and where the limits of those assumptions lie. This map is not a list of links. It is a coherent trajectory of reasoning that another lawyer can examine and reconstruct.

The second element is scenario testing. Where the previous section spoke about comparing alternatives, here that idea becomes practice. For each contested point the system constructs several realistic scenarios, indicates who is affected and how, specifies the time horizon, and reports scientifically supported probabilities. Scenarios make the causal chain visible and allow consequences to be compared before a decision is taken rather than after.

The third principle is an explicit treatment of uncertainty. The model should show where data are insufficient, which parameters drive the outcome, and how the result shifts under reasonable variation of inputs. Simple sensitivity checks often reveal hidden fragility. If a small adjustment flips the conclusion, that is a signal to move the question into the zone of human discretion and to require additional justification.

Robustness to wording and to neighboring domains is just as important. Legal language is variable by design, so the system should produce comparable results for equivalent queries and when transferred to related areas. Where robustness is not achieved, the limits of applicability must be stated in order to preserve predictability and to reduce the risk of arbitrariness.

A shared typology of errors is useful. Errors of fact, errors in applying a norm, procedural defects, and mistakes in assessing consequences are different classes, each with their own methods of detection and correction. This structure makes appeals more focused because the dispute targets a specific defect rather than the entire outcome.

The boundaries of automation must be defined in advance. A public ontology should set which steps can be delegated to a system and which remain with a human decision maker. Open textured concepts and conflicting priorities are marked as zones of residual interpretation where a person must decide and explain the choice, relying on the scenarios and data that the system presents. This is not a formality. This is where the normative core of law is preserved.

An external layer of quality control completes the picture. Methods should undergo regular independent review. Incidents should be logged. Data should be reevaluated on a schedule. Verification procedures should be updated as practice evolves. As economics uses replicable models and back tests, law should develop open benchmark case sets and standards for auditing provenance trails.

This protocol does not make AI a judge. It turns the model into an instrument with clear limits, demonstrable utility, and transparent accountability. That is how a computable layer is connected to legitimacy. Decisions remain human in meaning while becoming testable in procedure.

Conclusion

Law is caught in a paradox: it must accelerate to remain relevant, but its fundamental structure resists automation. The solution to this paradox is not a more powerful AI but a reengineering of legal reasoning itself.

The proposed technical architecture, a computable legal ontology with traceability maps, scenario testing, and explicit handling of uncertainty, is not just a set of tools. It is a new epistemology of law. Instead of relying on a judge’s intuition about “reasonableness,” the system requires clarity: which factors are considered, how they are weighted, and what happens when inputs change.

Decision provenance maps turn legal justification from a rhetorical exercise into a verifiable chain of reasoning. Scenario testing replaces abstract debates about “justice” with concrete comparisons of consequences for different groups over time. Boundaries between automated and human decision zones reveal what is now hidden, showing precisely where value judgments enter the process.

This architecture addresses the problem of AI legitimacy in law in a radical way: it makes machine reasoning not less but more transparent than human reasoning. A judge may err or show bias, but their logic cannot be reproduced. A system with the right ontology can be tested, corrected, and improved.

The result is not the robotization of justice but its scientific revolution. Just as economics evolved from a collection of essays into a political instrument through model formalization, law can become more precise, predictable, and accountable through computable structures. The human remains at the center as the architect of the system and guarantor of its ethical application, not as the sole source of legal judgment.

The question is not whether this transformation will occur. The pressure of speed and complexity will make it inevitable. The question is whether it will be guided and transparent, or whether legal decisions will become hostages of opaque algorithms. The choice is being made now.

Yuri Kozlov Author

Yuri Kozlov is a lawyer and CEO of JudgeAI, a system that models judicial reasoning and automates the resolution of legal disputes through legal algorithms.

See Full Bio