Artificial Intelligence vs. Human Lawyers: Artificial Intelligence -- 94% Accurate, Humans -- 85% Accurate — Managing Legal

LawGeex — an Israel-based tech company whose artificial intelligence (AI)-based document review software automates the process of identifying risky provisions in contracts — has announced the results of a peer-reviewed study that compared artificial intelligence to human lawyers in the review of standard business contracts called non-disclosure agreements (NDAs).

The LawGeex AI platform achieved 94% accuracy compared to an average of 85% among 20 human lawyers.

And timing? The human lawyers took an average of 92 minutes to review all 5 of the NDAs involved — and the AI platform took 26 seconds.

…

Why is this important? Because the review and approval of low-value, high-volume, day-to-day business contracts is a core business task that historically has required manual review by qualified (human) lawyers. The typical Fortune 1000 corporation maintains 20,000 to 40,000 contracts at any given time. And 83% report dissatisfaction with their contract management processes.

As to the kind of contracts reviewed in the study — NDAs — they typically take one week or longer to get approved (an experience I often endured as an executive at GE and Whirlpool).

…

What exactly did the test consist of? What was the job on which the AI platform and the human lawyers were put to work?

NDAs are agreements to keep confidential information confidential. With various carve outs, punishments for violation, agreed dispute resolution provisions, etc.

Lawyers tend to exercise their individual whims and quirks in the wording of NDAs — both in drafting them to send out and in marking them up in response. The reasoning ends up being idiosyncratic to the attorneys involved. No reason these can’t be standardized. But they almost never are.

The stakes? You don’t want your company to be bound by a provision that you can’t live with.

So the test NDAs were chosen to present 30 separate risk issues (arbitration clauses, choice of venue and payment of attorney fees, exclusions for information received from third parties, etc., etc.). The problematic risk issues were distributed over 153 paragraphs and over 3,000 clauses.

…

You might ask if this study (link here) is reliable in light of its sponsorship by an AI provider.

That (I believe) is where the peer review and controls for the study are significant. Independent auditors come from Stanford Law, USC Law, Duke Law, and Bar Ilan University Computer Science Department — many of them authorities who are “household names” in the legal industry whom I’ve heard at conferences or encountered in law practice (e.g., Prof. Gillian Hadfield of USC Law, Bruce Mann of the Morrison & Foerster law firm, Dr. Roland Vogl of Stanford).

…

Finally, a test of AI versus human lawyers raises a test design concern specific to law. Ron Friedmann, attorney who consults on legal technology credits the LawGeex study with addressing the fact that there’s greater subjectivity in legal analysis than, say, in medical applications of AI:

“Lawyers often overlook the fact that expert lawyers often disagree about how to interpret legal materials. That’s a big problem when studying outcomes.

“Contrast that with medicine, where most clinical trials use objective measures to determine treatment efficacy. Those measures are typically reproducible and consistent. And a clinical trial typically compares a new treatment to the existing “gold standard” (widely accepted though not always evidence-based best practice). With direct comparison of two approaches and objective measures, successful clinical trials usually provide meaningful and actionable results.

“Law lacks this approach. One reason is that the legal market has little ethos of taking an evidence-based approach to how it practices. Another reason is that the legal gold standard – humans doing the work – is highly variable and far from objective. At minimum, I’d say the LawGeex study worked hard to overcome these two limitations.”

…

What’s the business significance of these findings?

These findings show that tasks that were assigned to inexperienced young lawyers in my early days — that law firms and departments continue to assign to such fledglings — can now be automated.

With better accuracy in the bargain.