Are Base Models Beating Legal AI Tools in Testing?

By Gravity Stack Staff

Among the many conversations at this year’s Legal Innovators California, one insight from Gravity Stack’s Bryon Bratcher cut through the noise. As legal departments ramp up AI adoption, and we hear about more and more startups raising mega rounds, some of the most effective results are still coming from the same place they started: general-purpose foundation models.

Bryon described findings from the firm’s AI Lab, where teams have been benchmarking large language models like GPT-4 against a range of legal-specific copilots and applications.

“Sometimes the base models are just way better than what happens after it goes through another layer, whether that’s Copilot, Harvey, or others,” he said.

The remark points to a growing question in the legal tech market: does specialization always deliver better outcomes?

The Interface Matters 

One reason base models are proving useful to lawyers in their day to day workflow,  is their simplicity. ChatGPT and similar tools offer a clean, familiar interface. Lawyers already know how to use them, and already do in their personal lives.

“The chat interface is the adoption piece,” Bratcher said. “That’s what people know and love, and it’s already embedded in their workflow.”

Tools that replicate this ease of use tend to see faster engagement. Tools that require a new learning curve risk getting passed over, no matter how tailored the features may be.

What This Means for Legal Teams

Legal teams should benchmark tools based on real outcomes. Start by asking what the base model can do. Then measure how much, if anything, the additional layer improves that output.

The smartest teams are already testing side by side, bringing in tools their teams already use, and looking for areas where performance and usability come together.

Discovery Is Moving Toward First-Pass Automation

Bratcher also shared how Gravity Stack is applying this thinking to discovery workflows. In the past year, the team has run more than a dozen discovery projects using generative AI.

“Within 24 months, first-pass review will be done by AI,” he predicted.

That future is already being built. AI is being used to summarize complaints, organize documents, and get legal teams closer to QC with less manual work.

The key is implementation. Strong tools matter. But how they are used matters more.

Follow Gravity Stack on LinkedIn for updates from our AI Lab, client stories, and legal innovation insights.

Blog image

Related News

Gravity Stack AI Video
Large language models are impacting companies of all sizes. How can Gravity Stack help you embrace Generative AI?
This prestigious accolade recognizes Jack’s exceptional contributions to the legal industry.
Bryon Bratcher Named Innovator of the Year at Leaders in Tech Law Awards
Reed Smith’s Recognition at the American Lawyer Industry Awards (2021)

Get in Touch
With the Team

If you would like to learn more about our team, services and relevant case studies, leave your email address and we will get in touch.

Get In Touch

If you would like to learn more about this service offering, please fill in the form below.

Get In Touch

If you would like to learn more about our team, services and relevant case studies, please fill in the form below.