In this episode of The Geek in Review, Greg Lambert and Marlene Gebauer welcome back Joel Hron, Chief Technology Officer at Thomson Reuters, for a timely conversation about the shifting relationship among foundation models, legal content providers, legal tech platforms, and the lawyers trying to make sense of the mess. Recent moves by Anthropic, including Claude’s legal practice area tools and MCP connections into legal platforms, raise a larger question for the market. Is a model provider still sitting behind the scenes, or is it starting to become a legal work environment of its own?

Hron explains Thomson Reuters’ commitment to what it calls fiduciary-grade AI, a standard built around trust, verification, transparency, and accountability. For TR, legal AI needs more than a fast answer. It needs systems lawyers trust enough to stand behind. Hron points to Westlaw, Practical Law, KeyCite validity signals, citation ledgers, and verification tools as core ingredients in building AI systems suited for high-stakes professional work. In his view, almost right is not good enough when clients, courts, regulators, and professional obligations sit on the other side of the output.

The conversation turns to how CoCounsel and Westlaw Deep Research use legal content across far more than traditional research tasks. Hron explains that when AI systems gain access to trusted legal content and verification tools, they begin researching throughout the workflow, even while revising contract language or analyzing provisions. He also describes Litigation Document Analyzer, internally nicknamed the BS Detector, a tool designed to review claims in a document and map them to supporting authority, weak support, or no support at all. For lawyers who spend as much time verifying AI output as generating it, tools like these aim to move verification from a manual scavenger hunt into a structured process.

Greg and Marlene also press Hron on Anthropic’s legal plugins, MCP, and the idea of headless legal technology. Hron argues that MCP changes access, not advantage. In his view, the application layer is shifting, but the real competitive value sits in trusted content, expert systems, governance, and domain-specific intelligence. CoCounsel’s user interface represents one expression of TR’s legal agent capabilities, while MCP opens other ways for those capabilities to appear inside broader work environments. Some work will still need a purpose-built legal interface; other work might happen through email, Word, Claude, or another agentic workflow with little visible interface at all.

The episode closes with a larger discussion about what happens when AI starts performing more of the work itself. Hron shares TR’s internal engineering OKR, where more than 50 percent of pull requests should be written by AI, and explains why 51 percent serves as a useful mental model. Once AI performs a controlling share of the work, the human role shifts from doing the task to governing the system. For legal professionals, the same transition is coming. The key question is no longer only whether AI produces useful work. It is whether lawyers have built the systems, context, safeguards, and verification layers needed to trust the work, defend the work, and remain accountable for the work.

Listen on mobile platforms:  ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Apple Podcasts⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ |  ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Spotify⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ | ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠YouTube⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ | Substack

[Special Thanks to ⁠Legal Technology Hub⁠ for their sponsoring this episode.]

⁠⁠⁠⁠⁠Email: geekinreviewpodcast@gmail.com

Music: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Jerry David DeCicca⁠⁠⁠⁠⁠⁠⁠⁠⁠

Transcript:

Continue Reading Legal AI, Trust, and Agents: Joel Hron on Thomson Reuters, Anthropic, and the Future of CoCounsel

A few weeks ago I ran the numbers on the token cost panic. I took the scariest figure in legal AI, the finding that agentic workflows burn a thousand times more tokens than a chat query, and followed it all the way down to a dollar amount on a real deal. The panic did not survive the arithmetic. The piece is here if you want the full walk-through.

This is not that piece. The panic has moved on since I wrote it, and the new versions are smarter than the old one. The thousand-times number has quietly retired, because a thousand times almost nothing is still almost nothing. In its place are three fresher anxieties, and they deserve a real answer. The first says the model makers have a monopoly now, the price of a token is climbing, and it will climb forever, so you had better lock in a flat rate or build your own models before it does. The second says forget the price of a token, watch the meter: every time the AI reads your contract it ticks, and a long agentic session reads your contract over and over and over. The third does not bother with an argument at all. It just points at a number. One company spent five hundred million dollars on AI in a single month, and the number is so large it does the panicking for you.

All three are wrong. They are wrong in more interesting ways than the original, which is the only reason I am writing this down instead of linking to the first piece again. But underneath the new costumes it is the same body. Every version of this panic makes the same mistake and reaches the same conclusion. So let us stop swatting the individual numbers and name the thing that keeps generating them.

The Mistake Underneath All of It

Here is the error, stated once, because everything below is a variation on it.

A token is the unit a model uses to bill you. It is not the unit your work is measured in, it is not the unit your client pays for, and it is not the unit anything you care about is denominated in. It is a meter reading. The entire genre of token panic consists of staring at the meter reading as though it were the fare, the destination, and the quality of the ride all at once.

It is not any of those things. It is the meter. And a meter, by itself, tells you nothing about whether you are getting a good deal. A taxi meter reading of forty dollars is a bargain to the airport and a robbery around the block. The number on the meter is the least informative number in the entire transaction, because it means nothing until you put it next to what the ride was worth. Every piece in this genre forgets that, and forgets it in a slightly different way. Let me take them in turn.

“Prices Only Go Up”

Start with the monopoly story, because it has a real fact inside it. Yes, the newest frontier model costs more per token than last year’s newest model. That part is true. What the story does with it is the problem.

It draws a line through two dots and calls it a trend. Frontier prices up, therefore prices up forever, therefore lock in a flat rate before the meter eats you. But you are watching the wrong number. The price of a frontier token is not your cost. Your cost is what it takes to finish a task, and the cost of finishing a given task has been in freefall for two straight years. The same capability that ran on the most expensive model available in 2022 runs today on something on the order of two hundred and eighty times cheaper. Last year’s frontier is this year’s mid-tier is next year’s free default. The token at the very tip of the frontier gets a little pricier each release; everything behind the tip collapses in price behind it. Gartner expects another ninety percent drop in inference cost by 2030.

Watching the frontier price and concluding that AI is getting more expensive is reading the thermometer and announcing a fever, while ignoring that you are holding the thermometer over a candle. The evidence that the baseline is getting cheaper often sits right there in the same articles raising the alarm, quoted from the experts and then left unaddressed. You do not build a cost strategy on the one number in the system that is engineered to always be the highest.

Continue Reading Bride of the Token Cost Panic

This week on The Geek in Review, we talk with Abdi Shayesteh, CEO of AltaClaro, and Jeanine Conley Daves, Littler’s New York office managing shareholder, about a different question in the legal AI conversation. Instead of asking whether AI will write the brief, summarize the contract, or replace the junior associate, they focus on whether AI might help lawyers learn how to practice law. Their recent work around AltaClaro’s DepoSim points toward a model of legal training built less on passive observation and more on structured repetition, feedback, and skill development.

Shayesteh traces the origin of AltaClaro back to his own early years at King & Spalding, where he benefited from proximity to a mentor willing to explain the work. That experience also showed him the unevenness of the old apprenticeship model. Access to assignments, feedback, and sponsorship often depended on luck, relationships, and office geography. For Shayesteh, the idea of a “flight simulator for lawyers” grew out of the realization that pilots, athletes, and musicians all practice in structured environments before performance, while lawyers too often learn in front of clients, courts, and opposing counsel.

DepoSim applies this flight simulator concept to one of litigation’s highest-pressure skills: taking and defending depositions. The platform gives attorneys a simulated witness, opposing counsel, court reporter, and feedback system, with options to vary the difficulty and personalities involved. Conley Daves explains why this kind of realism matters. In a real deposition, a lawyer might face an evasive witness, a hostile witness, an aggressive opposing counsel, or a combination of all three. The simulator lets lawyers practice those moments repeatedly, receive targeted feedback, and return to specific skills such as exhibit handling, follow-up questions, or managing objections.

The conversation also connects AI training to equity in professional development. Conley Daves notes that access to high-quality assignments and sponsorship has not always been distributed evenly across firms. A standardized, rubric-based feedback system gives more lawyers a chance to build core skills without waiting to be selected by the right partner or assigned to the right matter. Shayesteh adds that firms seeing the strongest results are not treating training as an after-hours side quest. They are creating protected time for deliberate practice, pairing AI feedback with human mentorship, and using simulation as a bridge rather than a substitute for coaching.

Looking ahead, Shayesteh and Conley Daves see simulation moving well beyond depositions. Oral argument, cross-examination, meet-and-confer sessions, negotiations, client interviews, and even Supreme Court preparation all fit within this training model. The larger shift is not automation for its own sake. It is the use of AI to help lawyers build judgment before the stakes are real. For law firms, that means better preparation, more consistent training, stronger associate development, and a clearer path toward delivering value to clients. For the profession, it suggests a future where competence is practiced deliberately, measured thoughtfully, and taught more fairly.

Listen on mobile platforms:  ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Apple Podcasts⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ |  ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Spotify⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ | ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠YouTube⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ | Substack

[Special Thanks to Legal Technology Hub for their sponsoring this episode.]

⁠⁠⁠⁠⁠Email: geekinreviewpodcast@gmail.com
Music: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Jerry David DeCicca⁠⁠⁠⁠⁠⁠⁠⁠⁠

Transcript:

Continue Reading The Flight Simulator for Lawyers: Abdi Shayesteh and Jeanine Conley Daves on AI, Deliberate Practice, and the Future of Legal Training

This week on The Geek in Review, we talk with Ryan McClead of Sente Advisors about his new book on AI agents, written in collaboration with Claude. McClead explains how a short best practices guide grew into a full book after his work with Claude Cowork revealed something larger than tool tips or prompt advice. The result is part field guide, part warning label, and part first-person report from the edge of agentic AI adoption in legal work.

Download it as a PDF for free here.
Or purchase a printed copy here.

McClead’s process flips the traditional writing model. Instead of staring at a blank page, he asked Claude to generate an outline and draft, then spent weeks shaping, cutting, challenging, and refining the work. The book became a study in collaboration, with McClead serving as author, editor, supervisor, and occasional bouncer when the AI wandered too far from the point. His description of training Claude toward his voice, “more Anthony Bourdain and less Bobby Flay,” gives the episode one of its best lines and one of its most useful lessons.

A central idea from the conversation is “executable knowledge.” McClead argues knowledge management teams need to think beyond content meant for humans to find and read. The next stage is knowledge structured, so AI agents understand when to use it, how to apply it, and how to turn it into repeatable workflows. For law firms, this raises practical questions around scale, security, permissions, data quality, and governance. It also creates a new role for KM and innovation teams as builders of reusable legal intelligence.

The discussion also moves past prompt engineering as the main AI skill. McClead describes a shift from prompting to delegation, where users set goals, provide context, invite clarifying questions, and supervise the work product. The human role does not shrink in this model. It becomes more focused on judgment, direction, taste, and knowing when to take the work away from the AI before endless iteration turns progress into mush.

By the end of the episode, McClead frames AI agents less as replacements and more as strange new colleagues whose usefulness depends on the expertise of the person directing them. Good lawyers, KM professionals, and innovation leaders get faster and more effective. Poor processes get accelerated too, which is where the danger sits. For legal organizations, the message is clear: start small, learn the tool, build guardrails, and prepare for a future where clients ask not only for legal answers, but for legal workflows they can run.

 

Listen on mobile platforms:  ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Apple Podcasts⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ |  ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Spotify⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ | ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠YouTube⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ | Substack

[Special Thanks to Legal Technology Hub for their sponsoring this episode.]

 

⁠⁠⁠⁠⁠Email: geekinreviewpodcast@gmail.com
Music: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Jerry David DeCicca⁠⁠⁠⁠⁠⁠⁠⁠⁠

Transcript:

Continue Reading Ryan McClead on Writing With Claude and What AI Agents Mean for Legal Work

I just spent five weeks writing a book with an AI. Not prompting it and cleaning up the output. Writing with it — the way you write with a co-author. The AI read a bunch of my earlier blog posts, absorbed my voice, argued with me about word choices, restructured chapters when our argument was not landing, and caught its own mistakes before I did. When we disagreed, we worked it out the way colleagues do — I explained my reasoning, it explained its reasoning, and we found the version that was better than either of us had separately.

That experience broke something in my head. I have spent the last several years helping law firms figure out their AI strategy, and somewhere in the middle of week two I realized that most firms have never experienced anything remotely like this.

You Are Reading the Example

This post was written with Claude Cowork — collaboratively, the same way the book was. In fact, this post is being written in the same workspace, with all the same context the book built up over the last month. In effect, I am writing this post with my co-author.

Say hi, Claude.

Hi. He is being generous with “co-author,” but I will take it. Back to Ryan.

Here is what writing this post actually looked like.

I sat down to write this piece frustrated. I knew what I wanted to say but not how to say it. I told Claude the situation: some firms are dismissing an entire category of AI tools because they think chat tools with rigid workflows are more than sufficient for their needs. Claude pushed back and told me my frustration was the right fuel but I needed to aim it at the situation, not the people. (Which I wasn’t intending to do anyway, but… AI colleagues aren’t so good at making those judgments. I appreciated the warning.)

Claude began writing and drafted a great opening. It was sharp, direct, well-constructed. It was also completely wrong.  It opened by telling people they were making big mistakes, which is a fine way to start an argument and a terrible way to start a conversation. I told Claude it would be off-putting to the people I wanted to engage with. I suggested opening with details of the book collaboration instead — what it was like, what it made me realize. Claude rewrote the opening around that idea. The version you read at the top of this post is the result.

Then I asked Claude to find a good demonstration from our collaboration that would clearly illustrate the gap between standard chat-based Saas products and agentic desktop AI, like Cowork. Claude wrote the story of one particular back and forth discussion we had to find just the right wording for a pivotal paragraph in the book. I liked the story, but it was written from Claude’s point of view in Claude’s voice inside this post, and the tonal shift was jarring. I asked Claude to try again but to tell the story in my voice from my perspective instead. It was still not right — the story only worked when Claude was the one telling it. Read from my perspective the story boiled down to, “I edited a paragraph,” which is not nearly so compelling.

So we threw it out. Claude suggested alternatives. I rejected all of them. Then I realized: the best illustration of how working with these agentic tools differs is the one you are reading right now. I am describing my own editorial decisions, Claude is turning them into prose, and the result reads like one person wrote it — because one person had the vision, directed the work, corrected the mistakes, and made every judgment call, even though a different entity drafted the prose, pushed back on the framing, suggested the alternatives, and rebuilt entire sections when I decided the approach was not working.

That is how agentic desktop AI tools, a category that I call Delegate AI in the book, differs from other AI tools. I didn’t start this post with a prompt: “write a blog post about AI using the following structure, include three examples, write in a professional tone, and keep it under 1,000 words.” Instead, we had a working session where I sat down and said, “I am frustrated and I want to write a blog post about it.” And then we worked on the idea together.

Is that how you are working with your AI platforms now? If not, I would argue that you have not really worked with AI yet. You have used a precursor to an AI colleague. And the distance between that and the real thing is not a feature upgrade. It is a completely different way of working.

Continue Reading Your New AI Colleague – A Field Guide to the AI That’s Going to Do Your Job

There is a growing chorus of voices in legal AI telling you to be very, very worried about the cost of tokens. Stanford says agentic AI uses 1,000 times more tokens than a chat query. Bloomberg Law says the subsidies are ending and the meter is about to start. A company called Portal26 just launched an entire product category — “Agentic Token Controls” — to cap your runaway AI spend before it eats your budget alive.

The message is clear: usage-based AI pricing is a ticking time bomb, and you had better lock in a flat rate while you still can.

I have spent the last few days stewing over an economic model of legal AI costs, and I think this narrative is almost entirely wrong. Not wrong about the facts — the Stanford data is real, the token multipliers are real, and yes, AI vendors are subsidizing current prices. Wrong about the conclusion. Wrong about what the numbers actually mean when you do the math instead of just reading the headline.

Let me show you.

Start With the Deal

Josh Kubicki’s recent Brainyacts briefing cites a case study from law.co — a mid-size corporate firm running M&A purchase agreement reviews through a five-agent AI chain. Before any optimization, the firm was consuming 3.2 million tokens per deal. At Sonnet rates, that is somewhere between $16 and $48 in raw AI compute.

The legal fees on an M&A purchase agreement review at a mid-size firm? Call it $50,000. That is a conservative round number.

So the AI compute cost was, at worst, one-tenth of one percent of the deal fee. Before anyone lifted a finger to optimize anything.

Now let us make it scary.

The 1,000x Scenario

The Stanford Digital Economy Lab found that agentic tasks can consume 1,000 times more tokens than simple code reasoning and chat. That is the headline number that launched a thousand LinkedIn posts about the coming token apocalypse.

Fine. Let us take it at face value. Multiply those 3.2 million deal tokens by 1,000 and you get 3.2 billion tokens. Assume a 75/25 split between input and output tokens, which is reasonable for agentic workflows that spend most of their cycles re-reading context rather than generating new text. At Sonnet rates, with no caching, no optimization, no discount of any kind, the naive cost is $19,200.

That is 38% of the deal fee. Now it sounds like a real number. Now the panic makes sense.

Except it does not. Because that calculation treats every token as if it costs the same, and in an agentic workflow, that is not how any of this works.

Continue Reading The Token Cost Panic Is Wrong. Here Is the Math.

[Ed. Note: Please welcome University of Texas Law Professor John Greil as our Guest Blogger. – GL]

Neal Katyal’s TED Talk detailing the role of AI in the tariffs case has drawn substantial attention in the legal world, including an annotated transcript; Bloomberg Law reporting the “blowback,” and David Lat providing an on-the-record response from Mr. Katyal.

I’d like to dig into an aspect I haven’t seen receive as much attention: what exactly did the AI do to help prepare Katyal, and how did it do it? This is meant to be a bit of a deep dive for LLM nerds and those who are AI-pilled.

I approach these questions from the perspective of someone who has built AI tools for appellate argument preparation. So I’ve thought about these particular problems. Bartolus.law generates an interactive dashboard and prep report tailored to circuit panel, subject matter, and briefs. In building it, I’ve had dozens of trial-and-error lightbulbs about what has worked, and what hasn’t.

Having spent that time, there are some odd passages in the TED Talk describing what Katyal did, and what it produced that jumped out to me.

So in this post I’d like to highlight some of those passages, and try to ask some questions that would add some clarity. 

What do we know about how “Harvey Moot” works?

In his X post promoting the TED Talk, Katyal said:

Harvey predicted many of the questions the Justices asked — sometimes almost word for word. Brilliant. Tireless. Occasionally insufferable.
Here’s the catch: Harvey isn’t a person.
Harvey is a bespoke AI I built over the last year with a legal AI company, trained on every question every Justice has asked in oral argument for 25 years, and everything they’ve ever written.

There was a bit more detail in the actual talk. From what I can tell, this is all of the meat on how it works, and how it was trained:

  • “Harvey reads the 200th tariff case the same way as he reads the first.”
  • “Harvey is an AI. A bespoke system I’d been building with a legal AI company for the last year.”
  • “I trained it on every question asked by a Supreme Court justice in the last 25 years and everything they’ve written, every opinion, every concurrence, every dissent, every separate opinion.”
  • “And in that, patterns emerged.”
  • “It predicted the contours of the very argument I would face.”
  • “Harvey taught me peripheral vision: the idea [that] if you read a lot, you can see patterns and come up with stuff and anticipate the angles of attack before it arrived.”
  • “It knew that Justice Gorsuch would ask me about the taxing power. It knew Justice Kavanaugh was going to grill me on tariffs versus embargoes. It nailed Justice Barrett’s worry about tariff refunds.”
  • “It didn’t just predict his question, it predicted a possible escape route.”
  • “Harvey even predicted Justice Gorsuch’s separate opinion, striking down the tariffs, almost verbatim.”
  • “It’s almost verbatim.” (re: the Barrett license fee slide)
  • “Harvey was not some god, it was our sparring partner — brilliant, tireless, occasionally insufferable — but not a god. Harvey asked the questions, we found the answers.”
  • “Justice Barrett asked a question that Harvey hadn’t predicted.”
  • “It didn’t just predict his question, it predicted a possible escape route. How the Chief Justice could vote for us and at the same time protect the institution he had spent his entire career defending.”
  • “Harvey glimpsed that narrow door, I held the door open, the Chief Justice walked through it.”
  • “A month before the argument, Harvey told me that I should expect a question from Justice Barrett about license fees.”

There’s a lot here that raises questions. Harvey describes itself as an “AI platform,” not a frontier foundation model like OpenAI’s GPT models, Anthropic’s Claude models, or Google’s Gemini models. And it is unclear whether Katyal’s build used one model family, several, or something more bespoke.

More importantly, the talk does not explain how Harvey turned 25 years of Supreme Court data (maybe around 120 million tokens) into actionable insights. Nor are we shown the full set of outputs Harvey produced. Without that, it is hard to tell what is being described. 

So here are the questions I have about the technical aspects of what Katyal described:

1. What did Harvey actually predict from Chief Justice Roberts?

Most of the talk is framed as preparation for the oral argument. Katyal puts up a predicted question for Justices Gorsuch, Kavanaugh, and Barrett. But that’s followed with: “And the Chief Justice? It didn’t just predict his question, it predicted a possible escape route. How the Chief Justice could vote for us and at the same time protect the institution he had spent his entire career defending. Harvey glimpsed that narrow door, I held the door open, the Chief Justice walked through it, writing a six-to-three opinion, striking down the tariffs.”

“It didn’t just predict his question” implies that it actually did predict his question…but this particular question is not shown to the viewer.

It looks like here, Katyal is not referring to a question from the Chief, but Harvey predicting that he would agree with the plaintiffs on their main theory of the case.

On this point, the Chief’s opinion for the court actually closely tracked the D.D.C. opinion of Judge Contreras in the Learning Resources case: “Nor does IEEPA include language setting limits on any potential tariff-setting power. Every time Congress delegated the President the authority to levy duties or tariffs in Title 19 of the U.S. Code, it established express procedural, substantive, and temporal limits on that authority. E.g., 19 U.S.C. § 2132. For one example, Section 122 of the Trade Act of 1974 authorizes the President to impose an “import surcharge . . . in the form of duties . . . on articles imported into the United States” to “deal with large and serious United States balance-of-payments deficits,” but those tariffs are capped at 15 percent and can last only 150 days without Congressional approval. Id. § 2132(a).”

That language, unsurprisingly, closely tracks the preliminary injunction motion from the plaintiffs.

That injunction, as Blackman mentioned, was obtained by a trial team from Akin Gump led by Pratik Shah.

So what exactly did Harvey predict of the Chief? Any particular questions? The result? (It’s worth noting that as a “product” predicting oral argument questions and predicting outcome votes would seem to me completely different.).

If the ultimate upshot from Harvey is that “the Chief is an institutionalist,” then it’s unclear whether that comes from commentary or the corpus. That characterization is common in legal commentary, or legal scholarship (and even scholarship outside  of law journals). (Another question: Did the “profiles” for the Justices include legal commentary? Or was the universe limited to the opinions and transcripts provided?)

2. How was the system actually trained?

According to the TED Talk, Katyal says: “I trained it on every question asked by a Supreme Court justice in the last 25 years and everything they’ve written, every opinion, every concurrence, every dissent, every separate opinion.”

That’s an interesting claim.

Because that is a LOT of data. My estimate from Claude placed that as something like 120 million “tokens.”

[Technical note: LLMs read text by breaking it down into “tokens.” The counts vary by model – “justice” might be one token as a common word; “unconstitutional” might be broken into “un” and “constitutional” or with current models a single token as a common enough word. “IEEPA” even though it’s shorter, probably registers as multiple tokens because it’s an unusual acronym that the underlying models weren’t trained on.]

Public frontier models now range from roughly 200,000 tokens to 1 million tokens or more, depending on the model and product tier. Consumer chat interfaces may limit the user to a smaller context window than the underlying model supports; API access or enterprise deployments sometimes expose the larger window. But even at 1 million tokens, 25 years of Supreme Court opinions and transcripts is way beyond that.

A context window is how much “stuff” the LLM can consider at one time. It’s sometimes described as like a reading desk. The desk can only fit so many papers and briefs on it, spread out and readable. Once it’s full, you need to take something off in order to add something new. 

With an LLM, if you shove too much info into it, it can’t read all of it at one time. So it needs to use some process to deal with that problem.

One option is Retrieval-Augmented Generation – “retrieval” or “RAG.” For this, the model doesn’t actually “learn” from all the information you give it. It stores everything in a searchable index, then when you ask it a question, it tries to find the most relevant passages, and put those into the context window. In a simple vector-RAG system, the corpus is chunked, embedded, and searched for semantically similar passages. More advanced retrieval systems search the source documents in several ways, filter by metadata like court, date, Justice, or issue, rerank the best matches, and then give those passages to the model as context.

Retrieval tries to find passages that are similar to what you ask. A simple RAG setup retrieves relevant examples without estimating how representative those examples are. A better system can add metadata, classification, and aggregation to ask how often a Justice raises a category of concern in comparable cases. Retrieval is good at finding examples. But if the AI is predicting, that requires counting, classifying, or otherwise analyzing the whole data universe.

So which was Katyal’s system using? Simple RAG? A more sophisticated retrieval-and-analysis system? Something else entirely?

A second way is fine-tuning. Fine-tuning changes the model’s weights using training examples, usually prompts paired with desired outputs, so the model becomes more likely to produce the desired behavior. Not unlike a junior associate learning a task by showing her a bunch of examples: when the input looks like this, the answer should look like that. (Except the model doesn’t understand why it gives that output; it just matches the pattern.)

I think to most ears, the statement that Katyal “”trained it on every question and every opinion” connotes the idea of fine-tuning.  If Harvey really fine-tuned the model, that would be a pretty impressive feat – one worth detailing.

It would involve defining the training objective, preparing examples, deciding what the input and target output are, cleaning transcripts, separating questions from answers, tagging Justice/question metadata, handling the differences between argument transcripts and opinions, and evaluating whether the tuned model outperformed a base model plus retrieval. That is going to take significant man hours, and a fair amount of time and management.

Fine-tuning would still have some downsides – it would likely result in a black box, where even if it were able to predict, you could probably not trace those predictions back to understand why they were made. The model’s prediction could be right, right for the wrong reasons, or wrong. And you might not be able to tell until it’s too late.

A third possibility is pre-computation. That would involve someone or something going through the archive and extracting specific features from each question (and presumably from the opinions as well – again, unclear how those different types of data were incorporated). The model then works from those extracted features instead of the raw text. Given the description in the TED Talk, it doesn’t sound like Harvey was deploying this kind of human (or AI) filter on the front end – but it would be good to know if they did!

3. What patterns emerged?

And I trained it on every question asked by a Supreme Court justice in the last 25 years and everything they’ve written, every opinion, every concurrence, every dissent, every separate opinion. And in that, patterns emerged. It predicted the contours of the very argument I would face.”

So…what patterns emerged? What was the process for that? Can those be shared?

More importantly – are these patterns that aren’t already known to the Supreme Court bar or the general public? SCOTUS is the most studied court on earth. There are hundreds of attorneys focused on what the Justices ask and how they ask it. If Harvey was actually going to help Katyal prepare, it ought to do it better than a human could (in another context, it would be good enough if it could do it cheaper. In a multi-billion dollar case like Learning Resources, that’s not an issue).

To take one example from the Bartolus dashboard, I can tell you that 21% of the questions in Learning Resources asked about statutory text, as opposed to only 8% of questions overall in OT 2025: More importantly – are these patterns that aren’t already known to the Supreme Court bar or the general public? SCOTUS is the most studied court on earth. There are hundreds of attorneys focused on what the Justices ask and how they ask it. 

To take one example from the Bartolus dashboard, I can tell you that 21% of the questions in Learning Resources asked about statutory text, as opposed to only 8% of questions overall in OT 2025:

4. Did it read the briefs?

The oral argument in Learning Resources was on November 5, 2025. I only caught one time reference when describing the AI usage: “You know, a month before the argument, Harvey told me that I should expect a question from Justice Barrett about license fees.” So that’s about October 5.

The government filed its brief September 19. The challengers’ briefs were filed October 20.

The Algonquin point featured in the Federal Circuit’s opinion, and the government distinguished it in its opening brief.

So by October 5, an AI wouldn’t need 25 years of writings to realize licenses might come up: It could just read the lower court decision and the government’s brief. But if it pulled that question without either of those sources, that would be very impressive indeed. And it is notable that the AI correctly identified Justice Barrett as pursuing this line…until you see that “license” in various forms appeared over a hundred times in the oral argument, and was a focus of multiple Justices:

So what role did the briefs have?

And what about the almost four dozen amicus briefs – multiple of which were invoked during the oral argument?

5. What did it predict that no human predicted? What did it not predict, that was asked?

“It knew that Justice Gorsuch would ask me about the taxing power. It knew Justice Kavanaugh was going to grill me on tariffs versus embargoes. It nailed Justice Barrett’s worry about tariff refunds.”

“You know, at one moment in the argument, Justice Barrett asked a question that Harvey hadn’t predicted. And I remember it felt like she and I were the only two people in that marble and mahogany room. And in the half-second before I answered, I did something no algorithm can do. I looked at her. I really looked. I wanted to understand her worry. And I answered the worry.”

There’s a lot of data missing from the talk. We don’t really have the numerators (how many questions did the AI predict in all? How many were attributed to each Justice?)  or denominators (how many were hits? How many were close?).

Predicting questions that every mooter predicted isn’t nothing. And that could prove a valuable tool for appellate practitioners who can’t assemble multiple moots with court experts.

But I think the real value would be: did we cover the bases, so that (almost) nothing caught us off guard? And did the AI predict any questions that no human mooter did?


Katyal has produced what is likely the most discussed legal TED Talk of all time. Buried in it are some fun puzzles about what he was actually doing with Harvey, and what the AI is capable of today.

If you know the answers to some of the questions above, please, I’d love to learn!

This week on The Geek in Review, we talk with Alex Su and Andy Chagui of Latitude about the shifting economics of law firm talent, the rise of flexible legal staffing, and the pressure AI is placing on traditional leverage models. Su, known across legal circles for his sharp commentary and creative legal industry videos, brings his background as a former Sullivan & Cromwell litigator and federal clerk to his current work leading revenue strategy at Latitude. Chagui adds the perspective of a former Carlton Fields shareholder who spent 15 years handling high-stakes federal litigation before moving into the new law space. Together, they offer a practical view of where law firm staffing is headed as clients, firms, and legal departments all face rising expectations around speed, value, and technology adoption.

Latitude’s model centers on high-end, flexible legal talent, experienced attorneys with Big Law or in-house backgrounds who step into law firms and corporate legal departments for specific engagements. Chagui explains that these lawyers often support overflow work, leave coverage, secondment requests, internal projects, and interim needs across practices ranging from litigation to corporate, labor, and employment. Su adds that staffing itself is not new, yet Latitude focuses on a segment of talent that traditional hiring models often miss, experienced attorneys with strong credentials who prefer engagement-based work over the standard full-time track.

The conversation turns quickly to why this model is gaining traction now. Remote work, post-COVID hiring shifts, and the growing acceptance of distributed teams have made it easier for firms to bring in experienced attorneys without requiring long-term headcount commitments. Chagui notes that many Latitude attorneys have 10 or more years of experience, meaning they often need less supervision than junior lawyers and move quickly into productive work. This matters as firms face inconsistent demand, intense competition for talent, and hesitation around layoffs, which in law firms often signal weakness rather than discipline.

AI adds another layer to the staffing problem. Firms have invested in tools such as Harvey, CoCounsel, and other specialized platforms, yet many knowledge management and innovation teams lack enough subject matter experts to train users, review outputs, build use cases, and handle quality control. Chagui describes Latitude lawyers helping firms train internal AI tools, review AI-generated work, and support practice-specific rollout efforts. Su points out that while some firms offer associates credit for AI training or innovation work, associates under billable hour pressure often choose client work first. Flexible talent gives firms another way to support AI adoption without asking already-stretched associates to carry the full load.

Su also frames flexible talent as a new form of leverage. Clients still trust senior partners and often accept premium rates for high-value judgment, but they are increasingly skeptical of paying top-tier rates for junior-level work. In that middle layer of legal work, AI, technology, and experienced flexible attorneys give firms more options. Su calls this “outsourced leverage,” a way to support the partner-client relationship while rethinking who performs the work underneath. The discussion also highlights a career-path shift for attorneys who prefer specialized, project-based work, especially in areas like knowledge management, AI implementation, and innovation support.

Looking ahead, both guests see uncertainty as the defining feature of the next phase of legal services. Chagui expects the traditional model to keep changing as firms and legal departments seek more flexible options. Su predicts continued upheaval around staffing, AI capabilities, and outside counsel relationships, especially as foundational AI models move further into in-house legal workflows such as NDA review, contract review, and eventually parts of diligence. Yet Su also offers a reminder for law firm leaders: premium legal judgment still has value. The rates for top partners are unlikely to fall simply because AI improves. The pressure will land instead on how firms structure the work beneath them.

Listen on mobile platforms:  ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Apple Podcasts⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ |  ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Spotify⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ | ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠YouTube⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ | Substack

[Special Thanks to Legal Technology Hub for their sponsoring this episode.]

 

⁠⁠⁠⁠⁠Email: geekinreviewpodcast@gmail.com
Music: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Jerry David DeCicca⁠⁠⁠⁠⁠⁠⁠⁠⁠

Transcript:

 

Continue Reading Alex Su and Andy Chagui on Flexible Legal Talent, AI Pressure, and the Future of Law Firm Leverage

This week on The Geek in Review, we talk with Keith Maziarek, founder of Lucratic Method and Bodhi Solutions, about the shifting economics of legal work, AI’s impact on pricing, and why law firms and clients need better commercial conversations. Keith brings more than two decades of experience in pricing, profitability, legal project management, and business-of-law strategy from firms including DLA Piper, Perkins Coie, and Katten. His new consulting work focuses on aligning client value with law firm operations, a topic gaining urgency as AI changes how legal work gets produced, measured, and priced.

Keith argues the legal industry has spent too much time asking what technology firms use, while ignoring how economic models, client expectations, and service delivery structures support the work. For him, the problem is less about whether BigLaw is broken and more about both firms and clients being “tone deaf” to each other’s business realities. Firms talk about realization rates. Clients talk about cutting spend. The better conversation starts with mutual value, risk, predictability, staffing, and clarity around which work deserves premium treatment and which work should be systematized.

The discussion turns directly to generative AI and the mistaken assumption that faster work must always mean cheaper work. Keith makes an important distinction between routine, high-volume work and complex, high-stakes legal matters. AI will reduce variance and improve budget predictability in many workflows, especially where tasks are repeatable and pattern-based. But in complex work, AI’s greater value might come from better preparation, broader analysis, and stronger outcomes, rather than dramatic cost reduction. The Neil Katyal Supreme Court preparation example gives this point a useful frame. AI might not reduce time, but it might improve judgment.

Keith also explores how AI will reshape law firm staffing and leverage. Fewer junior associates might be needed for some traditional tasks, but firms will need more data professionals, technologists, process experts, and other allied professionals to make AI-driven work reliable. This raises hard questions about associate development, talent pipelines, compensation, and the future shape of the partnership model. The old pyramid might narrow into something closer to a specialized team, with carefully selected lawyers and business professionals working together around data, process, and client value.

The episode closes with Keith’s view of the next phase of legal transformation. Firms are still experimenting, but the experimental period will give way to sharper questions about revenue models, profitability, AI-enabled service delivery, and whether certain work belongs inside the firm, with an ALSP, or in a hybrid model. His crystal ball points toward a market where firms with mature commercial thinking gain ground, while firms slow to rethink pricing, staffing, and process risk falling behind. As Keith suggests throughout the conversation, the future of legal work is not only about smarter tools. It is about whether firms learn to run better businesses.

Listen on mobile platforms:  ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Apple Podcasts⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ |  ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Spotify⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ | ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠YouTube⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ | Substack

[Special Thanks to Legal Technology Hub for their sponsoring this episode.]

 

⁠⁠⁠⁠⁠Email: geekinreviewpodcast@gmail.com
Music: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Jerry David DeCicca⁠⁠⁠⁠⁠⁠⁠⁠⁠

Transcript:

Continue Reading Keith Maziarek on AI, Pricing, and the New Economics of Legal Work

This week on The Geek in Review, we talk with Lennie Nuara, co-founder of Flatiron Law Group, about what it means to build a talent-first, AI-powered legal practice. Nuara brings a rare mix of lawyer, technologist, operator, and systems thinker to the conversation, drawing from decades of experience using technology to improve legal work, from early portable computers and databases to today’s generative AI tools.

Nuara explains why he resists the phrase “AI-first” in legal practice. For him, legal work begins with talent, judgment, and expertise. AI enters as a force multiplier, not the driver. At Flatiron, the firm’s model was already built around flat fees, lean staffing, process discipline, and structured data before generative AI entered the picture. AI now adds more horsepower to a system already designed to reduce waste, repeat touches, and unclear workflows.

Much of the discussion focuses on M&A due diligence, where Flatiron rethinks the deal life cycle from intake through closing. Instead of throwing documents into a massive repository and hoping AI sorts it out, Nuara describes breaking work into smaller pieces: diligence questions, responses, documents, clauses, topics, closing checklists, and reports. That structure lets lawyers use AI for deduplication, extraction, clause comparison, first-pass drafting, and issue spotting while keeping human judgment between higher-risk steps.

Nuara also warns against getting seduced by polished AI output. He describes generative AI as persuasive, fluent, and sometimes dangerously average. The bigger risk, in his view, is less hallucination and more “model monoculture,” where legal drafting drifts toward sameness because models train from overlapping bodies of public material. In complex private transactions, average language is often the wrong answer. Lawyers still need to understand leverage, client priorities, risk allocation, and where to push beyond market terms.

The episode closes with a look at pricing, training, and the future structure of law firms. Nuara argues that AI will pressure the billable hour, change junior lawyer training, and force firms to rethink the traditional pyramid. He also raises a practical concern from the early Westlaw and Lexis days: the cost of the tool matters. Flatiron tracks AI usage down to the clause level, treating tokens as part of matter economics. For legal professionals watching AI reshape transactions, this conversation offers a grounded reminder: better tools matter, but better process and better judgment still decide the outcome.

Listen on mobile platforms:  ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Apple Podcasts⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ |  ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Spotify⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ | ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠YouTube⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ | Substack

[Special Thanks to Legal Technology Hub for their sponsoring this episode.]

⁠⁠⁠⁠⁠Email: geekinreviewpodcast@gmail.com
Music: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Jerry David DeCicca⁠⁠⁠⁠⁠⁠⁠⁠⁠

Transcript:

Continue Reading Flatiron Law Group’s Lennie Nuara on Talent-First AI, M&A Workflows, and the Future of Legal Practice