Artificial Intellience Compendium: AI25004 Tandem LLM Patient Capture App V01 250925

Blown away by a single word

Regular readers may have figured out that I spent a bit of time in hospital in 2024. While I keep declaring it was “no big thing” and sparing the tender sensibilities of friends by avoiding any detailed descriptions, the truth is that it’s taken a year for my injury to heal. From the start I was left in no doubt that I had dodged a bullet, and that the healing process would be protracted and occasionally unpleasant.

Since then, my situation has come to the attention of a dozen clinicians spread across five specialities in three different but neighbouring health authorities. Incredibly, even though I can walk from one health authority territory to another in a matter of minutes, each of these huge organisations has exercised its right to opt out of central NHS software package selection and operation, and as a result communication between the specialists has often come down to me or my carer being handed little scissor-cut rectangles of paper about a quarter the size of a banknote. These contain vital email addresses that act as information clearing-houses to lead the way out of this information non-distribution labyrinth. We were told, more than once, that we should under no circumstances divulge these addresses.

None of these clinicians could access cloud apps, either. Downloading anything like a client app to a medical workstation is grounds for dismissal. In the end I have carried my MRIs, X-rays and test results around on a USB key – which they’re not supposed to plug in, either.

So my interest was already piqued when a press release turned up from the software house Tandem (tandemhealth.ai). Not related to the mini-computer company of the 1980s nor the PC manufacturer Tandon, the release claimed the usual 200 impossible things before breakfast for a relatively humble LLM-based patient data capture app aimed at GPs and admitting clinicians.

“You know you’ve had it bad when the MD-turned-developer starts to look a bit queasy”

Now, these are people who are already very bored with spending their time typing like demons into forms using a data-input process that’s the direct opposite of how a typical Q&A-style diagnostic process normally operates. Most annoyingly, the physician has to make initial assumptions to fill in these classic composite documents, plumping for the right category in a drop-down list right up front long before they come up with the single, simple question which turns the whole process of disease identification upside down. I’ve always taken a dim view of input forms on web pages or apps that purport to be processing what you put into them, when actually they’re just an inflexible series of assertions that oblige you to do the processing, simply to get round the pesky input validations.

This app, said Tandem, is different. Would I like a test run? I was even saved the bother of having to throw together some fictionalised patient with their sufficiently complicated mythical ailment, because I could just use my own history from the past couple of years.

Even though I kept the worst of my details secret, all I will say is that you know you’ve had it bad when the MD-turned-developer running the demo starts to look a bit pale and queasy. The idea was to chat away, as you might when consulting a new clinician. Something I’d had plenty of practice with. When discussing the whole arc of my experience, I like throwing in a few wry observations here and there, like how it seems to be funny that my old dentist fell off his garage roof while messing with his smart lighting installation. I also like to drop in some basic stats from my most active cycling period, just to foreshorten any conversations about the usual “smoke, drink, what about hobbies?” lifestyle questions. This time I just said that I’d been a hardcore cyclist, something I hoped to get back to as soon as the wounds all healed up.

The LLM ghost in the machine took away our highly informal conversation and mulled it over for a few minutes. What came back was frankly incredible. I’m an evil system tester, thinking up software-breaking situations, and this occasion was no different. Using my history over the past year made for some authentic interview divagations: we rambled over the whole subject, and the fact is that at the end the summary had completely ditched the dentist on the roof bit, and retained “hardcore cyclist”, putting it in the “social, sports and fitness” category.

I was impressed, before moving rapidly into flabbergasted territory when the demo doctor said: “This is the surgical summary. Doctors love their jargon and their expertise-driven rules, and this can produce a nightmarish workload if the GP has to send reports off to several different specialities. The AI in this case can generate an expertise-specific summary at the click of a button, doing different template transformations on the same single central transcript.” As he spoke he flipped the report through half a dozen different formats, for surgical guys, foot guys, diabetes guys, as quick as he could move the mouse.

This, he said, accounts for 30% of the time the app claims to save per week for GPs and hospital clinicians. I actually came across the concept of template-driven interaction a while back, in the context of preparing training materials for employees or customers; in this case the template is a mixture of layout and rules of precedence for various parts of a clinical picture, because the heart guys want the heart stuff on page one while the foot guys don’t mind if it’s on page three.

This is an aspect of AI that most of the punditry and coverage has forgotten. Give the machines something consisting of verified factual reporting and straightforward observation, and their latitude for emitting crap is taken away. This clinical summariser is constrained by some basic rules and requirements, some expressed in the code, others in the initial templates. With all that random freedom taken away, the AI decisions become less random, less a matter of tossing a coin on each re-run: indeed, the ability to re-run for different specialities becomes vital to the clinical outcome. Something I had certainly experienced this year as the list of involved parties grew to four health authorities (five if you include Boots) prescribing nine medications with at least four different sets of rules which frequently contradicted one another.

I’m somewhat miffed by the thought that we have been wasting a lot of intellectual time on mythological interpretations of what AI is and can do. To my mind, this type of guided-output application is a whole different category from your regular MLs, LLMs, etc. This isn’t some Doctor Who-type global master brain: it’s a servant AI, working silently and rapidly inside the Azure cloud platform. Its best day comes when the patient and the audience for the reports generated by their affliction are delivered and consumed without any extra comment about being on any kind of leading edge whatsoever. It looks like business as usual.

I don’t know about you, but to me this is an area in which most business IT managers or proprietors are entirely missing out on. Just the IQ of the summariser alone is enough to get businesspeople thinking: how about turning those rambling calls that follow a fender-bender accident into a crisp couple of sheets of A4, with the underlying conversation kept for a while to allow reports suitable for lawyers, loss adjusters and third-party risk types to be called up at will? This represents a whole new way to do business, with messy human chit-chat as the entry point and crisp, business-ready summaries as the output.

“This is an aspect of AI that most of the punditry and coverage has forgotten”

Hubris in Tandem

I’m writing this in Germany, on hotel Wi-Fi. Tandem’s website is so smart, it detects where I am and changes language accordingly. It is more than diligent: I literally cannot read its English language site while not in an English-speaking location. The little daemon of the site flips me back to local, every time. Full marks, guys, and HOW DO I TURN IT OFF?

No AI has been left to make any decision in the domain of expertise. The usual criticism of “black box” stage magician-style decision-making, of different results with each re-run of the request, don’t apply here, because the whole point of the system is that it has to be able to run the same input with a different output template and provably not suffer any crisis of belief on the part of the subject, the professional, or the recipient. No small task.

So yes, I like the demo. However, I’m reminded of the smart training template people I met in Salt Lake City a few years back. They didn’t hook themselves so firmly to AI but they did freely acknowledge how hard the process of building a template of this type can be. It’s another much more subtle case of the rules governing the cognitive load of building a set of templates to express an arbitrarily difficult, possibly not fully understood, field of expertise being much tougher and booby-trapped with hidden pitfalls than even the experts themselves may not anticipate.

I’m sure there are a lot of board members out there who think that supporting AI and having faith in the solution of difficult problems by “the computer” is the right strategy to follow. The problem for us is this: how can we show the faith-driven decision-maker the basis for deciding what’s easy, and what’s difficult? And that’s what really got me excited about Tandem’s approach to single transcript, multiple report templates, and re-run consistency in AI-driven output.

cassidy@well.com

Artificial Intellience Compendium

Thursday, September 25, 2025

AI25004 Tandem LLM Patient Capture App V01 250925

No comments:

Post a Comment

AI26019 Copyright and AI V01 100326

Report Abuse