Bridgewater Associates, the world’s largest hedge fund, says it has trained an in-house AI model that judges financial documents better than any frontier model it tested, and at a fraction of the cost. In research published June 30 with Mira Murati’s Thinking Machines Lab, the firm’s AIA Labs team reported that its fine-tuned model reached 84.7% average accuracy across six document-filtering tasks drawn from investors’ daily work. The best off-the-shelf model, OpenAI’s GPT 5.5, managed 78.2%.
Measured by error rate, the trained model makes 29.8% fewer mistakes than the best frontier model, and it crosses a line the big commercial models could not: 80% accuracy, which the authors call “the threshold investors expect from a system they could trust in their daily workflow.” The custom model also runs 13.8x cheaper per task because it is far smaller than the frontier systems it beat.
Simple tasks, surprisingly hard for frontier AI
The research team, five authors from Bridgewater AIA Labs writing in collaboration with Thinking Machines, tested: information triage. Investors drown in news articles, research reports, central bank documents, and email. Reading is the easy part. The work that eats the day is deciding what deserves attention, and the six benchmark tasks capture exactly that. They include classifying whether an article matters to a C-suite investment professional, judging whether a central bank document signals the direction of future rate changes, and finding the spot in a document or email where boilerplate begins.
Humans find these calls trivial but struggle to explain them. The paper offers a pair of Financial Times headlines as an example.

