How Quick Research Helped Me Win My Company's March Madness Bracket

By: Jacob Russell
A technical look at why a purpose-built research tool outperformed general-purpose AI for bracket analysis.
Full disclosure: I know absolutely nothing about basketball. I do not watch games, I could not name more than a handful of players, and before this year's tournament, I had no idea what "AdjO" or "AdjD" meant. But when our company announced the annual March Madness bracket challenge, I saw an opportunity to test something interesting.
While my colleagues were using ChatGPT, Claude, and other general-purpose LLMs to fill out their brackets, I decided to leverage Amazon Quick Research. What started as a simple bracket challenge turned into an informal benchmark test between AI models. And somehow, my bracket came out on top.
The Results: Quick Research Dominates
When the final buzzer sounded, the results were striking. My bracket finished first with 1,370 points and 98.5% accuracy out of 15 total entrants. But what made this particularly interesting was seeing how the other AI-powered brackets performed:
Final AI Bracket Rankings:
- 1st place (Quick Research) - 1,370 points, 98.5% accuracy
- 4th place (Claude) - 1,210 points, 92.0% accuracy
- 10th place (ChatGPT) - 760 points, 52.1% accuracy
This was not a controlled benchmark or a formal evaluation of model quality. It was one company bracket challenge and one tournament. Still, the results were notable because they highlighted how a purpose-built research workflow can produce a very different level of depth than a general-purpose chatbot in a decision-making task.
Quick Research did not just win. It beat the ChatGPT bracket by 610 points with 46.4% higher accuracy. Even the fourth-place Claude bracket trailed by 160 points and 6.5% in accuracy. The performance gap was substantial and consistent across every stage of the tournament.
Interestingly, ChatGPT had won our company's bracket challenge the previous year, so there was some precedent for AI-powered brackets performing well. But this year's results suggested that the type of AI tool matters significantly.
This was not a close race. It was a demonstration of what happens when you use a purpose-built research tool versus general-purpose chatbots for complex analytical tasks.
.png)
The Technical Approach
Quick Research didn't just give me predictions. It generated a comprehensive 12-page analytical report that synthesized data from 36 different sources, including KenPom efficiency metrics, Bart Torvik's predictive models, ESPN analytics, CBS Sports data, and real-time injury reports.
The analysis employed several sophisticated methodologies:
.png)
KenPom Efficiency Analysis
The research evaluated all 68 teams using adjusted offensive efficiency (AdjO) and adjusted defensive efficiency (AdjD) metrics, measured in points per 100 possessions. Only three teams met the historical championship profile of AdjO ≥ 124.1 and AdjD ≤ 91.3: Duke, Arizona, and Michigan.
Bart Torvik Probability Modeling
Championship probabilities were calculated using Torvik's methodology, which assigned Duke a 20.2% title probability, Michigan 17.9%, and Arizona 14.8%. These were not arbitrary percentages but rather outputs from predictive models incorporating strength of schedule, efficiency metrics, and historical tournament performance.
.png)
Seeding Inefficiency Detection
The analysis identified significant discrepancies between committee seedings and predictive rankings. Louisville was underseeded by 9 spots (committee rank 23rd vs. predictive rank 14th), while North Carolina was overseeded by 7 spots (committee rank 22nd vs. predictive rank 29th). This gave me edges on which "favorites" were actually vulnerable.
Injury Impact Quantification
Rather than just noting injuries, the research quantified their impact. Duke's starting point guard Caleb Foster and center Patrick Ngongba were both injured, creating measurable vulnerability despite their #1 overall seed. BYU lost their 18 PPG scorer Richie Saunders to a torn ACL, and the analysis showed that 86% of their pick-and-roll possessions in the last 10 games were run by just two players.
The Data That Made the Difference
What separated Quick Research from the other AI tools was the depth and specificity of the analysis. For upset predictions, it didn't just say "this could happen." It provided quantified probabilities with supporting evidence:
VCU over North Carolina carried a 39% upset probability because UNC's key player Caleb Wilson was out with a broken thumb, eliminating their interior scoring advantage.
Texas over BYU had a 37% upset probability due to BYU's injury situation and Texas having a 7-foot center that created matchup problems.
These predictions were backed by specific statistical indicators like offensive rebounding rates, three-point attempt volumes, and defensive efficiency rankings.
The research also identified momentum indicators that other models missed. High Point entered the tournament on a 14-game winning streak, averaging 90.0 PPG (3rd nationally) and forcing 16.4 turnovers per game (3rd nationally). Akron was on a 10-game winning streak with all five top scorers being seniors, ranking 8th in effective field goal percentage.
The Championship Prediction: Michigan Over Duke
Quick Research predicted Michigan would win the national championship, which seemed counterintuitive at first. Duke had Cameron Boozer, who was leading the nation in Player Efficiency Rating (34.7), Win Shares (9.6), and Box Plus/Minus (19.9).
But the analysis made a compelling technical case. Michigan had the nation's best defense with an AdjD of 89.0, meaning they allowed 89.0 points per 100 possessions after adjusting for opponent strength. Their veteran, portal-built roster featured Big Ten Player of the Year Yaxel Lendeborg and 7-foot-3 center Aday Mara, creating length advantages that could disrupt Duke's offensive schemes.
The critical insight was roster construction. Michigan's veteran-laden lineup had a 17-3 record in Quad 1 games (games against top-tier opponents), demonstrating proven ability to win high-stakes matchups. Duke's freshman-heavy rotation, combined with injuries to key players, made them more susceptible to championship game pressure despite their superior offensive rating.
Michigan's veteran-laden, portal-built roster has the composure and depth to weather Duke's runs, while Duke's freshman-heavy rotation (without Foster) is more susceptible to the pressure and variance of a championship game.
This prediction proved decisive. Getting the championship game correct earned maximum points in the tournament scoring system, and it was the primary differentiator between my bracket and the competition.
Why Quick Research Outperformed
This bracket challenge became a useful comparison of AI approaches. The difference was not that one tool sounded more convincing than another. It was that Quick Research approached the task more like an analyst than a chatbot.
Multi-source data synthesis: Rather than relying on a single data source or general knowledge, it pulled from 36 specialized sources and integrated them into a coherent analytical framework.
Quantified uncertainty: Instead of binary predictions, it provided probabilistic assessments. A 39% upset probability communicates both that an upset is possible and quantifies the risk, allowing for more informed decision-making.
Contextual integration: Injuries weren't just mentioned in passing. The analysis showed how Duke's injuries would force freshman Cayden Boozer into a starting point guard role and thin their depth in a brutal East Region. It explained why BYU's offense became predictable after losing Saunders, with 86% of their pick-and-roll possessions concentrated on just two players.
Historical pattern recognition: The research incorporated historical data like 9-seeds beating 8-seeds 51.3% of the time, and that at least one 12-over-5 upset has occurred in five of the last six tournaments. These base rates provided statistical grounding for upset predictions.
Advanced metrics interpretation: Rather than just listing statistics, the analysis explained what they meant. A RAPM (Regularized Adjusted Plus-Minus) of 10.3 for Cameron Boozer meant Duke outscored opponents by 10.3 more points per 100 possessions with him on the court. This level of technical detail enabled more sophisticated bracket decisions.
Where the Performance Gap Came From
The 610-point margin between Quick Research and the ChatGPT bracket wasn't random. It reflected systematic differences in analytical depth:
Championship game accuracy: Quick Research correctly predicted Michigan's victory, earning 320 points. The ChatGPT bracket missed this entirely, scoring 0 points in the National Championship Game category.
Upset identification: Quick Research's quantified upset probabilities led to correct predictions on high-value upsets. The ChatGPT bracket appeared to favor chalk picks without accounting for injury context or seeding inefficiencies.
Consistency across rounds: Quick Research maintained 98.5% accuracy across all tournament stages (Sweet Sixteen, Elite Eight, Final Four, Championship). The ChatGPT bracket's 52.1% accuracy suggests it struggled particularly in later rounds where contextual factors become more critical.
The fourth-place Claude bracket performed better but still trailed Quick Research by 160 points. This suggests that while some AI tools can provide reasonable predictions, the depth of analysis matters significantly when the goal is optimal performance rather than just plausible results.
Technical Takeaways
For anyone using AI for decision support, the experience highlighted a few broader takeaways:
Data source diversity matters: Synthesizing information from 36 specialized sources provided more robust predictions than relying on general training data.
Quantification enables better decisions: Probabilistic assessments (39% upset probability) are more actionable than binary predictions.
Context integration is critical: Understanding how injuries, momentum, and matchup dynamics interact requires more than just statistical analysis.
Methodology transparency builds trust: Knowing that predictions were based on KenPom efficiency metrics, Torvik probability models, and historical patterns made the analysis more credible and actionable.
Architecture determines capability: The difference between Quick Research and general-purpose LLMs was not just output quality. It was the underlying ability to synthesize specialized sources, incorporate real-time context, and translate that into structured, probability-based reasoning.
That same research approach can be applied far beyond a tournament bracket (for example, market analysis, competitive intelligence, and strategic planning) where the quality of the outcome depends on how well the system can gather, weigh, and connect evidence.
In this case, it also happened to win the company March Madness bracket.
For a detailed look at the complete analysis, download the full Quick Research report here.



