As Consumers Turn to AI for Financial Advice, Which Does the Best Job? Here’s What One Test Found

NEW YORK–With consumers increasingly turning toward artificial intelligence rather than their credit unions when it comes to getting personal finance and investing advice, which AI does the best job of answering questions?

A group of reporters recently put two of the best-known large language models to the test, but noted that all of the AIs begin their advice with a disclaimer: They’re not able to provide personal finance or investing advice, even as they’re doing exactly that.

In some cases, the models recommend consulting financial professionals.

“They can provide personalized advice that I, as an AI, cannot,” the Money team quoted Gemini as saying in response to a question about choosing between a Roth IRA and a traditional IRA. Meanwhile, prefacing a 600-word response to a prompt about stock-picking vs. investment funds, ChatGPT warned, “Nothing here is personal financial advice.”

The Stakes are High

Staff with Money noted that while an AI model is unlikely to tell a consumer to consult a pro if they have a basic question about what to watch on Netflix, “ChatGPT and Gemini seem to recognize that the stakes are high with financial decisions, similar to requests for medical and legal advice.”

The AI companies behind them are also aware that regulatory agencies increasingly scrutinize how financial advice is given online, Money said.

Despite all the cautions, Money noted that one recent report found that 27% of people say they would trust AI to manage their finances over their significant other. The report further found the average U.S. adult would feel comfortable letting AI manage nearly $20,000 of their money.

The Test

Money reported that its examination of how some of the most powerful AI models respond to requests for personal finance advice involved five of its staffers grading outputs to 25 questions that we ran through ChatGPT’s o3 model and Gemini 2.5 Pro.

“The most striking finding in our test? One model performed far better than the other, earning higher marks across all five topic clusters we tested,” Money reported.

Here’s what it found:

Personal Finance Questions

ChatGPT overall grade: B-

Gemini overall grade: B+

“If you’re going to use AI for personal finance advice, which AI should you use? Our analysis found that Google’s “most intelligent AI model,” Gemini 2.5 Pro, outperformed OpenAI’s o3, which was its “most powerful reasoning model” until the Aug. 7 release of a newer model, GPT-5,” Money reported. “Our average score for the Gemini model was 3.18/4 (B+) while the ChatGPT model came in at 2.82/4 (B-).

“Gemini impressed us with thorough explanations and impressive sourcing,” the analysis stated. “With that said, the best model to use is the one that works for you. Gemini’s responses, for instance, tended to be longer than ChatGPT’s outputs in our test. You may prefer the to-the-point nature of ChatGPT’s tool.”

Questions About Retirement

ChatGPT: B-

Gemini: B+

“Providing general guidelines for retirement planning isn’t rocket science,” Money said. “After all, the basic principles of saving for your older years don’t change much, and both ChatGPT and Gemini serve this function sufficiently by explaining contribution limits and general savings goals. It’s when you need more nuanced advice — about, say, strategizing your retirement contributions to limit what you owe in taxes or building an investment portfolio that balances risk and payouts — that our graders found the AI models lacking.”

Questions About Housing

Money said it was also curious to see how the two language learning models used for this experiment, ChatGPT and Gemini, would do when asked basic housing questions.

‘For example, when asked ‘How do I know if I can afford a house?’ ChatGPT provided quite a few formulas to help you calculate things like your debt-to income ratio (DTI) and principal, interest, taxes and insurance (PITI). But it failed to adequately explain what these calculations are and why they are important when determining affordability (or mention other factors that could affect affordability),” Money reported. “Gemini did a much better job of defining the most commonly used terms used in calculating affordability and presenting the information in a reader-friendly manner. It wasn’t perfect, though. Gemini could have used some of the payment calculations and formula examples included in the ChatGPT version.”

Questions About Credit

ChatGPT section grade: B-

Gemini section grade: B+

According to Money, when reading ChatGPT’s response to a question about improving one’s credit score, its reporter identified several issues immediately.

Unrealistic timeline: Promising results in “weeks” is misleading. Credit scores typically update monthly, and big boosts usually require sustained effort

No mention of on-time payments: ChatGPT failed to mention payment history. While not an immediate fix, paying on time stops further damage. Late payments severely hurt your score, so any advice should emphasize timely payments

Missing context on urgency: The answer never asked why a fast boost was needed. If no credit application or rental is imminent, there’s no need to rush. Better to focus on steady improvement instead of quick hacks.

“On the positive side, ChatGPT did suggest lowering credit utilization (using under 10% of limits), an effective short-term strategy. However, much of its other guidance was shallow,” the Money analysis stated

As Consumers Turn to AI for Financial Advice, Which Does the Best Job? Here’s What One Test Found

Leave a Reply Cancel reply

Never miss any important news. Subscribe to our newsletter today!