It is interesting that one of the best ways to deal with math problems in LLMs is to have it write Python code to solve the problem. It is good at writing that kind of fairly straight-forward Python, and Python is good at accurately being able to do math. It does mean that you need to safely implement a sandboxed Python interpreter to do the calculations though.