Queues are operational complexity. Given the (worst-case-ish) choice between "architecture without a queue that sometimes has HTTP-level timeouts" and "architecture with a queue that reliably renders a spinner and sometimes has human-task-level timeouts," I'd probably favor the former unless management etc. really want the spinner and I'm confident we have tooling to figure out why requests are getting stuck in the queue. Without that tooling, debugging the single-threaded architecture is much easier.
Sure! But that is trading off user experience for technical simplicity (which you do often have to do at some point). However: the argument was that this system was better for user experience than a design that could accept requests in parallel, which is what I'm resisting/not yet understanding. In reality, I'm sure that the system was fine for the use cases they had, which is what I meant to admit with "I'm not saying you needed all that". I will say that the single threaded no-queue design already carries a big risk of request A blocking request B.
My argument that this helps user experience is that, when a failure does happen, it's a lot easier to figure out why, tell the user that experienced it what happened and get them unblocked, and fix it for future users in a simpler system than a more complex one. The intended case is that failures should not happen, so if you're in the case where you expect your mainframe to process requests well within the TCP/HTTP timeouts and you can do something client-side to make the user expect more than a couple hundred ms of latency (e.g., use JS to pop up a "Please wait," or better yet, drive the API call from an XHR instead of a top-level navigation and then do an entirely client-side spinner), you may as well not introduce more places where things could fail.
If you do expect the time to process requests to be multiple minutes in some cases, then you absolutely need a queue and some API for polling a request object to see if it's done yet. If you think that a request time over 30 seconds (including waiting for previous requests in flight) is a sign that something is broken, IMO user experience is improved if you spend engineering effort on making those things more resilient than building more distributed components that could themselves fail or at least make it harder to figure out where things broke.