> This has been a standard optimization for half a century. The original C compiler for the PDP-11 did these transforms even when you turned off optimizations
Consider this, a common easily applied optimization that compilers have been doing for half a century MAY have made it's way into modern CPUs.
Transistors aren't nearly as power hungry as you paint them and CPUs aren't nearly as bad at optimization. There is no reason to switch a multiply or divide for a shift. The ONLY reason to make that switch is if you are dealing with the simplest of processors (Such as a microwave processors). If you are using anything developed in the last 10 years that consumes more than 1W of power, chances are really high that the you aren't saving any power by using shifts instead of multiples. It is the sort of micro-optimization that fundamentally misunderstands how modern CPUs actually work and over estimates how much power or space transistors actually need.
If the ALU contained an early out or fast path for simpler multiplies, the latency would read 1-3. You can verify this by looking at div, which does early out and has a latency of 35-88.
Any compiler that doesn't swap a multiply to a shift when it can is negligent.
Valid points, but in this case (and many others where you encounter power-of-2 mult/div), I'd consider that a shift might actually semantically be the more natural operation in the first place, instead of an "optimization" of the mult/div operation. (With their equivalence being obvious to any reader, it might not matter.)
I was not arguing the people writing code should perform strength reductions manually, I was explaining what they were and then stating that even ancient compilers do them automatically. While I did not explicitly state it, the logical follow on is that programmers should almost never explicitly strength reduce in their code, they should write the semantically clear version and let the compiler handle it for them.
You are correct, that on modern CPUs there are often specifically recognized idioms where the processor can implicitly perform an instruction transform such as a strength reduction from a multiply to a shift.
Having said that, it still makes sense a compiler to perform strength reductions rather than depending on the CPU frontend, at least if your compiler has a relatively decent scheduling model for the CPU. I don't know of any modern production quality compiler that would omit a simple strength reduction like this and leave it to the CPU.
Consider this, a common easily applied optimization that compilers have been doing for half a century MAY have made it's way into modern CPUs.
Transistors aren't nearly as power hungry as you paint them and CPUs aren't nearly as bad at optimization. There is no reason to switch a multiply or divide for a shift. The ONLY reason to make that switch is if you are dealing with the simplest of processors (Such as a microwave processors). If you are using anything developed in the last 10 years that consumes more than 1W of power, chances are really high that the you aren't saving any power by using shifts instead of multiples. It is the sort of micro-optimization that fundamentally misunderstands how modern CPUs actually work and over estimates how much power or space transistors actually need.