You're probably sick of this conversation by now :). But at least in radio applications I think that acf[0] is normalized so that it's 1 (typically). And again the ACF is calculated at several lag arguments and the sum is used to build the final graph / array.
But you obviously know more about this than me, I'm just putting out what I know. Your paragraph above, I actually copied so I can study it a few times. So thanks.
It sounds this is what I'm doing: taking ACF at equally spaced offsets. Not sure what the sum of ACFs would achieve, but this might turn out a good idea.