Yet whenever I've seen this attempted in practice it's failed miserably. It soun...

midasz · on May 3, 2023

> We deal in time all the time, we know how long an hour is, a day, a week. We can remember "I did something like this before, it took me 2 weeks", not "it took me 13 points".

I really don't get this. Can you guys genuinely work 2 weeks on a feature without doing anything else?. No extra meetings, no dependencies, no incidents, no coworkers to help?

Something simple can take me 2 weeks or 2 days - it really depends on all the other stuff going on when I'm working. That's why imo it's so silly to estimate time unless you're tracking every minute you work on that specific thing and manage to seal yourself in a box away from distractions.

The thing I remember when estimating is. Huh, last time I touched this feature it went really smooth. Doesn't look like much work, adding a validator here and I've got an example right there. Factor in some testwork and I guess it's like 3 points?

> No manager can predictably translate story points into days, which they need in order to pitch to customers and manage their budgets.

Either you're a godlike estimator or you disappoint your manager A LOT. Who the hell are these people that can reliably say feature X will take me 8 weeks? Is that also not just an estimate? I'd rather wallow in the vagueness of points than get my ass reprimanded because I said something would take X weeks. If you're looking for absolute predictability you need factory line work.

danwee · on May 3, 2023

If it's business the one that wants the estimation, then I take into account everything: unrelated meetings, bugfixing of important stuff that gets broken, potential sickness/holidays, etc. What's the point of telling business that you can do the task in 3 days, if you cannot actually have 3 complete days to do the job? If you have "other work stuff to do" that will interfere between you delivering the item, you have to take that into account.

> Either you're a godlike estimator or you disappoint your manager A LOT. Who the hell are these people that can reliably say feature X will take me 8 weeks? Is that also not just an estimate? I'd rather wallow in the vagueness of points than get my ass reprimanded because I said something would take X weeks. If you're looking for absolute predictability you need factory line work.

You give estimations in time units because that's the only unit business wants. You can sure say "that will take X points"... nobody will listen to you and they will demand a different answer. It doesn't really matter, actually.

sanitycheck · on May 3, 2023

I don't think I'm godlike, I've just been contracting for most of 20-something years and I've got good at it. I always give a range. It's never "X weeks" it's "between X and Y weeks" (showing all my workings, too). "X weeks" is always wrong and "X points" is not only meaningless, it also gives no indication of any uncertainty. By giving a range I give the managers enough information to understand the risks - they can choose to go low with their own estimates if they need to win the business, there's no need to pressure the team into giving an artifically low estimate.

I agree that if you start giving a single number of days/weeks, that's bad. Then there's a strong incentive for everyone to start padding their single figure estimates to cover their asses, and the managers end up just halving them in their gantt charts and pushing the team to work quicker. That's an adversarial environment, where nobody trusts each other.

Izkata · on May 3, 2023

It's more like "the last time I did a case like this it took around 3 weeks to close the case, so 3 weeks this time. Maybe a little less because there's less experimenting.".

Basing it on past experience like that automatically takes into account (average amount of) meetings, waiting on others for code reviews, etc.

vsareto · on May 3, 2023

>Can you guys genuinely work 2 weeks on a feature without doing anything else?

Sometimes, but you need a progressive team that, e.g. that doesn't do stand ups every day when almost all the work is multi-day efforts.

You also need people who are technically skilled to not bug a developer about something they can find out themselves. More skill = more autonomy = less bother. That holds true for their work too.

A regular occurrence is people around developers leaning on them to figure out technical details because they are not technical enough - that should be their job, so that gap contributes to a higher number of meetings.

>Who the hell are these people that can reliably say feature X will take me 8 weeks?

They're probably overestimating then time the delivery to be around 8 weeks (they are more skilled than their estimate would lead you to believe). A feature like that should get broken up, then you estimate the individual ones. It doesn't matter that the fragments aren't testable/deployable on their own, it helps with the complexity. Decomposing that feature also helps you estimate better.

SkyBelow · on May 3, 2023

>The first thing I ask when I have to estimate in SP is "how many hours is 1SP?" and after a few minutes of the usual back-and-forth, whoever has to actually use these damn estimates always says something like "I treat 1SP as half a day". Bingo, now I can give you a number you can use.

That's because they aren't doing it right.

SPs for a team should be measured based on recent past performance. How many hours is 1SP? Let me look at how many hours the team has worked over the last 4 sprints and how many story points they have completed. You only need to ground it when you have a new team.

The problem I see is that people never want to stick to story points and never want to run the calculations. They want SPs to be the same across teams which isn't possible with this method. What project managers should do is look at the feature SP size and the SP per sprint to see how many sprints the works will take, which gives you an equal metric between sprints. You don't say team A is delivering 30 SPs over the next quarter working on a 20 SP feature while team B is delivering 45 SPs over the next quarter while working on a 60 SP feature. You instead look at the features and say that team A is working on a feature that looks like it'll take them 4 sprints to do and team B is working on a feature that will take them 8 sprint to do.

>We can remember "I did something like this before, it took me 2 weeks"

That works if enough of your new work is similar to previous work. I find that is rarely the case.

perfectspiral · on May 3, 2023

It's absolutely cargo cult nonsense and judging by the surrounding comments I'm SO glad we've all finally decided to stop pretending like it isn't.

Too · on May 3, 2023

This fails when you keep thinking in hours during estimation phase. Stop all such translations and instead map points to relative complexity and architectural impact.

Add button in UI = 1sp

i8n text on button = 2sp

New field in database incl backup and migrations = 10sp

And so on.

Once you’ve done this enough times you can correlate it to hours. Never talk about this factor to the ones doing the estimation and always keep the point system fixed to avoid fluctuations in velocity.

tharkun__ · on May 4, 2023

Sure but your estimates are way off. Adding a button should be a 2 and i18n should be a 1. A text is just replacing a hard-coded text with a call to i18n of a key. In fact that should be a 0 so to speak. Adding a button means actual functionality i.e. a button always has to actually do something. That something is actually probably way more points than a simple new database field which is a simple copy and paste of a script that does an alter table. I can add that to our automated db update scripts in my sleep, so if i18n is a 1 then the field is a 1 as well.

Izkata · on May 3, 2023

> Maybe it works but nobody does it right?

> A point is a different size for everyone, but a day is a universal (ok technically global) unit.

I can definitely see that. Pretty much what you said is what I always hear, but the actual theory behind it (and what I'd guess they're testing in those studies) is that people can agree a given case is the same size ("big" or "small"), but the actual time taken to complete it is going to depend on the individual's experience (both in general and with that specific codebase).

It's just that the time mapping in scrum isn't supposed to happen at the individual level, it's something handled by the scrum master / manager / whoever that interacts with the rest of the business, using an average. This way time estimates when the team has different levels of experience get smoothed out into something hopefully more accurate at the sprint level.

ohthehugemanate · on May 3, 2023

It's not rocket science, but I can't speak for how your teams have been doing it. I know there's a tremendous number of well meaning people who get certified in a rote method and only understand it as dogma. Doing this right only requires understanding the underlying principles and figuring out a method that your team likes.

Principles are:

- humans are much better at estimating relative effort than average time. So estimation sessions are only in terms of effort. The answer to "how many hours is one point?" is "as long as it takes." With developers like you this can be a hard line to keep but it is absolutely full stop required.

- consistency in relative effort unit sizes is a requirement for the math to work. Group estimation does this automatically after a few sessions, and can help expose miscommunications and better architectures along the way... but it's not the only way to do it.

- a consistent yardstick for "done" is required for consistent unit sizes. (Logically)

- project managers track average number of points completed per sprint. Even though this average will become extremely consistent, it is an AVERAGE ONLY and can not predict any individual sprint.

- There is no pressure to burn points "faster". Remember, point sizes are arbitrary and consistency is required. When engineers feel pressure to complete more in the same time, consistency is dropped and the math breaks) if you are using sprints this is an easy trap to fall into. More points per sprint != better. Making tasks easier is OK though, ie with automation or technical improvements. Note that this would impact the estimated point size of your tasks, not the number of points put through in a sprint!

That's it. Do it how you want, but have consistent relative point sizes, don't let engineers talk or think about time, don't treat an average like a single sprint prediction, and don't pressure engineers to increase velocity.

A casino can't predict a single hand, but they can predict with great accuracy their profit margin after 100 hands. Using the same math, you can't predict a single sprint, but you can predict with great accuracy 5, 10, or 20 sprints... as long as you help the developers stay unpressured and therefore consistent.

ALL THAT SAID, when you say "I did this before, it took me 2 weeks," that is also a very effective way to estimate. If your tasks are highly consistent, definitely track time for implementation of similar work (because human memory is very fallable) and estimate this way. Just don't let yourself notice that you defined a consistent unit of effort and used past average time to predict future average time, and you can feel like you've found a great life hack.

sanitycheck · on May 3, 2023

Well no, and rocket science works which is one difference.

I've been through all this with multiple teams/companies and I wasn't always even opposed to the idea! In the end all point-based estimates seem to do is add more ways to be wrong.

Humans are actually pretty good at estimating absolute time if they've had some practice. After the first few times you estimate a day and it takes a week, you realise you're an optimist and you should compensate for that. It seems the wrong conclusion to draw would be "oh well it's impossible, I should use made-up units instead", just get better at it. There was a time I was mostly doing fixed-price work, nobody would have accepted points in a quote and nobody would pay me per point.

Consistency in unit sizes is basically impossible, from what I've seen - unless there's an unspoken but commonly understood translation to units of time.

Project managers who try to size sprints in points tend to eventually give up, after finding that the the actual effort taken varies so wildly between sprints. (I also tend to think sprints suck, and Kanban is the way to go for agile - but that's another topic.)

chasd00 · on May 3, 2023

> There was a time I was mostly doing fixed-price work, nobody would have accepted points in a quote and nobody would pay me per point.

I've done that too and fixed-price work is like handling dynamite. I've always felt it was good for new grads to spend a couple years in small, eat-what-you-kill, consulting shops to understand the business of software development better. After that, then go to megacorp/faang or whatever but the lessons learned in a small consultancy will help you see the forest through the trees.

burnished · on May 3, 2023

My team does point based estimation and I like it. The points are useful for getting an idea of how much stuff to work on in a sprint and a measure for how much room we need to leave in for the inevitable and consistent problems that we have to pick up mid sprint.

The real benefit imo is that it is a good shorthand during estimation that reveals misunderstandings or flaws in a ticket. Everyone pointing a ticket at a 2 is whatever, literally any time spent discussing if it is a 2 vs a 3 drains my vitality, but every time there is a big outlier or bimodal distribution we get a lot of value from talking it out.

I think what points gets you here over hours is that it focuses on the task at hand. I think a time centric approach requires your estimates to be more like 'X hours for John, Y hours for Shirley'.

I'm not touching your points about how there are other contexts where it doesn't make sense to do this, or the absurdity inherent in pretending that they don't still boil down to time in a meaningful sense because I agree with them.

sopooneo · on May 4, 2023

I've never been anywhere that used story points effectively, but the notion sincerely appeals to me, and I look forward to being somewhere that has it working.

But there is one practical question that has always confused me: how do you "measure velocity" in Scrum, when each sprint's scope, and the story points within, are all preset at the start of the sprint?

Do we assume (realistically) that dev is not completing all stories within the sprint? Or that they are adding more stories in the middle? Or are we assuming that each sprint is finished exactly as planned, on time, and changes in average "velocity" reflect differing levels of confidence or ambition of the dev team as the decide how many stories to include in each sprint?

In a rough KanBan process, I can see measurement of "velocity" as being more straightforward. Am I missing anything?

ohthehugemanate · on May 5, 2023

Sounds like maybe your experience is with places that do this backwards? Velocity is a backwards looking measure. It doesn't commit your team to grind until that's done.

We get as much done as we get done. We take a guess about how much work that will be at sprint start, based on previous average, and make sure that much work is well understood and prioritized. But tasks turn out to be easier or harder than expected, people get sick, shit happens and you end up accomplishing more or less than what you expected. At sprint end you measure what you actually accomplished, to help predict the long run timeline of the project and maybe to get a better guess at how much work to prep for next time.

There should be no time or "amount done" pressure on the engineers. We measure the actual throughput and use that to inform management how long the work as defined will probably take. If management doesn't like that assessment, they can manage the situation by adding engineers or reducing complexity/scope. If you really like playing with fire you could let them ask to lower quality standards, too.

sopooneo · on May 6, 2023

Got it. That makes sense. And I hear you on how you could accomplish less story points, or tickets, or whatever during a sprint. Happens all the time for the reasons you say.

But how do you accomplish more? Are you in organizations that have enough trust that items can be added to a sprint while it is in process?

dragonwriter · on May 3, 2023

> The answer to "how many hours is one point?" is "as long as it takes."

In principle the “how many hours to one point” question is answered by tracking and analysis of past sprints, where over time you develop an empirical measure of velocity, which gives you an idea both of the average time it takes to do a atory of a particular size, but also the size of the error bars when taking that average as an estimate.

Which can be useful for sanity checking things like sprint sizing, though the existence of the stat risks it becoming an optimization target in some orgs, either because management sees it and wants “line goes up” or for some other reasons (teams can do it to themselves, though management is more likely to.)

lumb63 · on May 3, 2023

The goal of estimating is to know how long it will take to complete a given task. A successful estimating methodology ought to provide this. I am not convinced the method you outline does provide this.

First, I would be interested to see any data you have supporting the idea that “humans are much better at estimating relative effort than average time”. My own experience is humans have biases toward optimism or pessimism and are poor estimators of the magnitude of risk, but I’ll ignore that for the moment and accept the premise as true.

The other underlying premise of your method is that velocity is a random variable about this perfect prediction with a known distribution, such that via the law of large numbers, over a large number of sprints, the average velocity can be known with high accuracy. I see a few issues with this.

One issue is that velocity is not static. Velocity changes with team morale and how interesting or motivating the task is. It changes with how well engineers are suited to a task. It changes with organizational changes, the addition of new team members, the loss of old ones, the work environment, etc. I don’t buy that these are small factors that average out over the long run, either, since my experience leads me to believe they don’t. A poor work environment or coworkers leaving the team or low morale usually are harbingers of bad things to come, and, opposite, a growing team working usually has exciting challenges, morale is usually high, etc. My experience hasn’t shown a mixed bag of equal good and bad, but instead two divergent sets.

Even if velocity were static, the other issue is the distribution is unknown. Sometimes teams have bad weeks, and sometimes they have good ones. Sometimes something catastrophic happens in the organization and a dozen engineers are twiddling their thumbs or are thrown into some task they did not foresee. Sometimes things go unexpectedly well. A priori, there’s no way of knowing what a particular team tends to, or if that trend will continue. One could try to measure it, but my experience has been that something that was supposed to be static will change long before a sufficient amount of data could be collected to be meaningful.

Even if you knew the exact distribution you’d encounter, the final problem that remains is that projects aren’t infinitely long. There aren’t a million sprints to smooth things out. There will be variation in the outcome.

I guess what I’m saying is, estimates are estimates. Why systematize their creation when they will ultimately be wrong, and creating them is so costly? I could understand having a system if it were low-overhead, but my experience with a system similar to what you’re describing was anything but that.