Hot Take: Don'T provide incident resolution estimates

(firehydrant.com)

13 points | by vinnyglennon 7 hours ago ago

19 comments

  • crazygringo 6 hours ago ago

    > an unscripted voice comes over the speakers, informing us that the next available train will be here in five minute... 10 minutes later... Our train slithers its way into Bedford station

    The author seems to be missing an (unwritten?) rule of train time estimates.

    They are never about when a train is actually expected to arrive. They are always about the minimum possible time until the next train.

    They are basically just taking the distance to the next train, and calculating how long it will take to arrive if it goes at normal speed.

    The point is not to determine whether you'll get somewhere on time. Trains go slow for all sorts of reasons. Trains provide no guarantees.

    The point is to look at your phone and think, if it takes me 4 minutes to get to the station and the train is 6 minutes away, am I guaranteed to make the train? In which case, the answer is yes, because the train will never come earlier than advertisted. You won't miss it. And if the next train isn't for 26 minutes (minimum), then how can I efficiently use that time and not waste time getting to the station early?

    This concept is rarely clearly communicated, but it's good to be aware of it.

    • diggan 5 hours ago ago

      > They are never about when a train is actually expected to arrive

      Maybe I'm way beyond my expertise, but isn't the train schedules when the train will leave, rather than when it arrives? I've arrived early at train stations many times with my train already being there, idling, and then it leaves on the scheduled time.

      Except I guess during delays in the UK when they announce when the train arrives?

      > the train will never come earlier than advertisted

      This part kind of makes your comment sound like it's not about delays but about the schedules, which I feel like are the departure time, not the arrival time of the train.

      • crazygringo 5 hours ago ago

        The article is referring to NYC subways which, while they run on a schedule in theory, the schedule is effectively irrelevant except for when it leaves its first station.

        It won't speed up to compensate for delays, and it's never going to wait at a station because it's running ahead.

        So the countdown clocks don't report scheduled times, they are derived from the train's actual current distance.

        Obviously medium- and long-distance rain is different.

        • diggan 5 hours ago ago

          > It won't speed up to compensate for delays, and it's never going to wait at a station because it's running ahead.

          The first part absolutely makes sense, but the second one I'm not so sure. They don't have any "Hold at station" or whatever it's called to get some even spacing if they're running ahead of schedule? Seems unnecessarily chaotic.

          Doesn't that mean sometimes you end up with 2 or 3 trains coming right after each other?

          • xp84 4 hours ago ago

            > Doesn't that mean sometimes you end up with 2 or 3 trains coming right after each other?

            Kind of -- you definitely can have that happen, but probably never because Train #2 is way ahead of schedule, as it's nearly impossible for any American mass transit system's best case scenario to be any better than "on time". When a parade of trains all at once happens, it's more typical that, if trains are theoretically supposed to stop here at 11:10, 11:20, 11:30, and so on, train #1 would be arriving 29 minutes late at 11:39, followed by Train #2 22 minutes late at 11:42, and Train #3 running 14 minutes late at 11:44. For instance, due to "police activity" or worse, when someone dies by suicide on the tracks (pretty sure those delays are much longer than the above example in that case).

          • jermaustin1 4 hours ago ago

            Only slightly sarcastic: In the history of the subway, has it ever been early?

    • karmakaze 5 hours ago ago

      Except in Japan. Googled "japan train schedule accuracy"

      > The average delay for a Shinkansen train is around 20 seconds. For other trains operated by other railway companies, the average delay is around 50 seconds. In both cases, the average delay is less than a minute. But these average figures need to be tempered with the occasional incident.

      And quite the opposite for charter vacation flights which can leave hours earlier than the stated departure time.

    • bryanlarsen 5 hours ago ago

      Good bus/train statistics acknowledge that early trains are far worse than late ones. If a bus on a 30 minute schedule arrives early, it's effectively 30 minutes late because anybody arriving at the stop on time will miss the bus.

  • egypturnash 6 hours ago ago

    On the other hand:

    One night, there’s a big storm. The power flickers and goes out. Pretty soon, I get a text from the power company telling me that they know it’s gone and they expect to have it fixed by noon tomorrow. About what I was hoping for.

    Then they get it fixed around the time I’m going to bed. I am very happy, because not only was it fixed in a reasonable timeframe, it was fixed sooner than I was told to plan for.

    Pad your estimates with plenty of time for there to be complications. If it turns out to be a five-minute fix, then great!

  • com 7 hours ago ago

    In fact, if you can get away with it, don’t agree to SLAs and offer on-demand contract termination and immediate data migration to your clients.

    You get to get rid of service level managers, one of the biggest threats to service availability in my industry experience, and you begin to hate downtime, not tolerate it, and the same for performance degradation, response jitter etc.

  • kelseyfrog 5 hours ago ago

    My stealth start up is using generative AI to deliver incident resolution estimates and incident response communication. It reduces responder stress and can deliver minute-by-minute updates on resolution.

    Once we deliver in this space, we intend to pivot into using gen-AI to estimate JIRA tick time and give the same realtime information to tasks by auto-generating update comments based on developer activity.

  • neilv 6 hours ago ago

    > I would have preferred a message explaining that the doors were broken on the incoming train, that they had to fix it, and that they didn't know how long it would take.

    I would've preferred an order of magnitude estimate and confidence level.

    Both trust and respect are important, and can be hot issues in public transit.

    The NYC subway might be a great class equalizer for many, but much other public transit is firmly for lower classes.

    One thing thing even more infuriating and humiliating than waiting for a bus that's 30 minutes late -- like your time and life aren't important, because the only people who ride this bus are those who have to -- is when the bus finally comes, 30 minutes late, and is overfull and not taking more passengers. Then, to the people a driver isn't letting onboard, drivers in some locales will say, "next one is right behind me", whether or not that's true.

    Consistent with trust and respect, give the best information and confidence level you have, so people can act on that.

    And respect them as equals, to also behave with dignity -- don't try to manipulate their sentiment, by misrepresenting or withholding information.

    • tofof 6 hours ago ago

      I came back to find that what I had to say had already been put eloquently here, particularly those last two lines.

      Withholding information is disrespectful, and is just as damaging to your customers' ability to make alternative choices that would benefit them as providing bad estimates.

      The author's chosen example is quite cherrypicked. I can not imagine any scenario for public transit breakdown in which the actual delay would be "5 minutes." In fact, this is so short that there's no reason to even bother with that update if that were true. It's true that promising this completely unrealistic window was detrimental - but that's because it was such a poor estimate.

      It's not hard to give expected-case (not best-case!) and worst-case estimates that are in the right order of magnitude, and to communicate in a way that these aren't seen as promises. "The next train will be here in 5 minutes" is the worst possible phrasing, presenting an absolute truth, and it's perfectly reasonable that customers and the response team treat it as such. Again, the solution is ~better~ communication, not ~witholding~ communication.

    • andsomehacker 6 hours ago ago

      Absolutely this. Feels like the real problem in the author's story is that no update was provided when the situation changed. Providing no information at all is a worse experience.

  • syngrog66 5 hours ago ago

    special case of more general rule: don't give time estimates to others. asking for pain

  • jasonlotito 6 hours ago ago

    > Hot Take: Cater to the lowest common denominator of customer

    I disagree.

    You can either cater to the lowest common denominator, or you can cater to professionals that understand what the word estimate means.

    If you can't provide an estimate, I'm going to reasonably read that as you not have a good understanding of the issue and are not confident in whatever you are doing. Basically, you are incompetent.

    Rather than simply not providing estimates, be honest.

    1. Provide estimates when you can make them. 2. Provide updates at regular intervals even if nothing has changed. 3. Communicate clearly so people waiting can take appropriate action.

    An estimate of 2-3 hours is different from an estimate of tomorrow. The number of times power has gone out, knowing this made a big difference. Even when the estimate wasn't accurate, it helped make things more comfortable. And I appreciated every time having that estimate.

    • xp84 4 hours ago ago

      > 1. Provide estimates when you can make them.

      When one is super confident, for instance, if someone screwed up DNS and you know everything will be resolved exactly 300 seconds from the time it was corrected, because that was the TTL on the bad record... That's when I'd be happy saying "Service will be restored to normal at 3:15PM."

      When you're not sure if a fix you're trying will succeed or not, that's not the time to say "It'll be resolved in 15 minutes." That assumes that the current plan will fix it, and now you're just setting up 'future you' to look foolish when they have to post again to retract that promise (and worse, possibly make up another estimate that may also be wrong).

      As a general rule I'd say not pretending to have knowledge of the future is humility.

      > I'm going to reasonably read that as you not have a good understanding of the issue and are not confident in whatever you are doing. Basically, you are incompetent.

      I envy the person who has always understood exactly what course of action they should take -- and exactly how long that fix will take -- before they've had even a few minutes to investigate the problem! What a perfect tech stack and perfect knowledge that person must have. They probably don't have any downtime in the first place though, since they know every bit of their application's state at all times anyway.

    • sjsdaiuasgdia 5 hours ago ago

      This is the way to real trust between customer and provider. Enabling your customers with real info that can drive their decision making tells them you care more about how they're feeling the impact than how your reputation might be hurt.