This one is close to my heart and probably the most important thing I would say to anyone starting out in IT.
Over the years, I've learned that most systems are already telling you what went wrong. Not in summaries or dashboards, but in the event logs, quietly and often long before anyone notices there's a problem.
When incidents escalate, it is common to see teams restart services, reapply configurations or focus on the last visible failure. The pressure to "do something" is understandable, but it often skips over the place where the full story already exists.
Event logs are rarely neat. They're noisy, repetitive, and sometimes frustrating to interpret but taken together, they describe sequence, timing, dependency and cause in a way no single alert ever can.
With experience, patterns start to stand out. The same warnings appearing before different failures, the same authentication errors preceding broader outages and the same timing gaps that point back to an earlier dependency failure rather than the symptom everyone is reacting to.
Reading logs well isn't about memorising event IDs or knowing every subsystem in advance. It's about learning how systems narrate their own behaviour, and trusting that narrative even when it contradicts first impressions.
I've lost count of the number of times an incident felt complex and multi-dimensional, only for the logs to show a very ordinary sequence of events once they were read in order.
The fundamentals show up here again. Time matters, identity matters and dependencies matter. The logs reflect all of it, if you're willing to slow down and listen.
For me, this is where troubleshooting stopped feeling reactive and started feeling deliberate, not because problems disappeared, but because the story was already there, waiting to be read.