- Ankit Patel, Technology Lead, DNB
Waking Up to Downtime and Error Messages: My Journey to a Seamless System
From firefighting to future-proofing DNB's core banking integration layer
Waking up to system alerts and downtime incidents used to be a routine reality. The anxiety of unresolved issues and customer-facing failures was more than just operational overhead directly impacted trust in our digital services. That experience led us to transform how we engineer, deploy, and operate mission-critical platforms, which are integral to DNB’s core systems.
We knew this wasn't just about fixing bugs. It was about building a more resilient and scalable foundation.
Diagnosing the Real Issue: Technical Debt and Monolith Fragility
Our platforms serve as gateways between DNB’s core systems and the broader digital ecosystem—powering everything from online banking to internal employee applications. When these systems falter, the impact is immediate and visible. In 2021 and early 2022, we were grappling with a steady stream of P1 and P2 incidents. These were not just performance hiccups—they were full-blown operational bottlenecks that sometimes-made national news headlines.
It became clear that our architectural posture—an aging mix of tightly coupled services and legacy dependencies—was not fit for modern operational expectations.
The Pivot: Simplify, Decouple, Migrate
We made a conscious decision: if we were going to migrate to the cloud, we weren’t going to “lift and shift” complexity.
Instead, we adopted a decomposition-first approach. Before touching the cloud, we conducted a full inventory of our systems—auditing interdependencies, retiring unused components, and aggressively simplifying workflows. This step was non-negotiable. It gave us clarity, reduced our blast radius, and prepared us for smoother cloud onboarding.
By 2021-22, Bank decentralized systems were fully transitioned to cloud-native environments. The refactor reduced technical debt, and laid the foundation for elastic scalability and zero-downtime deployments.
From Downtime to Uptime: Metrics that Matter
Since the transition, we've seen a quantifiable impact:
- Zero major incidents
- Improved deployment velocity, enabling frequent safe releases
- Higher developer confidence via observability-first engineering
Our move wasn’t just about stability, it was about enabling innovation without fear of regression. That’s only possible when your foundation is solid.
Cloud Strategy Backed by ITOM Principles
One of the key enablers in this journey was DNB’s adoption of the ITOM (Information Technology and Operations Model). It gave our teams the autonomy to make platform decisions quickly, while remaining aligned with broader governance and security frameworks.
The real value of ITOM, in my view, lies in balancing ownership with alignment. We were able to move fast—rebuilding our pipelines, implementing progressive delivery, and tightening feedback loops between developers and operations—without compromising on regulatory or auditability standards.
Beyond Ops: Engineering for Customer Trust
Even though we work in a Group function, far from customer-facing apps, our mindset had to shift: every error message, every millisecond of delay, eventually reaches the customer.
That realization was a catalyst. It pushed us to treat our work not as infrastructure upkeep but as a product—one that serves internal teams, developers, and ultimately, every customer interacting with DNB platforms.
What’s Next: Proactive Engineering as the New Default and We’re not stopping and modernizing many other systems. The goal isn’t just uptime—it’s continuous improvement at scale.
Closing Thoughts
Our achievements didn't go unnoticed. Arne Damvin, our Executive Vice President of Banking & Payments, was impressed and encouraged others to be inspired by our journey. He emphasized the importance of proactive management and tidying up before modernizing. Our team's success showed that proactive management is key to increasing productivity and customer satisfaction.
Cleaning up before modernizing isn’t just a best practice, it’s a necessity. Whether you're moving to the cloud or evolving legacy platforms, a proactive, engineering-first approach allows you to increase reliability, reduce complexity, and create real value.
The journey from chaos to control doesn’t start with a migration script – it starts with intent and clarity.