2026-04-02 19:09:41

Been thinking a lot lately about what reliability actually means in modern systems. Not just uptime or shipping features on time – that's table stakes. A product leader I've been following, Shankar Raj, put it perfectly: reliability today is about how systems behave under pressure, recover from failure, and keep earning trust when things aren't perfect. After 20+ years building enterprise platforms across Fidelity, Deloitte, LTI Mindtree, and other massive operations, he's seen this evolution firsthand.

What struck me most was his shift from viewing enterprise systems as projects to treating them as living products. Most organizations manage platforms like they're shipping software – hit the milestone, ship the feature, move on. But Raj's approach is different. He asks: how does this perform after deployment? How fast do we recover? Do people trust it under stress? That mindset shift alone changed outcomes dramatically. One initiative saw incident recovery times drop 30%, and AI automation cut customer resolution time from 15 minutes down to under 3 minutes.

The AI angle is where it gets really interesting. As AI embeds deeper into enterprise systems, a new breed of problems emerges – login friction, interrupted sessions, fragmented identities. Most teams treat these as noise. Raj treats them as behavioral signals. He designed systems that stay coherent even when signals are incomplete or journeys get interrupted. One practical example: he built an AI-driven authentication system for a regulated platform that could adapt to contextual risk instead of enforcing rigid rules. Result was fewer login failures (around 15% reduction, thousands of prevented failures) without sacrificing security. That work earned a CLARO Award.

What I found most compelling was his thinking on customer journey reconstruction. Traditional CRM systems force premature identity certainty, which often creates more errors. Raj flipped it – treat it as a reconstruction problem using probabilistic coherence. Link fragmented identities through behavioral patterns and temporal context. At doTERRA, this unified phone, chat, email, and web into one coherent omnichannel view. Agents could see meaning even when interactions were incomplete. Average handling time dropped 30% across 2,000+ agents.

He's also deliberately cautious about automation. When systems get too opaque, organizations lose the ability to intervene when things go sideways. His platforms are designed with intentional transparency – automated decisions have confidence thresholds, humans stay meaningfully in the loop, and there's room for operators to step in when ambiguity hits. Some friction isn't a bug, it's a feature.

The broader philosophy here is interesting: reliability isn't just a technical metric, it's a human outcome. The future isn't built by faster systems or faster innovators – it's built by people creating trustworthy platforms that learn, recover, and respect the humans depending on them. As more enterprises accelerate AI adoption in regulated industries, this reliability-first, human-centered approach is becoming table stakes.

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.