Technology

Operations Lead

London, 3 days per week in the office
Work Type: Full Time
Responsible for ensuring the stability, availability, and performance of the organization’s technology
platforms. This position focuses on production operations, system reliability, incident management,
and continuous operational improvement across backend services, web applications, and mobile
applications. The role works closely with engineering, infrastructure, and security teams to maintain
resilient, observable, and well-governed production environments.


Responsibilities:

Production Operations & Reliability
    Own the operational health of production systems, including backend services, web platforms, and mobile applications
    Monitor system availability, performance, and capacity to ensure adherence to reliability Objectives
     Lead and participate in incident response, troubleshooting, root cause analysis, and post incident remediation
    Define, measure, and continuously improve operational metrics, SLAs, and SLOs

On-Call & Application Support
    Participate in a scheduled on-call rotation supporting 24/7 production environments
    Provide operational support for Node.js backend services, web applications, and React

Native mobile applications
    Serve as an escalation point for high-severity production incidents
    Respond to alerts within defined response time objectives and ensure timely resolution
    Ensure accurate incident documentation and follow-up actions

Monitoring, Observability & Reporting
    Implement, maintain, and optimize monitoring and observability solutions
    Build and maintain monitoring dashboards to provide real-time and historical visibility into system health
    Utilize tools such as New Relic to monitor application performance, errors, and user experience
    Continuously refine alerting thresholds to improve signal quality and reduce operational noise


Technical Operations
     Maintain hands-on involvement with cloud infrastructure, operating systems, and production configurations
    Support release, deployment, and change management processes with an emphasis on system stability
    Develop, maintain, and review operational runbooks, playbooks, and escalation procedures.

Automation & Continuous Improvement
    Design and implement automation to reduce manual operational effort and risk
    Improve system resilience through redundancy, failover mechanisms, and disaster recovery planning
    Identify recurring operational issues and drive permanent technical resolutions

Risk, Security & Compliance
    Collaborate with security teams to support vulnerability management and incident response
     Participate in disaster recovery testing, business continuity planning, and compliance activities
    Ensure production systems comply with operational, security, and regulatory requirements

Cross-Functional Collaboration 
    Work closely with engineering teams to improve production readiness and operational reliability 
    Provide operational input into system architecture and design decisions 
    Act as a senior technical escalation resource without direct people management responsibilities


Skills & Qualifications:


    Production support of backend services (Node.js preferred)
    Web application support across frontend and backend components
    Mobile application support and operational troubleshooting (React Native preferred)
    Application performance monitoring using New Relic
    Design and maintenance of monitoring dashboards and alerting systems
    Cloud platforms (AWS, Azure, or GCP)
    Linux/Unix operating systems and networking fundamentals
    Infrastructure-as-code and automation tools
    Incident, change, and problem management practices
    Experience with third-party libraries and APIs
    Wide knowledge of the general mobile landscape, architectures, trends, and emerging technologies
    Excellent written, verbal and social skills
    Experience operating high-availability or distributed systems
    Familiarity with CI/CD pipelines for web and mobile applications
    Experience supporting 24/7 on-call rotations
    Effectively work in a matrix organization. Lead through influence
    Get things done attitude. Must be self-motivated and results-oriented
    Ability to work in a cross-functional, multi-cultural team and in a collaborative environment.
    Should be able to multi-task and plan, organize and prioritize multiple projects. Must have   hands-on mentality
    Ability to work in a fast paced, multiple project environment on an independent basis and with minimal supervision
    Required Technologies: React Native, Javascript, JS Frameworks, Redux, GraphQL, AEM, Storybook, Chromatic, Typescript, UI Frameworks
    Relevant Technologies: NextJS, Angular, Cache, Adobe Experience Manager (AEM)

Behavioural Fit:

    Effectively work in a matrix organization. Lead through influence
    Get things done attitude
    Must be self-motivated and results oriented
    Ability to work in a cross-functional, multi-cultural & remote teams and in a collaborative environment.
    Ability to multi-task and plan, organize and prioritize multiple projects
    Works under pressure with constantly changing priorities and deadlines
    Must have a hands-on mentality

Submit Your Application

You have successfully applied
  • You have errors in applying