We are seeking a highly skilled and passionate Senior Site Reliability Engineer to join our
Engineering Enablement team. This is a critical role within a large, complex, and high-impact
initiative focused on deconstructing our monolithic architecture, revitalising our technology
stack, and embedding quality and resilience into every stage of our development lifecycle.
You will play a pivotal role in shaping our future-state platform, driving operational
excellence, and fostering a culture of continuous improvement.
What You'll Do:
As a Senior SRE Engineer in our Engineering Enablement team, you will:
• Architect and Implement Reliability: Design, build, and maintain highly scalable,
resilient, and performant systems on Azure, focusing on our Java, Kafka, and
Couchbase stack.
• Drive Modernisation: Work hands-on as part of the team spearheading the adoption
of Micronaut, standardising application templates, and transitioning to managed cloud
services.
• Enhance Operational Excellence: Develop and implement strategies for improving
system observability (standardised logging, metrics, tracing), alerting, and on-call
practices.
• Automate Everything: Champion automation across the software development
lifecycle (SDLC), from CI/CD pipelines to infrastructure provisioning, focusing on
accelerating delivery and de-risking deployments.
• Incident Management & Learning: Contribute to our mature, blameless post-
incident review process, identifying root causes and implementing preventative
measures to reduce incident hours.
• Tooling & Standards: Develop, maintain, and drive the adoption of shared,
standardised SRE tooling and best practices across engineering teams, including
containerisation (e.g., Docker, Kubernetes on Azure), infrastructure as code (e.g.,
Terraform), and configuration management.
• Mentorship & Collaboration: Provide technical leadership and mentorship to junior
engineers, fostering a culture of SRE principles and operational excellence across the
wider engineering organisation.
• Strategic Input: Contribute to the overall technical strategy and roadmap for our
SRE and platform initiatives, ensuring alignment with business objectives.
What You'll Bring:
• Deep SRE Expertise: Proven experience as a Senior Site Reliability Engineer or a
similar role, with a strong understanding of SRE principles (error budgets,
SLOs/SLIs, toil reduction).
• Azure Cloud Proficiency: Extensive hands-on experience designing, deploying, and
operating highly available and scalable applications on Microsoft Azure.
• Azure Kubernetes Service (AKS) Expertise: Mandatory extensive hands-on
experience with AKS for container orchestration, including deployment, scaling,
monitoring, and troubleshooting.
• Java Ecosystem Mastery: Expert-level proficiency with Java, including experience
with modern frameworks (ideally Micronaut, Spring Boot, or similar) and JVM
performance tuning.
• Distributed Systems Knowledge: Solid understanding and practical experience with
distributed systems, microservices architecture, and associated challenges (e.g.,
consistency, fault tolerance).
• Messaging & Database Expertise: Hands-on experience with an event streaming
platform (ideally Kafka) and NoSQL data storage (ideally Couchbase), including
operational best practices.
• Automation First Mindset: Strong scripting skills (e.g., Python, Bash) and
experience with Infrastructure as Code tools (e.g., Terraform, ARM templates) and
CI/CD pipelines (e.g., Azure DevOps, Jenkins).
• Observability Tools: Experience with monitoring, logging, and alerting tools (e.g.,
Azure Monitor, Prometheus, Grafana, ELK Stack, Splunk).
• Problem-Solving Acumen: Exceptional analytical and troubleshooting skills, with a
methodical approach to diagnosing and resolving complex production issues.
• Communication & Collaboration: Excellent communication skills, with the ability
to articulate complex technical concepts to diverse audiences and collaborate
effectively with cross-functional teams.
• Continuous Improvement: A proactive and innovative mindset, always seeking
ways to improve systems, processes, and team efficiency.
Some of the benefits you’ll enjoy working with us:
• The chance to join an organization with triple-digit growth that is changing the paradigm on how software products are built.
• The opportunity to form part of an amazing, multicultural community of tech expert
• A highly competitive compensation package.
• Medical insurance.
• English lessons.
Come and join our #ParserCommunity.