We are looking for Senior Reliability Engineers for our Container-as-a-Service product team. Our team is mandated to provide central platforms, services and expertise on Kubernetes deployments for Science and Enterprise activities around the world at Roche. As with all reliability engineers, experience with Infrastructure-as-Code are essential, and specific skills needed in this team are Kubernetes/Containers and Public Cloud IaaS/PaaS deployments. Senior Reliability Engineers work with a positive attitude in a self-managed team helping teammates learn & develop. They engage as ambassadors to our internal consumers, understanding business needs and translating them into infrastructure solutions.
A Senior Reliability Engineer is an infrastructure engineer who knows how to apply engineering principles to operations. They are well versed in a large number of technologies and welcome new tools and techniques. They work in conjunction with fellow engineering and operations members to come to the best possible solution. They are always looking for patterns and ways to increase efficiency, eliminate downtime, optimize costs, and maintain performance at scale. They will also advise our consumers on RE value proposition, adoption, industry best practices, and implementation strategy.
REs are responsible for the big picture of how the systems relate to each other, using a breadth of tools and approaches to solve a broad spectrum of problems. Practices such as limiting time spent on operational work, blameless postmortems and proactive identification of potential outages factor into iterative improvement that is key to both product quality and interesting and dynamic day-to-day work.
RE's culture of diversity, intellectual curiosity, problem solving and openness is key to its success. It brings together people with a wide variety of backgrounds, experiences and perspectives. They are encouraged to collaborate, think big and take risks in a blame-free environment. We promote self-direction to work on meaningful projects, while we also strive to create an environment that provides the support and mentorship needed to learn and grow. REs provide on-call support to keep systems up and running, ensuring the consumers have the best and fastest experience possible.
- Responsible for availability, tuning, performance, efficiency, change management, monitoring, emergency response, and capacity planning.
- Engage in and improve the whole lifecycle of services—from inception and design through deployment, operation and refinement.
- Create a bridge between engineering and operations by applying a software engineering mindset to system administration topics.
- Monitors and resolves Incident/problems with platform operations, setting priorities and ensuring all areas collaborate in the resolution when required.
- Support services before they go live through activities such as infrastructure design consulting, developing software platforms and frameworks, capacity planning and launch reviews.
- Collaborate with Managed Services suppliers and external consultancy, ensuring the collaboration is as effective as possible.
- Scale systems sustainably through mechanisms like automation, and evolve systems by promoting changes that improve reliability and velocity.
- Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
- Look for continuous improvement activities both in technical, teamwork, collaboration and processes areas. Propose and lead continuous improvement activities.
- Provide direction and guidance acting as an analyst by transforming the customer needs into specific requirements to be implemented in components managed by the team or by other teams.
- Remain proactive and aware of operational challenges and opportunities and work with support team staff to resolve incidents and major incidents.
- Ensure implemented solutions and components comply with Quality/Regulatory standards, as applicable.
Job Requirements / Qualifications:
- Skills on Kubernetes/Containers, Public Cloud and IaaS/PaaS deployments.
- Well proven scripting and automation skills with expertise in delivering and managing infrastructure as code, .
- Ability to work effectively with team members and virtual teams from different locations and different cultural backgrounds and ability to work across multiple time zones.
- Ability to function independently with very little supervision and navigate ambiguity.
- Excellent problem-solving and decision-making skills.
- Well demonstrated customer & delivery focus and strong interpersonal skills.
- Strong oral and written communication skills in English. German, Spanish or Chinese (Mandarin) are significant pluses.
- +5 years of relevant IT experience (CaaS, IaC, IaaS, DevOps, Cloud), experience of working in one or more multinational work environments (e.g. healthcare industry experience is a plus) as a senior systems or software Engineer.
- Strong hands-on technical skills in automation, infrastructure as code, logging, monitoring and observability, infrastructure configuration, scripting languages and applications.
- Experience working with Infrastructure Systems internals, their administration and networking.
- Experience applying design thinking, lean, prioritization and agile methodologies to evolve services offered to partners.
- Experience on the definition of technical computing infrastructure entirely under the control of software with no operator or human intervention.
- Experience defining Service Level Objectives and Service Level Indicators.
- Experience with DevOps mindset, processes and tools.
- Cross-Functional Technical Knowledge, tools/scripting/methodologies for: Configuration management, Infrastructure as Code, Automation Design, Infrastructure Development Life Cycle and hybrid Clouds.
- Experience with algorithms, data structures, complexity analysis and software design.
At Roche, we believe in diversity and inclusion as essential values for our success. We have a special interest in integrating people with disabilities into our teams. If you have a disability, for us it is a plus, and we have special benefits for you: Go ahead and present your candidacy!
Roche is an equal opportunity employer.
For more than 40 years. Roche Diabetes Care is pioneering innovative diabetes technologies and services. Being a global leader in integrated diabetes management, more than 5,000 employees in over 100 markets worldwide aim every day to help people with diabetes and those at risk to experience true relief from the daily therapy routines. We are dedicated to advancing how care is being provided and to achieve this, we collaborate with caregivers healthcare providers and payers worldwide to drive optimal management of this complex condition and contribute to building sustainable care structures.
Personalized diabetes management
At Roche Diabetes Care we believe that a collaborative, integrated and personalized approach is needed to determine the optimal therapy for each person with diabetes or at risk of developing the disease. It is equally important for us to spark lifestyle changes, encourage motivation and determine opportunities that will enable patients to reach the individual health goals by spending more time in range.
Under the brand Accu-Chek and in collaboration with partners. Roche Diabetes Care creates value by providing integrated solutions to monitor glucose levels, deliver insulin and track as well as contextualize relevant data points to contribute to a successful therapy. By establishing an open ecosystem, connecting devices and digital solutions, Roche Diabetes Care will help to enable personalized diabetes management which will thus help improve therapy outcomes.
By driving digital health in an open ecosystem and offering integrated diabetes management solutions and services, we are aiming to shape the way diabetes care is being provided now and in the future.