Changing the world through digital experiences is what Adobe’s all about. We give everyone—from emerging artists to global brands—everything they need to design and deliver exceptional digital experiences! We’re passionate about empowering people to create beautiful and powerful images, videos, and apps, and transform how companies interact with customers across every screen.
We’re on a mission to hire the very best and are committed to creating exceptional employee experiences where everyone is respected and has access to equal opportunity. We realize that new ideas can come from everywhere in the organization, and we know the next big idea could be yours!
Adobe’s Magento Customer Engineering team is looking for a Senior Site Reliability Engineer to join our Cloud Operations team.
As a high-growth, e-commerce company, we are looking for an experienced SRE with a heavy emphasis on backend operations support. We define, support and operate Magento’s e-commerce platform that includes client experience, build and deploy analysis and production applications. At Magento, we have a solid history of delivering successful open-source software projects.
What you will do
- Solve complicated issues with live cloud server environments and provide software configuration and tuning recommendations for optimal performance.
- Drive resolution automatization initiatives to mitigate recurrent issues.
- Write software layers, scripts, deployment frameworks, tracers, monitors, self-healing/auto remediation tools and automate the processes.
- Maintain business continuity by identifying and driving opportunities to make systems highly resilient and human-free.
- Even after self-healing and automation done by you – if complex issues arise, get involved with troubleshooting and root-cause analysis of issues
- Participate in shared on-call schedule [follow-the-sun model] managed across SRE & Engineering.
- Improve observability of software by implementing right monitoring, tracing and logging.
- Formulate replies to issues, communicate progress and resolution efficiently.
- Take ownership, assess, solve, and coordinate the resolution of technical issues.
- Excellent social skills in your interactions with Internal and External Developers/Community/Merchants.
What You Need to Succeed
- 5+ years’ experience working on PaaS/SaaS handling high transactional Linux setups providing critical 24x7 uptime
- Experience on LAMP architecture stack including Elasticsearch, MySQL, PHP, Linux, Docker, Redis, ElasticSearch and Jenkins. Also proven experience on high-scalable container-based environments (Kubernetes, Docker Swarm, …)
- Confirmed experience on high level languages: Bash, Python, Ruby, Perl or similar.
- Experience working with hyperscale cloud providers (AWS, Azure, GCP, etc.).
- Extensive experience with Monitoring / Logging / Alerting systems.
- Intellectual curiosity to pursue the unknown and to continuously learn.
- Great communication, interpersonal, and teamwork skills to work with internal teams but also our customers.
- Good social skills and desire to work in a dynamic and fast-paced environment
- Experience using/implementing CI/CD frameworks such Jenkins, Travis, Gitlab CI. All about IaaC is also relevant in our stuck so real hands-on using Chef/Puppet, Ansible, Terraform, etc is also welcome!
- Participate in on-call pager rotation
- Bachelor’s degree on Computer Science or similar
- AWS/Azure Certification
- Linux Certification
- Kubernetes Certification
- Other database experience/certifications sql/non-sql
- Experience in Lean, Six-Sigma, or Kaizen
- ITIL certification