At Elastic, we have a simple goal: to solve the world's data
problems with products that delight and inspire. As the company behind
the popular open source projects — Elasticsearch, Kibana, Logstash, and
Beats — we help people around the world do great things with their data.
From stock quotes to real time Twitter streams, Apache logs to
WordPress blogs, our products are extending what's possible with data,
delivering on the promise that good things come from connecting the
dots. The Elastic family unites employees across 30+ countries into one
coherent team, while the broader community spans across over 100
countries.
Thanks to our ongoing expansion we have the opportunity to grow our Site
Reliability team. We're a part of the Elastic Cloud engineering team
with a focus on solving Cloud operations problems and keeping the SaaS
online, who aren’t afraid to get our hands dirty. We are the first line
of consumers for Elastic's products and our experience helps influence
the direction of the stack. While most organizations may have a single
or a handful of Elastic Stack deployments, here you’ll be responsible
for identifying, troubleshooting and reporting platform problems to
product engineers (or fixing the code yourself) in order to ensure that
the thousands of Elasticsearch clusters we manage are providing a stable
and reliable service. We’re looking for people who are just as
passionate about troubleshooting issues with distributed systems as they
are to automate, code and collaborate to solve problems.
Responsibilities
- You will report and solve problems within the
Elastic Cloud infrastructure services and collaborate on issues with
product engineers
- You will participate in SRE software
engineering, writing code for the continuing reduction of human
intervention in operational tasks and automation of processes
- You
will monitor the Elastic Cloud platform and Cloud infrastructure,
responding to incidents, correcting and improving systems to prevent
incidents and planning capacity
- You will manage Cloud provider infrastructure, system deployments and product releases
- You will be involved in resolving Elastic Cloud customer support issues
- You will demonstrate and promote best practices for teams using Cloud platforms
- You will participate in 24x365 on-call schedules
Experience
- You are either an experienced sysadmin with
professional skills in Linux, preferably on distributed systems at
scale, and a demonstrable interest in using software engineering to
solve operational problems; or a software engineer with real interest,
and ideally some experience, in Linux systems, networking, monitoring
and automation.
- You have experience with Java and the JVM.
- You have at least three years of experience using a public Cloud; AWS, GCP, Azure, Softlayer or OpenStack
- You
are comfortable writing software to automate API-driven tasks at scale.
SRE use Python and Go regularly but are also encouraged to contribute
to the product codebase in Java, Scala, and Python.
- You have
used Ansible, Puppet, Chef or another config management suite, know
where it's broken, and open to trying new alternatives
Key Skills
- Healthy knowledge of Linux (have compiled your own
kernel at some point, know how to trace syscalls, understand TCP, care
about the difference between sysvinit/runit/systemd, etc.)
- Relentless desire to automate and build software tools
- Desire to represent work in git, driven by a GitHub workflow through issues and pull requests
- Love
open source development, and have contributed to some project somewhere
(doesn't have to be ours), whether through mailing lists, patches,
documentation, etc.
- Enjoy working remotely and the communication it requires
- Love a diverse environment, working with men and women all over the world
Additional Information
We're looking to hire team members invested in realising
the goal of making real-time data exploration easy and available to
anyone. As a distributed company, we believe that diversity drives our
vibe! Whether you're looking to launch a new career or grow an existing
one, Elastic is the type of company where you can balance great work
with great life.
- Competitive pay based on the work you do here and not your previous salary
- Equity
- Global minimum of 16 weeks of paid in full parental leave (moms & dads)
- Generous vacation time and one week of volunteer time off
- Your
age is only a number. It doesn't matter if you're just out of college
or your children are; we need you for what you can do.
Elastic is an Equal Employment employer committed to the principles
of equal employment opportunity and affirmative action for all
applicants and employees. Qualified applicants will receive
consideration for employment without regard to race, color, religion,
sex, sexual orientation, gender perception or identity, national origin,
age, marital status, protected veteran status, or disability status or
any other basis protected by federal, state or local law, ordinance or
regulation. Elastic also makes reasonable accommodations for disabled
employees consistent with applicable law.