Site Reliability Engineer in London

Palantir

Category

DevOps

Industry

Cyber Security Industry

Workplace

Onsite

Hours

Full-Time

Internship

Skills

Spark Ruby Python PostgreSQL Nagios MySQL Linux JavaScript Hadoop

Share offer

Job Description

A World-Changing Company

At Palantir, we’re passionate about building software that solves problems. We partner with the most important institutions in the world to transform how they use data and technology. Our software has been used to stop terrorist attacks, discover new medicines, gain an edge in global financial markets, and more. If these types of projects excite you, we'd love for you to join us.

The Role

Palantir software is deployed at the world’s most critical institutions to help them solve their greatest challenges. Users at customer sites from Washington DC, to London, to Tokyo rely on Palantir’s high availability to pursue their missions. Site Reliability Engineers (SREs) make sure our expanding number of customer deployments run smoothly 24 hours a day.

As an SRE, you will monitor and maintain Palantir systems to pre-empt problems before they ever threaten our customers’ workflows. You will combine engineering experience and a commitment to improve existing systems and processes with the creativity to develop novel solutions to evolving challenges.

Our team strives to automate processes wherever possible, using whatever tools are best for the job. Our responsibilities range from architecting systems for new implementations of Palantir, administering co-located servers (including hardware troubleshooting) to maintaining database platforms.

We work with a variety of teams to understand threats to our software and improve our products over time. We work side by side with Palantir’s implementation teams and our customers' IT departments to understand their business’s unique problems and to develop innovative solutions. We document our successes and communicate them back to Palantir’s product teams to advance the way our hardware, software, and network solutions are deployed to minimize failure rates and increase overall system reliability.

What We Value

Experience with Linux system administration (RHEL or CentOS preferred).
Good scripting ability in Bash, and preferably also Python, Ruby, Perl or JavaScript.
Experience with monitoring systems using tools like Nagios and writing health checks.
Practical experience managing databases or search engines, such as Postgres, MySQL, Oracle, Cassandra or ElasticSearch.
Ability to debug performance problems in complex distributed systems.
Interest in learning and utilizing newer technologies like Spark, Hadoop, Cassandra, and ElasticSearch.
Ability to work independently with minimal supervision.
Ability to participate in a 24/7 on-call rotation.
Unwavering commitment to operational security and best practices.

Preferred

BS/MS in Computer Science.
Experience with system management tools like Puppet, Chef or Ansible.
Knowledge of server hardware and/or experience working with Amazon Web Services (AWS).
Familiarity with information security standards and best practices.

If you need assistance or an accommodation due to a disability, you may contact us at accommodations@palantir.com.

Read full job description

About Palantir

Industry
Cyber Security

Palantir company page is empty

Add a description and pictures to attract more candidates and boost your employer branding.

Site Reliability Engineer in London

Palantir

Job Description

What We Value

Preferred

Read full job description

About Palantir

Palantir company page is empty

Other devops jobs that might interest you...

DevOps Cloud Engineer (m/f/d) at Roche

Cloud Systems Engineer at OpenNebula

DevOps Engineer at BABEL