Site Reliability Engineer

Datadog

(New York, New York)

Full Time

Job Posting Details

About Datadog

Datadog is the leading service for cloud-scale monitoring. It is used by IT, operations, and development teams who build and operate applications that run on dynamic or high-scale infrastructure. Because Datadog collects metrics and events from 100+ different technologies and services out of the box, including other monitoring tools, you can monitor your entire stack in one place, without any gaps.

Responsibilities

* Keep our service reliable, available and fast as a member of the operations team. * Respond to, investigate and fix service issues, whether they be deep in the OS kernel or in the application code. * Design, build and maintain the infrastructure we need to support orders of magnitude more customers.

Ideal Candidate

**Who you must be** * You have a BS/MS/PhD in a scientific field * You have a track record as an engineer in the operations of a large site * You value correctness and efficiency; you leave no stone unturned when diagnosing production issues * You handle infrastructure with code because automation lets you focus on the more difficult and rewarding problems * You have production experience with distributed compute/storage tools, e.g. zookeeper, cassandra, postgres, kafka, elasticsearch redis **Bonus Points** * You have submitted bug fixes to the aforementioned projects * You are fully fluent in python, ruby and go

Similar Jobs

See other jobs at Datadog
See more engineering jobs in New York

Questions

Answered by on

This question has not been answered

Answered by on

Ask a question!

There are no answered questions, sign up or login to ask a question

Site Reliability Engineer

Datadog

Questions

For Job Seekers

Contact Us

Site Reliability Engineer

Datadog

Questions

Want to see jobs that are matched to you?

Application Submitted

Login Here

Question Submitted

Thanks for submitting your question!