San Francisco, CA
Description:
Role Details:
CLIENT is seeking a Site Reliability Engineer for our online television and media-focused web
properties. In this role, you will be building systems that support the lifecycle and visibility of sites which have
global reach and scale. We aim to monitor everything and to automate everything!
Your Day-to-Day:
? Write infrastructure-as-code using tools such as Docker, Kubernetes, Helm, and Terraform.
? Design and build monitoring platforms that provides effortless, real-time visibility for service-owners.
? Work with our Cloud Engineering team to contribute improvements to our cloud platform.
? Collaborate with service-engineering teams to orchestrate and visualize their applications.
? Design and maintain automation workflows for CI/CD systems, such as Jenkins.
? Engage in active troubleshooting, root-cause analysis, performance-tuning, and post-mortem
exercises.
Qualifications:
What you bring to the team:
? Strong knowledge of Linux internals and shell scripting.
? Experience with one of the following programming languages: Java, Python, and Go.
? Experience with Docker or other container runtimes.
? Familiarity with container orchestration tools such as Kubernetes or Docker Swarm.
? Familiarity with cloud tools and products such as GCP, AWS, or Azure.
? Bachelor’s Degree or equivalent industry experience.