Posted 4 Jun 2024, 1:00 am

Site Reliability Engineer II at RudderStack

*Our roles are remote first, and can be based anywhere in India (#LI-Remote).

 

Responsibilities

  • Monitor and continually improve the capacity of our production environment
  • Design and implement scalable, reliable, and efficient infrastructure using Kubernetes, Terraform, AWS resources.
  • Partner with development teams to improve services through rigorous testing and release procedures with CI pipelines (Github Actions, Dockerfiles)
  • Gain a deeper understanding of RudderStack infrastructure and help debug incidents
  • Proactively build software to help operations and support teams
  • Identify opportunities for process improvements, automation, and cost savings

Requirements

  • A Bachelor or Master degree in Computer Science or equivalent experience is required
  • 5+ years of experience as a Site Reliability Engineer, Internal Platform Developer or similar role
  • Strong understanding of cloud computing, containers, and DevOps practices
  • Demonstrated Linux experience
  • Excellent debugging skills
  • Experience with Scripting and infrastructure automation
  • Familiarity with distributed systems design patterns using tools such as Kubernetes
  • Familiarity with AWS, Azure or Google Cloud Compute
  • Excellent verbal and written communication skills
  • Familiarity with Networking concepts like VPCs, proxies and CDNs

Here are examples of things we've worked on:

  • Build and maintain a Kubernetes platform to deploy all our applications with high availability
  • Build Kubernetes operator to automate 100s of deployments
  • Managed 100s of postgres with HA for our deployments
  • Provision and manage air-gapped on-premise deployments in diverse environments.
  • Manage multi-region multi-cluster environment with hundreds of customer deployments in single-tenant and multi-tenant models.
  • Complete Infrastructure as a code and enforced using GitOps model
  • Automated migrations of complex, highly available services
  • Working on compliance(i.e. SOC2 Type 2, HIPPA), security, scalability, and a lot more aspects to deliver top class, secure software
  • We follow FinOps and continuously optimize our cloud costs.

How we achieve results:

  • Empathy for the problems encountered by our customers.
  • Collaboration with engineering teams to achieve results.
  • Care deeply about the quality of your and the team's code
  • Curiosity and understanding, for investigating causes and finding effective solutions.
  • Output driven to provide value to our customers in a significant, measurable, and positive way.
  • Focus on writing testable, performant, bug-free code to provide the right solutions to the problems.


Please mention the word **SHARPEST** and tag RMjA5LjIyMi4yMS42Mg== when applying to show you read the job post completely (#RMjA5LjIyMi4yMS42Mg==). This is a beta feature to avoid spam applicants. Companies can search these words to find applicants that read this and see they're human.

The offering company is responsible for the content on this page / the job offer.
Source: Remote Ok