Netlify Logo

Netlify

Staff Infrastructure Site Reliability Engineer

Reposted 10 Days Ago
Remote
96K-130K
Senior level
Remote
96K-130K
Senior level
As a Staff Site Reliability Engineer at Netlify, you will drive strategies for reliability and scalability of their infrastructure, lead cross-organizational initiatives, mentor engineers, and develop frameworks for operational excellence. You'll be the technical authority during major incidents and work closely with stakeholders to integrate reliability considerations across the organization.
The summary above was generated by AI

About the Team:

Netlify’s SRE team is scaling to meet the demands of our rapidly growing platform and user base. Our SRE team is responsible for ensuring the reliability, scalability, and efficiency of Netlify’s infrastructure while maintaining a focus on innovation and operational excellence. As a Staff Site Reliability Engineer, you will be at the forefront of driving organizational-level reliability strategies, shaping the direction of Netlify’s systems, and tackling complex, systemic challenges. You will collaborate across teams to build a culture of operational excellence and deliver impactful solutions that support our mission to empower the next generation of web developers.

We are a remote-first, globally distributed group that values asynchronous communication, documentation, and a culture of transparency, empowerment, and collective ownership. Diversity and inclusion are at the heart of what we do, and we welcome team members from all backgrounds to bring their unique perspectives to our mission. Whether you’re launching a new phase of your career or growing an established one, Netlify offers a supportive environment where you can thrive while maintaining a healthy work-life balance.

What You’ll Do:

  • Lead high-impact reliability and infrastructure initiatives across the platform.
  • Drive the adoption of Infrastructure-as-Code and champion reliability-focused tooling and frameworks.
  • Manage all cloud infrastructure components, including instances, networking, DNS, Terraform automation, and Kubernetes.
  • Define and uphold architectural standards, best practices, and technical strategy for reliability at scale.
  • Provide mentorship to senior engineers and tech leads, fostering systems thinking and operational excellence.
  • Partner with Engineering, Product, and Executive teams to embed reliability into company-wide strategy.
  • Lead architecture reviews and provide oversight for critical infrastructure projects.
  • Develop and advocate for reliability metrics and SLO frameworks that align with business goals.
  • Participate in an on-call rotation and occasionally act as Incident Commander, providing technical leadership and system-level decision-making.

What You’ll Bring:

  • Deep expertise in cloud architecture, with hands-on experience designing and deploying global-scale solutions on AWS, Azure, or GCP.
  • Strong proficiency with Kafka or similar messaging systems, including deployment, scaling, and maintenance in multi-cloud environments.
  • Solid experience in database design, performance tuning, and maintenance for both relational and NoSQL systems in high-throughput environments.
  • Skilled in programming and scripting languages such as Go or Python, with a focus on automation and infrastructure tooling.
  • A proven track record of leading large-scale, cross-team technical initiatives and delivering impactful infrastructure outcomes.
  • Proficiency in configuration management tools like Ansible, Chef, or Puppet.
  • Experience in managing CI/CD pipelines using tools such as Jenkins, GitLab CI, CircleCI, or similar.
  • We welcome candidates based in Spain, Canada, or the UK for this position.
  • Excellent communication skills, with the ability to articulate complex technical strategies to executives and build consensus across diverse teams.
  • Demonstrated success in setting and scaling technical standards and best practices across large engineering organizations.

This role is a great fit if: 

  • You think in systems. You’re curious about how infrastructure, networking, observability, and security connect—and enjoy breaking down complex challenges into clear, actionable strategies.
  • You’re comfortable writing code (especially in Go) and enjoy automating infrastructure workflows, building tools to reduce manual effort, and supporting reliable operations at scale.
  • You’ve collaborated on cross-functional initiatives—like operational readiness reviews, cloud migrations, or introducing monitoring standards—and know how to communicate clearly with both technical and non-technical teammates.
  • You take a thoughtful, methodical approach to troubleshooting. You seek context before jumping to solutions, validate assumptions, and can clearly explain how you navigate production issues or potential incidents.
  • You work well in a distributed environment and value clear, respectful communication. Whether async or live, you prioritize inclusivity, documentation, and creating space for others to contribute.
  • You’re energized by helping others grow—whether that’s through mentoring, sharing knowledge, or building systems that support better outcomes across the team.
  • You approach reliability as a proactive practice, not just a reactive one. You care about preventing issues before they become incidents and building systems that help everyone sleep better at night.
  • You’re drawn to big, interesting challenges. The idea of helping shape a global CDN, support edge computing innovation, and rethink infrastructure for modern developers is what motivates you.

Applying:

Not sure you meet 100% of our qualifications? Please apply anyway! We value diverse experiences and perspectives.

When applying, please include:

  • A resume or short listing of your job history & skills (a LinkedIn profile link is fine).
  • (Optional) A cover letter explaining why you would enjoy this role at Netlify.

Our mission to build a better web relies on a diversity of skill sets, backgrounds, and thoughts. Netlify is an Equal Opportunity Employer, and we are committed to building a team that reflects our values of inclusivity and equity. If accommodations are needed for the interview process, please email [email protected].

About Netlify:

At Netlify, we’re on a mission to build a better web by making it easier than ever to build, deploy, and scale web applications. By unifying an entire ecosystem of web development tools, content sources, services, and APIs into one simplified workflow, Netlify empowers top brands to ship campaigns faster, reduce risk, and boost productivity and revenue. At the forefront of the composable web movement, with over 4 million web developers and businesses using the platform, with Netlify, you can connect everything and build anything. 

We are a Series D company that has raised over $200M from investors such as Andreessen Horowitz, Kleiner Perkins, EQT, Bessemer, BOND, and Menlo Ventures. As a fully distributed company, we aim to create a company culture where the best idea can come from anywhere and strive to be thoughtful, compassionate, and collaborative in our work. If this sounds like something you’d like to be part of, we’re excited to connect with you!

At Netlify, we are committed to a compensation philosophy that prioritizes fairness and equity, positions our employee compensation competitively in the market, recognizes and rewards performance, and takes a comprehensive approach to our rewards package. We anchor our compensation philosophy on a market-based approach, therefore salary ranges may differ depending on the labor cost in a particular location. The salary provided is in addition to robust benefits and participation in Netlify’s equity plan. Our base compensation for this role is targeted at €84,000 -  €113,000 for most Spain-based locations and CAD $163,000 - CAD $221,000 for most Canada-based locations. Candidates outside these locations, or in premium markets, should consult with their Talent Acquisition partner regarding location-based ranges, as they may be higher or lower than the average ranges listed. The starting pay will be determined based on multiple factors, including expertise and skills, market demands, experience, internal equity, and applicable geographic location. These compensation packages and ranges are subject to change and may be modified in the future.

Similar Jobs

16 Days Ago
Remote
Hybrid
5 Locations
151K-297K Annually
Senior level
151K-297K Annually
Senior level
Big Data • Cloud • Software • Database
As a Staff Engineer in the InfraSec team, you'll lead cloud security solutions, manage tools, automate monitoring, and guide a small team of SREs.
Top Skills: AnsibleAWSAzureCloudFormationGCPGoTerraform
3 Days Ago
Remote
United States
185K-250K
Senior level
185K-250K
Senior level
Cloud • Information Technology
As a Staff Site Reliability Engineer, you will manage core infrastructure, improve reliability, automate operations, and support engineering teams in a remote environment.
Top Skills: ElkEnvoyGoGrafanaGrpcHaproxyHashicorp NomadHoneycombJenkinsKafkaLinuxMySQLNode.jsPostgresPuppetRedis
23 Minutes Ago
Remote
Pennsylvania, USA
69K-163K Annually
Mid level
69K-163K Annually
Mid level
AdTech • Digital Media • Marketing Tech
Responsible for planning, designing, and implementing software and web applications, performing tests and debugging, and documenting development activities.
Top Skills: JavaScriptPython

What you need to know about the Los Angeles Tech Scene

Los Angeles is a global leader in entertainment, so it’s no surprise that many of the biggest players in streaming, digital media and game development call the city home. But the city boasts plenty of non-entertainment innovation as well, with tech companies spanning verticals like AI, fintech, e-commerce and biotech. With major universities like Caltech, UCLA, USC and the nearby UC Irvine, the city has a steady supply of top-flight tech and engineering talent — not counting the graduates flocking to Los Angeles from across the world to enjoy its beaches, culture and year-round temperate climate.

Key Facts About Los Angeles Tech

  • Number of Tech Workers: 375,800; 5.5% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Snap, Netflix, SpaceX, Disney, Google
  • Key Industries: Artificial intelligence, adtech, media, software, game development
  • Funding Landscape: $11.6 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Strong Ventures, Fifth Wall, Upfront Ventures, Mucker Capital, Kittyhawk Ventures
  • Research Centers and Universities: California Institute of Technology, UCLA, University of Southern California, UC Irvine, Pepperdine, California Institute for Immunology and Immunotherapy, Center for Quantum Science and Engineering

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account