We are currently partnering with Learnupon to help scale their engineering and product teams. Great chance to join a leading Irish e-learning platform. We are currently looking for a Staff SRE. Permanent role, hybrid working conditions or part remote available (monthly trips to Dublin).
Responsibilities
Identifying opportunities to improve and scale the infrastructure for performance, observability, maintainability, and cost, by creating innovative solutions with a strong emphasis on infrastructure as code.
Lead and plan, from an SRE perspective, our transformative initiative of moving LearnUpon production infrastructure to a containerised environment, orchestrated by Kubernetes.
Lead the efforts to build an observability function that incorporates application metrics, application transaction tracking, and event log management.
Working with other Engineering teams to provide infrastructure solutions that meet their ongoing requirements while also looking at future capacity needs. Building tools focused on measuring, monitoring and alerting, with an eye towards self-service in order to promote Engineers’ ownership of observability
Requirements
7+ years of experience working with SaaS products at scale within an SRE/DevOps role.
Experience deploying microservice environments, using containerisation technologies such as Docker and Kubernetes.
Experience in designing and implementing observability tech stacks using tools such as Grafana, Prometheus, Datadog, New Relic etc.
Experience deploying microservice environments, using containerisation technologies such as Docker and Kubernetes
Experience building and supporting large-scale distributed systems that back a consumer app or website with associated requirements of performance, security and disaster recovery.