Senior Director , Software Engineering , Site reliability engineering (all genders)
Job Description
We are on the lookout for a Senior Director, Site Reliability Engineering (all genders) to
lead our global Site Reliability Engineering (SRE) department. The department is part of
the Tech Foundations tribe, with the mission to increase the leverage of Delivery
Hero’s engineering organisations by reducing complexity through streamlined,
opinionated, and supported ways to build and run software, ensuring security, reliability,
cost-efficiency, and productivity across the whole group.
SRE aims to reduce the impact of incidents, both commercially as well as on customer
satisfaction, by reducing their occurrence and duration across all production systems at
Delivery Hero. To achieve this, the department will work on operational tooling,
mitigation, and postmortem processes for incident handling, as well as on preemptive
measures like game days and stress tests to find problems and resolve them before
reliability issues occur. Advocating best practices to improve system reliability as well as
providing pre-instrumented hardened application blueprints will help engineers across
the group to build a new generation of resilient applications.
The Senior Director of SRE will be expected to be an engaged thought leader in this field,
a coach to their teams, and a partner for their peers and stakeholders.
Reporting to the leader of Developer Platform, based in Berlin (Germany), you will be responsible for innovating and building central services and products in the area of site reliability (SRE consulting, performance & resilience engineering, observability, incident management).
Leading the global SRE organization of 35 engineers spread across Berlin and Dubai.
Working closely with the Developer Productivity and Developer Infrastructure departments to integrate your tools and products into the overall developer experience.
Supporting over 4000 software engineers distributed in our tech hubs and regional entities across the globe representing market leading consumer brands.
Your stakeholders are senior engineering leaders (CTOs, VPs, SVPs).
Your Northstar-KPI is order retention, with availability as a key driver, and you are responsible for setting up group-wide standards to measure those and other reliability-related metrics.
Your success is measured by how much you improved those KPIs across the whole group.
You drive both quarterly and long-term objectives and key results with full responsibility for executing on OKRs and your roadmap.
Qualifications
Master in computer science or a similar technology-related field required and a strong engineering background
Extensive experience running web-scale cloud-native services and acting as an incident commander for large-scale operational issues
10+ years of experience in the Site Reliability Engineering or DevOps fields
5+ years of leadership experience, leading multiple teams of teams, ideally within platform, infrastructure, or foundation departments and serving internal customers
Excellent written and verbal communication skills
Good influencer with strong stakeholder management skills, as you will need to convince senior engineering leaders to adopt your services
Strong customer focus and a passion for high-performance software delivery
You know how to lead, support and mentor engineering managers to be more effective leaders