Engineering Manager - Observability Metrics & Alerting

Posted 5 Days Ago
Be an Early Applicant
Hiring Remotely in United Kingdom
Remote
7+ Years Experience
Information Technology • Security • Cybersecurity
Helping Build a Better Internet
The Role
Lead a team of engineers building a large metrics pipeline for Cloudflare's internal Metrics and Alerting platform. Responsible for shaping strategy, optimizing configurations, and delivering scalable distributed systems. Must have strong technical and management skills.
Summary Generated by Built In

Available Locations: Amsterdam or Remote Netherlands; Lisbon or Remote Portugal; London or Remote UK; Munich or Remote Germany
About the Department
Production Engineering is responsible for the world's most reliable, observable, performant, and safe network ecosystem. Our customers rely on our products and systems to safely modify, troubleshoot, and release products without external impact.
Our external customers rely on us to provide seamless and predictable incident, traffic, policy management, resulting in the fastest and safest network services in the world.
We are accountable for the overall performance of internal and external facing services, guiding our product teams to optimal configurations and maximum efficiency. From the moment that a packet enters the Cloudflare ecosystem, we know exactly what its expected purpose and behavior is and we are capable of determining and exposing anomalous behavior.
The Cloudflare network makes it possible to solve challenges at massive scale and efficiency which would be impossible for almost any other organization.
About the role
We are looking for an Engineering Manager to join Cloudflare, specifically our Observability team, in charge of our internal Metrics and Alerting platform. You will lead a team of passionate, talented engineers that are building one of the largest metrics pipelines in the world processing over 2 billion time series across hundreds of different locations. You will play an active role in shaping our strategy and working with our customers to build the best developer experience. You will change the way people build applications.
You bring a passion for meeting business needs by building technical, innovative solutions. You excel to understand how big-picture goals inform technical details. You thrive in a fast-paced iterative engineering environment and have experience in delivering scalable distributed systems. Most importantly, you have a track record of having past teams respect you as both a technical leader and manager.
Examples of desirable skills, knowledge and experience
Experience leading a team and working across multiple teams to deliver results
Comfortable managing backend focused teams
Solid foundation in computer science and software engineering with strong competencies in software design, and building distributed systems
Excel at planning, creating teams and overseeing execution to meet commitments and deliver with predictability
Demonstrate a track record of managing a team including hiring, on-boarding, and professional development. You inspire your team to reach higher. You're as good as explaining "why" as you are "how"
Experience implementing tools, process, internal instrumentation, methodologies and resolving blockages
Comfortable managing teams/projects with tight deadlines and short release cycles
Operating knowledge of Prometheus, Thanos, Alertmanager and related infrastructure
Bonus Points
Understanding of server hardware, performance expectations and limitations, and failure domains
Deep Linux/UNIX systems knowledge
Managing contributions to large open-source projects

Top Skills

Alertmanager
Linux
Prometheus
Thanos
Unix
The Company
HQ: San Francisco, CA
3,300 Employees
Hybrid Workplace
Year Founded: 2010

What We Do

Cloudflare, Inc. is on a mission to help build a better Internet. Cloudflare’s suite of products protect and accelerate any Internet application online without adding hardware, installing software, or changing a line of code. Internet properties powered by Cloudflare have all web traffic routed through its intelligent global network, which gets smarter with every request. As a result, they see significant improvement in performance and a decrease in spam and other attacks. Cloudflare was awarded by Reuters Events for Global Responsible Business in 2020, named to Fast Company's Most Innovative Companies in 2021, and ranked among Newsweek's Top 100 Most Loved Workplaces in 2022.

Why Work With Us

Cloudflare employees come from all walks of life. Our team is energized by a collaborative, creative environment that celebrates our differences and fosters new ways to grow together.

Gallery

Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery
Gallery

Cloudflare Offices

Hybrid Workspace

Employees engage in a combination of remote and on-site work.

We are committed to developing a global team that is distributed with a flexible working approach. Doing this equitably and inclusively is essential to our success. Visit our careers site for more on 'How & Where We Work.'

Typical time on-site: Flexible
London, GB

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account