About the Dept
Cloudflare's Resiliency Engineering Team builds and runs the systems and software that support our solutions that handle trillions of requests per month. Resiliency Engineering ensures all of the new and existing features and functionality that Cloudflare offers can be managed at scale and meet the needs of our massively growing customer base.The Infrastructure Tooling Team within the Resiliency Engineering organization is responsible for defining, building and supporting the tools that can be leveraged by the rest of the Infrastructure Engineering team to manage our infrastructure at scale.
What you'll do
An engineering role at Cloudflare provides an opportunity to address some big challenges, at scale. We believe that with our talented team, we can solve some of the biggest security, reliability and performance problems facing the Internet. Just how big?
- We have in excess of 340 Terabits of network transit capacity
- We operate 330+ Points-of-presence around the world
- We serve more traffic than Twitter, Amazon, Apple, Instagram, Bing, & Wikipedia combined
- Anytime we push code, it immediately affects over 200 million internet users
- Every day, up to 20,000 new customers sign-up for Cloudflare service
- Every week, the average Internet user touches us more than 500 times
We are looking for talented Software Engineers to build and develop the platform which makes Cloudflare customers place their trust in us. Our Software Engineers come from a variety of technical backgrounds and have built up their knowledge working in different environments. But the common factors across all of our reliability-focused engineers include a passion for automation, scalability, and operational excellence. Our Infrastructure Software Systems and Automation team focuses on the automation to scale our infrastructure.
Our team is well-funded and focused on building an extraordinary company. This is a superb opportunity to join a high-performing team and scale our high-growth network as Cloudflare's business grows. You will build tools to constantly improve our scale and speed of deployment. You will nurture a passion for an "automate everything" approach that makes systems failure-resistant and ready-to-scale.
Cloudflare Software Engineers focus on automating our infrastructure installations and decommissions at scale. We enable our Data Centre Engineering teams by allowing them to install new data centers, replace servers and networking in existing data centers as quickly and efficiently as possible while not impacting existing infrastructure and customer services. While our focus is on automation and accurate asset tracking, there is an element of ongoing operational support of Data Center Engineers and other teams. We also review upcoming hardware changes and update automation and configuration management to cater to these advances.
Many of our Software Engineers have had the opportunity to work at multiple offices on interim and long-term project assignments. The ideal Software Engineering candidate has strong knowledge of Rust, with Python, Golang, and Typescript being an advantage. As we are automating server and networking installations, knowledge of Linux, Hardware and Networking is ideal. We prefer to hire experienced candidates; however raw skill trumps experience and we welcome strong junior applicants.
Requisite Skills
- Confidence to work in multiple programming languages - bonus points for Rust as well as Python, Golang and/or Typescript experience
- 5 years of relevant Development experience
- Strong skills in network services, including Rest APIs and HTTP
Examples of desirable skills, knowledge and experience
Strong systems level programming skills
- Experience (and love) for debugging to ensure the system works in all cases
- Experience with a continuous integration workflow and using source control (we use git)
- Linux systems administration experience
- Experience with Kubernetes and docker
- Tooling and automation development experience
- Network fundamentals DHCP, ARP, subnetting, routing, firewalls, IPv6
- Configuration management systems such as Saltstack, Chef, Puppet or Ansible
- SQL databases (Postgres or MySQL)
- Time series databases (OpenTSDB, Graphite, Prometheus)
- The ability to understand service and device metrics and visualize them using Grafana
- Great oral and written communications skills
- Desire to learn and improve
Bonus Points
- Experience with continuous / rapid release engineering
- Experience developing systems that are highly available and redundant across regions
- Performance analysis and debugging with tools like perf, sar, strace, dtrace
- Experience with the Linux kernel and Linux software packaging
- Internetworking and BGP experience
- Key/Value stores (Redis, KyotoTycoon, Cassandra, LevelDB)
- Load balancing and reverse proxies such as Nginx, Varnish, HAProxy, Apache
Some tools that we use
- Netbox
- Apache Airflow
- Temporal
- Salt
- Docker, Kubernetes
- Nginx
- Golang
- Django
- PostgreSQL
- Redis
- Prometheus
Compensation
Compensation may be adjusted depending on work location.
- For New York City, Washington, Washington D.C. and California (excluding Bay Area) based hires: Estimated annual salary of $154,000 - $188,000
Equity
This role is eligible to participate in Cloudflare's equity plan.
Benefits
Cloudflare offers a complete package of benefits and programs to support you and your family. Our benefits programs can help you pay health care expenses, support caregiving, build capital for the future and make life a little easier and fun! The below is a description of our benefits for employees in the United States, and benefits may vary for employees based outside the U.S.
Health & Welfare Benefits
- Medical/Rx Insurance
- Dental Insurance
- Vision Insurance
- Flexible Spending Accounts
- Commuter Spending Accounts
- Fertility & Family Forming Benefits
- On-demand mental health support and Employee Assistance Program
- Global Travel Medical Insurance
Financial Benefits
- Short and Long Term Disability Insurance
- Life & Accident Insurance
- 401(k) Retirement Savings Plan
- Employee Stock Participation Plan
Time Off
- Flexible paid time off covering vacation and sick leave
- Leave programs, including parental, pregnancy health, medical, and bereavement leave