Arista Channels Logo

Arista Channels

Senior Kubernetes Admin / Systems Engineer, EngProd

Posted 5 Days Ago
Be an Early Applicant
Remote
Hiring Remotely in Nashua, NH
Senior level
Remote
Hiring Remotely in Nashua, NH
Senior level
The role involves managing and enhancing Kubernetes cluster for engineering productivity, ensuring system reliability and scalability through monitoring, alert management, and infrastructure improvements.
The summary above was generated by AI
Company Description

Arista Networks is an industry leader in data-driven, client-to-cloud networking for large data center, campus and routing environments. What sets us apart is our relentless pursuit of innovation. We leverage the latest advancements in cloud computing, artificial intelligence, and software-defined networking to provide our clients with a competitive edge in an increasingly interconnected world. Our solutions are designed to not only meet the current demands of the digital landscape but to also anticipate and adapt to future challenges.

At Arista we value the diversity of thought and perspectives that each employee brings to the table. We  believe that fostering an inclusive environment, where individuals from various backgrounds and experiences feel welcome, is essential for driving creativity and innovation.

Our commitment to excellence has earned us several prestigious awards, such as Best Engineering Team, Best Company for Diversity, Compensation, and Work-Life Balance. At Arista, we take pride in our track record of success and strive to maintain the highest standards of quality and performance in everything we do.

Job Description

Who You’ll Work With

Arista Networks is looking for world-class Kubernetes-aware engineers passionate about driving systems reliability and scalability to provide the best possible development experience for our 1400+ person engineering team. You will be part of a fast paced, high caliber team building the internal systems and infrastructure used to build the routing and switching products driving the industry's largest data center networks.

Arista’s Software Engineering team runs at a scale rarely found - TBs of source control, 60GB work trees with 1000s of developer branches in flight at any given time, over 400K daily build/test jobs and over 150 homegrown and cloud native services running on a 100 node on-prem bare metal Kubernetes cluster.  Operating these systems takes vigilance, responsiveness to alerts, and a steady stream of updates and bug fixes to keep things running smoothly and efficiently as well as to increase our ability to monitor, understand and visualize them. The role will cover all aspects of our Kubernetes infrastructure, and may include monitoring, responding to, and enhancing alerts, working to unify and standardize our alerts, fine tuning code for scalability and performance, debugging problems, simplifying and securing developer experience with k8s etc. You will own your projects from definition to deployment, developer and vendor interactions, and you will be responsible for the quality of everything you deliver.

What You’ll Do

Working in the Engineering Productivity (EngProd) group, you will collaborate and work with other engineers to design, build, scale, and operate the systems that the rest of Arista’s development teams use.  The EngProd team uses industry-standard systems like Ansible, Jenkins, Kubernetes, Grafana, Spinnaker, MySQL, ElasticSearch, Google Cloud, and Varnish and also internal systems that we’ve built from the ground-up to automate CI/CD, testing, analysis, and visualization.

Responsibilities

  • Work with existing k8s admin team to own different aspects of managing a production k8s cluster (eg: upgrades, monitoring, capacity planning, security, developer experience etc)
  • Proactively monitor, respond to, and enhance alerts and set up automated alert handling where applicable
  • Create and maintain the incident response runbooks working with the service dev teams
  • Debug and resolve issues impacting developer user experience and infrastructure stability around the k8s platform
  • Adopt current best practices in k8s cluster management. Evaluate and adopt OSS projects that simplify k8s cluster management. 
  • Set up guidelines and paved paths for service dev teams improving developer experience around the k8s platform.
  • Work with Arista’s software engineers to identify bottlenecks and limitations in our workflows, tooling, and infrastructure around k8s and provide fixes for those problems.
  • Engage with 3rd party vendor support as part of triage

Qualifications

  • At least BSc Computer Science or Engineering + 8 years’ experience, MS Computer Science or Engineering + 6 years’ experience, or Ph.D. in Computer Science or equivalent work experience.
  • Knowledge of one or more of Go, Python, Javascript. Experience with shell Scripting to be able to implement medium complexity automation workflows.
  • Knowledge of Linux (or UNIX).
  • Experience in operating software systems at scale.
  • Strong understanding of the fundamentals of storage and networking.
  • Comfortable with Ansible and GitOps.
  • Strong expertise with managing on-prem/baremetal Kubernetes clusters.
  • Applied understanding of software engineering principles.
  • Strong problem solving and software troubleshooting skills.
  • Ability to design a solution and implement features independently. Ability to work in small teams.
  • Comfortable with security principles and able to study source code of OSS projects, conduct experiments as necessary to debug issues.
  • Proven expertise with debugging complex issues that span the technology stack.
  • Experience dealing with network proxies and containerized storage.

Additional Information

   

Arista Networks is an equal opportunity employer.  Arista makes all hiring and employment-related decisions in a non-discriminatory manner without regard to race, color, religion, sex, sexual orientation, gender identity, national origin or any other factor determined to be unlawful under applicable federal, state, or law law.  All your information will be kept confidential according to EEO guidelines.

Top Skills

Ansible
Elasticsearch
Go
GCP
Grafana
JavaScript
Jenkins
Kubernetes
MySQL
Python
Shell Scripting
Spinnaker
Varnish

Similar Jobs

11 Days Ago
Remote
United States
Senior level
Senior level
Real Estate
The Regional Senior Automation Engineer provides leadership and technical support to maintenance teams at logistics facilities. Responsibilities include driving process improvements, optimizing control systems, implementing best practices, leading audits, and mentoring staff. The role requires extensive experience in automation and controls engineering, travel for on-site support, and a strong focus on safety and process optimization.
Top Skills: Allen BradleyAutomation SystemsIndustrial Control NetworksMS OfficePlc ProgrammingRobotic SystemsSiemens Control SystemsVfds
12 Days Ago
Remote
United States
Senior level
Senior level
Retail • Sports
Lead a high traffic web UI engineering team, managing product development, digital strategy, and team performance while ensuring quality and operational excellence.
Top Skills: AngularCSSDockerGitHTMLJavaScriptJSONKubernetesRxjsTypescript
20 Days Ago
Remote
United States
Mid level
Mid level
Retail • Sports
Design and build software solutions, perform analysis, support team initiatives, and contribute to software maintenance and development while mentoring junior members.
Top Skills: .NetBackstageJavaJavaScriptPython

What you need to know about the Belfast Tech Scene

If asked to name the birthplace of the RMS Titanic, you might not say Belfast. Similarly, if asked to name Europe's leading destination for foreign direct investment in new software development, Belfast might not come to mind. Yet, both are true. The city has emerged as a tech powerhouse, recently ranked among the best in the U.K. for tech careers — especially for software developers. It also leads the U.K. with the highest percentage of software development jobs advertised.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account