Iron Mountain Logo

Iron Mountain

Monitoring and Observability Platform Engineer with Datadog and Solarwinds

Posted 17 Days Ago
Remote
5 Locations
Senior level
Remote
5 Locations
Senior level
As a Senior DevOps Engineer at Iron Mountain, you will maintain, integrate, and implement software applications. Responsibilities include managing software application testing, quality assurance, configuration, installation, and overseeing the implementation of new software updates. You will liaise with various teams, enhance monitoring practices, and manage risks in project execution.
The summary above was generated by AI

At Iron Mountain we know that work, when done well, makes a positive impact for our customers, our employees, and our planet. That’s why we need smart, committed people to join us. Whether you’re looking to start your career or make a change, talk to us and see how you can elevate the power of your work at Iron Mountain.

We provide expert, sustainable solutions in records and information management, digital transformation services, data centers, asset lifecycle management, and fine art storage, handling, and logistics. We proudly partner every day with our 225,000 customers around the world to preserve their invaluable artifacts, extract more from their inventory, and protect their data privacy in innovative and socially responsible ways. 

Are you curious about being part of our growth stor​y while evolving your skills in a culture that will welcome your unique contributions? If so, let's start the conversation.

THE OPPORTUNITY

Title:   Monitoring and Observability Platform Engineer with Datadog and Solarwinds

Location: UK, 100% remote

Full time, permanent role

SC requirements: Must have UK passport, be UK based for more than five consecutive years and able to obtain SC security clearance

Global Technology and Innovation:

Driving performance and growth through people, innovation, security, and new ways of working, Global Technology and Innovation provides secure and stable infrastructure, competitively differentiated solutions, innovative technology platforms, and business operations for Iron Mountain.

Job summary:

This role is responsible for the comprehensive administration, configuration, and optimization of Datadog and SolarWinds monitoring platforms to ensure the health, performance, and availability of diverse applications and infrastructure.

The Monitoring and Observability Platform Engineer will leverage deep technical expertise in instrumenting applications, configuring infrastructure, network, and application monitoring, and establishing centralized logging solutions. This position requires a strong understanding of monitoring protocols, event correlation, and data trend analysis to provide end-to-end observability. The engineer will collaborate with cross-functional teams to integrate monitoring data with other critical platforms, support critical production issues, and contribute to the continuous improvement of monitoring strategies and tools. This role also includes the installation, maintenance, and upgrade of monitoring systems, as well as the creation of insightful dashboards and visualizations to drive proactive problem resolution and informed decision-making.

Your role in our mission:

    Process/Operational Experience

    • Motivated self-starter with the ability to work on individual and team tasks
    • Engineer must be able to work effectively with the Enterprise Architects, OS engineers, and operation support teams to provide training, develop guidelines, and serve as a subject matter expert
    • Ability to share knowledge of monitoring best practices with system owners and system administrators to enhance overall monitoring and alerting posture
    • Ability to plan and execute system and software installations upgrades and changes across the organization
    • Identify risks/roadblocks  and mitigate them throughout all projects and tasks while ensuring major design flaws are addressed
    • Ability to prioritize competing priorities and maintain a backlog list
    • Experience with gathering and organizing large amounts of data to use for instrumentation into an Enterprise monitoring solution

    People/Leadership:

    • On-call and flexible working schedule
    • Strong communication  skills to relate technical details to non-technical leaders and stakeholders
    • Promote a positive working environment for the team and stakeholders
    • Enthusiastic about working with cross-functional teams and feel ownership over the success of each project
    • Working expertise in a collaborative environment and promoting a teamwork mentality
    • Excellent time management and organizational skills and experience establishing guidelines in these areas for others
    • Situationally Aware - Must be the first to notice differences and issues as they arise and elevate them to management
    • Conflict resolution - Must be able to facilitate discussion and facilitate alternatives or different approaches.


    Required Skills and Experience:

    • This role requires the candidate to be resident in the UK. UK Government SC clearance is required.
    • British National who has lived in the UK for more than 5 consecutive years and is able to pass a Home office Security Clearance check (SC)
    • Must have:
      • Datadog,  Solarwinds
      • Python or Ansible or Powershell scripting 

    Broader/General:

    • application performance monitoring or network monitoring or log monitoring
    • browser tests or synthetic monitoring or real user monitoring
    • log configuration or log aggregation or log formatting
    • event correlation
    • end to end Observability 
    • Nice to have:
      • SIEM tools: Solarwinds SEM or Chronicle
      • Nagios
      • Coding expertise in Ansible or python or Powershell
      • Ability to create and execute complex SQL queries for reporting, alerting, correlation, etc

    Minimum Skills & Qualifications:

    Minimum of four years of hands-on experience in the following: 

    • Demonstrated expertise in administering  Datadog and SolarWinds platforms by instrumenting diversified applications/solutions
    • Proficient in configuring Infrastructure Monitoring, Network Monitoring, Centralized Logging, and App monitoring (browser tests, API tests, APM, and synthetics) in  Datadog and Solarwinds
    • Knowledge of the monitoring configuration protocols (SNMP v2/v3, SSH, WinRM, WMI, JMX) and event correlation
    • Working expertise in performance monitoring tool alerts, dashboards, and data trend analysis in a monitoring tool
    • Hands-on experience in monitoring a variety of end devices -  routers, switches, firewalls, F5 Load balancer, Infoblox, storage, virtual, Windows servers, Linux servers, and UNIX servers  
    • Working expertise in implementing end-to-end observability by enriching the monitoring data with other platform data such as CMDB/ServiceNow ticketing platform, and other vendor platforms
    • Responsibilities encompass script development, installation, management, and maintenance of monitoring tools, along with seamless integrations with other systems and collaboration across teams/platforms
    • Configuration of centralized logging, aggregating logs from diverse sources such as WebSphere, Tomcat, and IIS WebServers into  Datadog/Solarwinds, security/infrastructure logs with expertise in handling various log formats, including JSON Payload
    • Proficient in instrumenting diverse applications within  Datadog and Solarwinds, setting up health rules, and optimizing monitoring settings
    • Implementation of End User Monitoring and Real User Monitoring using Datadog and SolarWinds, including the injection of required scripts
    • Support for critical production issues, includes data gathering, performance analysis, solution recommendations, and issuing comprehensive issue reports
    • Install and perform Solarwinds upgrades/patches
    • Creation of data visualization dashboards in  Datadog and Solarwinds
    • Collaboration with Systems and Application Architecture teams to have systems monitoring requirements in the migration/implementation process
    • Coordination with project teams to ensure the availability of monitoring for applications before their release into production
    • Contribution to the review and analysis of business and system requirements, specifically focusing on systems monitoring tool protocols and future tool utilization
    • Ability to implement and support  a highly available continuous monitoring platform to be utilized by 24x7 operations and cross-functional teams
    • Knowledgeable in SSL setup and proficient in the installation and management of monitoring infrastructure certificates
    • Working expertise in automating infrastructure as code/operations using appropriate automation tools. Preferably Ansible and Python platforms to establish event correlation 
    • Leverage expertise in recommending baseline monitoring thresholds, recommend performance monitoring KPIs and SLAs, and provide monitoring tool infrastructure recommendations
    • Working expertise in a ticketing/CMDB platform.  Preferably SNOW, but other tools acceptable such as Remedy, Assyst, etc 
    • Diploma or Bachelor's degree in computer science, information technology or a related field

    Discover what awaits you:

    • Discover Limitless Possibilities: Embark on an exciting journey with Iron Mountain, a global organization that embraces transformation and innovation.
    • Empowering Inclusion: Join a supportive environment where everyone's voice is heard, opinions are valued, and feedback is encouraged, fostering an atmosphere of inclusion and belonging.
    • Global Connectivity: Connect with 26,000+ talented individuals from 59 countries, opening doors to diverse cultures and fostering global learning opportunities.
    • Championing Individuality: Be part of a winning team that celebrates diversity and encourages individual differences to drive greatness.
    • Competitive Total Rewards: supporting your career at Iron Mountain, family, personal wellness, and wellbeing. (Local benefits may vary based on country-specific policies.)
    • Embrace Flexibility: Experience the freedom of remote/hybrid work, enabling a harmonious work-life balance (dependent on role).
    • Unleash Your Potential: Access abundant opportunities for personal and professional growth, preparing you for a digitalized future.
    • Valuing Every Contribution: Join a workplace that actively encourages and supports all talents, recognizing the unique impact of each individual.
    • Pioneering Sustainability: Contribute to our vision of fostering a sustainable and thriving workforce, leaving an enduring legacy for generations to come.

     #LI-Remote

    Category: Information Technology

    Similar Jobs

    2 Days Ago
    Remote
    Hybrid
    Belfast, County Antrim, Northern Ireland, GBR
    Mid level
    Mid level
    Artificial Intelligence • Cloud • Information Technology • Sales • Security • Software • Cybersecurity
    The Manager of Technical Support will lead a team of technical support engineers, enhance customer experience, and improve operational processes. Responsibilities include coaching team members, monitoring service quality, managing escalations, and fostering cross-functional collaboration to achieve business goals.
    Top Skills: Managerial ExperienceSaas EnvironmentsSalesforce Service CloudSoftware SupportTechnical Support
    2 Days Ago
    Easy Apply
    Remote
    Hybrid
    UK
    Easy Apply
    Senior level
    Senior level
    Artificial Intelligence • Cloud • Computer Vision • Hardware • Internet of Things • Software
    The Security Operations Manager at Samsara will lead a global team to manage security alerts, refine incident response processes, develop security metrics, and enhance overall security capabilities. Responsibilities include oversight of daily operations, advising on security strategy, ensuring compliance, and fostering team growth while promoting a collaborative culture.
    Top Skills: AWSBashFedrampGCPIso 27001Nist Cybersecurity FrameworkPowershellPythonStateramp
    2 Days Ago
    Easy Apply
    Remote
    Hybrid
    London, England, GBR
    Easy Apply
    Mid level
    Mid level
    Hardware • Information Technology • Security • Software • Cybersecurity • Conversational AI
    As a Network Support Engineer, you will deliver exceptional technical support primarily for Meraki products such as wireless access points and security devices. Responsibilities include troubleshooting complex network issues, collaborating with cross-functional teams, and ensuring a seamless support experience for customers. You will also work with Linux commands and use Wireshark for packet analysis, whilst developing knowledge articles and completing training modules to stay updated.
    Top Skills: 802.1XArpEthernetLinuxMerakiOspfRadiusStpTcpUdpWireshark

    What you need to know about the Belfast Tech Scene

    If asked to name the birthplace of the RMS Titanic, you might not say Belfast. Similarly, if asked to name Europe's leading destination for foreign direct investment in new software development, Belfast might not come to mind. Yet, both are true. The city has emerged as a tech powerhouse, recently ranked among the best in the U.K. for tech careers — especially for software developers. It also leads the U.K. with the highest percentage of software development jobs advertised.

    Sign up now Access later

    Create Free Account

    Please log in or sign up to report this job.

    Create Free Account