The DevOps Engineer will design observability architectures, implement data collection, manage tooling, conduct performance analysis, automate workflows, and mentor teams on observability best practices.
At Uni Systems, we are working towards turning digital visions into reality. We are continuously growing, and we are looking for a DevOps Engineer to join our UniQue team!
What will you be bringing to the team?
- Observability Platform Design & Architecture:
- You will design and implement scalable and robust observability architectures for complex distributed systems, including microservices, cloud-native environments (Kubernetes, serverless), and traditional infrastructure.
- You will define and enforce standards and best practices for telemetry data (metrics, logs, traces) collection, processing, storage, and visualization.
- You will also evaluate, select, and integrate new observability tools and technologies into the existing ecosystem.
- Instrumentation & Data Collection:
- You will be working directly with development and Back Office teams to implement pervasive instrumentation within applications and infrastructure, using frameworks like OpenTelemetry.
- You will develop custom exporters, agents, or integrations to collect specific telemetry data from various sources.
- You will also configure and optimize data pipelines for efficient ingestion and routing of metrics, logs, and traces.
- Tooling Implementation & Management:
- Involved in hands-on deployment, configuration, and administration of observability platforms.
- Support the automation of the deployment and management of observability infrastructure using Infrastructure as Code (IaC) tools (e.g. Terraform).
- Develop custom dashboards, alerts, and reports to provide actionable insights into system health and performance.
- Performance Analysis & Troubleshooting:
- You will support and coach teams for them to identify performance bottlenecks, anomalies, and potential issues by analysing observability data.
- Support and coach teams for them to leverage observability tools to quickly pinpoint root causes and facilitate rapid resolution.
- Support and coach teams for them to conduct in-depth performance analysis, capacity planning, and resource optimization based on collected telemetry.
- Support and coach teams for them to implement anomaly detection and predictive analytics to anticipate and prevent issues.
- Automation & Scripting:
- You will need to develop scripts and automation tools to streamline observability workflows, integrate systems, and enhance operational efficiency.
- Mentoring & Knowledge Sharing:
- You will act as a subject matter expert, providing technical guidance and mentorship to teams on observability best practices, tools, and troubleshooting techniques.
- Create detailed technical documentation, runbooks, and playbooks.
- Conduct training sessions and workshops to upskill development and operations teams on observability concepts and tools.
- Collaboration & Cross-functional Support:
- You'll be in close collaboration with development and operations to embed observability throughout the software development lifecycle (SDLC).
- Work with stakeholders to define Service Level Indicators (SLIs) and Service Level Objectives (SLOs) and build dashboards to track adherence.
What do you need to succeed in this position?
- Bachelor's in Computer Science, Software Engineering, DevOps, or a related technical discipline.
- A minimum of 5-8+ years of progressive experience in a hands-on technical role, with a significant focus on observability, monitoring, or DevOps.
- Minimum of 3-5 years of dedicated experience in designing, implementing, and managing observability solutions in production environments.
- Proven track record of architecting and delivering scalable and resilient observability platforms.
- Extensive experience with incident response and post-mortem analysis.
- Expert-level understanding of Observability Principles: Deep knowledge of the "three pillars" (metrics, logs, traces), distributed tracing, event correlation, and their application in complex systems.
- Deep Hands-on Expertise with Observability Tools: Proven proficiency in deploying, configuring, and optimizing multiple leading observability platforms (e.g., Prometheus/Grafana, ELK Stack, Jaeger/ OpenTelemetry.
- Cloud-Native & Distributed Systems Expertise: In-depth understanding and hands-on experience with cloud platforms (Azure), containerization (Docker, Kubernetes), service mesh, and microservices architectures.
- Infrastructure as Code (IaC): Proficient in using tools like Terraform for automating infrastructure provisioning and configuration related to observability.
- Linux System Administration & Networking: Strong grasp of Linux operating systems, networking protocols, and system-level troubleshooting.
- Database Knowledge: Familiarity with time-series databases (e.g. Prometheus, InfluxDB) and other relevant data stores for observability data.
- Troubleshooting & Root Cause Analysis: Exceptional analytical and problem-solving skills, with a systematic approach to diagnosing complex technical issues.
- Relevant industry certifications in cloud platforms, Kubernetes, or specific observability tools are highly valued.
- A strong command of the English language is mandatory (speaking, writing)
At Uni Systems, we are providing equal employment opportunities and banning any form of discrimination on grounds of gender, religion, race, color, nationality, disability, social class, political beliefs, age, marital status, sexual orientation or any other characteristics. Take a look at for more information.
Top Skills
Azure
Docker
Elk Stack
Grafana
Jaeger
Kubernetes
Linux
Opentelemetry
Prometheus
Terraform
Similar Jobs
Software • Energy • Solar
Maintain service reliability and performance, advocate for system improvements, and deploy new services across container and cloud environments. Manage live services using various tools.
Top Skills:
AnsibleAWSBashCassandraDockerGceGitGkeGnu/LinuxGrafanaIp NetworkingIso27001KafkaKubernetesKvmLampLdapLibvirtLinuxMosquittoPostgresPrometheusPythonQemuRedisRhelVirtualization TechnologiesZabbix
Artificial Intelligence • Fintech • Software • Financial Services
Join Plum as a Senior Site Reliability Engineer to ensure resilient, secure, and scalable systems. Operate infrastructure, automate processes, and optimize CI/CD workflows while collaborating across teams.
Top Skills:
Argo WorkflowsArgocdAWSCircleCIGCPGithub ActionsGrafanaKubernetesOpentelemetryPostgresPrometheusPythonRabbitMQRedisTerraform
Information Technology
The DevOps Engineer will manage CI/CD pipelines, automate infrastructure, ensure cloud operations, collaborate with teams, and optimize deployment processes.
Top Skills:
AnsibleAWSAzureAzure PipelinesDockerGitGoKubernetesPowershellPythonSvnTerraform
What you need to know about the Belfast Tech Scene
If asked to name the birthplace of the RMS Titanic, you might not say Belfast. Similarly, if asked to name Europe's leading destination for foreign direct investment in new software development, Belfast might not come to mind. Yet, both are true. The city has emerged as a tech powerhouse, recently ranked among the best in the U.K. for tech careers — especially for software developers. It also leads the U.K. with the highest percentage of software development jobs advertised.