Build and maintain reliable, observable, and scalable Kubernetes infrastructure on GCP. Manage observability (metrics, logs, traces) with Prometheus/Grafana/Alertmanager, own GitOps workflows (FluxCD), write and improve CI/CD pipelines, and collaborate with developers to ensure performance, scalability, and reliability across the iGaming platform.
We're building a multi-brand iGaming platform that helps operators run their business in regulated markets worldwide. It brings together three main products — Player Account Management (PAM), Sportsbook, and Casino — all built on a modern microservices architecture using Scala, with a React/TypeScript back office.
We're looking for a Middle SRE to join our team and help us build and maintain reliable, observable, and scalable infrastructure. You'll work closely with developers, own reliability practices, and contribute to the team's DevOps culture.
Requirements
Must have:
- Linux — confident working in the terminal, troubleshooting, system basics
- Docker — understanding of containers, images, and basic Docker operations
- GCP (Google Cloud Platform) — hands-on experience required; our entire infrastructure runs on GCP
- Kubernetes — solid hands-on experience; understanding of workloads, networking, and cluster operations
- GitOps / FluxCD — experience with FluxCD or similar GitOps tooling is a strong advantage
- Observability — Prometheus, Alertmanager, Grafana; understanding the difference between metrics, logs, and traces
- Git / GitHub — comfortable with branching strategies, PRs, and Git-based workflows
- Knows what SLI, SLO, and SLA mean.
- Basic understanding of AI-related concepts: agents, skills, MCP (Model Context Protocol)
Nice to have:
- CI/CD Pipelines — experience writing or maintaining pipelines (GitHub Actions, GitLab CI, or similar)
- Databases — familiarity with PostgreSQL, Couchbase, or other databases; understanding of basic operations and monitoring
- Networking fundamentals — knows where load balancers, gateways, and DNS fit in a cloud architecture
- Infrastructure as Code — Terraform or similar IaC tooling
- Managing and improving Kubernetes-based infrastructure
- Maintaining and evolving observability stack (metrics, logs, traces)
- Writing and improving CI/CD pipelines
- Supporting and improving GitOps workflows
- Collaborating with developers on reliability, performance, and scalability topics
Benefits
Similar Jobs
Artificial Intelligence • Cloud • Consumer Web • Productivity • Software • App development • Data Privacy
The Production Network Reliability Engineer designs, operates, and improves Dropbox's large-scale networking systems, focusing on automation and resiliency while collaborating with cross-functional teams.
Top Skills:
AnsibleBgpIpv4Ipv6IsisMplsOspfPython
Artificial Intelligence • Cloud • Consumer Web • Productivity • Software • App development • Data Privacy
The role involves designing scalable data pipelines and architectures, integrating multiple data sources, and ensuring data quality and governance within the CMDB and Asset Intelligence platform.
Top Skills:
SparkDatabricksPythonSQL
Fintech • Professional Services • Consulting • Energy • Financial Services • Cybersecurity • Generative AI
The Migration Test Manager will lead test management activities for ETL migration programs, coordinating across stakeholders and ensuring quality delivery through test concepts and defect management.
Top Skills:
Defect ManagementETLPowerPoint
What you need to know about the Belfast Tech Scene
If asked to name the birthplace of the RMS Titanic, you might not say Belfast. Similarly, if asked to name Europe's leading destination for foreign direct investment in new software development, Belfast might not come to mind. Yet, both are true. The city has emerged as a tech powerhouse, recently ranked among the best in the U.K. for tech careers — especially for software developers. It also leads the U.K. with the highest percentage of software development jobs advertised.


