COGNNA Logo

COGNNA

Senior Data Engineer

Posted 16 Days Ago
Be an Early Applicant
Remote
2 Locations
Senior level
Remote
2 Locations
Senior level
As a Senior Data Engineer, you will architect security data ecosystems by designing data lakehouse architectures, implementing real-time streaming pipelines, and enabling AI/ML features. You will manage data ingestion patterns and ensure system integrity through automation and observability.
The summary above was generated by AI

As a Senior Data Engineer, you will be the architect of our security data ecosystem. Your primary mission is to design and build high-performance data lake architectures and real-time streaming pipelines that serve as the foundation for COGNNA's Agentic AI initiatives. You will ensure that our AI models have access to fresh, high-quality security telemetry through sophisticated ingestion patterns.

Key Responsibilities

1. Data Lake & Storage Architecture

  • Architectural Design: Design and implement multi-tier Data Lakehouse architectures to support both structured security logs and unstructured AI training data.
  • Storage Optimization: Define lifecycle management, partitioning, and clustering strategies to ensure high-performance querying while optimizing for cloud storage costs.
  • Schema Evolution: Manage complex schema evolution for security telemetry, ensuring compatibility with downstream AI/ML feature engineering.

2. Real-Time & Streaming Processing

  • Streaming Ingestion: Build and manage low-latency, high-throughput ingestion pipelines capable of processing millions of security events per second in real-time.
  • Unified Processing: Design unified batch and stream processing architectures to ensure consistency across historical analysis and real-time threat detection.
  • Event-Driven Workflows: Implement event-driven patterns to trigger AI agent reasoning based on incoming live data streams.

3. AI/ML Enablement & Feature Engineering

  • Vector Data Foundations: Architect the data infrastructure required to support semantic search applications and variants of RAG architectures for our generative AI models.
  • Feature Management: Design and maintain a centralized repository for ML features, ensuring consistent data is used for both model training and real-time inference.
  • AI Pipeline Orchestration: Build automated workflows to handle data preparation, model evaluation, and deployment within our cloud AI ecosystem.

4. DataOps & Systems Design

  • Infrastructure as Code: Utilize declarative tools (e.g., Terraform) to manage the entire lifecycle of our cloud data resources and AI endpoints.
  • Quality & Observability: Implement automated data quality frameworks and real-time monitoring to detect "data drift" or pipeline failures before they impact AI model performance.

Requirements
  • Experience & Education: 5+ years in Data Engineering or Backend Engineering, focused on large-scale distributed systems. B.S. or M.S. in Computer Science or a related technical field.
  • Cloud Architecture: Deep architectural mastery of the Google Cloud Platform ecosystem, specifically regarding managed analytical warehouses, serverless compute, and identity/access management. Proven track record of deploying enterprise-scale Data Lakehouses from scratch.
  • Real-Time Mastery: Expertise in building production-grade distributed messaging and stream processing engines (e.g., managed Apache Beam/Flink environments) capable of handling high-velocity telemetry.
  • AI Enablement: Strong understanding of how data architecture impacts AI performance. Experience building embedding pipelines, feature stores, and automated workflows for model training and evaluation.
  • Software Fundamentals: Expert-level Python and advanced SQL. Proficiency in high-performance languages like Go or Scala is highly desirable.
  • Operational Excellence: Advanced knowledge of CI/CD, containerization on Kubernetes, and managing cloud infrastructure through code to ensure reproducible environments.
Preferred Qualifications
  • Experience with dbt for modern analytics engineering.
  • Understanding of cybersecurity data standards (OCSF/ECS).
  • Previous experience in an AI-first startup or a high-growth security tech company.

Benefits

💰 Competitive Package – Salary + equity options + performance incentives
🧘 Flexible & Remote – Work from anywhere with an outcomes-first culture
🤝 Team of Experts – Work with designers, engineers, and security pros solving real-world problems
🚀 Growth-Focused – Your ideas ship, your voice counts, your growth matters
🌍 Global Impact – Build products that protect critical systems and data

Top Skills

Apache Beam
Apache Flink
Dbt
Go
Google Cloud Platform
Kubernetes
Python
Scala
SQL
Terraform

Similar Jobs

4 Days Ago
Remote
Egypt
Senior level
Senior level
Information Technology • Consulting
The Senior Data Engineer is responsible for designing data architectures, developing ETL/ELT pipelines, optimizing SQL queries, and mentoring junior engineers while ensuring data governance and performance monitoring.
Top Skills: Apache AirflowAWSAzureAzure Data FactoryAzure SynapseCosmosdbDatabricksGCPHadoopMongoDBPostgresPower BISnowflakeSparkSQL ServerSsisTableau
4 Days Ago
Remote
Cairo, EGY
Senior level
Senior level
Artificial Intelligence • Information Technology • Software • Analytics
The Senior Data Engineer is responsible for creating scalable data pipelines, maintaining ETL processes, and developing a semantic layer for analytics, collaborating with various teams to ensure data quality and accessibility.
Top Skills: Data VaultETLRelational Data WarehousesSemantic LayerSQLSsis
7 Days Ago
Remote or Hybrid
8 Locations
Senior level
Senior level
Information Technology • Mobile • Consulting
The role involves building a centralized data lake on GCP, developing SPARK-powered data pipelines, ensuring data quality, and collaborating with cross-functional teams for advanced analytics and data models.
Top Skills: AirflowDbtDockerGCPKubernetesLookerLuigiNoSQLPrefectPysparkPythonScalaSQLTerraform

What you need to know about the Belfast Tech Scene

If asked to name the birthplace of the RMS Titanic, you might not say Belfast. Similarly, if asked to name Europe's leading destination for foreign direct investment in new software development, Belfast might not come to mind. Yet, both are true. The city has emerged as a tech powerhouse, recently ranked among the best in the U.K. for tech careers — especially for software developers. It also leads the U.K. with the highest percentage of software development jobs advertised.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account