CAI (cai.io).

Spark Engineer

Reposted 23 Days Ago

Be an Early Applicant

Square, Newry Mourne and Down, Northern Ireland

Mid level

Square, Newry Mourne and Down, Northern Ireland

Mid level

As a Spark Engineer, you'll design and optimize data processing systems with Apache Spark, build data pipelines, and collaborate with data scientists.

The summary above was generated by AI

Spark Engineer

Req number:

R6280

Employment type:

Full time

Worksite flexibility:

RemoteWho we are

CAI is a global technology services firm with over 8,500 associates worldwide and a yearly revenue of $1 billion+. We have over 40 years of excellence in uniting talent and technology to power the possible for our clients, colleagues, and communities. As a privately held company, we have the freedom and focus to do what is right—whatever it takes. Our tailor-made solutions create lasting results across the public and commercial sectors, and we are trailblazers in bringing neurodiversity to the enterprise.

Job Summary

As a Spark Engineer, you will design, build, and optimize large-scale data processing systems using Apache Spark. You will collaborate with data scientists, analysts, and engineers to ensure scalable, reliable, and efficient data solutions.

Job Description

We are looking for a Spark Engineer with deep expertise in distributed data processing, ETL pipelines, and performance tuning for high-volume data environments. This position will be full-time and remote.

What You'll Do:

Design, develop, and maintain big data solutions using Apache Spark (Batch and Streaming).
Build data pipelines for processing structured, semi-structured, and unstructured data from multiple sources.
Optimize Spark jobs for performance and scalability across large datasets.
Integrate Spark with various data storage systems (HDFS, S3, Hive, Cassandra, etc.).
Collaborate with data scientists and analysts to deliver robust data solutions for analytics and machine learning.
Implement data quality checks, monitoring, and alerting for Spark-based workflows.
Ensure security and compliance of data processing systems.
Troubleshoot and resolve data pipeline and Spar k job issues in production environments

What You'll Need

Required:

Bachelor’s degree in Computer Science, Engineering, or related field (Master’s preferred).
3+ years of hands-on experience with Apache Spark (Core, SQL, Streaming).
Strong programming skills in Scala, Java, or Python (PySpark).
Solid understanding of distributed computing concepts and big data ecosystems (Hadoop, YARN, HDFS).
Experience with data serialization formats (Parquet, ORC, Avro).
Familiarity with data lake and cloud environments (AWS EMR, Databricks, GCP DataProc, or Azure Synapse).
Knowledge of SQL and experience with data warehouses (Snowflake, Redshift, BigQuery is a plus).
Strong background in performance tuning and Spark job optimization.
Experience with CI/CD pipelines and version control (Git).
Familiarity with containerization (Docker, Kubernetes) is an advantage.

Preferred:

Experience with stream processing frameworks (Kafka, Flink).
Exposure to machine learning workflows with Spark MLlib.
Knowledge of workflow orchestration tools (Airflow, Luigi).

Physical Demands

Ability to safely and successfully perform the essential job functions
Sedentary work that involves sitting or remaining stationary most of the time with occasional need to move around the office to attend meetings, etc.
Ability to conduct repetitive tasks on a computer, utilizing a mouse, keyboard, and monitor

Reasonable accommodation statement

If you require a reasonable accommodation in completing this application, interviewing, completing any pre-employment testing, or otherwise participating in the employment selection process, please direct your inquiries to [email protected] or (888) 824 – 8111.

Top Skills

Airflow

Spark

Avro

Aws Emr

Azure Synapse

BigQuery

Databricks

Docker

Flink

Gcp Dataproc

Hadoop

Hdfs

Java

Kafka

Kubernetes

Luigi

Orc

Parquet

Python

Redshift

Scala

Snowflake

SQL

Yarn

Similar Jobs

Apex Fintech Solutions

Agency Trader

12 Hours Ago

Hybrid

Belfast, County Antrim, Northern Ireland, GBR

Mid level

Fintech • Software • Financial Services

The Agency Trader will oversee overnight trading activities, provide client support, manage orders, and mitigate risks while ensuring operational efficiency in trading processes.

Top Skills: Electronic Trading SystemsOrder Management Systems

Apex Fintech Solutions

Senior Accountant

Yesterday

Hybrid

Belfast, County Antrim, Northern Ireland, GBR

Senior level

Fintech • Software • Financial Services

The Senior Accountant manages complex financial tasks, oversees month-end close duties, handles reconciliations, and improves workflow processes while collaborating with various teams.

Top Skills: Microsoft Office 365NetSuite

Apex Fintech Solutions

Operations Analyst

Yesterday

Hybrid

Belfast, County Antrim, Northern Ireland, GBR

Junior

Fintech • Software • Financial Services

Responsible for tax lot accounting, reconciliation, audit reviews, compliance with IRS regulations, and assisting with process improvements in a fast-paced environment.

Top Skills: Bi ToolsExcelSQL

What you need to know about the Belfast Tech Scene

If asked to name the birthplace of the RMS Titanic, you might not say Belfast. Similarly, if asked to name Europe's leading destination for foreign direct investment in new software development, Belfast might not come to mind. Yet, both are true. The city has emerged as a tech powerhouse, recently ranked among the best in the U.K. for tech careers — especially for software developers. It also leads the U.K. with the highest percentage of software development jobs advertised.