Tether.io Jobs

Staff ML Systems Engineer - Media Intelligence

Tether.io

Staff ML Systems Engineer - Media Intelligence

Reposted 5 Days Ago

Be an Early Applicant

In-Office or Remote

2 Locations

Senior level

In-Office or Remote

2 Locations

Senior level

The Senior Applied ML Engineer will architect backend systems for a media intelligence platform, integrating AI/ML services, optimizing workflows, and overseeing large media processing pipelines.

The summary above was generated by AI

Join Tether and Shape the Future of Digital Finance

At Tether, we’re not just building products, we’re pioneering a global financial revolution. Our cutting-edge solutions empower businesses—from exchanges and wallets to payment processors and ATMs—to seamlessly integrate reserve-backed tokens across blockchains. By harnessing the power of blockchain technology, Tether enables you to store, send, and receive digital tokens instantly, securely, and globally, all at a fraction of the cost. Transparency is the bedrock of everything we do, ensuring trust in every transaction.

Innovate with Tether

Tether Finance: Our innovative product suite features the world’s most trusted stablecoin, USDT, relied upon by hundreds of millions worldwide, alongside pioneering digital asset tokenization services.

But that’s just the beginning:

Tether Power: Driving sustainable growth, our energy solutions optimize excess power for Bitcoin mining using eco-friendly practices in state-of-the-art, geo-diverse facilities.

Tether Data: Fueling breakthroughs in AI and peer-to-peer technology, we reduce infrastructure costs and enhance global communications with cutting-edge solutions like KEET, our flagship app that redefines secure and private data sharing.

Tether Education: Democratizing access to top-tier digital learning, we empower individuals to thrive in the digital and gig economies, driving global growth and opportunity.

Tether Evolution: At the intersection of technology and human potential, we are pushing the boundaries of what is possible, crafting a future where innovation and human capabilities merge in powerful, unprecedented ways.

Why Join Us?

Our team is a global talent powerhouse, working remotely from every corner of the world. If you’re passionate about making a mark in the fintech space, this is your opportunity to collaborate with some of the brightest minds, pushing boundaries and setting new standards. We’ve grown fast, stayed lean, and secured our place as a leader in the industry.

If you have excellent English communication skills and are ready to contribute to the most innovative platform on the planet, Tether is the place for you.

Are you ready to be part of the future?

About the job

We are building a highly scalable media intelligence platform that processes, analyzes, and flags potentially problematic content across large volumes of video, audio, image, and text. The platform powers content safety workflows for Tether Data products, including Keet, and must operate reliably at scale across diverse languages, formats, and content types.

As a Staff ML Systems Engineer, you will own the core of this platform - from ingestion and async processing pipelines through AI/ML model integration, inference optimization, vector search, and structured report generation. This is not a research role and it is not a prompt-engineering role. It is a production engineering role where the models are one component of a larger system that you are responsible for making fast, reliable, cost-efficient, and maintainable.

You will be the senior technical owner of the media intelligence backend. That means you define the architecture, make the hard tradeoff calls, mentor other engineers, and carry responsibility for the system in production. You will work closely with engineering leadership and collaborate with ML researchers, data engineers, and product teams to deliver a platform that provides actionable, timestamped findings to human reviewers at scale.

Responsibilities

Backend Architecture & System Ownership

Design and operate scalable backend services for media ingestion, processing, and report generation - clean, well-tested, and built for horizontal scaling from day one
Own API contracts, data models, and storage patterns for media assets, processing jobs, model outputs, embeddings, and audit trails
Build high-throughput async processing pipelines for video, audio, image, and text using queues and event-driven patterns (SQS, Kafka, Pub/Sub, or equivalent)
Implement reliable asynchronous processing with retries, idempotency, dead-letter queues, backpressure handling, and graceful degradation

AI/ML Integration & Model Workflows

Integrate and optimize AI/ML inference workflows within the backend - embedding pipelines, multimodal models, OCR, speech-to-text, scene analysis, and visual classifiers
Own model-serving infrastructure: batching strategies, concurrency tuning, warmup behavior, timeout handling, autoscaling, and GPU utilization
Apply practical model optimization techniques - quantization, distillation, batching, caching, routing to smaller models where appropriate - to hit latency, throughput, and cost targets on constrained hardware
Benchmark and evaluate candidate models using domain-relevant metrics, not just standard leaderboards. Set operating thresholds using data-driven calibration methods and document the rationale

Model Serving & Performance Optimization

Optimize AI/ML inference workflows for latency, throughput, reliability, and cost across both real-time and batch-processing paths.
Work with model-serving systems such as vLLM, Triton, TGI, SageMaker, Vertex AI, or custom inference services to improve batching, concurrency, warmup behavior, timeout handling, autoscaling, and GPU utilization.
Evaluate and apply practical model optimization techniques such as quantization, model distillation, batching, caching, prompt optimization, and routing to smaller or cheaper models where appropriate.
Design and maintain vector search and indexing systems using technologies such as Pinecone, Weaviate, Qdrant, Elastic Vectors, FAISS, pgvector, or similar tools.
Build retrieval workflows that support semantic search, similarity matching, duplicate detection, media discovery, and structured metadata search.
Monitor model and system performance in production, including API latency, queue depth, processing time, model error rates, GPU utilization, confidence distributions, drift signals, and cost per processed item.Search, Indexing & Data Retrieval

Infrastructure, Reliability & Observability

Deploy and operate systems on AWS, GCP, Azure, or equivalent cloud platforms, including compute, storage, networking, queues, model-serving infrastructure, and monitoring systems.
Ensure system reliability through logging, metrics, tracing, alerting, dashboards, operational runbooks, and incident-response best practices.

Collaboration & Engineering Leadership

Mentor junior and mid-level engineers through code reviews, design discussions, and hands-on pairing
Drive architectural decisions and raise engineering quality across backend, infrastructure, and ML integration work
Translate ambiguous product requirements into clear technical deliverables with defined success criteria

These three capabilities are non-negotiable. Please do not apply if you cannot speak to all three with specific examples and real numbers.

Model optimization for constrained hardware. You have deployed an ML model (any modality) on a memory-constrained inference server and improved its performance in production. You have hands-on experience with at least two of: quantization (GPTQ, AWQ, GGUF, or bitsandbytes), batching strategy tuning, KV-cache optimization, knowledge distillation, or serving framework configuration (vLLM, Triton, TGI, or equivalent). You can cite specific before/after latency, throughput, or memory numbers from that work.
Production async pipeline ownership. You have owned an async media-processing or ML inference pipeline end to end - including queue design, worker failure handling, idempotency, retry logic, dead-letter queues, and an observability layer. You understand what happens under backpressure and what breaks first.
Evaluation and calibration rigor. You have set model operating thresholds using data-driven methods - precision/recall curves, cost-weighted metrics, or domain-specific benchmarks - not intuition or trial and error. You can explain the false-positive and false-negative tradeoffs of a classifier in plain language to a non-technical stakeholder.

Additional requirements:

8+ years of backend engineering experience building scalable distributed systems, data pipelines, or media processing services
4+ years of hands-on ML integration experience in production (model APIs, embedding pipelines, OCR, speech-to-text, video/image analysis, or multimodal inference)
Strong Python proficiency; deep understanding of RESTful API design and async processing patterns
Experience with SQL and NoSQL databases, schema design, and data modeling
Experience deploying and operating systems on AWS, GCP, or Azure

Preferred:

Experience with video/image processing pipelines
Familiarity with HuggingFace transformers, PyTorch, and the open-source model ecosystem
Prior work on content moderation, trust and safety, or media intelligence systems
Experience with multilingual models or multilingual content pipelines
Contributions to open-source ML or infrastructure projects

System Design & Architecture

Preferred understanding of distributed systems, scaling patterns, and performance engineering
Ability to design modular, maintainable, and efficient architectures
Experience with API versioning, modularization, and designing long-running workflows
Understanding of performance bottlenecks and low-latency backend patterns

Important information for candidates
Recruitment scams have become increasingly common. To protect yourself, please keep the following in mind when applying for roles:

Apply only through our official channels. We do not use third-party platforms or agencies for recruitment unless clearly stated. All open roles are listed on our official careers page: https://tether.recruitee.com/
Verify the recruiter’s identity. All our recruiters have verified LinkedIn profiles. If you’re unsure, you can confirm their identity by checking their profile or contacting us through our website.
Be cautious of unusual communication methods. We do not conduct interviews over WhatsApp, Telegram, or SMS. All communication is done through official company emails and platforms.
Double-check email addresses. All communication from us will come from emails ending in @tether.to or @tether.io
We will never request payment or financial details. If someone asks for personal financial information or payment at any point during the hiring process, it is a scam. Please report it immediately.

When in doubt, feel free to reach out through our official website.

Similar Jobs

Nisos

Principal Product Designer

Yesterday

Remote

Northern Ireland, GBR

Senior level

Professional Services • Security • Software • Consulting • Cybersecurity • Generative AI • Data Privacy

Lead end-to-end product design for the Ascend platform: research, prototyping, user testing, and ship intuitive B2B experiences. Drive design strategy, create prototypes and design artifacts, collaborate with Product/Engineering, and iterate on complex threat-intelligence workflows.

Top Skills: AIDesign SystemsFigmaPrototyping Tools

DraftKings

Vice President, iGaming Revenue Operations

Yesterday

Remote or Hybrid

United Kingdom

Expert/Leader

Digital Media • Gaming • Information Technology • Software • Sports • Esports • Big Data Analytics

Lead and define the operating model for DraftKings iGaming, overseeing Revenue Operations and Games Operations & Studio. Partner with the EVP & GM to translate strategy into measurable operating plans, drive revenue growth, ensure regulatory compliance, align cross-functional teams, embed analytics and experimentation into commercial decisions, scale into new markets, and build a high-performing leadership team.

GitLab

Senior Absence Management Partner

Yesterday

Easy Apply

Remote

United Kingdom

Easy Apply

Senior level

Cloud • Security • Software • Cybersecurity • Automation

Manage GitLab's global leave programs end-to-end: intake, case processing, compliance, HRIS updates, stakeholder guidance, reporting, and continuous improvement. Lead cross-functional projects, monitor regulations, and use data and AI responsibly to improve processes and member experience.

Top Skills: Ai ToolsGitlabGoogle WorkspaceHrisWorkday

What you need to know about the Belfast Tech Scene

If asked to name the birthplace of the RMS Titanic, you might not say Belfast. Similarly, if asked to name Europe's leading destination for foreign direct investment in new software development, Belfast might not come to mind. Yet, both are true. The city has emerged as a tech powerhouse, recently ranked among the best in the U.K. for tech careers — especially for software developers. It also leads the U.K. with the highest percentage of software development jobs advertised.