Req number:
R7168Employment type:
Full timeWorksite flexibility:
RemoteWho we areCAI is a global services firm with over 9,000 associates worldwide and a yearly revenue of $1.3 billion+. We have over 40 years of excellence in uniting talent and technology to power the possible for our clients, colleagues, and communities. As a privately held company, we have the freedom and focus to do what is right—whatever it takes. Our tailor-made solutions create lasting results across the public and commercial sectors, and we are trailblazers in bringing neurodiversity to the enterprise.
Job Summary
Technical resource responsible for monitoring, optimizing, and improving operational reliability within our Microsoft Dataverse / Power Platform environment. This role will lead monitoring, observability, DevOps enablement, CI/CD governance, and automation initiatives to improve system reliability, integration stability, and backend IT support efficiencies.The ideal candidate combines Power Platform expertise with Azure DevOps, GitHub-based source control, monitoring frameworks, and cloud cost management experience and are looking for your next career move, apply now.
Job Description
We are looking for a technical resource responsible for monitoring, optimizing, and improving operational reliability within our Microsoft Dataverse / Power Platform environment. This role will lead monitoring, observability, DevOps enablement, CI/CD governance, and automation initiatives to improve system reliability, integration stability, and backend IT support efficiencies. This position will be full-time and remote.
What You'll Do
Scheduled Job & Workflow Monitoring
Monitor scheduled jobs and background processes in Dataverse
Track:
Failed Power Automate flows
Long-running flows
Workflow execution anomalies
Build proactive alerting mechanisms
Perform root cause analysis and corrective action planning
Improve retry logic and resiliency patterns
Dashboard & Observability Development
Design and implement centralized monitoring dashboards displaying:
Failed workflows
Slow-running automations
Tools may include:
Power BI
Azure Monitor
Application Insights
Log Analytics
Azure Cost Management
Dataverse analytics APIs
Integration health and latency
System account lockouts
Job schedules and failure trends
Long-running background jobs
Email error logs
Azure DevOps & GitHub (CI/CD Governance)
Implement and manage CI/CD pipelines for Power Platform solutions
Manage solution versioning and environment promotion strategies (Dev → Test → Prod)
Configure Azure DevOps pipelines for:
Solution export/import automation
Automated deployments
Validation testing
Maintain GitHub repositories for:
Source control of Power Platform solutions
Infrastructure-as-Code (IaC) scripts
Automation scripts
Enforce branching strategies and pull request governance
Integrate automated quality checks into deployment pipelines
Enable automated environment provisioning where applicable
Source Control & Environment Governance
Support Git-based source control best practices
Support:
Branching models (GitFlow or trunk-based)
Automated solution packaging
Maintain deployment documentation and release runbooks
Ensure secure credential and secret management within pipelines
Automation & Operational Efficiency
Identify repetitive backend support tasks suitable for automation
Implement self-healing automation where feasible
Reduce alert fatigue through improved monitoring configuration
Develop operational runbooks and knowledge base documentation
What You'll Need
Required:
Monitoring & Observability
Azure Monitor
Log Analytics
Application Insights
Power BI dashboard development
Structured logging and alerting frameworks
Power Platform & Dataverse
Azure DevOps Pipelines
GitHub (branching strategies, PR governance)
YAML-based pipeline configuration
Power Platform Build Tools
Environment promotion strategies
Advanced Power Automate
Dataverse administration
Solution management
Power Platform ALM best practices
DevOps & CI/CD
Preferred:
Experience implementing DevOps for enterprise Power Platform environments
Familiarity with Infrastructure as Code (ARM, Bicep, Terraform)
Understanding of SRE principles
ITIL or service management experience
Experience reducing production incidents through automation
Role Impact / Outcomes
This role will:
Increase platform reliability
Reduce production incidents
Improve deployment consistency
Strengthen DevOps maturity
Improve backend IT efficiency
Establish governance and observability standards across the Power Platform ecosystem
Physical Demands
Ability to safely and successfully perform the essential job functions consistent with federal, state and local standards
Sedentary work that involves sitting or remaining stationary most of the time with occasional need to move around the office to attend meetings, etc.
Ability to conduct repetitive tasks on a computer, utilizing a mouse, keyboard and monitor
Reasonable accommodation statement
If you require a reasonable accommodation in completing this application, interviewing, completing any pre-employment testing, or otherwise participating in the employment selection process, please direct your inquiries to [email protected] or (888) 824 – 8111.


