Technology Operations & AIOps Consultant

SWATX

  • الرياض
  • عقد
  • دوام كامل
  • قبل 1 شهر
  • التقديم على الوظيفة بسهولة
Key Responsibilities * Toolchain Evaluation & Modernization
  • Evaluate legacy monitoring and alerting tools (e.g., BMC MainView, SolarWinds).
  • Recommend and integrate a unified observability stack using Splunk, Dynatrace, Grafana, and Elastic Stack.
  • Ensure end-to-end visibility across infrastructure, apps, and user experience.
  • AIOps Enablement
  • Deploy AIOps capabilities (event correlation, noise reduction, predictive analytics) using Dynatrace and Splunk.
  • Enable intelligent alerting and root cause analysis using ML-based models.
  • Integrate ServiceNow ITOM for automated incident creation and enrichment.
  • Automation & Self-Healing
  • Develop automation playbooks and runbooks (Python, PowerShell, Ansible) for common incident types.
  • Enable auto-remediation pipelines linked to AIOps events.
  • Support auto-scaling, service restarts, and config drift corrections.
  • Observability Architecture & Implementation
  • Deploy logs, metrics, traces using Elastic Stack and Dynatrace.
  • Define and implement Service Level Objectives (SLOs), error budgets, MTTR/MTTD benchmarks.
  • Build dashboards in Grafana, Dynatrace, and ServiceNow Performance Analytics.
  • Operational Process Reengineering
  • Redesign and automate event, incident, change, and problem management processes.
  • Align monitoring workflows with ServiceNow CMDB and CI health status.
  • Shift operations from reactive to proactive, leveraging predictive insights.
Qualifications
  • Education:
  • Bachelor's in Information Technology, Engineering, or Computer Science
  • Master’s degree (optional but preferred)
  • Experience:
  • 8–12 years in IT operations, observability, or monitoring architecture
  • 3–5 years hands-on in AIOps and automation
  • Strong background in Dynatrace, Splunk, SolarWinds, ServiceNow, Elastic, BMC tools
  • Core Competencies:
  • Observability architecture and integration
  • AIOps platforms and automation frameworks
  • ITOM/ITSM best practices (especially ServiceNow ITOM modules)
  • Scripting and tooling orchestration
  • Metrics design: MTTR, Uptime, Alert Fatigue Index
  • Certifications (Preferred):
  • ITIL 4 Managing Professional
  • Dynatrace Associate/Professional
  • Splunk Core Certified Admin
  • DevOps / SRE Foundation

SWATX