Senior AI Platform Engineer
Guardian Life
**Job Description:**
AI Platform Engineer (5–8 years)
Role Summary
We are looking for an AI Platform Engineer to build and scale the core platform that supports traditional ML models as well as modern LLM and generative AI workloads. This role focuses on production-grade MLOps, model lifecycle management, platform reliability, security, and self-service enablement for data scientists and engineering teams.
Key Responsibilities
1. Platform Engineering
· Design and operate a scalable AI/ML platform using Kubernetes, containers, and infra-as-code.
· Build reusable frameworks for model training, fine-tuning, batch and real-time inference, and RAG pipelines.
· Implement multi-tenant isolation, quotas, and cost-tracking.
2. MLOps & Model Lifecycle Automation
· Develop CI/CD/CT pipelines for models, prompts, and data.
· Manage model registry, feature store, lineage, and experiment tracking.
· Ensure reliable production rollout using blue-green, canary, and shadow deployments.
3. Data & Pipelines
· Build scalable data and model pipelines using orchestrators like Airflow, Prefect, Dagster, or Argo.
· Implement data validation and schema enforcement.
· Optimize storage, caching, indexes, embeddings, and vector search workflows.
4. Observability & Reliability
· Set up monitoring for data drift, model drift, prompt performance, latency, accuracy, and cost.
· Define SLOs, SLIs, and incident response patterns.
· Implement logging, tracing and metrics using Prometheus, Grafana, OpenTelemetry, or similar tools.
5. Security & Governance
· Enforce secrets management, IAM controls, network security, and auditability.
· Implement model governance, model cards, prompt controls, and risk guardrails.
· Work with security to ensure PII and compliance adherence.
6. Performance & Cost Optimization
· Optimize compute, autoscaling, GPU usage, caching, and batching.
· Track cost per model, per workload, and per team for transparency.
· Implement model optimization (quantization, distillation, caching).
7. Enablement & Developer Experience
· Create templates, SDKs, CLI tools, documentation, and best practices.
· Help data scientists and developers move models to production quickly.
· Partner with architecture, cybersecurity, and product teams.
Must-Have Skills
· 5–8 years of experience, with 3+ years in ML platform/MLOps.
· Strong Python development skills.
· Experience with Kubernetes, Docker, Helm.
· Infra-as-code: Terraform, Pulumi, CloudFormation or similar.
· CI/CD systems like GitHub Actions, GitLab CI, Azure DevOps, Jenkins.
· Experience with one or more ML platforms:
o MLflow, Kubeflow, Azure ML, Vertex AI, SageMaker, Ray, BentoML, W&B.
· Strong understanding of model lifecycle, deployment patterns, and monitoring.
· Experience with vector databases, feature stores, artifact registries.
· Familiarity with observability stacks (Prometheus, Grafana, Loki, OpenTelemetry).
· Strong understanding of security for data and ML workloads.
Good-to-Have Skills
· Experience with LLM serving frameworks: vLLM, Triton, Ray Serve, OpenAI/Anthropic APIs.
· Experience building RAG systems with vector DBs: FAISS, Milvus, Pinecone.
· Understanding of data engineering tools like Spark, Flink, Kafka.
· GPU optimization (CUDA, TensorRT, ONNX).
· Background in cost governance (FinOps for AI).
· Experience building internal SDKs, CLIs, or developer tools.
· Knowledge of privacy frameworks and governance models.
Education
B.E / B.Tech / M.E / M.Tech in Computer Science, IT, or equivalent hands-on experience.
**Location:**
This position can be based in any of the following locations:
Chennai
**Current Guardian Colleagues: Please apply through the internal Jobs Hub in Workday**
Every day, Guardian helps our 29 million customers realize their dreams through a range of insurance and financial products and services. Our Purpose, to inspire well-being, guides our dedication to the colleagues, consumers, and communities we serve. We know that people count, and we go above and beyond to prepare them for the life they want to live, focusing on their overall well-being — mind, body, and wallet. As one of the largest mutual insurance companies, we put our customers first. Behind every bright future is a GuardianTM. Learn more about Guardian at guardianlife.com .
Visa Sponsorship:
Guardian Life is not currently or in the foreseeable future sponsoring employment visas. In order to be a successful applicant, you must be legally authorized to work in the United States, without the need for employer sponsorship.
Confirm your E-mail: Send Email
All Jobs from Guardian Life