Delivery — AI Cloud & GPU

Principal Architect — AI Cloud & GPU Infrastructure

Architect the compute layer that powers the AI era.

Remote / US or IndiaRemoteFull-TimeReq CDC-016

About the role

The Principal Architect for AI Cloud & GPU Infrastructure designs and deploys GPU clusters, private AI clouds, MLOps platforms, and inference infrastructure for CloudData.Center clients. You are the technical authority on AI compute architecture — from NVIDIA DGX clusters to Kubernetes and inference serving.

What you will do

Lead architecture design for GPU compute clusters (H100, B200, GB200-class systems)
Design high-performance networking for AI: InfiniBand, 400G/800G Ethernet fabrics, and RDMA
Architect private AI cloud platforms: bare-metal GPU provisioning, Kubernetes, and GPU scheduling
Design storage architectures for AI workloads: high-throughput NVMe, parallel file systems
Lead MLOps platform design: training pipelines, model registries, inference serving infrastructure
Advise clients on GPU-as-a-Service models, multi-tenancy, and workload isolation
Support hyperscaler on-ramp design: AWS, Azure, and GCP hybrid connectivity
Develop reference architectures and technical proposal content

What we need

10+ years in cloud architecture, HPC, or AI/ML infrastructure roles
Deep expertise in GPU compute: NVIDIA DGX, H100/A100/B200 architecture, NVLink, NVSwitch
Strong understanding of high-performance networking: InfiniBand HDR/NDR, RoCE, RDMA
Experience with Kubernetes, Slurm, or Ray for GPU cluster orchestration
Familiarity with MLOps platforms: MLflow, Kubeflow, NVIDIA NIM, or equivalent

Nice to have

NVIDIA Certified Networking Professional or equivalent
Experience with PyTorch/TensorFlow distributed training infrastructure
Private AI cloud design experience for financial services or government verticals

Apply

Apply for Principal Architect — AI Cloud & GPU Infrastructure

Tell us about yourself and attach your resume. We review every application personally.