Delivery — AI Cloud & GPU
Principal Architect — AI Cloud & GPU Infrastructure
Architect the compute layer that powers the AI era.
Remote / US or IndiaRemoteFull-TimeReq CDC-016
About the role
The Principal Architect for AI Cloud & GPU Infrastructure designs and deploys GPU clusters, private AI clouds, MLOps platforms, and inference infrastructure for CloudData.Center clients. You are the technical authority on AI compute architecture — from NVIDIA DGX clusters to Kubernetes and inference serving.
What you will do
- Lead architecture design for GPU compute clusters (H100, B200, GB200-class systems)
- Design high-performance networking for AI: InfiniBand, 400G/800G Ethernet fabrics, and RDMA
- Architect private AI cloud platforms: bare-metal GPU provisioning, Kubernetes, and GPU scheduling
- Design storage architectures for AI workloads: high-throughput NVMe, parallel file systems
- Lead MLOps platform design: training pipelines, model registries, inference serving infrastructure
- Advise clients on GPU-as-a-Service models, multi-tenancy, and workload isolation
- Support hyperscaler on-ramp design: AWS, Azure, and GCP hybrid connectivity
- Develop reference architectures and technical proposal content
What we need
- 10+ years in cloud architecture, HPC, or AI/ML infrastructure roles
- Deep expertise in GPU compute: NVIDIA DGX, H100/A100/B200 architecture, NVLink, NVSwitch
- Strong understanding of high-performance networking: InfiniBand HDR/NDR, RoCE, RDMA
- Experience with Kubernetes, Slurm, or Ray for GPU cluster orchestration
- Familiarity with MLOps platforms: MLflow, Kubeflow, NVIDIA NIM, or equivalent
Nice to have
- NVIDIA Certified Networking Professional or equivalent
- Experience with PyTorch/TensorFlow distributed training infrastructure
- Private AI cloud design experience for financial services or government verticals
Apply
Apply for Principal Architect — AI Cloud & GPU Infrastructure
Tell us about yourself and attach your resume. We review every application personally.