Senior Platform AI Engineers

About Us:

CLOUDSUFI, a Google Cloud Premier Partner, is a global leading provider of data-driven digital transformation across cloud-based enterprises. With a global presence and focus on Software & Platforms, Life sciences and Healthcare, Retail, CPG, financial services and supply chain, CLOUDSUFI is positioned to meet customers where they are in their data monetization journey.

Our Values

We are a passionate and empathetic team that prioritizes human values. Our purpose is to elevate the quality of lives for our family, customers, partners and the community.

Equal Opportunity Statement

CLOUDSUFI is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. All qualified candidates receive consideration for employment without regard to race, colour, religion, gender, gender identity or expression, sexual orientation and national origin status. We provide equal opportunities in employment, advancement, and all other areas of our workplace. Please explore more at https://www.cloudsufi.com/

Role Requirement:

Candidate needs to have experience building a unified AI platform as a key enabler for the engineering teams.

● Internal AI Developer Platform (IDP) Architecture: Proven experience designing and building multi-tenant AI platforms. Ability to abstract complex GCP AI services (Vertex AI, Agent Engine, Model Garden) into self-service, "paved-path" APIs, SDKs, or Terraform modules that product engineering teams can easily consume without needing deep AI infrastructure expertise.

● Centralized Model Gateway/ Garden : Expertise in architecting enterprise Model Gateways. Must have experience building unified routing layers that manage rate-limiting, load balancing, failovers, and unified telemetry, allowing the platform team to swap underlying models seamlessly without breaking downstream product applications.

● Secure 3P Integration Middleware (Tool Registries): Deep experience in enterprise integration and API management (e.g., Apigee, Cloud Run middleware). Ability to build a centralized, secure "Tool Registry" that standardizes how downstream agents authenticate and interact with external systems (Salesforce, ServiceNow), enforcing strict identity and access management (IAM) and OAuth boundaries.

● Standardized LLMOps & Agent CI/CD Pipelines: Strong background in designing "Code-to-Production" workflows for AI. Experience building standardized release pipelines that enforce mandatory pre-flight checks, such as automated evaluations (LLM-as-a-judge), latency testing, and prompt regression testing before any product team's agent is promoted to production.

● Platform-Level Trust, Safety & FinOps: Experience implementing embedded, non-bypassable governance. This includes integrating dynamic PII masking (GCP Sensitive Data Protection), prompt injection firewalls (Model Armor), and token-cost attribution (chargeback models via BigQuery) at the platform tier, ensuring product teams are secure and cost-aware by default.

● Self-Service RAG & Knowledge Infrastructure: Experience provisioning and managing scalable, multi-tenant Vector Search infrastructure (e.g., Vertex AI Vector Search, AlloyDB pgvector). Ability to build standardized data ingestion and chunking pipelines that allow diverse engineering teams to easily ground their specific agents in sanctioned enterprise data.

● AI Observability: Proficiency in implementing OpenTelemetry (OTel) and distributed tracing for LLM reasoning paths. Ability to build global dashboards that provide the platform team with a birds-eye view of system-wide token usage, latency bottlenecks,and tool-call failures across dozens of disparate product teams.

Behavioural competencies required:

Must have worked with US/Europe based clients in onsite/offshore delivery model
Should have very good verbal and written communication, technical articulation, listening and presentation skills
Should have proven analytical and problem solving skills
Should have demonstrated effective task prioritization, time management and internal/external stakeholder management skills
Should be a quick learner and team player
Should have experience of working under stringent deadlines in a Matrix organization structure
Should have demonstrated appreciable Organizational Citizenship Behavior (OCB) in past organizations

Required Skills

AI Engineering Platform Engineering

View all job openings