The Portfolio Group
Join an award-winning B2B consultancy at the forefront of enterprise AI, building and owning the cloud-native platform infrastructure that powers production-grade conversational and generative AI products at scale.
The role
This is a platform and infrastructure engineering role – not a data science or ML engineering position. You'll own the runtime, infrastructure, and operational layers that RAG pipelines, LLM orchestration, vector search, and evaluation workflows run on, across AWS and Databricks. The focus is on building scalable, observable, secure, and cost-efficient platform infrastructure that enables AI engineering teams to ship and operate AI products reliably in production.
What you'll do
- Design, build, and operate cloud-native AI platform infrastructure across AWS (Lambda, API Gateway, DynamoDB, S3, CloudWatch) and Databricks
- Deploy and operate containerised services on Kubernetes using Terraform for infrastructure-as-code
- Own and scale vector search infrastructure (OpenSearch, Algolia, AWS Bedrock Knowledge Bases) and embedding pipelines
- Build and maintain CI/CD pipelines for inference services, retrievers, ingestion workflows, and RAG components
- Implement observability across AI workloads using CloudWatch, MLflow, and OpenTelemetry – covering latency, throughput, cost, and system health
- Apply secure-by-design principles including IAM, encryption, network controls, and audit logging
- Work closely with AI engineers to translate prototypes and proof-of-concepts into production-ready, well-architected platform components
What we're looking for
- Proven experience in platform, infrastructure, or software engineering roles delivering production-grade systems on AWS
- Strong hands-on Kubernetes experience, specifically with EKS (Elastic Kubernetes Service) and ECS (Elastic Container Service) in production environments
- Strong Terraform experience for infrastructure-as-code, provisioning and managing cloud infrastructure at scale
- Experience operating containerised services, managing CI/CD pipelines, and owning observability and reliability
- Familiarity with vector databases or search infrastructure (OpenSearch, Algolia) is a strong advantage
- Python proficiency for scripting, automation, and deploying production services
- Solid grasp of distributed systems, cloud-native architecture, microservices, and API design
- Ownership mindset – comfortable operating autonomously across reliability, performance, cost, and security
Why join? You'll own the foundational platform infrastructure behind a growing suite of generative AI products, working directly with senior AI and engineering leaders. This is a deep technical ownership role with long-term architectural impact, within an organisation investing heavily in AI at scale.
INDAM
The Portfolio Group are acting on behalf of our client in recruiting for this position.
To apply for this job please visit www.reed.co.uk.
Make this application stronger
Use these quick checks before applying so your CV, interview preparation and job search are better matched to this vacancy.
Before you apply
Check the key details and make sure the role matches what you are looking for.
- Review the job title, company, location, salary and working pattern if provided.
- Check the skills, experience or qualifications requested by the employer.
- Make sure the commute, hours and contract type are realistic for you.
Tailor your CV
For IT Jobs, highlight the most relevant skills, experience and achievements linked to this type of work. Keep it honest, clear and focused on what the employer is asking for.
Use the CV Builder or browse Career Advice.
Prepare for interview
If your application is successful, prepare simple examples that show your motivation, strengths and suitability.
Keep searching smarter
Do not rely on one application. Keep searching similar roles and set up alerts so new vacancies reach you faster.
