Senior AI Platform Engineer (Multi-tenant SaaS & MLOps) (m / f / d)
In Germany - Berlin| Bonn | Cologne | Frankfurt / Main | Hamburg | Munich
We are seekingan experienced AI Platform Engineer to contribute to designing and buildingscalable SaaS products within our AI Lab. In this role, you’ll combine deeptechnical expertise with strategic vision to build AI-powered products thatwill help transform our clients’ business models and enable their growth.
Simon-Kucheris at the forefront of innovation in driving commercial excellence, revampingbusiness models, developing solutions and methodologies for unlocking bettergrowth of our clients. Within AI Lab, we are developing cutting-edge large scale AI products to deliversustained top-line impact for our clients.
Are you interested in working in a team of AI evangelists with a can-doattitude? Want to experience the dynamics of agile processes in open-mindedteams? How about getting creative in a startup atmosphere with a steepdevelopment curve and flat hierarchies? And most importantly, do you want tomake a difference? Then you've come to the right place.
What makes us special :
- Advance your career with exciting professional opportunities in our thriving company with a startup feel
- Innovate by transforming ideas into cutting-edge AI products, championing AI and Generative AI through creative experimentation to push boundaries and deliver transformative solutions.
- Voice your unique ideas in a culture defined by our entrepreneurial spirit, openness, and integrity
- Feel at home working with our helpful, enthusiastic colleagues who have great team spirit
- Broaden your perspective with our extensive training curriculum and learning programs (e.g. LinkedIn Learning)
- Speak your mind in our holistic feedback and development processes (e.g. 360-degree feedback)
- Satisfy your need for adventure with our opportunities to live and work abroad in one of our many international offices
- Enjoy our benefits, such as hybrid working, daycare allowance, corporate discounts, and wellbeing support (e.g. Headspace)
- Unwind in our break areas where you can help yourself to the healthy snacks and beverages provided
- See another side of your coworkers at our frequentemployee events, World Meetings and Holiday Parties
How you will create animpact :
Design and evolve a multi-tenant SaaS architecture, including tenant isolation for data, computer, and observability.Build automated tenant provisioning / onboarding, configuration, and safe rollouts (canary / feature flags) across tenants.Implement noisy-neighbor protection (per-tenant quotas, rate limits, priority scheduling) and per-tenant SLO monitoring.Partner with security / compliance to deliver enterprise controls (audit logs, tenant-aware access control, retention).Develop and maintain data architecture : create and manage robust data architectures that support high-volume, high-throughput SaaS applications, focusing on reliability and scalability.Drive faster and more reliable ML delivery by building robust MLOps foundations, including automated training pipelines, experiment tracking, and scalable model deployment.Accelerate AI product development by operationalizing LLMs end-to-end — from fine-tuning and evaluation to high-performance serving, monitoring, and embeddings workflows.Increase engineering velocity and system reliability by developing and maintaining unified CI / CD pipelines that ship ML and application code seamlessly.Enable scalable and cost-efficient AI workloads through well-architected cloud infrastructure across AWS.Improve performance and resilience of AI systems by managing Kubernetes clusters, optimizing autoscaling, and orchestrating GPU-heavy workloads.Enhance inference speed and portability by delivering highly optimized, secure Docker-based containers tailored for ML and LLM workloads.Strengthen data quality and model performance through well-designed ETL / ELT pipelines, streaming systems, feature store integration, and workflow orchestration.Ensure reliable and trustworthy AI operations by implementing comprehensive observability : logs, metrics, traces, and model / data drift detection.Reduce operational risk by embedding security and compliance best practices — IAM, RBAC, VPC design, secrets management, and encryption — into every layer of the stack.Increase automation, reduce manual toil, andsupport rapid experimentation by leveraging Python, Bash, and Terraform toscript, codify, and automate infrastructure and ML workflows.About you :
You have shipped and operated customer-facing SaaS products used by real users at scale and bring hands-on experience operating multi-tenant SaaS with tenant isolation, per-tenant controls, and enterprise security expectations.You have previously owned end-to-end ML / AI infrastructure — from data ingestion and feature pipelines to training, deployment, and production monitoring.You enable engineers and data scientists to move faster by building self-service platforms, stable environments, and automated workflows that eliminate friction.You have a track record of designing systems that scale globally across regions, workloads, and traffic patterns.You’re comfortable participating in incident response and on-call rotations, and you know how to stabilize and improve critical production systems.You think with a product mindset, focusing on customer value, reliability, and speed-to-market rather than technology for its own sake.You have a strong bias for automation — you eliminate manual operational toil by designing robust tooling and pipelines.Very strong communication and collaboration skills- supporting other engineers, async collaboration, explaining technicaldecisions to non-technical audiences, writing documentation, showinginitiative.Technical skills required :
Proven patterns for tenant isolation (DB-per-tenant vs schema-per-tenant vs row-level security), plus tenant-aware caching and noisy-neighbor protection (rate limits, quotas, scheduling).Experience with OIDC / OAuth2, tenant-awareRBAC / ABAC , SCIM provisioning, and audit logging requirements for B2BSaaS.Deep Kubernetes experience : cluster ops, HPA / VPA , node pools, GPU scheduling, cluster autoscaler / Karpenter , PDBs, network policies, and multi-AZ design.Service mesh (Istio / Linkerd) and ingress patterns(ALB / Nginx), plus secure egress and mTLS (where applicable).Strong requirement for Infrastructure as Code beyond Terraform basics : Terraform modules, Terragrunt, policy-as-code (OPA / Conftest), and secrets automation.GitOps (ArgoCD / Flux) and progressive delivery (ArgoRollouts / Flagger), feature flags, canaries and blue / green.Model lifecycle tooling : MLflow / W&B , model registry, experiment tracking, reproducible training, dataset / versioning ( DVC / lakeFS ).Pipeline orchestration (Airflow / Prefect / Dagster) + artifact stores.Model serving patterns : online serving (KServe / Seldon / BentoML / Ray Serve), async / batch inference, autoscaling, and rollback strategies.Experience with prompt / version management , offline + online evaluation harnesses, RAG evaluation (retrieval metrics, groundedness), guardrails, and red-teaming basics.Handling streaming inference (SSE / WebSockets), caching, routing, and fallback models.Vector DB experience (pgvector / Pinecone / Weaviate / Milvus) and embedding lifecycle (backfills, re-embedding,indexing strategies).Explicit requirement for OpenTelemetry , tracing, and SLOs. Tools : Prometheus / Grafana, Loki / ELK, Datadog / New Relic—whatever you standardize on.Incident mgmt : postmortems, runbooks, errorbudgets.Requirements aligned to enterprise buyers : GDPR , encryption at rest / in transit, secrets mgmt (AWS Secrets Manager / Vault), KMS, key rotation.SOC 2 / ISO 27001 familiarity, vulnerability scanning(Trivy / Grype), SBOMs, SAST / DAST, dependency management.Have we sparked your interest? Simply click the 'Apply now' button tosubmit your application. Please note that, for data protection reasons, wecannot accept applications via email.
Would you like to learn more about us and our company culture? Click hereto watch our recruitment video .
About Simon-Kucher
Simon-Kucher is a global consultancy with more than 2,000 employees in30+ countries.
Our sole focus is on unlocking better growth that drives measurable revenue andprofit for our clients. We achieve this by optimizing every lever of theircommercial strategy – product, price, innovation, marketing, and sales – basedon deep insights into what customers want and value. With 40 years ofexperience in monetization topics of all kinds, we are regarded as the world’sleading pricing and growth specialist. simon-kucher.com
We believe in building a culture that embraces diversity, equity, andinclusion, creating an environment in which our people feel valued, are able tobe themselves and feel their contribution matters. If we get that right,remarkable things will happen; people will grow faster, innovate, feel valued,and create better outcomes for everyone – our people, our clients and, ofcourse, our business.
Your personal contact :
Maria Weininger
recruitment.germany(at)simon-kucher.com
Please submit your application exclusively via the “Apply now” button !
Better growth starts here. With you.