SRE - DataPlatform

Veepee · FR · 101d ago

Hybrid JavaPython

About the role

Being an SRE at VeepeeTech means being part of a transversal SRE community while integrating a product-oriented Data Platform team.

You will contribute to the reliability, scalability, and operability of critical data services by applying SRE and DevOps practices, while sharing knowledge across teams.

The Data Platform is currently evolving toward a modern lakehouse architecture deployed on VeepeeCloud (our on-prem platform), based on technologies such as Trino, Iceberg, and object storage, with strong ambitions around performance, cost efficiency, and platform ownership.

You will work in a distributed environment (France & Spain), within a team of 40–50 data professionals across engineering, analytics, data science, and governance.

You will play a key role in ensuring the reliability and scalability of this next-generation data platform, while supporting the transition from public cloud to hybrid/on-prem architectures.

🎯 TASKS

Platform Reliability & Operations

Ensure reliability and performance of our data platform services (Trino, Iceberg, S3, Kafka, Flink)

Define and implement SRE best practices: SLIs/SLOs, error budgets, observability

Build and maintain monitoring, alerting, and incident response frameworks (Prometheus, Grafana, etc.)

Cloud Migration & Architecture

Contribute to the migration from public datawarehouse cloud to VeepeeCloud lakehouse stack

Support coexistence between cloud and on-prem systems and ensure consistency and reliability

Help design resilient architectures for ingestion, transformation, and serving layers

Kubernetes & Infrastructure

Operate and improve services running on Kubernetes (GKE/EKS & on-prem clusters)

Automate infrastructure provisioning using Terraform, Atlantis, and/or Crossplane

Improve GitOps workflows for platform deployment and configuration

FinOps & Performance Optimization

Collaborate with teams to optimize compute/storage usage (Trino queries, BigQuery slots, etc.)

Build tools and dashboards to track cost, usage, and efficiency

Support the transition toward cost-efficient on-prem workloads

Developer Enablement

Improve self-service capabilities for data teams (e.g., provisioning Trino/Iceberg resources)

Help teams adopt best practices in reliability, observability, and deployment

Write clear technical documentation and runbooks

Resilience & DRP

Contribute to Disaster Recovery Plan (DRP) definition and implementation

Ensure multi-DC resilience (FR1 / NL1) and data replication strategies

Participate in incident management and postmortems

👉 MUST HAVE skills

Strong experience with Kubernetes in production environments

Experience with distributed data systems (or strong willingness to learn)

Solid understanding of SRE principles (monitoring, alerting, SLAs/SLOs)

Experience with Infrastructure as Code (Terraform or similar)

Familiarity with GitOps workflows

Experience with observability tools (Prometheus, Grafana, logging systems)

Comfortable working in cloud environments

Strong collaboration mindset and ability to work across teams

Fluent in English

👉 NICE TO HAVE skills

Experience with Trino, Iceberg, or data lakehouse architectures

Experience with Ceph S3 or object storage systems

Knowledge of Kafka / Flink / Airflow

Experience with FinOps practices and cost optimization

Experience with Crossplane or platform self-service models

Programming skills (Python, Java, or Go)

Experience with multi-region / multi-DC architectures

✅ BENEFITS

Variable bonus;

The dynamic and creative environment within international teams;

The variety of self-education courses on our e-learning platform;

Participation in meetups and conferences locally and internationally;

Flexible Office with up to 2 days at home

⚙️ RECRUITMENT PROCESS

1️⃣ 30-minute HR Screen with a Veepeeᵀᵉᶜʰ Recruiter

2️⃣ General Technical exchange

3️⃣ Technical exchange with the manager

4️⃣ Team Interview

Tech stack

JavaPython

Arrangement Hybrid

Location FR

Posted 101d ago

findatechjob

Tech jobs straight from company career pages. No recruiters, no middlemen, no spam.

Countries

United States United Kingdom Germany Canada

Languages

Python TypeScript Go Rust

Company

About