Site Reliability Engineer Intern
Dataiku
Dataiku is The Universal AI Platform™, giving organizations control over their AI talent, processes, and technologies to unleash the creation of analytics, models, and agents. Providing no-, low-, and full-code capabilities, Dataiku meets teams where they are today, allowing them to begin building with AI using their existing skills and knowledge.
Internship goal
Enhance the infrastructure and the continuous Integration pipelines of our Cloud Platform and help with the automation process within our Site Reliability Engineering team.
Detailed description
Dataiku Cloud is the SaaS offering of the Dataiku company specialized in AI and Machine Learning. The SRE team is in charge of the reliability and the performance of our infrastructure and the releases of the platform.
The goal of the internship is to be integrated in one of the SRE teams and work on multiple domains of a modern SRE stack: CI/CD pipeline, Infrastructure as Code, Security, Monitoring, … At the beginning of the internship, an onboarding phase with the SREs will let you learn about major tools used in the cloud industry, including Kubernetes, AWS, Grafana, Github Actions, Docker …
Under the supervision and with the support of your tutor, you will directly collaborate with other SRE team members and developers. You will work from our Paris or Nantes offices and have the opportunity to meet all other Dataiku teams. You’ll be an integral part of our team.
Responsibilities
- Collaborate with the SRE team to improve and optimize the CI/CD pipeline
- Leverage Kubernetes to build and test images in-cluster
- Automate our infrastructure processes
- Work on reliability and scalability processes
- Define metrics and build monitoring dashboards
Stack
- Kubernetes
- AWS / Azure
- Grafana
- ArgoCD
- Github Actions
- Terraform
- AI / machine learning