MLOps: Industrialising Machine Learning Models for Production
This 5-day course provides an end-to-end introduction to MLOps for deploying, operating, and improving machine learning systems in production. Designed for data scientists, machine learning engineers, and DevOps/SRE professionals, it combines business and technical perspectives, including project scoping, data definition, error analysis, drift detection, reproducibility, deployment, monitoring, CI/CD, and governance.
The course is structured around a hands-on capstone project based on a credit card fraud detection system. Participants progressively build a production-ready ML workflow by containerizing services with Docker, exposing models through FastAPI, tracking experiments and model versions with MLflow, deploying workloads with Kubernetes, and implementing monitoring, validation, rollback, and governance practices. The training takes place in cloud environments that reflect realistic production conditions and emphasizes practical decision-making across the full machine learning lifecycle.
Content
- ML lifecycle in production: scoping, data definition, modelling strategy, deployment, monitoring, and MLOps maturity
- Production challenges in machine learning: data drift, concept drift, training-serving skew, feedback loops, and performance baselining
- Development environment and engineering practices for ML: cloud setup, productivity tools, advanced Git, repository structure, and reproducibility
- Containerisation of ML systems with Docker: images, containers, registries, Dockerfiles, image optimisation, Docker Compose, and GPU support
- Model serving with FastAPI: API design, serving strategies, data validation, error handling, logging, documentation, and testing
- Experiment tracking and model management with MLflow: metrics, parameters, artefacts, model registry, run comparison, and lifecycle management
- Error analysis and model evaluation: per-segment metrics, false positive and false negative analysis, fairness, and model selection
- Deployment and orchestration with Kubernetes: pods, deployments, services, ingress, scaling, rollout strategies, and rollback
- Monitoring and observability in production: infrastructure metrics, model performance monitoring, drift detection, and traceability
- CI/CD for machine learning: automated testing, data validation, regression testing, image build and deployment pipelines, governance, and responsible AI
- Capstone project: end-to-end industrialisation of a credit card fraud detection system
Learning Outcomes
At the end of the training, Learners will be able :
- Design an end-to-end MLOps workflow for a machine learning application in production
- Containerize and serve a machine learning model using Docker and FastAPI
- Experiment and manage model versions with MLflow
- Deploy and scale a machine learning service with Kubernetes
- Implement monitoring, validation, and CI/CD practices for production machine learning systems
Training Method
This course combines lectures, real-world scenarios / examples, practical exercises, group work, and project-based learning. Participants progressively develop a capstone project based on a credit card fraud detection system in cloud environments that reproduce realistic production conditions, with each topic contributing directly to the final end-to-end MLOps pipeline.
Certification
Certificate of ParticipationPrerequisites
Participants should be comfortable with intermediate Python programming, including object-oriented programming, data manipulation with pandas and NumPy, and writing modules and unit tests. They should also be comfortable working in a Linux or shell environment, including navigation, environment variables, pipes, redirections, and basic scripting. Prior experience with supervised machine learning is required, including training and evaluating models, interpreting common evaluation metrics, and working with overfitting, cross-validation, and train/test splits. Participants should also be familiar with Git workflows, including branching, merging, pull requests, and simple conflict resolution. Prior experience in data science or ML engineering is strongly recommended.
Planning and location
09:00 - 18:00
09:00 - 18:00
09:00 - 18:00
09:00 - 18:00
09:00 - 18:00