Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Introduction
- Overview of deep learning scaling challenges
- Overview of DeepSpeed and its features
- DeepSpeed vs. other distributed deep learning libraries
Getting Started
- Setting up the development environment
- Installing PyTorch and DeepSpeed
- Configuring DeepSpeed for distributed training
DeepSpeed Optimization Features
- DeepSpeed training pipeline
- ZeRO (memory optimization)
- Activation checkpointing
- Gradient checkpointing
- Pipeline parallelism
Scaling Models with DeepSpeed
- Basic scaling using DeepSpeed
- Advanced scaling techniques
- Performance considerations and best practices
- Debugging and troubleshooting techniques
Advanced DeepSpeed Topics
- Advanced optimization techniques
- Using DeepSpeed with mixed precision training
- DeepSpeed on different hardware (e.g. GPUs, TPUs)
- DeepSpeed with multiple training nodes
Integrating DeepSpeed with PyTorch
- Integrating DeepSpeed with PyTorch workflows
- Using DeepSpeed with PyTorch Lightning
Troubleshooting
- Debugging common DeepSpeed issues
- Monitoring and logging
Summary and Next Steps
- Recap of key concepts and features
- Best practices for using DeepSpeed in production
- Further resources for learning more about DeepSpeed
Requirements
- Intermediate knowledge of deep learning principles
- Experience with PyTorch or similar deep learning frameworks
- Familiarity with Python programming
Audience
- Data scientists
- Machine learning engineers
- Developers
21 Hours