Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Foundations of Audio Classification
- Categories of sound events: environmental, mechanical, and human-generated
- Overview of use cases: surveillance, monitoring, and automation
- Distinguishing between audio classification, detection, and segmentation
Audio Data and Feature Extraction
- Various types of audio files and formats
- Key considerations: sampling rate, windowing, and frame size
- Extracting features such as MFCCs, chroma features, and mel-spectrograms
Data Preparation and Annotation
- Use of standard datasets like UrbanSound8K and ESC-50, alongside custom datasets
- Labeling sound events and defining temporal boundaries
- Techniques for balancing datasets and audio augmentation
Building Audio Classification Models
- Application of Convolutional Neural Networks (CNNs) for audio tasks
- Input types: raw waveforms versus extracted features
- Management of loss functions, evaluation metrics, and overfitting
Event Detection and Temporal Localization
- Strategies for frame-based and segment-based detection
- Refining detections through thresholds and smoothing techniques
- Visualizing predictions on audio timelines
Advanced Topics and Real-Time Processing
- Utilizing transfer learning for scenarios with limited data
- Deploying models using TensorFlow Lite or ONNX
- Considerations for streaming audio processing and latency
Project Development and Application Scenarios
- Designing an end-to-end pipeline from ingestion to classification
- Developing proof-of-concept solutions for surveillance, quality control, or monitoring
- Implementing logging, alerting, and integration with dashboards or APIs
Summary and Next Steps
Requirements
- A solid understanding of machine learning concepts and model training processes
- Proficiency in Python programming and data preprocessing techniques
- Familiarity with the fundamentals of digital audio
Intended Audience
- Data scientists
- Machine learning engineers
- Researchers and developers specializing in audio signal processing
21 Hours