MLOps for Wizart
Implementing MLOps solutions to streamline AI and data management for Wizart, optimizing operational efficiency and scalability in retail and décor software.
- Service
-
AI Development
Services
- Industry
- Retail & Distribution
They faced challenges in manual data management and inefficient machine learning processes that hindered scalability and operational effectiveness. Expozit provided tailored MLOps services to address these issues.
Customer goal
In the realm of modern business, leveraging machine learning (ML) promises innovation and efficiency. However, the journey from developing advanced ML models to deploying them effectively presents complex challenges:
1. Data Management
Manual interventions by the Data Management team were required for data selection and preprocessing for machine learning, involving custom scripts that took hours or days to execute. This bottleneck slowed down data delivery to the ML team, thereby hindering the training process.
2. ML Processes
Regular delays in urgent experiments and developments were caused by the team’s heavy workload in data preparation and the time-consuming kernel recompilation process. This slowed down ML processes and delayed product updates’ time-to-market. For example, retraining and updating models demanded significant time from ML engineers, including server selection, setup on vast.ai, data migration, and ongoing monitoring.
3. Monitoring and Maintenance
Inefficient performance testing and monitoring led to service disruptions and customer dissatisfaction. This was evident in poor MTBF and MTTR metrics, indicating a need for better operational reliability.
Solution
Expozit’s MLOps: Basic Ops engagement with Wizart was transformative. We initiated with a thorough audit of their existing processes, engaging key stakeholders to understand infrastructure nuances, limitations, and specific team needs comprehensively. This in-depth analysis encompassed evaluating technologies, tools, platforms, and resource competencies within the ML domain.
Based on these findings, we identified inefficiencies such as underutilized EC2 instances and opportunities for serverless computing.
Working closely with Wizart’s ML team, we devised a comprehensive solution:
• Automated Data Management: Implemented additional Airflow pipelines and leveraged Amazon Lambda and Glue ETL to automate data collection and preprocessing, enhancing efficiency and reducing manual efforts.
• Optimized ML Processes: Introduced Amazon SageMaker for streamlined model training and deployment, minimizing reliance on external services like vast.ai and optimizing infrastructure management.
• Enhanced Monitoring and Maintenance: Configured Amazon CloudWatch for real-time performance monitoring and integrated Prometheus for detailed model metric tracking with NVIDIA Triton, ensuring proactive issue resolution and improved operational reliability.
Client investment included engagement of ML engineers and cloud resources for project execution. In a retrospective review, the client expressed satisfaction with the achieved ROI and operational improvements, underscoring enhanced collaboration between business and technical teams. Our team recommended advancing to MLOps Advanced for further automation and optimization.
Data specialists reduced weekly operational hours significantly, enhancing productivity without relying on local scripts.
ML engineers streamlined workflow, reducing the time spent on experiments and model updates.
Operational efficiencies resulted in improved system reliability, reducing downtime and enhancing service availability.