Spotlight on Tech

Data at scale: How object storage drives AI/ML success

by
Brooke Frischemeier
Head of Product Management, Unified Cloud
Rakuten Symphony
November 25, 2025
3
minute read

Artificial intelligence and machine learning (AI/ML) are here, rapidly transforming industries, driving innovation and redefining how we interact with technology. With capabilities like predictive analytics, natural language processing, computer vision and autonomous systems, AI/ML models grow increasingly sophisticated.

But, behind every groundbreaking AI model lies a critical, often overlooked, component: the data. When managing the sheer volume, velocity and variety of data required to train, validate and deploy these models, object storage emerges as the undisputed champion.

Object storage organizes and stores data as objects, which include data, metadata, and a unique identifier. This approach yields highly scalable and efficient storage that can manage large volumes of unstructured data, such as images, videos, and backup archives. Objects are stored in a flat structure and are assigned a unique ID, making data easily accessible via APIs.  

Traditional storage solutions often struggle to meet the demands of modern AI/ML workloads. File systems can become bottlenecks when handling petabytes of data, struggling with metadata overhead and scalability issues. Block storage, while fast, is typically expensive and less flexible for the massive, unstructured datasets that AI/ML thrives on.

Why object storage shines

Here are a few of the reasons why object storage is the best choice for AI/ML applications:

  • Massive scalability for massive data: AI/ML models are data-hungry. Whether it's vast image datasets for computer vision, extensive data sets for language processing or sensor data streams for IoT analytics, object storage can scale virtually limitlessly to accommodate these ever-growing datasets. You can store billions of objects, from kilobytes to terabytes in size, without complex capacity planning.
  • Handle unstructured and semi-structured data with ease: The majority of data used in AI/ML is unstructured – images, videos, audio files, text documents and log files. Object storage is purpose-built for this, treating each piece of data as an independent object with its metadata, making it easy to store, retrieve and manage diverse data types.
  • Cost-effectiveness at scale: Storing vast amounts of data can be prohibitively expensive with traditional methods. Object storage offers a highly cost-effective solution, often leveraging commodity hardware and efficient storage techniques such as erasure coding, making it economical to retain large datasets for extended periods, which is crucial for iterative model training and historical analysis.
  • Accessibility and API-driven workflows: Object storage typically offers simple, RESTful APIs (like S3), making it incredibly easy for AI/ML applications, data scientists and developers to programmatically access and manipulate data. This API-driven approach integrates seamlessly with cloud-native tools, orchestration platforms like Kubernetes, and popular AI/ML frameworks.
  • Data versioning and immutability: Training AI models often requires experimenting with different datasets or model versions. Object storage's versioning capabilities allow data scientists to track changes, revert to previous states, and ensure data immutability for reproducible research and compliance.
  • Global distribution and collaboration: With distributed teams and global data sources, object storage can facilitate data sharing and collaboration by making datasets accessible from anywhere, often with built-in replication and geo-distribution capabilities.

Key use cases powered by object storage

The flexibility and scalability of object storage extend far beyond AI/ML, making it a critical component across numerous enterprise and telecommunications workloads:

  • Data lake for network analytics and AI/ML: Object storage serves as the foundational data lake for storing massive, diverse network data for operational intelligence, predictive maintenance, fraud detection and AI/ML model training and inferencing.
  • IoT data ingestion and management: Object storage efficiently ingests, stores, and manages vast streams of time-series data and events from geographically dispersed IoT devices and sensors, providing the raw material for real-time analytics and long-term trend analysis.
  • Content delivery network (CDN) and media storage: Object storage acts as the scalable and high-performance backend for CDN nodes, storing and delivering high-quality video, audio, and rich media content, ensuring low-latency access for subscribers.
  • User data storage: Object storage securely stores and manages user-generated content, profiles, preferences, and other data for various digital services, ensuring high availability, scalability and compliance.
  • Backup and archiving for critical applications and data: Object storage provides secure, compliant and cost-effective long-term data archival and disaster recovery for critical telco applications, subscriber data, billing records and operational databases.
  • Regulatory compliance data storage: Object storage meets stringent regulatory requirements for data retention, immutability and access control for sensitive customer, operational, and financial data.
  • Cloud-native application storage: Object storage serves as the primary, API-driven storage for unstructured data generated by cloud-native applications, including network function virtualization (NFV) components, orchestration platforms, business support systems (BSS), and operational support systems (OSS).
  • Log management and analysis: Object storage offers centralized, highly scalable storage for application, network, and system logs, enabling efficient troubleshooting, security auditing, performance monitoring and compliance reporting.
  • Software repository: Object storage hosts software images, updates, configuration files, container images, and other binaries for network functions, edge devices and cloud-native applications, facilitating efficient deployment and management.
  • Edge computing data aggregation: Object storage efficiently aggregates and stores vast amounts of data collected from numerous edge computing sites and remote network elements, preparing it for local processing or reliable transfer to core data centers and cloud environments.

Rakuten cloud-native object storage

For service providers and enterprises navigating the complex landscape of on-demand analytics, edge computing, 5G and IoT, the demands for robust data infrastructure are paramount. They require a storage solution that not only meets the current needs of AI/ML but can also evolve with future innovations.

This is precisely where Rakuten Cloud-Native Storage object store steps in. Built from the ground up for Kubernetes environments and featuring a centralized metadata store, it delivers the superior scalability, data consistency and operational efficiency service providers and enterprises need to power their AI/ML initiatives, accelerate service innovation and unlock the full potential of their vast, unstructured data assets.

Key takeaways

The growth in data collected from AI/ML, as well as other digital systems that generate massive amounts of data, makes it crucial to select the right storage system. Object storage outperforms traditional storage for these applications due to its ease of data accessibility and support for APIs. With object storage built into Rakuten Cloud-Native Storage, customers gain a Kubernetes solution that offers scalability, data consistency, and operational efficiency.

Spotlight on Tech