Distributed storage is a kind of data storage technology, which uses the disk space of each machine in the enterprise through the network, and forms a virtual storage device with these scattered storage resources. The data is distributed in every corner of the infrastructure. Traditional network storage system uses centralized storage server to store all data. Storage server becomes the bottleneck of system performance and the focus of reliability and security, which cannot meet the needs of large-scale storage applications. Distributed network storage system adopts scalable system structure, uses multiple storage servers to share the storage load, and uses location server to locate storage information. It not only improves the reliability, availability and access efficiency of the system, but is also easy to expand.
As shown in the figure, the device used for distributed storage is a regular server, not a storage device. Distributed storage realizes the whole storage resources on the server through non-standard protocols, and processes the storage resources pooling and virtualization, and finally presents storage space (block or file storage) for the users. Distributed storage has its own characteristics, and does not use standard protocol. Therefore, it only needs to install its client software in the application server to realize the virtual presentation of storage space and the processing of requests.
Due to the complexity of distributed storage topology, the probability of failure is greatly increased. Therefore, for distributed storage, network based data redundancy, data protection and data fault tolerance are needed to ensure the availability and reliability of the storage system under any abnormal conditions (such as disks, network cards, switches and servers).
Application scenarios
Key business database OLTP / OLAP
The system architecture design for flash memory provides distributed active active dual activity. It has the industry-leading high performance and low latency capability, and ensures the efficient and stable operation of key business database (OLTP) and data warehouse (OLAP) such as Oracle RAC.
Virtualization / cloud resource pool
Build a unified storage resource pool of large capacity single disk (100 TB level) for virtualization and cloud environments such as VMware and openstack, support smooth expansion without affecting business (cluster size up to 4096 nodes), and help enterprise it evolve to cloud architecture.
High availability architecture
It provides read-write performance close to the local SSD level for physical machines, supports multiple physical machines to mount the same data volume, and realizes the separation of computing and storage. Any change of computing nodes will not affect the reliability and security of data.
Container and AI application
It provides kubernetes CSI and other container storage solutions a stable and persistent storage for stateful services such as MySQL; in deep learning scenarios, it provides large capacity shared storage space for front-end computing clusters, and high IOPs and high bandwidth ensure the application fluency.