Apart from IPFS, how much do you know about Blockchain Storage?

Related Articles for you

Apart from IPFS, how much do you know about Blockchain Storage?

The InterPlanetary File System (IPFS) is a protocol, hypermedia, and file-sharing peer-to-peer network for storing and sharing data in a distributed file system. IPFS uses content-addressing to uniquely identify each file in a global namespace connecting IPFS hosts.

A blockchain is a type of Digital Ledger Technology (DLT) that consists of a growing list of records, called blocks, that are securely linked together using cryptography

Facing the temptation of the trillion data storage market, many entrepreneurs have targeted the business opportunity.

It’s reported that there are many blockchain projects with decentralized storage in the market such as Storj, Sia, Factom, MaidSafe, Genero, etc. The IPFS protocol is the most well-known, and although its theory hasn’t been fully implemented, it has already attracted many fans (Read more: Introduction to Technical Framework and Applications of IPFS).

Wang Donglin is a practitioner who has got involved in the storage industry for nearly 10 years. He said that, if people were confused about the terminology such as storage certification mechanism and network protocol, then just tried to remove the thick clouds concentrated over the decentralized storage project. “As for the storage itself, the decentralized storage is still the continuation of classic technology.”

It means that if people try to sort out the classic data storage practices in history, they can naturally understand how the decentralized storage solutions can be improved, and why the industry believes the era of “blockchain storage surpassing the cloud storage” has come already.

Revolution and Reform of Cloud Storage

The history of storage devices can be traced back to the birth of the first computer. The era of storage service for the public can be traced back to 2006 when e-commerce giant, Amazon, released the S3 storage service. There will be no computer without storage.

Due to the easy operation, the cloud server which was originally used to serve Amazon was used by many enterprises with strong requirements for data storage. Hence, the entire cloud storage industry has gradually developed.

After more than a decade of development, cloud storage has evolved from a tiny market to a huge market. According to the data released by IDC China, the scale of the cloud management service market in China reached US$307 million in 2018, a year-on-year growth of 131.4%. The forecast pointed out that the compound growth rate in the whole market will reach 70.8% from 2018 to 2023. By 2023, the whole market size will jump to US$4.66 billion by 2023.

As for the whole world, the entire cloud storage market will reach 10 billion with technology/Internet giants such as Amazon, Microsoft, Google, Aliyun, and Tencent. If the traditional enterprise storage is taken into consideration, the entire market will be about US$70 billion, including some traditional IT giants such as Dell/EMC, NetApp, IBM, HPE, HDS, and Huawei.

But in fact, cloud storage firstly reformed traditional storage. Wang Donglin said that a friend of a big storage equipment company told him they would refuse the “business with a gross margin less than 85%, that is, in $100 of sales income only $15 is spent on purchasing hardware.” Compared to traditional storage, the price of cloud storage service is much more attractive.

However, although cloud storage service providers can depend on strong capital and resources to provide service to enterprises by building multiple data centers around the world, many problems have shown up due to various reasons.

The first one is, of course, the technical problem.

Here we need to distinguish two concepts: redundancy and fault domain isolation. The centralized data storage solution can achieve data reliability by increasing the reliability of a single system, for instance, more storage hard drives or more data service centers.

Improving the reliability of a single system may encounter bottlenecks. In this case, redundancy and fault domain isolation is needed to improve reliability.” Wang Donglin explained, “Redundancy can ensure complete data reading even if some data are lost. Fault domain isolation can limit the scope of fault to a small range.”

For example, the annual failure rate of hard disks is around 1% (the nominal figure is lower, but the actual data are slightly higher), “The data should have reached the limit, and it’s impossible to be lower.”

The solution for traditional enterprise storage is to “distribute data across multiple hard drives and allow one (RAID5) or two (RAID6) hard drives to fail, no data loss even though.”

To continue development along this route will result in a distributed storage system at the same location by multiple servers to achieve redundancy and fault domain isolation at the level of the storage server. The cloud storage service providers even spread data across different cabinets to realize the redundancy and fault domain isolation at the level of cabinet.” Wang Donglin said, “However, the reliability of a single data center has also encountered a bottleneck now. The solution is to continue the development along the technical route to achieve redundancy and fault domain isolation at different locations.”

In fact, it’s already become the “decentralized storage”. If the blockchain incentives are included, it will evolve into “blockchain storage”.

The concept of distributed storage has already appeared in the storage industry.” Wang Donglin said. With the popularity of decentralization and sharing concepts, new business models like Airbnb and Uber have become popular. The storage industry can certainly adopt such an approach to fully utilize the storage resources of ordinary users.

The blockchain incentive system can also play a very good role. It can encourage mining workers to join in and quickly build a huge storage pool covering the whole world; punish the storage nodes which do not provide the promised services to guarantee the quality of storage service; attract more users to greatly reduce the costs.

Data redundancy, fault domain isolation, and supervision

“For the practitioners in the storage industry, they should have a concept of value: data itself is alive, they must be responsible for the security and reliability of the users’ data.” Wang Donglin said that even the decentralized storage solution should follow this principle as well.

Wang Donglin also said that even though the storage solutions are improved and innovated, the key points such as reliability, security, redundancy, cost, availability, data deduplication, and DDOS should also be taken into consideration.

The reliability, security, and cost of data are the most important of all.

“For example, if we compare the data to the deposit. One day when the user needs money urgently, the bank claims the machine is out of service due to malfunction, that’s the problem of availability; if the deposit amount is revealed, that’s the problem of security; if the money is gone, that’s the problem of reliability.”

Based on the basic common sense of storage, redundancy must be done to improve data reliability. “To ensure data reliability, existing cloud storage service providers will usually store three copies of data, that is, the data redundancy rate is 300%.”

In addition to redundancy, data fault isolation is also necessary.

“The original practice is fault isolation to ensure other data will not be affected even if some have been damaged. However, it’s unrealistic to reduce the failure rate of hard disks unlimitedly.” Wang Donglin said.

In addition, the data also need to be supervised so that the data reconstruction can be done once problems occur.

“Data redundancy, fault domain isolation, heartbeat supervision, and data reconstruction will be considered by the professional storage service providers. Even if it’s integrated with the blockchain, all these aspects are still attached with great importance,” said Wang Donglin.

Cost Performance of Blockchain Storage

“The company data used to be stored in the hard disk, so people always worry about the malfunction may occur. Once the cloud storage has been developed, they start to store data in the server, but they also worry about the malfunction of the entire server system. Accidents such as natural disasters may happen, and some enterprises then start to expand the data backup from a single location to multiple locations, lest all the data may be corrupted.

Decentralized storage ensures data security to a certain extent, but it’s a problem who should pay for such a data center. “Even if the world’s largest cloud service provider, Amazon, has only dozens of nodes worldwide. For the centralized organizations, they still face big financial pressure to build data centers.”

After the emergence of the blockchain incentive system, such a situation may be changed.

Wang Donglin said that the application of blockchain to cloud storage can effectively reduce the threshold for access to data storage. “With the incentive mechanism, more nodes can participate to ensure the entire decentralized system works more effectively.”

Besides its own incentive system, the blockchain is an innovation that can bring a new decentralized system to the whole data storage industry. “The distributed concept of storage industry used to be reflected in a few locations only. But after the blockchain appeared, each user was encouraged to contribute his own storage device to the entire decentralized cloud storage ecosystem.”

The concept of distributed storage in the original solution has also gradually been replaced by decentralized cloud storage. Wang Donglin believes that “no matter which stage the storage industry has developed, it’s always a natural continuation of classic solution.”

Implementation Challenge

Wang Donglin said that the relationship between the project side and the miner manufacturer is somewhat similar to Google search and the websites. On the one hand, in order to make the website content more easily accessed by the Google engine, some people will optimize the SEO of websites.

In general, the emergence of professional miners will indeed have a positive effect on the ecology of an entire decentralized storage project.” However, whether the performance of the miners can meet the claimed level, we should only “wait and see”.

Compared with the original plan, the advantage of a decentralized storage solution is obvious, but as a blockchain project dedicated to 2B enterprise-level services, the implementation will become a big challenge.

As for enterprises, what they care about most should be whether the storage services will become cheaper with fewer costs. They haven’t fully aware of the meaning of blockchain storage.

However, we have seen the necessity of blockchain storage in various news such as “Tencent cloud may suffer from data loss of startup companies due to Aliyun failure.”

But no matter how attractive the blockchain becomes, the implementation should be the key factor.

Many documents hyping decentralized storage projects usually emphasize the original sin of the centralized system, and declare that decentralization is safer than centralized storage, Wang Donglin believes that it’s wrong as a matter of fact. “Without data encryption, the decentralized solution in which the user’s data are exposed will have less security.”

Moreover, in the industry, there’s also another problem of the “dilemma of encryption and deduplication”. According to Wang Donglin, some project has made a breakthrough in this regard, “TruPrivacy technology can achieve zero-knowledge encryption and cross-user deduplication.”

“The whole industry hasn’t reached the level of fierce competition. I hope that more similar projects can form alliances so that the whole industry can develop together and eventually compete with the centralized organizations such as Amazon.” Wang Donglin said with full confidence.

Apart from IPFS, how much do you know about Blockchain Storage?


Leave a Comment

Your email address will not be published. Required fields are marked *