Rise of the Digital Revolution
For quite some time now, the IT industry has been abuzz with promoting the idea of transitioning analog life to digital life whether for personal or business enterprise. We are all affected by this digital revolution whether we like it or not which promises to make life easier, more productive and achieve greater efficiency in our everyday life. As consumers, we will definitely enjoy the benefits of a digital existence. Although this approach sounds promising, the IT industry is facing bigger challenges in making this dream come true.
Data Deluge
Based on the 2014 IDC report, the digital technologies has grown exponentially in 2013 at the same time offering huge range of digital opportunities. Currently, there are 4.4 zettabytes or 44 billion of terabytes in our digital universe and it is expected to grow by tenfold in 2020 to 44 zettabytes. To put things in perspective, let us assume last 2013 we have 128GB storage in a 0.29” thick tablet and stack them up together; this can already covered up roughly 2/3 of the earth’s distance to the moon. By 2020, it will definitely grow into 6.6 stacks.
Source: EmergingTechBlog
To put it in another way, the 44 zettabytes is equivalent to listening to music for 88 billion years or watching HD video for 1.4 billion years continuously.
The Growth of Internet of Things
Through understanding the given information, one will wonder how this is possible that we are able to generate so much digital data. There is an easy explanation for this. There are two major contributors responsible for these data – the Consumers and the Enterprises. There are 20 billion devices of this so-called “Internet of Things” ranging from your laptops, smartphones, wearable devices, TVs to appliances like dishwashers, refrigerators, smart cars, traffic sensors, infotainments, gas station pumps, robotic machines, smart buildings and so on. This will grow to 32 billion devices in 2020 or about 4 devices per person assuming the earth population will reach 7.7 billion in 2020. All these devices we are using can generate and consume a lot of data. In recent studies, in every 1 minute in 2014, there are 50 thousand hours of video streamed on Netflix, 4 million searches received by Google and 277 thousands tweets were made.
With all these explosive growth of digital data which are generated from various digital sources, the digital universe will be too big to handle. This will be a major and tough challenge to IT industry collectively. According to the same IDC report, the 4.4 zettabytes of data created in 2013, only 15% of these digital data were managed by consumers, while enterprises and IT are responsible for using the remaining 85% of data.
Source: DellEMC
Big Data Storage Challenges
To understand the challenges that the IT industry and enterprises have to face, first, we need to understand and classify what are the demands of the “Big Data”. Gartner defines that there are 3 “V”s of Big Data. These are the Volume, Velocity and Variety. Later on in 2014, Bernard Marr, a bestselling author has expanded the 3”V”s to 5 “V”s, adding Veracity and Value.
Volume
Volume refers to the vast amounts of data generated every second. Just think of all the emails, twitter messages, photos, video clips, sensor data and others that we make and share every second. We are not talking about terabytes but zettabytes. On Facebook alone, we send 10 billion messages per day, the “like” button is clicked 4.5 billion times and 350 million new pictures were being uploaded each and every day. If we take all the data generated in the world between the beginning of time and 2008, the same amount of data will soon be generated every minute! The increasing amount makes data sets too large to store and analyze with traditional database technology. With this hefty data, we really need a technology that is able to store and use these data sets with highly scalable storage systems, where parts of the data is stored in different locations and brought together by software.
Velocity
Velocity is the speed at which new data is generated and also the speed at which data moves around. Just think of social media messages going viral in seconds, the speed at which credit card transactions are checked for fraudulent activities, or the milliseconds it takes trading systems to analyze social media networks to pick up signals that trigger decisions whether to buy or sell shares. Big data technology requires us now to retrieve and analyze the data while it is being generated, all in lowest latency as possible in milliseconds or even microseconds.
Variety
Variety refers to the different types of data we can use now. In the past, we simply focused on structured data that nearly fits into tables or relational databases, such as financial data (e.g. sales by product or region). In fact, 80% of the world’s data is now unstructured, and therefore it can’t be easily put into tables (think of photos, video sequences or social media updates). Big Data demands us to harness different types of data (structured and unstructured) including messages, social media conversations, photos, sensor data, video or voice recordings and bring them together with more traditional, structured data.
Veracity
With countless forms of big data, the quality and accuracy is not easily controllable (best examples are Twitter posts with hash tags, abbreviations, typos and colloquial speech as well as the reliability and accuracy of the content). On top of that, think about all these bits and bytes read and write to storage systems that are prone to bit errors, not to mention the possibility of hardware failures. All of these fast data velocity will also require quality and accuracy.
Value
Finally, there is the last V to take into account when looking at Big Data – Value! It is all well and good having access to big data but unless we can turn it into value it is useless. So you can safely argue that “value” is the most important V of Big Data. It is important that businesses make a business case for any attempt to collect and leverage big data. It is so easy to fall into the buzz trap and embark on big data initiatives without a clear understanding of the costs and benefits.
With all these challenges, the IT industry and Enterprise should rethink and re-invent its future storage technology both hardware and software that can work cohesively and seamlessly together.
Storage Hardware
We’ve certainly come a long way since the computer revolution in the 20th century. In the 1960’s, a single computer is already sufficient to serve an entire IT department room to perform only several basic calculations. Come to think of it, this is just a small fraction of what our current smartphones can do. Thanks to Moore’s Law, the ever increasing semiconductors keep on getting better year after year and decade after decade.
However, when we speak of specific storage technology, we are still currently dealing with tape drives, mechanical rotating drives that are more than half a century old technology.
Flash to the Rescue
Below is a chart that shows where the gap widens when DRAM and CPU are getting faster, but the Tape and rotating Hard Drives just cannot keep up with Moore’s Law. Fortunately, we have the flash technology to thank for because it enables to close this gap. To compare the current performance of SATA SSD, it can deliver roughly about 100,000 IOPS while the 15K RPM Enterprise level Hard Drive is only capable of delivering 200 IOPS. You can see that the increased capacity is 1,000-fold of what a hard drive can do. Currently, the hard drive is still the winner in terms of capacity (at the time of writing this, the 4TB, 6TB or even 8TB hard drives can be found easily), the current highest capacity of SSD is around 4TB. However, when it comes to storage density, the SSD flash definitely takes the crown. Moreover, in the next 12 to 24 months, all major SSD players will be able to provide even up to 8TB or 12TB SSDs.
So, does it mean that Tape and HDD will become extinct? Probably not quite yet, at least not in the near future or for the next few years. The current cost of Hard Drives and Tapes is still a fraction of what Flash SSD cost. But SSD may catch up soon according to Gartner’s chart below. The year 2017 is probably where the price parity will be for Enterprise HDD and entry level SSD. We should also keep in mind that by the year 2017, low cost Data Center HDD may probably still be at least 20% cost of the SDD. Therefore, rotating drives may still extend its visibility in the market.
At Premio, we embrace both hard drive and flash drive technology because of the cost and performance respectively. The key is how you will utilize both technology in each of their domain and create hybrid storage. Flash drives are perfect for where hot data need to reside and perfect for Tier 1 or “most use” cached data, while rotating drives are more cost effective to store data in cold storage or long-term storage where performance is not critical and cost per GB is more crucial.
This is part of the reason why our product family offers FlacheStreams, which is dedicated for flash disk arrays and ScaleStreams for large backup storage. What’s more, some of our products like DuraStreams and OmniStreams have flash and hard drives in the same box for hybrid storage solution.
To learn more about our server products and what they can offer for your business, contact us today!