Monster Data is Headed Our Way
Blog: Jim Sinur
If you thought Big Data was a challenge for us all, wait until the new wave of Monster Datahits us. We will have to manage it, make decisions, and build processes and applications leveraging this monster data. Just what is monster data, and how will it affect us.
Monster Datarepresents data that is overwhelmingly large, unduly complex, can’t be trusted for accuracy. Typically it is composed of multiple kinds of data including structured, unstructured text, voice, image, or video. Some monster data may be unknown or emergent, making it scary to deal with for most individuals, technologies, or organizations.
Even Larger Volumes
We have long been concerned about “the IoT Awakening,” exposing large amounts of critical data that would likely need immediate attention often at the edge. While managing all the moving parts of Industry 4.0 is a challenge, we see new value chains that employ GPS, tracing, and original digital identities adding data to the mix. As organizations want to leverage data for more refined business outcomes, more data will be needed.
Organizations leverage more powerful AI and computer-analysis techniques to gain insight into human behavior using personality, social, and organizational psychology data. This need will yield data sets that are much larger than what we have today and certainly too large for traditional processes and applications. Data will likely include recorded conversations that could process into usable information.
More and more data is piling up from digital footprints left in social media, cell phones, business transactions in various contexts, shopping, surfing, and other devices that record our every moment, freely given or not. Sometimes this new data is just taken from sites as people pass through, leaving crumbs behind.
Even More Complex
To order to utilize technology to empower us, the complexity of the data will also become much more diverse. Because large data collections can be computationally analyzed to reveal new signals, patterns, and trends, the complexity of that data will have to be managed well. Organizations want to deliver insights from human behavior and interactions collected everywhere, every second of the day.
The data will come from various contexts that imply context-sensitive meaning. Hopefully, this new and emergent data will be available on the cloud, but the cost and security issues will make this data more hybrid cloud in nature. The data will likely be a hybrid of structured and unstructured data and require new data management means with ownership and dynamism challenges.
This complex and dynamic set of data sources will become more challenging to manage, but it is on its way to becoming a precious asset that can be leveraged by machine and deep learning. While dynamic and emergent, its use will become more stunning over time.
Even More Inaccurate
Because of speed and size alone, the accuracy of monster data will be a constant challenge. When combining data in new ways understanding its source, context, and ultimate meaning at all levels of granularity, this becomes more of a critical problem for the data management professionals as well as the end-users.
There will be ownership issues and who will be held accountable for the accuracy of any data leveraged. All of this will have to be sorted and managed under the gun with the pressure of speedy results. Of course, internal data sets will have a better-understood pedigree than those data sources from outside an organization and in contexts not well understood.
As we grow to zettabytes, the amount and variety of data being accessed, collected, stored in the cloud, stored on-premises, and analyzed will keep increasing in an exponential fashion. This seems like a near-impossible task until the promise of better analysis and prediction to correct problems takes over our desires. Business outcomes will likely drive this growth in this extreme competition and dynamic environment these days.