The Unexpected Consequences of Big Data

Blog: Jim Sinur

Big Data is the unexpected resource bonanza of the current century. Moore’s Law driven advances in computing power, the rise of cheap storage and advances in algorithm design have enabled the capture, storage, and processing of many types of data previously that were unavailable for use in computing systems. Documents, email, text messages, audio files, and images are now able to transform into a usable digital format for use by analysis systems, especially artificial intelligence. The AI systems can scan massive amounts of data and find both patterns and anomalies that were previously unthinkable and do so in a timeframe that was unimaginable. While most of the uses of Big Data have been coupled with AI/machine learning algorithms so companies and understand their customer's choices and improve their overall experience (think about recommendation engines, chatbots, navigation apps and digital assistants among others) there are uses that are truly industry transforming.

In healthcare, big data and analytics is helping the industry move from a pay-for-service model that reimburses hospitals, physicians and other caregivers after service was performed to a new approach that reimburse them based on the outcome of the service, specifically the post-service health of the patient. This approach is only possible if there is enough data to understand how the patient relates to the vast population of other patients who have had the same procedure/service and the expected outcome. While a variety of other factors, such as the patient’s cooperation with the treatment plan, are involved, those factors can be tracked and analyzed as well, providing a clear path on best practices and expected results based on evidence. When this is combined with diagnostic improvements made possible by using AI to find patterns in blood and tissue samples or radiology image scanning and anomaly detection, the ability for the physician to determine the exact issue and suggest the best treatment pathway for a given situation is unparalleled. The result to society for this example is expected to be a dramatic increase in efficiency resulting in a lower cost of service. However, the same technologies that are able to deliver these unparalleled benefits are also capable of providing the platform for a previously unimaginable set of fraudulent uses.

Examples of Issues

An interesting case of the unexpected occurred in the UK where a group of criminals with very sophisticated knowledge in AI and big data have been able to scam a number of organizations into transferring large sums of money to fraudulent accounts. According to the BBC, the criminals captured a number of voice recording from CEO’s making investor calls. They analyzed the voice recordings with an AI pattern -matching program to re-create words and parts of speech. They then created a new recording in the CEO’s voice directing the CFO to wire funds to a specific account on an emergency basis. They sent the recording via voice mail to the CFO and even spoofed the CEO’S number. Think of this as an extremely sophisticated fraudulent “robocall” attack using AI to replicate the voice of a known and trusted person sending explicit instructions requiring urgent compliance. While normally this would not work due to organizational processes and security protections, given the right set of circumstances, it can be successful. Also, the level of knowledge, time and money it takes to prepare and launch this type of attack limits its ability to be easily replicated. However as more voice data becomes available and the AI algorithms and techniques become easier to use, we can expect these types of data and technology misuse to become more prevalent. One can imagine a case where the voice of a loved one in distress is sent to a parent or grandparent looking for some amount of money to be sent immediately to card or account. Here the same techniques applied over a large population could have devastating results.

Similarly, facial recognition technology has the potential to identify and authenticate people based on using the sophisticated camera technology found in mobile phones and other camera and video recording devices that have become pervasive in our world. However, few people really understand the limitations of these devices when it comes to accurately identify people under different environmental situations. In the case of the best commercially available technology the accuracy rate, under sufficient lighting and in a “penned” or confined space, is over 90%. This drops to around 65% if the lighting conditions change or the person is in a place like a mall or an outdoor arena. Now, add to that the significant error rate that occurs for people with skin tones that are closer in color to their accessories, as well as its inability to accurately recognize a person with a hat, scarf, sunglasses or facial hair, and it is easy to see why communities such as San Francisco have banned its use in law enforcement activities.

Efforts to Consider

So, the question is; what can we do to bet the benefits of AI and big data yet protect ourselves from the downside risk these technologies bring? First, realize that as the old adage goes, the Genie cannot be put back into the bottle. We will need to live with and be prepared to manage the risks each of these technologies brings. In our practice, we work with clients to identify the critical data types, decision types and actions/outcomes that require elevated of level protection. This is a comprehensive effort that results in a digital asset threat matrix with corresponding required actions. However, everyone or the organization, no matter what the size can start by:

Understanding the types of data both you and your organization have in your possession (images/pictures, text, spreadsheets) and decide what data you are willing to share and under what circumstances. This is particularly important for individual biometric data. Keep engaged with papers and events emerging on the topic of “The Data of You”
Develop specific rules for when you will take actions such as transferring money and who (maybe multiple people) is able to authorize the transaction and under what circumstances
Ask your analytics vendor or analytics team, to show you the tested the current and historic accuracy rate of any software that is used to make critical decisions. Why would you allow something with a marginal accuracy rate to aid in the decision-making process, especially when dealing with something so important as law enforcement? This also applies to other analytical software such as blood and urine testing services.
Safeguard your data in the context of use through tracking, mining and random audits. There are usually trends and tells in the usage of your data internally and externally.
Stay abreast with activities and outcomes from “Deep-Fake” events and publications. The use of AI and Algorithms to fool institutions and individuals are on the rise leading to alternative realities.

Net; Net:

Lastly, on an individual level, remember it is your data. Do not agree to share it with any app or information request, especially on-line lotteries or emails that tell you are a winner, just give us your contact information! These may be scams and you do not want to end up a victim of the unintended consequences of big data and AI!

For more information see:

www.datadiscoverysciences.com/blog

www.sinurblogspot.com

This post is a collaboration with Dr. Edward Peters

Edward M.L. Peters, Ph.D. is an award-winning technology entrepreneur and executive. He is the founder and CEO of Data Discovery Sciences, an intelligent automation services firm located in Dallas, TX. As an author and media commentator, Dr. Peters is a frequent contributor on Fox Business Radio and has published articles in The Financial Times, Forbes, IDB, and The Hill. Contact- [email protected]