Moving Beyond Text-Rich Data in eDiscovery

As technology becomes more streamlined, so do everyday communications. On various platforms like Slack, WhatsApp, or WeChat, users regularly communicate through the use of various multimedia. This multimedia is fun and efficient, but the meaning behind it is largely up for interpretation. As they say, a picture is worth a thousand words. However, the subjective nature of a “picture” requires additional efforts for configuration and insight.

Difficulty with Multimedia

ChatAs of October 2019, there are 3,178 emojis in the Unicode Standard, the industry standard for emoji representation of text. As essentially mini “pictures”, emojis are a popular way of visually conveying thoughts and emotions. On the flip side, the meaning behind a particular emoji can vary by language, situation, device, platform (think Apple vs. Android, Twitter vs. Facebook), and user. An example of this discrepancy in emoji etiquette between China and the US. The seemingly innocent “raising hands” emoji, 🙌, is actually viewed negatively and used in a dismissive context in China. These nuances must be accounted for in the discovery process.

Taking these nuances into account, there are seemingly meanings that thousands of emojis across multiple platforms can have. When extrapolating to include the myriad of other types of non-text rich chat data, deciphering subjective conversational meanings becomes extremely daunting. Gifts, audio/video, location maps, like/dislikes/reactions, embedded payments, etc. all have the potential to carry business and investigative context. Assigning meaning to any of these ambiguous messages can be an onerous task without proper methods and expertise.

Discoverable Non-Text-Rich Data and Sources

Sources of data from commercial technology, consumer technology, and social media each have their own discovery challenges. With so many devices and platforms collecting data in each of these categories, the sheer amount of this non-text data has increased. Sifting through the data to find that, which is relevant, is an extremely daunting task.

Each source, and source type, present challenges. For example, what is the significance of geolocation collected by a smart device when compared to locations being shared through iMessage? If it does hold significance, is this data easily discoverable? What is the process of making this data easily discoverable?

Additional Configuration

Many organizations are unaware of where to begin with search terms in order to turn up relevant non-text data. Then, when this data is discovered, it’s not typically not presented in an easy-to-comprehend way. It may be jumbled, ambiguous, or in a different language completely. In many cases, experts must be called in for additional configuration I an attempt to “crack the code”.

Without the proper expertise, this process can be long, intimidating, and costly. In order to stay in-touch, organizations must adapt to the constant evolution of how people communicate via chat.