Go Back

What’s the scoop on voice surveillance?

Even in 2022, for Financial Services, monitoring voice is non-trivial. Doing that is definitely a tall order. This is also one of the key themes of the upcoming XLoD Global conference and for good reason. Regulatory authorities are becoming increasingly interested in what financial firms are doing about trading-related voice transactions.

Regarding communication surveillance, if built right, tech solutions today can easily handle archiving, analyzing, and monitoring emails, texts, and phone calls. The expert capabilities of the pack (all the RegTech vendors plus the in-house legacy solutions at financial firms) start to separate at the interpretation of “emoji speak” and inferring nuance through the choice of emojis or other images. Voice surveillance is the “next level” when it comes to system functionality. Why? Because it’s a really, really tough nut to crack. Often, voice is either not monitored (where compliance listens to random samples), or surveilled using a separate system to the eComm Surveillance.

Yet, together, we must solve this. Phone calls are a veritable “treasure trove” of potential when it comes to market abuse. We are all invited to debate if we should be focused on outcomes – or accuracy. Conventional wisdom suggests that this problem is so large that the 80-20 rule may the best approach at this time. Banks are currently in quite the quandary regarding voice with their problems compounded because those voice recordings are now strewn across an ever-growing number of options and silos (mobile, trader voice, WhatsApp).

Banks are under a lot of pressure. Surveillance costs have skyrocketed – with no end of increases in sight. Communication channels are fragmented, and new ones are popping up regularly. And, perhaps most importantly, regulatory authorities are stepping up their expectations and they are anything but shy when it comes to enforcement and levying hefty fines. As we have seen recently, with some of the CFTC and SEC fines for record keeping of mobile messaging apps.

There are several articles and conference presentations highlighting the escalation of regulatory pressure and how “identifying intent” has become the new yardstick. According to most regulatory citations published, poor surveillance efforts, governance and ownership are generally the issues that put banks on the wrong side of regulatory bodies. Voice surveillance poses a significant hurdle in this regard as proving intent with conventional communication modes such as email and text has not even been optimized despite efforts to do so for more than a decade. The net-net is that most of today’s technology simply cannot accurately decipher voices, and accents, and keep up if the speakers switch between languages or dialects, or understand ‘trader lingo’.

But there is hope. Collectively, we can all draw upon the experience and insights gained from previous efforts with conventional communication channels. Admittedly, the complexity of voice is significantly greater, but there are trillions of emails and SMS texts that can be used as a solid basis for machine learning to inform new algorithms designed to interpret voice.

With the surge in active users of Teams, Microsoft has made deep investments in FINRA compliance, openly acknowledging the importance of the financial services sector as its customer base. They have developed advanced technology and software solutions that scan – and store – all documents looking for keywords, sensitive data types, and retention labels that align with select policies to ensure governance over appropriate communications regulations. As good as those tools are, they don’t provision for voice surveillance.

RingCentral, another tech giant, has recently dialed up (no pun intended…) its efforts in banking enabling adherence to regulatory requirements by recording and archiving calls. The company has also tapped into the fear across the industry as $200 Million fines keep rolling in for one bank after another related to failed efforts to capture and archive digital communications. Truly, both companies showcase great advances here in technology and compliance enablement – but neither crack the nut of voice surveillance and compliance.

 Adding to the already high levels of complexity is the real-world problem of mixed media that is compounding the already nigh-impossible problems associated with monitoring voice-related trades. Surveillance software developers need to manage one, two, maybe and sometimes five or more types of media within the same conversation between two people. More specifically, two people may be exchanging SMS text messages, URLs, video clips, gifs, emojis, jpeg images, and voice notes. That conversation may break away from texting for a phone or video call on Teams, Zoom, WhatsApp, or one of the other popular choices. Plus, there’s no guarantee that the conversation will happen exclusively on one e-Comms channel. In fact, bad actors are deliberate in their attempts to obfuscate their intentions by jumping e-Comms channels, thus making it even more difficult for compliance officers to stitch together the whole story.

Regulators can ask for voice recordings at any time, it’s a risk if they are the first people to understand the content. Banks have a choice: be proactive or reactive. With a whole track of XLoD dedicated to the topic of voice surveillance, making the right choice may not be as hard as it sounds.

Originally posted on LinkedIn, Follow Oliver’s insights 


Follow Us

Subscribe to Shield’s Newsletter

Capture everything. Deploy anywhere. Store in one place.