Audio Search Accuracy: Getting Out of DET

I thought I’d get back to the accuracy question again, and go into a bit more detail on how we determine the overall accuracy of a phonetic search model based on the optimum trade-off between precision and recall. It all boils to understanding the DET chart, or Detection Error Tradeoff. Here’s what one of these looks like:

Image may be NSFW.
Clik here to view.

In most charts, “up and to the right” is the way you want to go. A DET is somewhat flipped from this paradigm, where “down and to the left” would represent a perfect world. But as we’ve discussed before, there’s no perfect world in search…it’s all about trade-offs. So let’s dive into the details on this chart so you can understand it better. First, what does this chart really show?

This chart shows the practical search results for five different search expressions in a typical Nexidia search. Each search expression is made up of a certain number of phonemes. The shortest expression (fewest phonemes) is shown at the top in the orange line, while the longest expression (most phonemes) is shown in pink at the bottom. The Y-axis measures the percent recall for the search, while the X-axis measures the level of precision for the search. (For a refresher on precision vs. recall, view my earlier post here.)

So what is this chart showing us? It is a dramatic and real interpretation that for any given search expression, you can maximize recall (most potential true positives) but only at the expense of precision (more false hits). The yellow line represents a search term with 8 phonemes, a typical two-to-three syllable word. Following this line all the way down to the right, you see that you can achieve almost 90 percent recall if you are willing to live with about 10 false alarms per hour of content. That’s not a bad trade-off in a compliance situation, especially when the review tool lets you quickly and easily listen to and disposition results.

As with most any type of search engine you use, the more relevant content you give it to search, the better your results. So in this case, the bottom pink line represents a search of 20 phonemes (a typical three or four word phrase) and shows that you can get over 95% recall with just one false alarm per hour, and almost 99% recall with only 10 false alarms per hour.

There are two key points that I will make again. First, because the underlying phonetic index has captured ALL the true spoken content in each recording, it offers the most accurate representation possible of what people have actually said in the file. But second, due to the many variables that make up the differences we experience in human speech (accents, background noise, etc.), reviewers can leverage this knowledge about precision vs. recall to craft a search strategy that gives them the level of search results that satisfy their goals.

Audio Search Accuracy: Getting Out of DET

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112