In order to decide if a test speech segment was said by the target speaker,
an a priori decision threshold has to be set. The threshold
chosen here is derived from the Furui threshold setting method
[1, 2].
tsp=target speaker, ntsp=non-target speaker
An extended threshold determination is used here:
in this case, the followed transformation is applied:
so the threshold becomes speaker independent, and it becomes
possible to adjust the threshold to improve the cost function
(see
). The data used as non-target speaker data (for
threshold setting) came from the training set of the 1996 NIST evaluation
data. In order to determine
and
the non-target
speaker data were "passed through" each target speaker model to obtain
,
and the three constants A,B,C.