Francisco Valverde: "Dos herramientas para el análisis del desempeño de clasificadores multiclase"

Ponente: Francisco Valverde, (NLP&IR-UNED)

Fecha: martes 30 de abril de 2013

Hora: 12h00

Lugar de celebración: Sala 1.03, ETSI Informática, UNED

Abstract

The most widely spread measure of performance, accuracy, suffers from a paradox: predictive models with a given level of accuracy may have greater predictive power that models with higher accuracy. We argue that a reason for this may be that, in spite of optimizing classification error rate, high accuracy models fail to capture crucial information transfer in the classification task.

Therefore we set out to solve the problem of the assessment of classification when maximizing the statistical information captured by the model is the main goal of the classification process, e.g. in exploratory analysis.

For this purpose we concentrate on a different quantity for a classifier, the perplexity, and show how it relates to classification accuracy. Using perplexity we are then able to obtain the normalized information transfer factor (NIT), a measure of how efficient is the transmission of information from the input set of classes to the output set of classes.

We claim that the NIT factor is a more natural measure of classification performance than accuracy when the assessment criterion is the transfer of information through the classifier instead of classification error count. It also makes it harder for classifiers to 'cheat' using techniques like specialization. We show how to use it in classification assessment and howit rejects rankings based in accuracy.

 

Bio

Francisco J. Valverde es Ingeniero de Telecomunicaciones (esp. Telemática) por la Universidad Politécnica de Madrid y Dr. Ingeniero en Telecomunicaciones por la Universidad Carlos III. Sus intereses abarcan la semántica léxica (tesis doctoral y estancia de investigación con C. Fillmore  en el grupo de FrameNet, ICSI, Berkeley) y la recuperación de información (estancia de investigación con F. Crestani en la U. de Strathclyde, Glasgow, RU). Desde el 2004 su interés se centra en el uso de técnicas de semianillos en aprendizaje máquina para estas dos aplicaciones, donde ha co-desarrollado una extensión no binaria al análisis de conceptos formales y varias técnicas de evaluación de clasificadores multiclase.

 

Lugar de celebración

Sala 1.03
ETSI Informática, UNED
c/ Juan del Rosal, 16
Ciudad Universitaria
28040 Madrid


Materiales

Presentación: Two tools for the performance analysis of multiclass classifiers.


 
RocketTheme Joomla Templates