Ground Truth Analysis Tool for Retrieve and Rank

IBM Watson: Retrieve & Rank Ground Truth Analysis Tool


•RANKER: To return the most relevant documents at the top of your results, the Retrieve and Rank services uses a machine learning component called a ranker. You send queries to the trained ranker.
•GROUND TRUTH - The ranker learns from examples before it can re-rank results from queries that it hasn't seen before. Collectively, the examples are referred to as "ground truth.”
HYPOTHESIS: That the SHAPE and QUALITY of the Ground_Truth.CSV file for the Retrieve and Rank service, has attributes that can be measured – and impact Precision and Recall KPIs.

BENEFIT:  If we can create a tool to analyze a Ground_Truth.CSV file that is used for Retrieve and Rank, we can get a better understanding of performance - we can create a yardstick to help measure KPIs - GT attributes <-> Precision and Recall






•Exploration of simple tool, written in “R” can provide a 60 second analysis of GT file
–What’s ‘normal’ relative to other GT files
–How does GT “SHAPE” impact Precision and Recall
–KPIs – Measurable GT attributes that can be used to help assess overall performance of system – compare and contrast
•Ground truth ‘fingerprint’ 
•Get a better feel for shape and quality of GT used in retrieve and rank

Ryan Anderson

Created: September 13, 2015


