com.hrstc.trec

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV PACKAGE NEXT PACKAGE

FRAMES NO FRAMES

Package com.hrstc.trec

TREC Module

See:
Description

Class Summary
ColumnExtractor	ColumnExtractor.java Created on November 3, 2004, 8:41 AM Extracts specified columns from a file regular args: queries_even.txt.result 4 5 6 7 8 9 10 lucene args: queries.txt.result 4 5 6 7 8 9 10 desc run args:
Defs	Definitions
FileConverter	FileConverter.java Takes result file from lucene and qrel file and combines them in the file that could be used for training input...output Created on October 17, 2004, 2:39 PM
MatlabToTREC	MatlabToTREC.java Created on November 3, 2004, 11:16 AM Merges output from matlab with the lucenes output and creates a file that could be evaluated by trec_eval TREC: qid iter docno rank sim run_id
QueryRelevance	QueryRelevance.java Used for storing QueryRelevance Judgments Created on October 17, 2004, 2:22 PM
ResultMerger	ResultMerger.java Created on January 9, 2005, 11:34 AM Merges results from two files; one by one; Files must have equal number of docs / query
RobustEvalFix	RobustEvalFix.java Used for fixing robust_eval script so that the queries match with the script Created on November 15, 2004, 10:21 AM
RobustEvalFixTest	RobustEvalFixTest.java JUnit based test Created on November 15, 2004, 8:44 PM
RStat	Represents topicid avep p10 for the use by roubust_eval
StatsParser	StatsParser.java Created on October 15, 2004, 2:34 PM Takes as input output from ret_eval -q and produces output in form: topicid avep p10 for the use by roubust_eval

Package com.hrstc.trec Description

TREC Module

Provides various utilities that allow to run and evaluate peformance of Lucene on the test data from TREC Robust Retrieval Track.

Procedure

Index Files
I have modified org.apache.lucenesandbox.xmlindexingdemo package so that it works with TREC Robust 2004 data files. In order to index files execute org.apache.lucenesandbox.xmlindexingdemo.IndexFiles
Perform Search
I have modified SearchFiles so that it outputs data in TREC format. For usage details see javadoc. There is a sample program that could be launched be executing 'demo'.
Evaluation - TREC
It is necessary to evaluate search results with TREC programs; you can just use TREC's standard program - trec_eval, in order to perform evaluation.
ex. /mnt/hgfs/thesis/trec_eval/trec_eval.7.0beta_linux/trec_eval -q qrels.robust2004.txt $1.result > $1.trec_eval.out
Stats Parsing
Values of P10 are necessary to run the robust_eval script, but they are not produced in the necessary form by trec_eval script. Program com.hrstcs.trec.StatsParser parses trec_eval's result file and produces necessary stats in the format usable by robust_eval.
ex. /usr/java/jre1.5.0_01/bin/java -Xmx256m -cp /mnt/hgfs/thesis/lucene/aditional_src/bin com.hrstcs.trec.StatsParser $1.trec_eval.out $1.robust_eval.in
Evaluation - TREC Robust
In some of the runs not all of the topics are present, program com.hrstcs.trec.RobustEvalFix adjusts 'robust2004_eval.pl' script so that only present topics are evaluated.
ex. /usr/java/jre1.5.0_01/bin/java -Xmx256m -cp /mnt/hgfs/thesis/lucene/aditional_src/bin com.hrstcs.trec.RobustEvalFix /mnt/hgfs/tdata/original/robust2004_eval.pl $1.result
Finally, in order to perform robust evaluation:
./robust2004_eval.pl $1.robust_eval.in > $1.robust_eval.out

Reports

com.hrstc.trec.report contains utilities that could be used to create reports from various runs.

Author:

Neil O. Rouben