Class AxiomaticF3EXP
java.lang.Object
org.apache.lucene.search.similarities.Similarity
org.apache.lucene.search.similarities.SimilarityBase
org.apache.lucene.search.similarities.Axiomatic
org.apache.lucene.search.similarities.AxiomaticF3EXP
F3EXP is defined as Sum(tf(term_doc_freq)*IDF(term)-gamma(docLen, queryLen)) where IDF(t) =
pow((N+1)/df(t), k) N=total num of docs, df=doc freq gamma(docLen, queryLen) =
(docLen-queryLen)*queryLen*s/avdl NOTE: the gamma function of this similarity creates negative
scores
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.lucene.search.similarities.Similarity
Similarity.SimScorer -
Field Summary
-
Constructor Summary
ConstructorsConstructorDescriptionAxiomaticF3EXP(float s, int queryLen) Constructor setting s and queryLen, letting k to defaultAxiomaticF3EXP(float s, int queryLen, float k) Constructor setting all Axiomatic hyperparameters -
Method Summary
Modifier and TypeMethodDescriptionprotected doublegamma(BasicStats stats, double freq, double docLen) compute the gamma componentprotected doubleidf(BasicStats stats, double freq, double docLen) compute the inverted document frequency componentprotected ExplanationidfExplain(BasicStats stats, double freq, double docLen) Explain the score of the inverted document frequency component for a single documentprotected doubleln(BasicStats stats, double freq, double docLen) compute the document length componentprotected ExplanationlnExplain(BasicStats stats, double freq, double docLen) Explain the score of the document length component for a single documentprotected doubletf(BasicStats stats, double freq, double docLen) compute the term frequency componentprotected ExplanationtfExplain(BasicStats stats, double freq, double docLen) Explain the score of the term frequency component for a single documentprotected doubletfln(BasicStats stats, double freq, double docLen) compute the mixed term frequency and document length componentprotected ExplanationtflnExplain(BasicStats stats, double freq, double docLen) Explain the score of the mixed term frequency and document length component for a single documenttoString()Name of the axiomatic method.Methods inherited from class org.apache.lucene.search.similarities.Axiomatic
explain, explain, scoreMethods inherited from class org.apache.lucene.search.similarities.SimilarityBase
fillBasicStats, log2, newStats, scorerMethods inherited from class org.apache.lucene.search.similarities.Similarity
computeNorm, getDiscountOverlaps
-
Constructor Details
-
AxiomaticF3EXP
public AxiomaticF3EXP(float s, int queryLen, float k) Constructor setting all Axiomatic hyperparameters- Parameters:
s- hyperparam for the growth functionqueryLen- the query lengthk- hyperparam for the primitive weighting function
-
AxiomaticF3EXP
public AxiomaticF3EXP(float s, int queryLen) Constructor setting s and queryLen, letting k to default- Parameters:
s- hyperparam for the growth functionqueryLen- the query length
-
-
Method Details
-
toString
Description copied from class:AxiomaticName of the axiomatic method. -
tf
compute the term frequency component -
ln
compute the document length component -
tfln
compute the mixed term frequency and document length component -
idf
compute the inverted document frequency component -
gamma
compute the gamma component -
tfExplain
Description copied from class:AxiomaticExplain the score of the term frequency component for a single document -
lnExplain
Description copied from class:AxiomaticExplain the score of the document length component for a single document -
tflnExplain
Description copied from class:AxiomaticExplain the score of the mixed term frequency and document length component for a single document- Specified by:
tflnExplainin classAxiomatic- Parameters:
stats- the corpus level statisticsfreq- number of occurrences of term in the documentdocLen- the document length- Returns:
- Explanation of how the tfln component was computed
-
idfExplain
Description copied from class:AxiomaticExplain the score of the inverted document frequency component for a single document- Specified by:
idfExplainin classAxiomatic- Parameters:
stats- the corpus level statisticsfreq- number of occurrences of term in the documentdocLen- the document length- Returns:
- Explanation of how the idf component was computed
-