Index Details

The details of the DBpedia indcies used for generating the baseline runs of the DBpedia-Entity v2 collection (i.e., Table 2 of [1]) are described here.

DBpedia files

The following files from DBpedia 2015-10 dump are indexed:

Index settings

Index fields

The following fields are used for constructing the index, following the general approach outlined in [2]. All fields listed below contain unique phrases.

Field Description Predicates Notes
Names Names of the entity <foaf:name>, <dbp:name>, <foaf:givenName>, <foaf:surname>, <dbp:officialName>, <dbp:fullname>, <dbp:nativeName>, <dbp:birthName>, <dbo:birthName>, <dbp:nickname>, <dbp:showName>, <dbp:shipName>, <dbp:clubname>, <dbp:unitName>, <dbp:otherName>, <dbo:formerName>, <dbp:birthname>, <dbp:alternativeNames>, <dbp:otherNames>, <dbp:names>, <rdfs:label>  
Categories Entity types <dcterms:subject>  
Similar entity names Entity name variants !<dbo:wikiPageRedirects>, !<dbo:wikiPageDisambiguates>, <dbo:wikiPageWikiLinkText> ! denotes reverse direction (i.e. <o, p, s>)
Attributes Literal attibutes of entity All <s, p, o>, where “o” is a literal and “p” is not in Names, Categories, Similar entity names, and blacklist predicates.For each <s, p, o> triple, if p matches <dbp:.*> both p and o are stored (i.e. “p o” is indexed).  
Related entity names URI relations of entity Similar to Attributes field, but “o” should be a URI.  

Group-specific settings

The baseline runs of the DBpedia-Entity v2 collection [1] are generated using two indices, denoted as index A and B. These indices, built by two different research groups, share the above settings, but are different in the following aspects.

Index A:

Index B:

[1] Faegheh Hasibi, Fedor Nikolaev, Chenyan Xiong, Krisztian Balog, Svein Erik Bratsberg, Alexander Kotov, and Jamie Callan. 2017. “DBpedia-Entity v2: A Test Collection for Entity Search”, In proceedings of 40th ACM SIGIR conference on Research and Development in Information Retrieval (SIGIR ’17). 1265-1268.

[2] Nikita Zhiltsov, Alexander Kotov, and Fedor Nikolaev. 2015. “Fielded Sequential Dependence Model for Ad-Hoc Entity Retrieval in the Web of Data”. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ‘15). 253–262.