== Datasets == There are two dataset: WN9-IMG and FB-IMG - Each dataset contains a data directory conatining the triples used for training, testing and validation. - The WN9-IMG dataset was introduced by (Xie et al., 2017) https://www.ijcai.org/proceedings/2017/0438.pdf - The FB-IMG dataset is based on FB15K introduced by (Bordes et al., 2013) http://papers.nips.cc/paper/5071-translating-embeddings-for-modeling-multi-relational-data.pdf == Embeddings == - Each dataset contains an embeddings directory which contains different types of embeddings, all saved in pickles: - Structural: embedding vectors for entities AND relations trained using the TransE algorithm introduced by (Bordes et al., 2013) http://papers.nips.cc/paper/5071-translating-embeddings-for-modeling-multi-relational-data.pdf - Visual: image embeddings corresponding to the entities. There different types of such embeddings depending on the used VGG model. For example VGG19 contains 4096-dimensional vectros and VGG128 has 128 embeddings vectors. See the pre-trained VGG19 neural network for image classification (Simonyan and Zisserman, 2014) https://arxiv.org/pdf/1409.1556.pdf - Linguistic: word embeddings vectors for the entities learned. For the FB dataset they are 1000-dimensional vectors and for WN they are 300-dimensional vectors. GloVe embeddings (Pennington et al., 2014) http://aclweb.org/anthology/D14-1162 , AutoExtend framework (Rothe and Schütze, 2015) http://aclweb.org/anthology/P15-1173 - Mulitmodal: they are embeddings obtained by concatenating the visual and the linguistic embedding vectors. - Format of Embeddings: dictionaries saved as pickles Using Python, you can read in the data via: import pickle with open(path_in_file, 'rb') as f: embeddings = pickle.load(f) embeddings['your_test_headOrRelationOrTail'] == License == Feel free to distribute these word embeddings under the CC-By License (http://creativecommons.org/licenses/by/4.0/). If you use these word embeddings in your research, please cite: Hatem Mousselly Sergieh, Teresa Botschen, Iryna Gurevych and Stefan Roth. 2018. A Multimodal Translation-Based Approach for Knowledge Graph Representation Learning. @inproceedings{moussellySergieh2018multimodal, title = {{A Multimodal Translation-Based Approach for Knowledge Graph Representation Learning}}, author = {Mousselly Sergieh, Hatem and Botschen, Teresa and Gurevych, Iryna and Roth, Stefan}, publisher = {Association for Computational Linguistics}, booktitle = {Proceedings of the 7th Joint Conference on Lexical and Computational Semantics (*SEM 2018)}, publisher = {Association for Computational Linguistics}, pages = {to appear}, month = jun, year = {2018}, location = {New Orleans, USA} }