Display options
Share it on

Front Psychol. 2014 May 12;5:385. doi: 10.3389/fpsyg.2014.00385. eCollection 2014.

Using a high-dimensional graph of semantic space to model relationships among words.

Frontiers in psychology

Alice F Jackson, Donald J Bolger

Affiliations

  1. Laboratory for the Neurodevelopment of Reading and Language, Department of Human Development and Quantitative Methodology, University of Maryland College Park, MD, USA ; Program for Neuroscience and Cognitive Science, University of Maryland College Park, MD, USA.

PMID: 24860525 PMCID: PMC4026710 DOI: 10.3389/fpsyg.2014.00385

Abstract

The GOLD model (Graph Of Language Distribution) is a network model constructed based on co-occurrence in a large corpus of natural language that may be used to explore what information may be present in a graph-structured model of language, and what information may be extracted through theoretically-driven algorithms as well as standard graph analysis methods. The present study will employ GOLD to examine two types of relationship between words: semantic similarity and associative relatedness. Semantic similarity refers to the degree of overlap in meaning between words, while associative relatedness refers to the degree to which two words occur in the same schematic context. It is expected that a graph structured model of language constructed based on co-occurrence should easily capture associative relatedness, because this type of relationship is thought to be present directly in lexical co-occurrence. However, it is hypothesized that semantic similarity may be extracted from the intersection of the set of first-order connections, because two words that are semantically similar may occupy similar thematic or syntactic roles across contexts and thus would co-occur lexically with the same set of nodes. Two versions the GOLD model that differed in terms of the co-occurence window, bigGOLD at the paragraph level and smallGOLD at the adjacent word level, were directly compared to the performance of a well-established distributional model, Latent Semantic Analysis (LSA). The superior performance of the GOLD models (big and small) suggest that a single acquisition and storage mechanism, namely co-occurrence, can account for associative and conceptual relationships between words and is more psychologically plausible than models using singular value decomposition (SVD).

Keywords: co-occurrence; computational model of language; distribution model; graph; similarity

References

  1. Brain Lang. 1990 Jan;38(1):75-104 - PubMed
  2. J Exp Psychol. 1971 Oct;90(2):227-34 - PubMed
  3. Behav Res Methods. 2011 Sep;43(3):746-60 - PubMed
  4. Psychon Bull Rev. 2000 Dec;7(4):618-30 - PubMed
  5. Psychon Bull Rev. 2005 Aug;12(4):703-10 - PubMed
  6. J Exp Psychol Gen. 1997 Jun;126(2):99-130 - PubMed
  7. Cogn Sci. 2005 Jan 2;29(1):41-78 - PubMed
  8. Psychol Aging. 1991 Dec;6(4):522-7 - PubMed
  9. Psychol Rev. 2000 Oct;107(4):786-823 - PubMed
  10. Psychol Rev. 2007 Jan;114(1):1-37 - PubMed
  11. Cognition. 1980 Sep;8(3):263-7 - PubMed
  12. Nature. 2005 Jun 9;435(7043):814-8 - PubMed
  13. Behav Res Methods. 2012 Sep;44(3):890-907 - PubMed
  14. Cognition. 1997 Feb;62(2):223-40 - PubMed
  15. J Exp Psychol Learn Mem Cogn. 1992 Nov;18(6):1191-210 - PubMed
  16. J Exp Psychol Learn Mem Cogn. 1992 Nov;18(6):1155-72 - PubMed
  17. Neuropsychology. 1998 Apr;12(2):218-24 - PubMed
  18. Psychol Rev. 2007 Apr;114(2):211-44 - PubMed
  19. Behav Res Methods. 2007 Aug;39(3):510-26 - PubMed
  20. Top Cogn Sci. 2011 Apr;3(2):346-70 - PubMed
  21. Behav Sci. 1967 Sep;12(5):410-30 - PubMed
  22. Psychon Bull Rev. 2003 Dec;10(4):785-813 - PubMed
  23. Mem Cognit. 1977 May;5(3):335-9 - PubMed
  24. Behav Res Methods Instrum Comput. 2004 Aug;36(3):402-7 - PubMed
  25. Mem Cognit. 1984 Jul;12(4):315-28 - PubMed

Publication Types