Tipultech logo

Word count method

Author: Dr Simon Moss


In many studies, researchers have attempted to quantify the language and words that participants use. They might, for example, want to examine whether the language and words that patients use change across the course of treatment. They might want to assess whether the words that individuals use depend on their mood.

Some researchers apply techniques that consider the context. That is, they examine broad linguistic units, such as sentences or paragraphs, to derive information (e.g., Gottschalk & Gleser, 1969). Other researchers examine the frequency of specific words (e.g., Pennebaker, Francis, & Booth, 2001). Although a focus on specific words disregards the broader context of language, extensive transcripts can be examined efficiently using computer programs.

This focus on words is called the word count method. To examine a transcript, researchers first define specific word categories, such as words that reflect positive emotions, negative emotions, achievement, autonomy, self references, and so forth. Next, researchers count the proportion of words in a transcript that correspond to that category. The usage of specific word categories is assumed to reflect the importance of that class of terms to the individuals (Stone, Dunphy, Smith, & Ogilvie, 1966) and has been found to coincide with many traits and states, such as wellbeing and personality.

Programs to count words

Linguistic Inquiry and Word Count

Pennebaker, Francis, and Booth (2001) developed the Linguistic Inquiry and Word Count program to count the frequency of words that pertain to specific categories (see also Pennebaker, Mehl, & Niederhoffer, 2003). The program searches over 2300 words and categorizes these terms into 70 or more categories.

Some of the categories are intended to represent psychological processes, and include position emotion words, negative emotions words, causal words, insight words, and self-discrepancies (e.g., "should"). Some of the categories are linguistic classes, such as first person pronouns and articles. In addition, some of the categories include words that relate two or more objects, such as terms that are associated with time, space, and motion. Finally, some of the categories relate to domains of content, such as achievement and religion (for a summary, see Tausczik & Pennebaker, 2010).

One of the obvious limitations of this technique is that many words, when isolated from their context, are ambiguous. To override this problem, some researchers now examine the frequency of specific combinations of words, comprising two, three, or more terms (e.g., Oberlander & Gill, 2006).


Hart (2001) developed a program called DICTION, primarily designed to count word in political messages. The program determines the frequency with which individuals use five broad categories of words, reflecting activity, optimism, certainty, realism, and commonality respectively. Each category is divided into approximately seven more specific classes. Optimism, for example, includes categories that represent praise, satisfaction, inspiration, blame, hardship, and denial. The program can search for approximately 10,000 different words, each assigned to a single category. Another feature is the capacity to determine which category corresponds to various homographs, using statistical properties in the text.

The Weintraub approach

Weintraub (1981, 1989) developed an approach in which participants speak into a microphone for 10 minutes, usually on any topic they choose. Rather than count words using a computer program, naive judges, with limited knowledge about linguistics, rate the texts on several dimensions. These dimensions include the frequency of adverbs that intensify a word, such as "really" or "so", the frequency of qualifiers, such as "sort of", and the frequency of negatives, such as "never" or "no".


Another software program that has been developed to assess the words that people use is called INFOMAP, available from www.infomap-nlp.sourceforge.net. This program can be used to undertake latent semantic analysis. This technique subjects text to an analysis that is akin to factor analysis or multidimensional scaling, uncovering dimensions, such as the degree to which individuals use personal versus collective pronouns, such as I versus we, the degree to which individuals use first person versus third person pronouns, such as I versus he, and the degree to which the words are regarded as positive or negative.

Senden, Lindholm, and Sikstrom (2014), for example, subjected news articles from the media to INFOMAP. Only articles in which journalists did not specify their personal identities were included in this analysis--articles that are supposed to be unbiased. They found that first person pronouns were associated with more positive words than third person pronouns, indicating a bias. That is, words such as I or we tended to coincide with more positive terms than words such as you and he. Furthermore, when journalists utilized third person pronouns, collective words like them coincided with more negative terms than personal words like you--reflecting out-group derogation& the opposite pattern was observed when journalists used first person pronouns. These analyses show the inherent biases of journalists.

Correlations with personality

Word use has been shown to correlate with personality. Nevertheless, effect sizes tend to be small (Pennebaker, Mehl, & Niederhoffer, 2003). For example, as Mehl, Gosling, and Pennebaker (2006) showed, when individuals are extraverted, they are not as inclined to write large words. However, they write many words.

Self references

Ickes, Reidhead, and Patterson (1986) showed that individuals who report elevated levels of self monitoring--the tendency to adapt their demeanor and values to accommodate the preferences and expectations of another person--seldom use self references, such as "I", "me", or "my". Instead, they often use other references, including "you" and "we".

Pennebaker and King (1999) showed that self references are inversely related to openness to experience (see Five factor model). That is, individuals who often use the terms "I", "me", "my", as well as words in the present tense, are less likely than peers to be creative, curious, and intelligent but more likely to be rigid and inflexible.

Apart from openness to experience, the other personality variables were also reflected by word use (Pennebaker & King, 1999). In particular, the use of positive emotion words was related to emotional stability, extraversion, and agreeableness. The use of negative emotion words was inversely related to emotional stability (see also Weintraub, 1981, 1989) as well as agreeableness. Finally, the use of self references was also inversely related to emotional stability (Pennebaker & King, 1999)

Exclusion and negation

Some individuals often use words that reflect exclusion, such as "but", "without", or "except", rather than inclusion, such as "with" (Pennebaker & King, 1999). References to exclusion tend to correspond to introversion, rather than extraversion, as well as limited levels of conscientious behavior.

Terms that reflect negation, such as "no", "not", and "never", also reflects this personality (Pennebaker & King, 1999). Similarly, negation seems to represent an anxious disposition as well (Weintraub, 1981, 1989).

Certainty words

Some individuals use many words that reflect certainty, such as precisely, exactly, absolutely, always, sure, definite, clearly, and guarantee (Fast & Funder, 2008). These individuals tend to be more confident, assertive, and aspirational as well as intelligent, thoughtful, and introspective. In addition, they are likeable--often skilled in social settings--and expressive as well as interesting rather than unemotional or bland. These findings were derived from self reports, evaluations from acquaintances, as well as observations of behaviors (Fast & Funder, 2008).

Hence, according to Fast and Funder (2008), these words might reflect judicial behavior and wisdom. These findings contradict previous assertions that certainty in language might reflect paranoia and rigidity (Hayakawa, 1940& Korzybski, 1933).


Some individuals often use words that justify their behavior, such as "because", "since", and "in order to". These words, called explainers seem to be more frequent in individuals who report an anxious disposition (Weintraub, 1981, 1989).

Sexuality words

Some individuals use many words that relate to sexuality, such as breast, butt, erection, horny, love, nude, and orgasm (Fast & Funder, 2008). These words seem to coincide with an extraverted, talkative, bold, dramatic, unconventional, and rebellious personality. Nevertheless, ratings from acquaintances and behavioral observations also indicate that individuals who use these words are often moody, anxious, and egocentric--seldom interested in the feelings of anyone else. Conceivably, according to Fast and Funder (2008), these words reflect a need for attention.

Correlations with states


Weintraub (1981, 1989) showed that anger tends to manifest as a particular profile of words. Specifically, angry individuals seldom use qualifiers, such as "kind of", "what you might call", but often used negation and rhetorical questions.


Pennebaker and King (1999) examined whether or not the needs for achievement, power, and affiliation were related to word usage. Both an implicit measure--the Thematic Apperception Test--and an explicit measure--the Personality Research Form--were used to assess these needs.

When individuals exhibited an implicit need for achievement, the words they used were unrelated to their immediate context (Pennebaker & King, 1999). That is, they did not use many self references but instead utilized longer words and discrepancy words, such as "should" or "could". In addition, they often referred to the social past. That is, they spoke in the past, not present, tense and utilized social terms. Finally, they seldom used insight or causation words, indicating a limited inclination to rationalize their behavior. Examples of insight words are think, consider, and know. Examples of causal words are because, therefore, and effect. Nevertheless, explicit need for achievement was unrelated to word usage.

When individuals exhibited an implicit need for affilitation, they seldom referred to the social past, seldom using social terms or past tense. They tended to use emotion words and the present tense (Pennebaker & King, 1999). Need for power was unrelated to word usage, however (Pennebaker & King, 1999).

Dimensions of meaning

An analysis of words can also be undertaken to uncover the dimensions along which individuals reflect upon themselves. That is, this analysis can extracted issues that are meaningful to people.

Specifically, Chung and Pennebaker (2008) derived seven factors that differentiate individuals on their use of adjectives. In this study, college students were asked to write narratives about themselves. The use of adjectives was subjected to a factor analysis. Seven factors emerged, each representing a dimension of meaning to individuals. For example, one of the dimensions was called sociability, representing the degree to which individuals utilize words like shy or outgoing. This dimension, thus, comprised words that are opposite in meaning.

The second dimension was called evaluation, representing the degree to which people used terms that evaluate their appearance or competence, such as cute, ugly, and stupid. The third, dimension was negativity, representing the extent to which individuals utilize negative words. The fourth dimension was self acceptance, primarily concerned with how individuals feel about themselves, with reference to words like lonely, blessed, independent, free, loving, and lost. The fifth dimension was fitting in and included words like crazy, cool, weird, and normal. The sixth dimension was psychological stability, represented by words like positive, negative, emotional, and fake. The final dimension was maturity. This factor included words like mature, successful, capable, and caring.

The psychometric properties of these dimensions, however, were modest. Correlations of the same dimension, derived from different essays, were usually about .1. Nevertheless, these dimensions were significantly related to measures of personality, as represented by the five factor model. Reference to sociable words, for example, was inversely associated with extraversion, whereas references to self acceptance was positively associated with extraversion. Allusions to evaluation or negative words was positively associated with neuroticism. Finally, allusions to maturity words was positively associated with conscientiousness.

In addition, use of negative words was positively associated with depression, as measured by the Becks Depression Inventory. Use of psychologically stable words was positively related to performance at school.

Correlations with clinical disorders.


A variety of studies have shown that individuals who experience elevated levels of depression are more inclined to use first person singular pronouns (e.g., Bucci & Freedman, 1981& Weintraub, 1981)--a pattern that has also been found in patients with mania (Lorenz & Cobb, 1952). Pennebaker, Mehl, and Niederhoffer (2003) reported a study that showed, however, that depressed individuals are especially likely to use the word "I" in essays& they are not necessarily more likely than are other individuals to use the words "me", "my", or "mine" excessively.


As shown by Stirman and Pennebaker (2001), suicide might be associated with infrequent use of first person plural pronouns, such as "we", and more frequent use of first person singular pronouns, such as "I"--possibly reflecting social disengagement. This pattern of word usage was indeed uncovered in suicidal poets (Stirman & Pennebaker, 2001).

Correlations with demographics

Age also influences the use of words. Older individuals use fewer self references. That is, conceivably, they can more readily detach themselves from their problems and thus use the words "I", "me", or "my" less often than do younger individuals (Pennebaker & Stone, 2003).

Furthermore, older individuals use fewer words that relate to negative emotions. Perhaps, these individuals have often developed the ability to alleviate unpleasant emotions and thus use the terms "upset", "sad", "angry", and "anxious" less often than do their younger counterparts (Pennebaker & Stone, 2003).


One of the complications of word count methods is that words might be classified incorrectly. For example, as Fast and Funder (2008) acknowledge, the word "happy" in the sentence "I am not happy" is classified as a positive emotion, even though the person is depicting a negative mood.

Nevertheless, the sentence "I am not happy" is not equivalent to the sentence "I am sad" (Berry, Pennebaker, Mueller, & Hiller, 1997 & Fast & Funder, 2008). In particular, the sentence "I am not happy" indicates that individuals are evaluating their level of happiness and not their level of sadness. Hence, assigning the terms "happy" and "sad" in these sentences to separate categories might be appropriate. Indeed, Berry, Pennebaker, Mueller, and Hiller (1997) showed that classifications do not differ substantially when humans, rather than computers, categorize the words.


Allport, F., Walker, L., & Lathers, E. (1934). Written composition and characteristics of personality. Archives of Psychology, 173, 1-82.

Berry, D. S., Pennebaker, J. W., Mueller, J. S., & Hiller, W. (1997). Linguistic bases of social perception. Personality and Social Psychology Bulletin, 23, 526-538.

Bucci W., & Freedman, N. (1981). The language of depression. Bulletin of Menninger Clinic, 45, 334-358,

Burke, P. A., & Dollinger, S. J. (2005). A picture's worth a thousand words": Language use in the autophotographic essay. Personality and Social Psychology Bulletin, 31, 536-548.

Carroll, D. W. (1999). Psychology of language (3rd ed.). New York: Brooks/Cole.

Chung, C. K., & Pennebaker, J. W. (2008). Revealing dimensions of thinking in open-ended self-descriptions: An automated meaning extraction method for natural language. Journal of Research in Personality, 42, 96-132.

Chung, C., & Pennebaker, J. (2007). The psychological functions of function words. In K. Fielder (Ed.), Frontiers in social psychology (pp. 343-359). New York: Psychology Press.

Chung, C. K., & Pennebaker, J. W. (2008). Revealing dimensions of thinking in open-ended self-descriptions: An automated meaning extraction method for natural language. Journal of Research in Personality, 42, 96-132.

Fast, L. A., & Funder, D. C. (2008). Personality as manifest in word use: correlations with self-report, acquaintance report, and behavior. Journal of Personality and Social Psychology, 94, 334-346.

Fernandez-Dols, J., Sanchez, F., Carrera, P., & Ruiz-Belda, M. (1997). Are spontaneous expressions and emotions linked? An experimental test of coherence. Journal of Nonverbal Behavior, 21, 163-177.

Furnham, A. (1990). Language and personality. In H. Giles & W. P. Robinson (Eds.), Handbook of language and social psychology (pp. 73-95). New York: Wiley.

Gottschalk, L., & Gleser, G. (1969). The measurement of psychological states through the content analysis of verbal behavior. Berkeley: University of California Press.

Groom, C., & Pennebaker, J. (2002). Brief report: Words. Journal of Research in Personality, 36, 615-621.

Gross, J. J., & John, O. P. (1997). Revealing feelings: Facets of emotional expressivity in self-reports, peer ratings, and behavior. Journal of Personality and Social Psychology, 72, 435-448.

Gross, J. J., John, O. P., & Richards, J. M. (2000). The dissociation of emotion expression from emotion experience: A personality perspective. Personality and Social Psychology Bulletin, 26, 712-726.

Hayakawa, S. I. (1940). Language in action. Chicago: Semantics Institute.

Hart, R. P. (2001). Redeveloping DICTION: theoretical considerations. In M. D. West (Ed.), Theory, method, and practice in computer content analysis (pp. 43-60). New York: Ablex.

Ickes, W., Reidhead, S., & Patterson, M. (1986). Machiavellianism and self-monitoring: As different as "me" and "you". Social Cognition, 4, 58-74.

King, L. A., & Emmons, R. A. (1991). Psychological, physical, and interpersonal correlates of emotional expressiveness, conflict, and control. European Journal of Personality, 5, 131-150.

Lepore, S. J., & Smyth, J. M. (Eds.). (2002). The writing cure: How expressive writing promotes health and emotional well-being. Washington, DC: American Psychological Association.

Letzring, T., Block, J., & Funder, D. (2005). Ego-control and ego-resiliency: Generalization of self-report scales based on personality descriptions from acquaintances, clinicians, and the self. Journal of Research in Personality, 39, 395-422. Bibliographic Links

Letzring, T., Wells, S., & Funder, D. (2006). Information quantity and quality affect the realistic accuracy of personality judgment. Journal of Personality and Social Psychology, 91, 111-123.

Lorenz, M., & Cobb, S. (1952). Language behavior in manic patients. Archives of Neurological Psychiatry, 67, 763-770.

Mehl, M., Gosling, S., & Pennebaker, J. (2006). Personality in its natural habitat: Manifestations and implicit folk theories of personality in daily life. Journal of Personality and Social Psychology, 90, 862-877.

Mehl, M., & Pennebaker, J. (2003). The sounds of social life: A psychometric analysis of students' daily social environments and natural conversations. Journal of Personality and Social Psychology, 84, 857-870.

Mehl, M., Pennebaker, J., Crow, M., Dabbs, J., & Price, J. (2001). The electronically activated recorder (EAR): A device for sampling naturalistic daily activities and conversations. Behavior Research Methods, Instruments, & Computers, 33, 517-523.

Nagy, W., & Anderson, R. (1984). How many words are there in printed school English? Reading Research Quarterly, 19, 304-330.

Oberlander, J., & Gill, A. J. (2006). Language with character: A stratified corpus comparison of individual differences in e-mail communication. Discourse Processes, 42, 239-270.

Pennebaker, J. W., & Francis, M. E. (1996). Cognitive, emotional, and language processes in disclosure. Cognition & Emotion, 10, 601-626

Pennebaker, J. W., Francis, M. E., & Booth, R. J. (2001). Linguistic Inquiry and Word Count (LIWC): LIWC2001. Mahwah, NJ: Erlbaum.

Pennebaker, J. W., & King, L. A. (1999). Linguistic styles: Language use as an individual difference. Journal of Personality and Social Psychology, 77, 1296-1312.

Pennebaker, J. W., & Lay, T. C. (2002). Language use and personality during crises: Analysis of Mayor Rudolph Giuliani's press conferences. Journal of Research in Personality, 36, 271-282.

Pennebaker, J. W., Mayne, T. J., & Francis, M. E. (1997). Linguistic predictors of adaptive bereavement. Journal of Personality and Social Psychology, 72, 863-871.

Pennebaker, J., Mehl, M., & Niederhoffer, K. (2003). Psychological aspects of natural language use: Our words, our selves. Annual Review of Psychology, 54, 547-577. Bibliographic Links

Pennebaker, J., & Stone, L. (2003). Words of wisdom: Language use over the life span. Journal of Personality and Social Psychology, 85, 291-301.

Raskin, R., & Shaw, R. (1988). Narcissism and the use of personal pronouns. Journal of Personality, 56, 2, 393-404.

Sanford, F. (1942). Speech and personality. Psychological Bulletin, 39, 811-845. Bibliographic Links

Senden, M. G., Lindholm, T., & Sikstrom, S. (2014). Biases in news media as reflected by personal pronouns in evaluative contexts. Social Psychology, 45 ,103-111. doi: 10.1027/1864-9335/a000165

Stirman, S. W., & Pennebaker, J. W. (2001). Word use in the poetry of suicidal and nonsuicidal poets. Psychosomatic Medicine, 63, 517-522.

Stone, P. J., Dunphy, D. C., Smith, M. S., & Ogilvie, D. M. (1966). The general inquirer: A computer approach to content analysis. Cambridge, MA: MIT Press.

Tausczik, Y. R., & Pennebaker, J. W. (2010). The psychological meaning of words: LIWC and computerized text analysis methods. Journal of Language and Social Psychology, 29, 24-54.

Weintraub W. (1981). Verbal Behavior: Adaptation and Psychopathology. New York: Springer

Weintraub, W. (1989). Verbal behavior in everyday life. New York: Springer.

Academic Scholar?
Join our team of writers.
Write a new opinion article,
a new Psyhclopedia article review
or update a current article.
Get recognition for it.

Last Update: 5/28/2016