Analyzing Differences of Perception with those Diagnosed with Borderline Personality Disorder Using Natural Language Processing



According to clinical experience, evidence has shown that individuals diagnosed with BPD tend to feel emotions at a stronger magnitude. This study used Topic Modelling and Sentiment Analysis Natural Language Processing, a Machine Learning tool that analyzes transcripts of speech, in order to investigate emotional language differences through the demarcation of emotional categories. These categories are defined as anger, anticipation, disgust, fear, joy, sadness, surprise, and trust. An exploratory hypothesis to encompass such relations between emotional language and BPD is the apparent difference between perception between those diagnosed and not diagnosed with BPD. However, the results showed little difference in the emotional language between the two. For example, statistical analysis showed that anger had a t-score of 0.26, showing little significance between this difference. The other emotional categories yielded similar results. Based on the NRC EmoLex resource by Mohammad and Turney in 2013, this study implies that emotional language processing is roughly similar between those with and without BPD.

Keywords: Behavioral and Social Sciences; Cognitive Psychology; Borderline Personality Disorder; Emotional Expression; Natural Language Processing


The National Institute of Mental Health defines Borderline Personality Disorder as “a mental illness that severely impacts a person’s ability to manage their emotions”1. This lack of emotional control tends to increase impulsivity and affect how a person feels about themselves and others1. Individuals diagnosed with BPD often experience intense mood swings and respond to positive, negative, and neutral stimuli much more dramatically than a neurotypical person would2. As a result, Borderline Personality Disorder causes people to perceive emotions, the environment, and others actions much differently than one typically would2. Many studies have been conducted in an attempt to quantify this difference in perception. One such study by Conrad and Morrow examined 109 male college students with varying levels of BPD symptoms and their respective responses to neutral news reports, reports depicting violence, and reports with abandonment themes3. The results showed that men with little or no symptoms reacted much differently than men with extreme BPD symptoms3. Men with high levels of BPD were much more likely to respond violently to things that displeased them3. They had much more dramatic emotional responses to the news reports compared to the men with very mild or no BPD symptoms at all3. This brings up the topic of Borderline Personality Disorder and violence. It is estimated that up to 45%4 of prison inmates meet the criteria to be diagnosed with BPD, compared to the 1-2%4 general population estimates. In fact, a study titled “Features of borderline personality and violence” by Adrian Raine tested the hypothesis that borderline personality characterizes extreme violence found that there was a linear increase with borderline scores and the degree of violence of crimes committed by murderers, violent, and non-violent adult offenders5. Another study titled “Neuroimaging and genetics of borderline personality disorder: a review” by Eric Lis and others examined the disorder’s biological etiology6. The use of neuroimaging is fairly new in the study of BPD, but the disorder has been linked to the amygdala and limbic systems of the brain, or the centers that control emotions–particularly anger, fear, and impulsive reactions6. Studies have shown that the amygdala and hippocampus of people with BPD may be as much as 8%7 and 16%7 smaller respectively than those of a person not diagnosed with BPD. Overall, the data suggests that the areas of the brain that regulate and control emotions in people with BPD are hypermetabolic6. Additionally, the limbic areas are activated in excess (Lis et al, 162-173). This likely causes the difficulty to rationale with thought in the face of emotions, a defining characteristic of BPD6. Existing research on BPD outlines its causes, the effects, and the negative actions common in the disorder. However, there is limited existing research concerning language. This study aims to use Natural Language Processing to analyze the difference in written language between individuals not diagnosed with BPD and people who are diagnosed with Borderline Personality Disorder. Current studies and data reveal that people with BPD react much more dramatically to emotions, stimuli, and their environment than those without it2. This often causes people with BPD to behave more violently than others. As a result, there is a drastic difference in prison vs. community populations of people diagnosed with BPD4. This study explores how this tendency towards violence and more emotional reactions is characterized in ways other than actions. This is accomplished by analyzing the difference in the writing done by people without BPD and people with BPD. Each text will be analyzed for emotional language that is separated into 8 categories: anger, anticipation, disgust, fear, joy, sadness, surprise, and trust8. The study will further explore how people with BPD’s dramatic emotions and reactions extend to their use of language. For many, writing is an expression of one’s perception of the world. This study intends to identify any recurring differences in perception between individuals without BPD and people with BPD. It aims to take a deeper look at what the differences in perception are with people diagnosed with Borderline Personality Disorder.


Dataset Description

The data set in this study included a variety of written or transcripted works and narratives by people with BPD and people without. This included six poems (three by a person with BPD and three by a person without), seven personal essays or reflections (five by people with BPD and two by people without), two essays on rethinking BPD (one by a person with BPD and one by a clinician without), two transcripts of stand-up comedy (one by a person with BPD and one by a person without), and two short stories (one by a person with BPD and one by a person without). In total, the data set consists of eleven sources from people with BPD and eight sources from people without BPD. The data covered a wide range of topics to provide insight as to what BPD looks like in different contexts. The data was retrieved through a specific and thorough search process that collected sources from online sources such as the Poetry Foundation, Scraps From the Loft, and New York Times. For each type of writing or transcript from someone with BPD, there was at least one by someone without BPD on a similar topic. Therefore, the emotional language differences (if present) could be attributed to the disorder, not the topic. All of the data was public information and previously published online.

NLP Model Description

This study uses natural language processing to analyze texts. Given the fact that it has been shown in many studies that people with BPD react much more emotionally to things compared to people without BPD, the program analyzed the emotional language in all of the sources. This program utilized Topic Modelling and Sentiment Analysis techniques. Words were selected and separated into different emotional categories–anger, anticipation, disgust, fear, joy, sadness, surprise, and trust–based on the NRC Word-Emotion Association Lexicon (EmoLex) resource8. Texts by and transcripted interviews of people with BPD and without BPD were then analyzed for emotional language. The program returned a count of how many words each text contained from each emotion list. No libraries were employed and there were no preprocessing steps.

Results and Discussion

The results display a different finding than what was expected based on previous studies. They show that there is no significant difference between the emotional language of people with BPD and people without. As one can see, the average score for both BPD and those without BPD for each emotion can be seen below.

Table 1: This table consists of that raw data of count of emotional words by emotion and transcript type.

When conducting a statistical analysis on the fact, we can see that a difference of means test produces t-scores seen below at a 5% statistical significance level. This is well below the significance threshold, which can be classified as a difference of means that is not statistically significant. The data found is not agreeing with previous studies, and the results previously discussed. When looking at emotions as a whole and conglomerating all the data together, one finds similar results. As it can be seen, The average for BPD was found to be 118.71 with a standard deviation for 133.422. On the other hand, the average for Non-BPD was found to be 135.05 with a standard deviation of 129.63. When conducting a significance test of means, the Z-score is found to be .75, much less below the significance threshold at the 5% significance levels for p-values. The results show that the difference between emotionality with those with and without BPD in the context of writing is not statistically significant. Further use of data analysis supports this claim. When conducting a statistical analysis on the fact, we can see that a difference of means test produces t-scores seen below at a 5% statistical significance level.

Table 2: This table consists of the calculated t-scores and p-values corresponding to emotion

Graph 1 represents the difference of means and scores for emotional language in people with BPD and without (see Table 1 for more specific numbers). The graph agrees with the previously discussed conclusion. In the case of the multivariate bar graph, the distribution of scores associated with BPD and non-BPD seem to be similar. If anything, it seems Non-BPD has actually scored higher than its counterpart. In addition, both seem to have roughly the same number of peaks as their counterpart. This is further evidence of qualitative analysis that the difference between natural language between those with BPD and those without BPD is not statistically significant. The lack of statistical significance based on the t-scores shows that there is little difference between emotional language and perception between those with and without BPD based on the texts. This is interesting in the context of the research question as previous studies have indicated that people with BPD do respond more emotionally. Furthermore, considering this study examined written sources, it is interesting to think about the role that editing and revision can play in minimizing the differences between people with and without BPD, and how that might have caused different results than was expected, based on previous research.

Figure 1: This figure consists of a paired bar chart depicting the averages for BPD vs. non-BPD.

Given the findings of this study, one can see that there is no correlation between difference in emotional language and BPD.


While the study has proven to be a comprehensive analysis, the results pertaining to the relation of emotion and BPD can be further improved in terms of applicability. For example, increasing the size of the dataset would allow a substantiated claim and a reinforced conclusion on perception differences, diagnosis of BPD, and natural language. While the current study implemented a broad range of texts and transcripts, an increasingly diversified set of records could establish potential associations that the current data does not. This study’s dataset had a limited size due to very limited written resources by people with known, diagnosed Borderline Personality Disorder. Therefore, the dataset could be expanded to recordings or books to diversify the findings. This would allow for a larger dataset and therefore a better, more comprehensive understanding of the differences in emotional responses.

In addition this study implemented previous transcripts and writings from individuals through a third party. However, direct communication never occurred. In other words, this study did not have access to real people with BPD and therefore the findings are structurally limited as such. If there was access to individuals with BPD, the study would yield more structurally sound results based in real, daily life through the inclusion of primary sources.

Lastly, the only method of exploratory data analysis was natural language processing, and a rudimentary application of it. Implemented other sources of analysis such as facial emotion recognition could provide stronger results. This would provide emotional analysis of not just spoken or written language, but also body language, providing a more thorough and specific understanding of the different emotional responses. Additionally, exclusively examining BPD limits this study in its reach and potentially misses out on some context. The study could be extended to include analysis of mental illnesses associated with BPD or other mental illnesses entirely.


This study analyzed various written texts and transcripts by people with and without BPD for emotional language separated into eight categories: anger, anticipation, disgust, fear, joy, sadness, surprise, and trust. The results found do not indicate any differences in emotional language between people with and without BPD, which contradicts previously discussed studies in the Introduction. While this study was comprehensive analytically, it was limited due to its restricted data set and lack of access to real people without BPD. However, despite the limitations, this study could still apply to mental illness research. The study could be further extended to other mental illnesses such as Depression, Anxiety, or Bipolar disorder. It could also be extended in the context of BPD with access to more data. By exclusively examining written and transcripted texts, the study is narrowed to a somewhat edited form of language. Therefore, while the results are still valuable, they are not as applicable to daily life. If the study was to have access to real subjects with and without BPD it would be able to more accurately examine the raw effects of Borderline Personality Disorder on emotions and perception.


