Predicting Myocardial Infarctions Based on Patients’ Memory of Heart-Related Symptoms

February 27, 2026

189

Abstract

Anamnesis in a medical environment is a patient’s report of their personal medical history and self-reported symptoms, including memory of any assessments, such as ones using an ECG. However, the effectiveness of anamnesis in diagnosing patients for diseases still requires investigation. This study focuses on identifying the efficacy of anamnesis in predicting two different myocardial complications, a myocardial rupture and a relapse of myocardial infarction based on anamnesis. A quantitative data analysis was conducted on a dataset cleaned through Python programming. The cleaned dataset was fed into a Random Forest classification model to attempt to depict how effective anamnesis was in predicting the myocardial complications. The model portraying a myocardial rupture achieved a precision of 40.00%, a recall of 16.67%, and F1 score of 23.53%. The other model representing a relapse of a myocardial infarction achieved a precision of 40.00%, a recall of 20.69%, and F1 score of 27.27%. These metrics show that the two models were not accurate and cannot be used to produce an answer to the research question of whether anamnesis is effective in predicting whether patients will have myocardial complications. In the future, researchers should find a dataset that offers more precise and specific questions in the dataset.

Keywords: anamnesis, machine learning, machine learning model, myocardial infarction

Introducion

Myocardial infarction is one of the most lethal diseases across the globe and has grown to be one of the most serious conditions in the medical industry. Annually in the United States, around 1.5 million people get a myocardial infarction, resulting in about 30 deaths per hour¹. In the event of a myocardial infarction, several complications often result from the occurrence of the disease. For the purpose of helping patients with myocardial infarctions or complications, doctors use the tool of anamnesis as an essential aid in diagnosing patients.

Anamnesis is one of the most chief and common medical tools used in the medical field, and it only involves the use of a couple tools, such as a stethoscope or conversing with a patient². In a medical environment, a doctor uses anamnesis to recognize a patient’s symptoms, identify risk factors, and lead to accurate diagnoses that save countless lives³. Through the use of the simple diagnostic tool, information about a patient’s family history, lifestyle, and past medical history can be gathered³. To get a better insight on the condition of patients, anamnesis is frequently used with a different kind of examination, such as a physical examination. Doctors can understand what diseases a patient could be diagnosed with through anamnesis⁴. In the cardiac field, anamnesis can accurately increase correct diagnosis of a myocardial rupture or a relapse of a myocardial infarction. In modern medical society, the simplicity and efficacy of anamnesis justifies it as one of the most dependable tools.

Anamnesis is used widely across various medical fields, being relevant in the field of primary care to the field of cardiology. In relation to the lives of patients at risk of heart complications, doctors are able to identify if a patient has a history of smoking or is at risk due to family history through anamnesis, being useful when trying to get an understanding of a patient’s risk factors³. Resulting from the anamnesis, a doctor would customarily direct patients to another form of testing, like an electrocardiogram⁵. For instance, in pediatric cases of cardiac complications, a child’s family history is often acquired through anamnesis for the purpose of trying to treat any possible ailment as soon as possible³.

There are many different symptoms flagged and acknowledged by cardiologists to help better diagnose patients. Symptoms, such as nausea, shortness of breath, angina pectoris, and a recent heart attack are all signs that would prove helpful to doctors when they are diagnosing patients. Doctors can interpret these symptoms as indicators of different myocardial complications, such as a myocardial rupture, a myocardial infarction, or chronic heart failure⁶. Anamnesis is a vital diagnostic tool when diagnosing different complications in the cardiovascular and cardiac fields.

Despite the anamensis being described as useful in a medical environment, several questions still remain concerning its efficacy in diagnosing myocardial complications. Several studies have been written to signify the importance and effectiveness of anamnesis in predicting myocardial complications. For instance, in one study, anamnesis immensely helped and aided diagnosis of simulated valvulopathies⁴. In a separate investigation done by a group of researchers, researchers explored the clinical effect of different simulated situations on EKG interpretation. An EKG, or electrocardiogram, is an examination that observes the electrical signals of the heart in order to best diagnose heart conditions. Through this experiment, data proved to show anamnesis increased accuracy of diagnosis from 4% to 12% when put in comparison to an experiment where no such information was granted ⁴.

These experiments done by various groups of researchers suggest how an accurate machine learning model should depict the efficacy of anamnesis in predicting myocardial complications. The machine learning model made and the data being analyzed should eventually supply a clear explanation regarding the effectiveness of anamensis when used to predict the diagnosis of myocardial complications. Anamnesis is an important tool used to aid in diagnosis of patients with health complications.

Literature Review

There are a couple of different studies that have been done on anamnesis and its use in various different medical fields and environments. Many different researchers have conducted experiments and surveys to find a substantial amount of information on that field of anamnesis. In a study done by Teixeira et al., researchers were trying to find the effectiveness of anamnesis when combined with a physical examination and lab tests in clinical decision making in common clinical situations. The test was conducted with 80 patients that had cardiac diseases. They found that in 80% of patients, anamnesis positively affected diagnosis, but was more clinically decisive in a few proportion of patients. The original diagnosis with anamnesis was enhanced when a physical examination was added⁵. When compared to the findings of the study presented in this paper, these findings could add context as to how the results could have gone if the anamnesis related questions in the dataset were useful in predicting anamnesis’ effectiveness. In his study, he found that anamnesis influenced 80% of patients positively, but was enhanced by other diagnostic tools, such as physical examination.

In a study conducted by Dezman et al., it was seen whether physical examination and medical history would be enough to predict the likelihood of ACS, acute coronary syndrome, for a 50 year old man. They found that both physical examination and history are useful in predicting the syndrome, but they cannot replace tests, such as an ECG⁷. In a study carried out by Swap and Nagurney, researchers reviewed several articles concerning the topic of identifying acute coronary syndrome in in patients with chest pain; they found that chest pain history is not enough alone to safely diagnose a patient with acute coronary syndrome, though certain elements of history are helpful when finding a patient’s likelihood for the disease⁸.

In an investigation done by van Dijk et al., 503 adult patients with transient loss of consciousness (TLOC) were given an initial evaluation to gather information on medical history, a physical examination, and ECG history. The initial evaluation was later used to make a final diagnosis. The accuracy of the initial evaluation in making a final diagnosis was 93%, suggesting that medical history and physical examination both are very useful when it comes to diagnosing patients with TLOC. The researchers suggest that the use of additional testing beyond medical history, a physical examination, and ECG history is unnecessary⁹. In a study conducted by Eriksson et al., examiners did an observational study on 1,167 patients with chest pain to observe which elements in the medical history and physical examination predicted MACE in a 30-day period. Certain features in the medical history could predict a 30-day MACE, but it also reinforces the idea that to safely rule out any event or disease, medical history should be paired with some other form of testing¹⁰.

In a study by Brandberg et al., they collected medical history from patients with acute chest pain through a computerized history taking (CHT) program on a tablet. The results showed that a majority of patients provided enough information for doctors to categorize them for risk. This form of anamnesis proves that a survey done by doctors could be accurate if done the correct way¹¹. In a separate investigation, Brandberg et al. collected data from a CHT program to identify low risk rates of major adverse cardiac events (MACE) for certain patients with chest pain. This CHT program gathered information on a patient’s history, age, and risk factors; this information was enough to complete the study, correctly identifying patients at a low risk of major adverse cardiac events¹². Reinforcing the role of CHT systems as a form of anamnesis, researchers Zhakina et al. reviewed these systems, finding that anamnesis information is helpful to provide additional support to clinical decisions¹³.

On the topic of anamensis used in a machine learning model, researchers Fukuzawa et al. conducted experiments using different AI models to predict several medical cases. For example, ChatGPT accurately diagnosed 76.6% of cases solely based on medical history, providing evidence to conclude that medical history remains a principal part of AI-assisted medical diagnoses¹⁴. In a study conducted by Min et al., researchers surveyed over 14,000 patients and collected information, such as cardiac imaging, for all of them. Researchers then built a predictive model based off of the patient’s medical history that could accurately predict several events of a patient, such as death or myocardial infarction¹⁵.

Based on this, it is possible that anamnesis would have done a good job in predicting patients’ health problems. In a research paper written by Joksimović & Bastać, it states that about 50%-70% of a diagnosis is made purely off of the anamnesis¹⁶. In a different study by Wang et al., they did a study of diagnostic accuracy when concerned with patients in the field of ophthalmology. In this study, there were 115 patients from the Doheny Eye Center at the University of California Los Angeles (UCLA). These 115 patients were then distributed into different groups based on a variety of classification¹⁷. In a group called “Afferent”, there were 43 patients. Of those 43 cases, 25 cases were diagnosed correctly based upon the chief complaint, which is the main problem a patient is having. Adding the diagnostic tool of the patient history, the diagnostic accuracy rose by 11 cases, and the diagnostic accuracy reached 100% when physical examination was added¹⁷.

Methodology

In this study, a quantitative data analysis was employed to assess a dataset, and the analysis showed the impact of different features on complications that could result from myocardial infarctions. This dataset and introductory paper related to the dataset were both obtained from the UC Irvine Machine Learning Repository, under the title “Myocardial infarction complications”¹⁸. Thus, in this dataset, the researchers analyzed a database that was relevant to that field. The analyzed database was collected from the Krasnoyarsk Interdistrict Clinical Hospital in Russia in the years of 1992-1995. This database was only publicly distributed a few years prior to the making of the research paper in 2020.

This original dataset consisted of 1700 patients and 110 features that observed the clinical phenotypes in the patients. In addition, the base also included 12 features labelled as targets that represented possible complications of a myocardial infarction. Of the 12 features, 11 features represent binary variables, meaning the only possible values are 0 or 1, meaning “yes” the patient has the disease or “no” the patient does not. One feature out of the 12 represents a variety of numbers from 1 to 7 that represent a patient’s cause of death.

Furthermore, the database can be used to predict myocardial complications from a variety of different times. It can predict complications of myocardial infarction based on information collected about the patient at the time of admission or three days after admission. For the data analysis applicable to this study, data was chosen that would be related to anamnesis and had to do with the self-reported symptoms of the patients.

In the data analysis, the 110 features in the dataset were shortened to 19 features. The other features in the dataset were not relevant to the study of anamnesis in relation to myocardial infarction complications; thus, the select 19 anamnesis-related features were kept in the dataset, while the other features were removed by cleaning the dataset. Of the 19 features, two were features that represented the age and sex of the patient, while the other 17 features were relevant to the study of anamnesis in relation to myocardial complications. The 12 features that were labelled as targets were reduced to just two. The two chosen were the features named “RAZRIV” and “REC_IM”. “RAZRIV” stood for a myocardial rupture, and “REC_IM” stood for a relapse of the myocardial infarction. These were the two features chosen to be predicted through the machine learning model for the study. Due to this fact, two datasets were cleaned and used to predict both of the targets¹⁹.

To clean the dataset, the Python programming language was used. Version 3.12.7 of Python was used for coding, along with version 2.2.2 of the package Pandas. All programming and analysis were done through Python Pandas libraries on Jupyter notebooks on the Anaconda application. There were several steps taken in order to clean the data. The first steps taken were in analyzing the dataset and removing any features that were not relevant to the data being studied. Before uploading the dataset to Jupyter notebook, any unneeded features were removed. Doing this beforehand made the cleaning of the data more efficient.

It had to be decided if the feature needed standardizing, scaling, or encoding. Standardization was used if one column’s values had significantly different ranges when compared to other values, and scaling would be done on numerical features where the feature is not normally distributed. Encoding would only be needed if the data would have to be transformed from one format to another; for example, it is possible to encode features that contain non-numerical values, such as color or size, to numerical values in the form of number. Additionally, duplicate rows, missing values, and outliers would have to be found and fixed in the datasets.

None of the features in the dataset needed standardization or scaling. The only feature that needed encoding was the “SEX” feature; this feature was encoded through one-hot encoding from a numeric to categorical variable. This means that the feature was changed from one that is represented through numbers to one represented through categories. One-hot encoding was used because it was easily compatible with most machine learning models, including the one used in this study, a Random Forest model.

After the dataset had been cleaned, the dataframe was uploaded through a csv file into a Python Pandas library. The data was checked for missing values, duplicate rows, and outliers. After checking these, the data was cleaned using programming codes to delete any duplicate rows and fix any missing values. Any column that needed to be converted from numeric to categorical was done.
After the data cleaning had been completed, a code was run to extract information from the dataset. The code used a Scikit-learn library to make a Random Forest model. The ratio of train, validation, and test data was 80/0/20. 80% of the data was in the training subset, no data was set in the validation subset, and 20% of the data was in a test subset. The train/test split of 80/20 was ideal for the dataset used in this study due to its small size, which would help make a better model.

A Random Forest uses decision trees, which are trees that branch off in two directions based on if the statement applicable is true or false. A Random Forest model consists of multiple trees to improve accuracy when producing a result. In this study, Random Forest was used to present a machine learning model that represented how well the anamnesis-related features did in predicting two myocardial complications. This code extracted information, such as the confusion matrix, sensitivity, and specificity of the dataset. This information resulted in much useful information applicable to the outcome of the quantitative data analysis.

In a Random Forest model, there are no exact loss functions, optimization methods, or activation functions. Instead of loss function, Random Forests for classification use Gini Impurity, an impurity measure, for the nodes in decision trees. An impurity measure is used to display the caliber of junctions in a decision tree by evaluating how mixed the classes are for a particular junction. A low impurity displays that most or all points are in one class, while a high impurity shows that there is a great mix of data points between classes. Random Forests use bootstrap sampling and feature bagging as an optimization process. Bootstrap sampling is the random sampling of rows in a dataset, creating multiple subsets of the same dataset, and feature bagging is a process in a Random Forest where only a random subset of features is considered at each branching point in a decision tree. The Random Forest model doesn’t have any activation functions.

To gather more information on the dataset, the code was adjusted to now assess the model based on the precision, recall, and F1 score. This data gathered through the code would provide a different insight in contrast to the original code that evaluated the model through specificity and sensitivity. One of the final steps was applying the dataframe and presenting the data in two Random Forest models, one for each target.

This model showed the accuracy of anamnesis and how well it did in predicting the several different complications of a myocardial infarction. The machine learning models helped answer the salient question if a patient’s self-reported symptoms are beneficial in predicting future myocardial complications.

There is much information about the various uses and effects of anamnesis in general and in specific fields. Useful information was accumulated from various different research papers and websites. The introductory paper on the UC Irvine Machine Learning Repository, titled “Trajectories, bifurcations, and pseudotime in large clinical datasets: applications to myocardial infarction and diabetes data”, was a primary source used in gathering information for this research paper¹⁸. This introductory paper on the repository was directly tied to the dataset that was used, providing a clearer understanding into the data and the different variables having to do with their study. Along with this dataset, many others were notable in compiling data, such as papers covering the topics of pediatric cardiac anamnesis, coronary heart disease, and cardiovascular physical examination and anamnesis. Considerable background information was put into this study of anamnesis and its effectiveness in predicting myocardial complications.

Results

Results of the machine learning model were presented through the precision, recall, and F1 score. Precision and recall are both classification metrics, meaning these tools are used to assess the performance of a machine learning model. The F1 score is a harmonic average between precision and recall, meaning it prioritizes the smaller value and will consequently be low if one of the precision or recall is low. It is commonly used for measuring classification performance²⁰. These metrics are applicable to be used with confusion matrices, which show what a machine learning algorithm did right versus what it did wrong. In a confusion matrix, there are four outcomes: false positive, false negative, true positive, and true negative. These four outcomes are presented in a confusion matrix, and help demonstrate how precisely a machine learning model can predict a specific topic²⁰.

In context to the study presented, the four outcomes in a confusion matrix show important information on the accuracy of the machine learning model. A true negative happens when the model correctly predicts that the patient didn’t have the complication, while a false negative occurs when the model predicts that the patient didn’t have the complication, when in reality, the patient actually did have it. Similarly, a true positive occurs when the model correctly predicts that the patient had the complication, and a false positive occurs when the model predicts that the patient had a complication, even though the patient didn’t actually have it.

	Predicted Negative	Predicted Positive
Actual Negative	True Negative (TN)	False Positive (FP)
Actual Positive	False Negative (FN)	True Positive (TP)

Table 1| Possible outcomes in a confusion matrix

Accuracy can be defined as (True Positive + True Negative / True Positive + True Negative + False Positive + False Negative). In simpler terms, this can be interpreted as (Total correct guesses / Total number of guesses). In contrast, recall is (True Positives / True Positives + False Positive), meaning the metric is basically (Total correct positive guesses / All positive guesses). Finally, precision is the (True Positives / True Positives + False Negatives)²¹. The F1 score is used for maximizing precision and recall, therefore, an equation for trying to find the F1 score would be (2 x Precision x Recall / Precision + Recall) [20]. All four of these metrics are very appropriate to confusion matrices.

Oftentimes, researchers will use either specificity, accuracy, and sensitivity, or precision, recall, and F1 score to see how well the model predicts data. In the machine learning model relevant to this data, F1 score, precision, and recall were used for several reasons. In some models there are class imbalances, which means that the total number of subjects in one class greatly exceeds the other. For example, in a model predicting how many people have a certain rare disease, the number of people who do not have the disease will be far greater than the patients who are diagnosed with the rare disease. In most scenarios, the accuracy of a model may seem high because the model may be good at predicting the results of a majority class and be inadequate in predicting the minority class²¹.

Furthermore, specificity measures the performance in predicting the majority class, therefore, it may be high by default, without even taking into account the minority class. Sensitivity takes account of the positive prediction, but does not actually predict how many predictions were correct from those positive predictions²⁰. In comparison, precision shows how many predictions that were positive were actually correct, and the F1 score balances precision and recall, being immensely useful when dealing with models where a class imbalance is present.

In this study, a total of 1660 patients were studied and retained for the cleaned database. Out of the original 1700 patients, 40 of the patients were taken out of the study due to several reasons. These reasons included having missing values or being duplicate rows. Out of the research present in the data, there are several features that could have proved more important than others when put in context to the data being studied and the research question. For instance, a feature in the dataset represented obesity present in the anamnesis¹⁹. Obesity could have pointed to several heart complications and is one of the features that can be observed through a simple anamnesis. The results of the study were reliant on the data for each feature and how well they predicted both targets.

All of these metrics were calculated through a Python code run through a Jupyter notebook. These statistics show how the machine learning model did in predicting the two different targets, relapse of a myocardial infarction and myocardial rupture. For the target called “RAZRIV”, representing a myocardial rupture, the model achieved a precision of 40.00%, a recall of 16.67%, and F1 score of 23.53%. The confusion matrix disclosed that the machine learning model predicted 2 true positives, 3 false positives, 10 false negatives, and 317 true negatives. For the target variable “REC_IM”, representing a relapse of a myocardial infarction, the model attained a precision of 40.00%, a recall of 20.69%, and F1 score of 27.27%. The confusion matrix for this target expressed 294 true negatives, 9 false positives, 6 true positives, and 23 false negatives.

	Predicted Negative	Predicted Positive
Actual Negative	317	3
Actual Positive	10	2

Table 2 | Confusion matrix for target “RAZRIV”

	Predicted Negative	Predicted Positive
Actual Negative	294	9
Actual Positive	23	6

Table 3 | Confusion matrix for target “REC_IM”

Discussion

The results constructed from the study do not necessarily support or reject the original hypothesis that stated that anamnesis can accurately increase correct diagnosis of a myocardial rupture or a relapse of a myocardial infarction. While the hypothesis assumed that the anamnesis would do well in predicting the two myocardial complications, which were a myocardial rupture and relapse of a myocardial infarction, the results suggest the initial expectations were inconclusive. For the target “RAZRIV”, the metrics of the recall, precision, and F1 score were quite low. Additionally, as stated before, the accuracy could be quite misleading due to it being prematurely high because of a class imbalance. In comparison, for the target “REC_IM”, the three metrics of the recall, precision, and F1 score had low percentages. This suggests that for both targets, the machine learning model’s performance was limited in predicting the myocardial complications based on the anamnesis related questions in the dataset.

One reason that the metrics observed were lower than expected is because the features that were used to predict the complications have to be put into question. Exempting the features “AGE” and “SEX”, the other 17 features were relevant to the topic of anamnesis. After looking into the feature descriptions, there are many details of features that could potentially lead to the poor results of this study.
Anamnesis is subjective to each patient, and each patient has their own pain tolerance. For example, to one patient, heartburn could be confused with angina or a myocardial infarction, but to another heartburn could appear as nothing. In another example, if a patient has a panic or anxiety attack, they could misinterpret that for atrial fibrillation.

There are many features that observe cardiac problems in the anamnesis and could be subject to possible problems when diagnosing it in the anamnesis. For example, the feature “nr_07”, represents ventricular fibrillation in the anamnesis. In a simple anamnesis, finding and diagnosing ventricular fibrillation must have some bias based on what the patient says. In other cases, it would be very hard or impossible to diagnose the cardiac problem. The same could be said for other features; for example, feature “np_08” represents a complete LBBB in the anamnesis, and feature “np_01” stands for a first degree AV block in the anamnesis. A LBBB is a Left Bundle Branch Block, which is a heart problem where electrical signals to the left ventricle of the heart are delayed or completely blocked, and an AV block is a heart issue where the rhythm of the electrical signals is delayed.

These are close to impossible to diagnose based on a simple anamnesis, so doctors most likely had used some tools to diagnose these diseases; on another note, doctors could have also decided these cardiac problems based on what the patients told them, but as stated before, this could come with some subjectiveness as to what the patient thinks happened to them. When analyzing the results of this study, it is important to find possible problems that led to the results and why the machine learning model was so inaccurate.

In the future, doctors could take steps in an effort to interpret if patients could reliably understand their own health complications. Surveys used to gather information on patients should include questions about whether patients can interpret their symptoms. For example, patients could mistake heartburn-related chest discomfort or asthma with serious cardiac abnormalities. An example survey question could ask whether a patient knows the difference between chest pain due to heart burn compared to chest pain due to a heart condition. In the survey, it wouldn’t be practical to ask patients if they have had problems, such as heartburn, because heartburn doesn’t always affect the chest; for instance, heartburn could be sensed in the neck and esophagus for some patients. Surveys should involve getting an understanding whether or not patients understand myocardial symptoms and complications.

In application to the real world, this study on anamnesis and its effectiveness in predicting future myocardial complications can support its usefulness. In current medical environments, anamnesis is used to help doctors diagnose diseases and identify risk factors to various health complications. Doctors use anamnesis as an important diagnostic tool to gather information on a patient’s life.

Looking at these various research experiments and data collected, these findings can be compared to the results presented in the study relevant to this current research paper. The data collected from both Joksimovic and Bastac and Wang et al. point to the same conclusion¹⁶,¹⁷. They both point to anamnesis being a great tool when it comes to diagnosing health complications. Both of those groups of researchers point to more than half of patients being correctly diagnosed based purely off of anamnesis and the chief complaint.

On another note, a couple reasons were found to determine why the model was unable to reliably predict the complications in relation to the original dataset by Golovenkin et al.¹⁹. The model used data that was collected from the Krasnoyarsk Interdistrict Clinical Hospital in Russia in the years of 1992-1995. This dataset was used because it provided public and comprehensive records relevant to the study of myocardial infarction complications. The patient selection criteria and the fact the data is from the 1990s could suggest that the survey questions could be outdated and not feasible to be used in this study related to using anamnesis to predict myocardial complications¹⁹. A reason could have been due to the anamnesis questions that were asked in the features. Most features were questions related to anamnesis that a doctor could not observe through a simple anamnesis.

In the original dataset, there were four possible times information could have been collected in the anamnesis anamnesis. Information could have been collected at time of admission, or at 24, 48, or 72 hours after admission. According to Section 3 of the descriptive statistics for the myocardial infarction complication dataset, anamnesis data for the features relevant to this study was collected at time of admission. This applied for all the features used with creating the Random Forest model. Therefore, the features designated as anamnesis were the patients’ memory of their own medical history and symptoms, including any memory of any prior assessment that resulted in a diagnosis for a complication²².

The main goal of the original paper by Golovenkin et al. was separate to the research objective of the study in this paper¹⁸. The numerous features and targets in the original dataset were for a different goal, and the study in this paper took some features out of the dataset to build a machine learning model to predict separate health complications. These features were taken out of context and were consequently impractical in making an accurate machine learning model for the purpose of predicting myocardial infarctions. Due to the anamnesis related questions in the dataset being unable to develop a model, the model was inaccurate and unable to give a definite answer to the research question.

The findings obtained in the present study could be better understood by separate studies; the other studies suggest that anamnesis is a reliable tool when it comes to diagnosing patients, providing substance on how the study would have gone if the model was accurate. The various papers on the topic of anamnesis and its effectiveness when applied to diagnosing future health complications add great context and additional information when analyzing the outcomes of the research obtained in this paper. In conclusion, anamnesis can still be assumed as a very useful tool when it comes to diagnosing patients for diseases based on various studies done by different researchers.

Feature Name	Feature Description	Role	Type
ID	Record ID (ID): Unique identifier. Cannot be related to participants. It can be used for reference only.	ID	Integer
AGE	Age of patient.	Feature	Integer
SEX	0: female, 1: male	Feature	Binary
nr_11	Observing of arrhythmia in the anamnesis	Feature	Binary
nr_01	Premature atrial contractions in the anamnesis	Feature	Binary
nr_02	Premature ventricular contractions in the anamnesis	Feature	Binary
nr_03	Paroxysms of atrial fibrillation in the anamnesis	Feature	Binary
nr_04	A persistent form of atrial fibrillation in the anamnesis	Feature	Binary
nr_07	Ventricular fibrillation in the anamnesis	Feature	Binary
nr_08	Ventricular paroxysmal tachycardia in the anamnesis	Feature	Binary
np_01	First-degree AV block in the anamnesis	Feature	Binary
np_04	Third-degree AV block in the anamnesis	Feature	Binary
np_05	LBBB (anterior branch) in the anamnesis	Feature	Binary
np_07	Incomplete LBBB in the anamnesis	Feature	Binary
np_08	Complete LBBB in the anamnesis	Feature	Binary
np_09	Incomplete RBBB in the anamnesis	Feature	Binary
np_10	Complete RBBB in the anamnesis	Feature	Binary
endocr_01	Diabetes mellitus in the anamnesis	Feature	Binary
endocr_02	Obesity in the anamnesis	Feature	Binary
endocr_03	Thyrotoxicosis in the anamnesis	Feature	Binary
RAZRIV	Myocardial rupture	Target	Binary
REC_IM	Relapse of myocardial infarction	Target	Binary

Table 4 | Characterization of features from dataset

Conclusion

Anamnesis is a very useful diagnostic tool that helps doctors across the globe treat patients. The diagnostic tool can gather information on a patient’s past medical history, risk factors, and lifestyle. Seeing how well it gathers patient information, the question arises whether or not it is, in fact, useful and effective when it comes to the stage of diagnosing a disease of a patient or administering a treatment. In this study, two machine learning models were made to diagnose patients with myocardial complications and were used to see how well the machine learning model could predict two different targets, a myocardial rupture or relapse of myocardial infarction, based on anamnesis. The model built on this subset of data was inaccurate. The results were characterized through the metrics of the recall, precision, and F1 score, of which all were low, proving that the model had a poor performance when it came to diagnosing the patients based on the anamnesis-related questions it was presented.

However, much information can be taken away to be learned from this study. This study supports that anamnesis is helpful when diagnosing a patient. The answer to whether or not it is effective in diagnosing patients with myocardial complications warrants further research. When trying to answer the initial research question proposed in this study, analyzing various researchers’ results and different research papers when studying anamnesis can be proven useful.

In this study, it was concluded that the machine learning model was erroneous in presenting the effectiveness of anamnesis in predicting two different myocardial infarctions. This was largely due to the fact that the anamnesis in the form of questions that were asked in the dataset could not be used to develop an accurate model. Future quantitative data analysis on the efficacy of anamnesis in diagnosing myocardial infarctions should utilize a dataset that is composed of more definite features that could aid in creating an accurate model. In conclusion, the model was inaccurate, so the negative results obtained through this study cannot be used to solve the question on anamnesis being an effective tool in diagnosing patients for myocardial complications.

References

Ritchey, M. D., Wall, H. K., Gillespie, C., George, M. G., Jamal, A., & Division for Heart Disease and Stroke Prevention, CDC (2014). Million hearts: prevalence of leading cardiovascular disease risk factors–United States, 2005-2012. MMWR. Morbidity and mortality weekly report, 63(21), 462–467 [↩]
K. Zlatkova, & Y. Zlatkov. Anamnesis of Cardiovascular Disease. Knowledge International Journal, 68(4). (n.d.). [↩]
C. Zucchermaglio, F. Alby, & M. Fatigante. What counts as illness? Anamnesis as a collaborative activity. TPM: Testing, Psychometrics, Methodology in Applied Psychology, 23(4), 471–487. (2016) [↩] [↩] [↩] [↩]
I. Masic, Z. Begic, N. Naser, & E. Begic. Pediatric cardiac anamnesis: Prevention of Additional Diagnostic Tests. International Journal of Preventive Medicine, 9, Article 5. (2018) [↩] [↩] [↩]
I. S. Teixeira, V. L. Borges, N. Viola, H. T. Moreira, A. P. Filho, A. Schmidt, J. A. Marin-Neto, & M. M. D. Romano. How Cardiovascular Physical Examination Impacts Clinical Decision-Making in Various Scenarios of Cardiac Valvular Diseases. Sociedade Brasileira de Cardiologia, 1–9. (n.d.). https://doi.org/10.36660/abc.20240272i [↩] [↩]
I. N. N. Mahmuda, N. Nurkusumasari, F. Nofaldi, P. P. Astuti, F. D. Syafitri, & Dessy. Coronary Heart Disease: Diagnosis and Therapy. Solo Journal of Anesthesi, Pain, and Critical Care, 1(2), 74-87. (2021). https://dx.doi.org/10.20961/soja.v1i2.54984 [↩]
Dezman, Z., Mattu, A., & Body, R. (2017). Utility of the History and Physical Examination in the Detection of Acute Coronary Syndromes in Emergency Department Patients. Western Journal of Emergency Medicine, 18(4), 752–760. https://doi.org/10.5811/westjem.2017.3.32666 [↩]
Swap, C. J., & Nagurney, J. T. (2005). Value and Limitations of Chest Pain History in the Evaluation of Patients With Suspected Acute Coronary Syndromes. JAMA, 294(20), 2623–2629. https://doi.org/10.1001/jama.294.20.2623 [↩]
Van Dijk, N., Boer, K. R., Colman, N., Bakker, A., Stam, J., Van Grieken, J. J. M., Wilde, A. A. M., Linzer, M., Reitsma, J. B., & Wieling, W. (2007). High Diagnostic Yield and Accuracy of History, Physical Examination, and ECG in Patients with Transient Loss of Consciousness in FAST: The Fainting Assessment Study. Journal of Cardiovascular Electrophysiology. https://doi.org/10.1111/j.1540-8167.2007.00984 [↩]
Eriksson, D., Khoshnood, A., Larsson, D., Lundager-Forberg, J., Mokhtari, A., & Ekelund, U. (2020). Diagnostic Accuracy of History and Physical Examination for Predicting Major Adverse Cardiac Events Within 30 Days in Patients With Acute Chest Pain. The Journal of Emergency Medicine, 58(1), 1–10. https://doi.org/10.1016/j.jemermed.2019.09.044 [↩]
Brandberg, H., Sundberg, C. J., Spaak, J., Koch, S., Zakim, D., & Kahan, T. (2021). Use of Self-Reported Computerized Medical History Taking for Acute Chest Pain in the Emergency Department – the Clinical Expert Operating System Chest Pain Danderyd Study (CLEOS-CPDS): Prospective Cohort Study. Journal of Medical Internet Research, 23(4), e25493. https://doi.org/10.2196/25493 [↩]
Brandberg, H., Sundberg, C. J., Spaak, J., Koch, S., Zakim, D., & Kahan, T. (2025). Performance of computerized self-reported medical history taking and HEAR score for safe early rule-out of cardiac events in acute chest pain patients: the CLEOS-CPDS prospective cohort study. European Heart Journal – Digital Health, 6(1), 104–114. https://doi.org/10.1093/ehjdh/ztae087 [↩]
Zhakhina, G., Tapinova, K., Kanabekova, P., & Kainazarov, T. (2023). Pre-consultation history taking systems and their impact on modern practices: Advantages and limitations. Journal of Clinical Medicine of Kazakhstan, 20(6), 26–35. https://doi.org/10.23950/jcmk/13947 [↩]
Fukuzawa, F., Yanagita, Y., Yokokawa, D., Uchida, S., Yamashita, S., Li, Y., Shikino, K., Tsukamoto, T., Noda, K., Uehara, T., & Ikusaka, M. (2024). Importance of Patient History in Artificial Intelligence–Assisted Medical Diagnosis: Comparison Study. JMIR Medical Education, 10(1), e52674. https://doi.org/10.2196/52674 [↩]
Min, J. K., Dunning, A., Gransar, H., Achenbach, S., Lin, F. Y., Al-Mallah, M., Budoff, M. J., Callister, T. Q., Chang, H.-J., Cademartiri, F., Maffei, E., Chinnaiyan, K., Chow, B. J. W., D’Agostino, R., DeLago, A., Friedman, J., Hadamitzky, M., Hausleiter, J., Hayes, S. W., & Kaufmann, P. (2015). Medical History for Prognostic Risk Assessment and Diagnosis of Stable Patients with Suspected Coronary Artery Disease. The American Journal of Medicine, 128(8), 871–878. https://doi.org/10.1016/j.amjmed.2014.10.031 [↩]
Joksimović, Z., & Bastać, D. (2022). Anamnesis – the skill and art of clinical medicine [↩] [↩]
Wang, M. Y., Asanad, S., Asanad, K., R. Karanjia, & A. A. Sadun. Value of medical history in ophthalmology: A study of diagnostic accuracy. Journal of Current Ophthalmology, 30(4), 359-364. (2018). https://doi.org/10.1016/ [↩] [↩] [↩]
S. Golovenkin, J. Bac, A. Chervov, E. M. Mirkes, Y. V. Orlova, E. Barillot, A. N. Gorban, & Zinovyev. Trajectories, bifurcations and pseudotime in large clinical datasets: Applications to myocardial infarction and diabetes data. (2022, May 15). [↩] [↩] [↩]
Golovenkin, S., Shulman, V., Rossiev, D., P. Shesternya, S. Nikulina, Y. Orlova, & V. Voino-Yasenetsky. Myocardial infarction complications [Dataset]. UCI Machine Learning Repository. (2020). https://doi.org/10.24432/C53P5M [↩] [↩] [↩] [↩]
Baratloo, A., Hosseini, M., Negida, A., & E. G. Ashal. Part 1: Simple definition and calculation of accuracy, sensitivity and specificity. (n.d.). https://pmc.ncbi.nlm.nih.gov/articles/PMC4614595/ [↩] [↩] [↩]
Mohit. (2025, May 2). Classification Problem: Relation between Sensitivity, Specificity and Accuracy. Analytics Vidhya. https://www.analyticsvidhya.com/blog/2021/06/classification-problem-relation-between-sensitivity-specificity-and-accuracy/ [↩] [↩]
Golovenkin, S., Gorban, A., Mirkes, E., Shulman, V., Rossiev, D., Shesternya, P., Nikulina, S., Orlova, Y., & Dorrer, M. (2020). Myocardial infarction complications Database [Dataset]. In University of Leicester. https://doi.org/10.25392/leicester.data.12045261 [↩]

Predicting Myocardial Infarctions Based on Patients’ Memory of Heart-Related Symptoms

Abstract

Introducion

Literature Review

Methodology

Results

Discussion

Conclusion

References

LEAVE A REPLY Cancel reply

POPULAR CATEGORIES

NAVIGATION

ABOUT US

Abstract

Introducion

Literature Review

Methodology

Results

Discussion

Conclusion

References

RELATED ARTICLESMORE FROM AUTHOR

A Novel Hybrid Approach to Stroke Prediction Using EEG Signals and Machine Learning Methods

Stratifying the Cystic Fibrosis Lung Microbiome by Age Using Diversity, Differential Abundance, and Core Microbiome Analyses: A Community-Based Approach to Microbial Interactions

Current and Future Directions in Oncolytic Virotherapy for Pancreatic Ductal Adenocarcinoma

LEAVE A REPLY Cancel reply

POPULAR CATEGORIES

NAVIGATION

ABOUT US

RELATED ARTICLES MORE FROM AUTHOR