Biarritz

Litterature review based on a semi-automatic text mining process

Abstract

BACKGROUND: Patient healthcare trajectory is a recent emergent topic in the literature, encompassing broad concepts. However, the rationale for studying patients' trajectories, and how this trajectory concept is defined remains a public health challenge. Our research was focused on patients' trajectories based on disease management and care, while also considering medico-economic aspects of the associated management. We illustrated this concept with an example: a myocardial infarction (MI) occurring in a patient's hospital trajectory of care. The patient follow-up was traced via the prospective payment system. We applied a semi-automatic text mining process to conduct a comprehensive review of patient healthcare trajectory studies. This review investigated how the concept of trajectory is defined, studied and what it achieves.
METHODS: We performed a PubMed search to identify reports that had been published in peer-reviewed journals between January 1, 2000 and October 31, 2015. Fourteen search questions were formulated to guide our review. A semi-automatic text mining process based on a semantic approach was performed to conduct a comprehensive review of patient healthcare trajectory studies. Text mining techniques were used to explore the corpus in a semantic perspective in order to answer non-a priori questions. Complementary review methods on a selected subset were used to answer a priori questions.
RESULTS: Among the 33,514 publications initially selected for analysis, only 70 relevant articles were semi-automatically extracted and thoroughly analysed. Oncology is particularly prevalent due to its already well-established processes of care. For the trajectory thema, 80% of articles were distributed in 11 clusters. These clusters contain distinct semantic information, for example health outcomes (29%), care process (26%) and administrative and financial aspects (16%).
CONCLUSION: This literature review highlights the recent interest in the trajectory concept. The approach is also gradually being used to monitor trajectories of care for chronic diseases such as diabetes, organ failure or coronary artery and MI trajectory of care, to improve care and reduce costs. Patient trajectory is undoubtedly an essential approach to be further explored in order to improve healthcare monitoring.
KEYWORDS: Healthcare trajectory; PPS; Semi-automated; Systematic reviews; Text mining; Word cloud.

Supplemental materials

Word Cloud

  • Trajectory corpus contains 11,331 articles;
  • PPS (Prospective Payment System) corpus contains 18,906 articles;
  • MI (Myocardial Infarction) corpus contains 3,782 articles.

nuage_1
nuage_2
nuage_3
1 - Corpus Trajectory
2 - Corpus PPS
3 - Corpus MI

Similarity Analysis

graph_simi_1
1 - Corpus Trajectory
graph_simi_2
2 - Corpus PPS
graph_simi_3
3 - Corpus MI

Text Clustering

dendrogramme_1
1 - Corpus Trajectory (80.4% of articles)
dendrogramme_2
2 - Corpus PPS (86.4% of articles)
dendrogramme_3
3 - Corpus MI (98.4% of articles)

Similarity analysis on clusters 3 and 4 from the first Trajectory clustering corpus

The two clusters pool 1,645 articles between them.

class_3
Cluster 3
class_4
Cluster 4

Text Clustering on SubCorpus

Here we conducted a clustering on unclustered articles in the first step by type of corpus. Nb articles unclustered:

  • 3,160 articles for Trajectory;
  • 2,433 for PPS;
  • 59 for MI.

dendrosubcorpTraj
1 - Corpus Trajectory (99.9% of articles)
DendroSubCorpPPS
2 - Corpus PPS (75.7% of articles)
DendroSubCorpMI
3 - Corpus MI (79.6% of articles)

Last exploration for articles unclustered by this two clusterings

  • There are 591 articles for PPS corpus, so we chose an an exploration by similarity analysis ;
  • There are 12 articles for MI corpus, so we chose an exploration by word cloud.
SASubSubCorpPPS
2 - Corpus PPS
WC-SubSubCorpMI
3 - Corpus MI