Phd proposal: At-risk users behaviors modeling on social networks

Main topic

Data mining (text, images, videos, sounds), social networks

Context

Suicide is a person's deliberate act of ending his/her own life. Suicide reveals serious personal problems but also often reflects a deterioration of the social context in which an individual lives. According to a recent and alarming WHO (World Health Organisation) report (September 4, 2014), one person dies of suicide every 40 seconds in the world - more than all the yearly victims of wars and natural disaster – more than 1,100,000 by year. Most suicide attempts are supported by hospital emergency units. Suicide is a major public health issue with strong socio-economic consequences. For example, the economic cost of suicide was estimated to 5 billion euros in 2009 in France. In the framework of the 2013-2020 Mental Health Action Plan, WHO member states plan 10% reduction in suicide rates in each country before 2020.

Research hypothesis

It is possible to design semi-automatic tools to exploit massive data issued from social networks, to allow dynamic and interactive knowledge discovery used in order to detect at-risk individuals. Scientific and technological objectives. The main objective of the thesis is to design and develop new approaches for the early identification of at risk individuals through their use of the social media. The model of semi-automatic detection of suicidal profiles will be used by psychiatrists to follow on social networks, patients who stayed in their services after a first suicide attempt. We intend to capture a possible deterioration in their mental state in order to offer assistance when needed. In this thesis, the PhD student will design and implement an approach integrating different data mining methods that will be used in a multicenter randomized controlled trial in order to prevent recurrences.

Topic

This subject is a result of a collaboration between LIRMM and the psychiatric emergency department of University Hospital of Montpellier. The main objective is to design and implement new approaches for early detection of at risk individuals through their use of the social media. The model, developed as part of this thesis, semi-automatic detection of suicidal profiles will be used by psychiatrists to follow patients on social networks, who have stayed in their service after a first suicide attempt. We intend to capture a possible deterioration in their mental state to be able to offer assistance when needed. One of the first deliverables of this thesis is a prototype integrating different methods of text mining and will be used as part of a multicenter randomized controlled trial to test the usefulness of the model for recurrence prevention. The design and implementation of this first prototype will allow the acquisition of data on real users of social networks, users who have committed a suicide attempt and supported by the appropriate emergency services. To design this prototype, data are already available and used in Advanse team (tweets, letters...). An important element of this thesis is the design of interactive methods dedicated to health care professionals (psychiatrists, ...) to allow, when an alarm is trigger, the best possible restitution of different the information collected about the patient.

Methods

  • Multi-layer Classification (bagging, boosting, staking) to detect risks symptoms then aggregated via a score defined within the framework of the thesis that makes sense for health professionals ;
  • Deep learning to create a specific indicator for images, videos and sounds by comparing new media to millions of streaming media available on social networks and labelled with information such as "anorexia", "scarification"… ;
  • Consideration of the temporal evolution of the previous indicators (topics drifts, martingale, etc.) ;
  • Aggregation of all the previous indicators in the form of a dashboard, recommendation and alerts for an aid to effective decision healthcare professional ;
  • Active learning to take into account interactions with the health professionals who validate or invalidate indicators.

Some publications

  1. E.K. Moscicki, Identification of suicide risk factors using epidemiologic studies, PCNA, 20 :3, 499-517, 1997
  2. http://www.who.int/mental_health/suicide-prevention/world_report_2014/fr/
  3. M.-A. Vinet, A. Le Jeanic, T. Lefèvre, C. Quelen, et K. Chevreul, "Le fardeau économique du suicide et des tentatives de suicide en France", Rev. DÉpidémiologie Santé Publique, vol. 62, p. S62-S63, févr. 2014.
  4. http://www.who.int/mental_health/action_plan_2013/fr/
  5. M. Donald Tapi Nzali, S. Bringay, C. Lavergne, T. Opitz, J. Azé et C. Mollevi. Construction d’un vocabulaire patient/médecin dédié au cancer du sein à partir des médias sociaux. IC. 2015
  6. Iwan Syarif, Ed Zaluska, Adam Prugel-Bennett, and Gary Wills. 2012. Application of bagging, boosting and stacking to intrusion detection. In Proceedings of the 8th international conference on Machine Learning and Data Mining in Pattern Recognition (MLDM'12), Petra Perner (Ed.). Springer-Verlag, Berlin, Heidelberg, 593-602.
  7. Y. Bengio, I. J. Goodfellow and A. Courville, « Deep Learning », Book in preparation for MIT Press, 2015
  8. C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke and A. Rabinovich, « Going Deeper with Convolution ». In Proceeding of CVPR'2015, June 7-12 2015, Boston, USA.
  9. A. Krizhevsky, I. Sutskever, and G. E. Hinton, « ImageNet Classification with Deep Convolutional Neural Networks ». In Advances in Neural Information Processing Systems 25, NIPS’2012, F. Pereira, C.J.C. Burges, L. Bottou, and K.Q. Weinberger, Eds., pp. 1097–1105. Curran Associates, Inc., 2012.
  10. João Gama, Indrė Žliobaitė, Albert Bifet, Mykola Pechenizkiy, and Abdelhamid Bouchachia. 2014. A survey on concept drift adaptation. ACM Comput. Surv. 46, 4, Article 44 (March 2014), 37 pages.
  11. Saba Babakhani, Niloofar Mozaffari, and Ali Hamzeh. A martingale approach to detect peak of news in social network. 2014.
  12. Mane, K., Schmitt, C., Owen, P., Wilhelmsen, K., and Prasad, S. (2014): Visual Analytics to Enhance Personlized Healthcare Delivery: A data-driven approach to augment clinical decision making. RENCI, University of North Carolina at Chapel Hill.
  13. A. Abdaoui, J. Azé, S. Bringay, et P. Poncelet, FEEL: French Extended Emotional Lexicon. 2014.
  14. T. Nguyen, T. Tran, S. Gopakumar, D. Q. Phung, S. Venkatesh. An evaluation of randomized machine learning methods for redundant data: Predicting short and medium-term suicide risk from administrative records and risk assessments. CoRR abs/1605.01116 (2016).
  15. Y. LeCun, Y. Bengio, G. Hinton. Deep learning. Nature. 521, 436–444 (28 May 2015).
  16. A. Arnab, S. Jayasumana, S. Zheng, P.H.S. Torr. (2016) Higher Order Conditional Random Fields in Deep Neural Networks. In: Leibe B., Matas J., Sebe N., Welling M. (eds) Computer Vision – ECCV 2016. ECCV 2016. Lecture Notes in Computer Science, vol 9906. Springer.

To apply

The candidate must send the following documents:
  • Letter of motivation
  • C.V.
  • Two letters of recommendation
  • School report (bachelor and master degrees)
Send the documents to Dr. Maximilien Servajean, Dr. Sandra Bringay and Dr. Jérôme Azé, at the following emails: