Data
Science
& Soft
Computing
Lab
Full
Members
Dr Daniel Stamate, Lab lead, Data Scientist, Goldsmiths, and University of Manchester
Prof Fionn Murtagh, Data Scientist, Goldsmiths
Dr Ida Pu, Computer Scientist, Goldsmiths
Alexandra Stepanenko, Accelerated Knowledge Transfer - AKT Research Associate, Goldsmiths
Mihai Ermaliuc, PT PhD Candidate Data Science at Goldsmiths, working in Neural Networks, Generative Adversarial Networks, and Large Language Models
Henry Musto, PT PhD Candidate in Data Science, working in predicting Dementia with Survival and Classification Statistical and Machine Learning models on ADNI and ELSA cohorts
John Langham, PT PhD Candidate at Goldsmiths, working on predicting Risk of Dementia with Machine Learning on routine primary care records on CPRD cohorts
Mohamed Saber, PT PhD Candidate Data Science at Goldsmiths, working in Financial Fraud Detection
Jiri Marek, PT PhD Candidate Data Science at Goldsmiths, working in Behavioural Finance
Associated
Members
Prof
Daniel Stahl,
Professor
Medical Statistics and Statistical Learning, Lead of Precision
Medicine and Statistical Learning Group, Institute of Psychiatry,
Psychology and Neuroscience, King's College London
Prof Doina Logofatu, Computer Scientist and Mathematician, Frankfurt University of Applied Sciences
Dr Mihaela Breaban, Computer Scientist, University of Iasi
Dr Olesya Ajnakina, Senior Data Scientist and Statistician, King's College London
Mr Frederic Marechal, Data Scientist in industry
Dr Charlotte Wu, Strategic Physician Leader in health systems & technology innovation, Harness Health Partners
Data
Science MSc Students interns
Prad
Sree Davuloori, Riya Haran
Former
Members
Dr
Raph Olaniyan, Dr Asei Akanuma, Dr Wajdi Alghamdi, Karolina
Rutcowska, Andrea Katrinecz, Jeremy Ogg, Pedro Lopez, Gabriel Burcea,
Rostislav Vorobev, Ruslan Tsygankov, Rubaida Easmin, Mazy Carneiro,
Gozde Orhan, Esperanza
Ballesteros, Markela Zeneli
Main
Research Directions
1. Machine Learning Prediction Modelling in Mental Health
2. Soft Computing, Evolutionary Algorithms and Applications
3. Machine Learning and NLP Sentiment Analysis in Finance
4.
Predicting Spectral Reflectance Curves and Applications in Coatings
Industry
1.
Machine Learning Prediction Modelling in Mental Health
(1.a)
Predicting Risk of Dementia with Machine Learning using Routine
Primary Care Records – CPRD.
Participants:
Daniel Stamate, Fionn Murtagh, Mihai Ermaliuc, John Langham,
Charlotte Wu, in collaboration with Prof David Reeves and team at the
Centre for Primary Care in the Institute for Population Health,
University of Manchester
Our Lab leads on the Machine Learning aspects of the study based on our project on Predicting the risk of dementia using routine primary care records, which is developed in collaboration with University of Manchester and other academic partners. The project got media coverage at BBC. The research work concerns the development of novel synergistic approaches to predicting dementia based on Machine Learning (AI) and Statistical methods, and the development of a prediction tool. There are currently almost 1 million people in UK living with dementia. There is currently no cure, and the condition has higher health and social care costs than cancer, stroke and chronic heart disease, taken together (dementia cost in UK being £26 billion per year). Current thinking suggests that 35% of cases of dementia could be prevented. Our research project aims to contribute to prevention, and to helping improve diagnosis rates (currently at least one third of expected patients don't receive a dementia diagnosis) through predicting risk of dementia with new machine learning and statistical based approaches. The main source of data to be analysed in this project is the Clinical Practice Research Datalink (CPRD).
(1.b)
Predicting Alzheimer's and Dementia with Machine Learning and
Statistical Approaches on ADNI, EMIF-AD and ELSA
cohorts.
Participants:
Daniel Stamate, Daniel Stahl, David Reeves, Henry Musto, Rostislav
Vorobev, Ruslan Tsygankov, Olesya Ajnakina, in collaboration with
Institute of Psychiatry London - King's College London, UCL, Oxford
University, EMIF-AD Consortium partners, and University of
Manchester
This
topic involves predicting Alzheimer's Disease (AD) and Dementia with
innovative Machine Learning and Statistical Learning methodologies
on:
i) Alzheimer's Disease Neuroimaging Initiative: ADNI with methodologies based on Neural Networks and Deep Learning, Gradient Boosting, Gaussian Processes, SVM, and Survival Machine Learning
ii) European Medical Information Framework - Alzheimer's Disease: EMIF-AD (AD Biomarker Discovery), with methodologies based on Gradient Boosting Machines, Random Forests and Deep Learning
iii) English Longitudinal Study of Ageing – ELSA, with methodologies based on Survival Random Forests, Survival Elastic Net and Cox models, and Gradient Boosting classification.
(1.c)
Predicting Psychosis
Participants:
Daniel Stamate, Daniel Stahl, Wajdi Alghamdi, Andrea Katrinecz, in
collaboration with Institute of Psychiatry, Psychology &
Neuroscience, King's College London, Department of Psychiatry and
Neuropsychology Maastricht University Medical Centre, and
Department of Psychiatry, Yale University School of
Medicine
Prediction Modelling and
Pattern Detection Approaches for the First-Episode Psychosis
Associated to Cannabis Use
Recent
studies show that cannabis is one of the most popular drugs in the
world. Many countries have started to legalise it. However, recent
research work demonstrates that the consumption of cannabis is a
significant risk factor for various types of psychosis. As such,
research efforts are currently made to improve the estimation of
cannabis contribution to the psychosis development. In this ongoing
research we apply data science methodologies based on scalable
machine learning and statistical learning to devise novel approaches
to the prediction of the first-episode psychosis attributable to the
use of high potency cannabis, and the quantification of risk
factors, based on phenotype data. Genotype data is to be added to
the analysis in a next phase of the research. The work is performed
in collaboration with the teams of Dr Marta Di Forti, Prof Sir Robin
Murray, and Prof Daniel Stahl at the Institute of Psychiatry,
Psychology & Neuroscience, King's College London.
Predicting Psychosis from
Experience Sampling Data using Machine Learning
Modern
psychiatric classification systems categorize psychiatric disorders
–partly evidence-based; largely pragmatically– based on
different combinations of required number of symptom domains that
exceed the operational threshold of severity. This taxonomy endorses
unique phenotypes with precise boundaries. A prevailing trend in
psychiatry has been to reify these categorical diagnoses. However,
efforts to discriminate these psychiatric disorders, using modern
genetic and neuroimaging data, have thus far failed to deliver a
promising outcome. Evidence indicates commonality rather than
distinction. The Experience Sampling Method (ESM), a personal diary
method to assess mental states in real-time, provides a unique
opportunity to observe these subtle fluctuations of mental states.
It has various advantages over the conventional method of
cross-sectional assessment of psychopathology based on self-report
questionnaires: high ecological validity, high reliability, no
recall bias, high temporal resolution, and contextual information.
However, this intense assessment strategy produces a massive amount
of information at an individual level. As such, even modern
statistical approaches sometimes fail to provide optimal solutions
to deal with the complexity of data at this scale. Machine learning
offers enhanced solutions for this kind of research challenges when
synergistically combined with more traditional statistical methods.
The aim of this study is to predict pattern formation and agnostic
clustering of general population using generic ESM data collected
with mobile apps. The work is developed in collaboration with Andrea
Katrinecz (Data Science MSc graduate), Dr Sinan Guloksuz of the
Department of Psychiatry and Neuropsychology at Maastricht
University Medical Centre and Department of Psychiatry, Yale
University School of Medicine, and Prof Daniel Stahl of the
Department of Biostatistics and Health Informatics, King's College
London.
2.
Soft Computing, Evolutionary Algorithms and
Applications
Participants:
Doina Logofatu, Daniel Stamate, Ida Pu, Mihaela Breaban, in
collaboration with Frankfurt University of Applied Sciences, and
University of Iasi
Soft
Computing involves various advances in AI Algorithmics which are
specific to the nature of this computing paradigm. This theme
addresses the need for efficiency in solving optimisation problems or
the need for offering tractable solutions for specific NP-hard
problems by employing Evolutionary Computing approaches, in
particular Genetic Algorithms and Particle Swarm Optimisation
algorithms.
On the other hand, devising efficient algorithms for integrating, querying and performing inferences with imperfect information, benefits of Soft Computing approaches, as those based on multi-valued logics, and this is another direction we follow in our research. We develop algorithms for computing the semantics of the integrating, querying or inference rules that describes the result of these processes, and for deciding the query equivalence problem, which is useful in the query optimisation problem.
Moreover,
statistical simulations are a useful Soft Computing tool that we
employ for assessing new algorithms we propose for improving the
time-efficiency in blocking expanding ring search for mobile ad hoc
networks, or for various concurrency problems.
3.
Machine Learning and NLP Sentiment Analysis in Finance
Participants:
Daniel Stamate, Rapheal Olaniyan, and Frederic Marechal
There has been an increasing interest recently in examining the possible relationships between emotions expressed online and stock markets. Most of the previous studies claiming that emotions have predictive influence on the stock market do so by developing various machine learning predictive models, but do not validate their claims rigorously by analysing the statistical significance of their findings. In turn, the few works that attempt to statistically validate such claims suffer from important limitations of their approaches.
Growing research analyses the relationship between sentiment-filled online information and the stock market, and shows a tendency for the former to predict the latter. But little is known if this information's predictive power resolves uncertainty. Rather, it is believed that it induces volatility because investors over-react or under-react to new information as a result of sentimental contagion.
In particular, stock market data exhibit erratic volatility, and this time-varying volatility makes any possible relationship between these variables non-linear. Our work investigates and propose novel frameworks based on approaches that account for non-linearity and heteroscedasticity. We study also the asymmetric nature of influences of positive and negative sentiments on the stock market volatility.
Current research is extended also
towards financial fraud detection with NLP and ML approaches.
4.
Predicting Spectral Reflectance Curves and Applications in Coatings
Industry
Participants:
Daniel Stamate, Asei Akanuma, Alexandra Stepanenko, in collaboration
with Sherwin-Williams
This research is developed in collaboration with Sherwin-Williams in Knowledge Transfer Partnership (KTP) and Accelerate Knowledge Transfer (AKT) projects co-funded by Innovate UK and by the business partner. The work concerns the development of innovative Artificial Neural Network / Deep Learning state of the art approaches to colour reflectance curve prediction for optimising the design of new coatings.