Invited Talks in Data Science,
Goldsmiths College, University of London

Series of talks with an intended audience including scholars with interests in Data Analytics,
Big Data, and Data-Intensive interdisciplinary research and applications

The series is part also of the Data Science MSc curriculum

Organised by Dr Daniel Stamate, Department of Computing


24th February 2017, 11am-12pm, venue: PSH305, New Academic Building

Guest speaker: Mr John Morton
Talk: Big Data Analytics: Changing the way you do business

Having great technology and being analytic skilled are key skills demanded from the market place. This discussion is focussed on how you help business and organisations understand the value of data, effectively use data or disrupt established industries. This presentation covers new business models and uses for Big Data, Open Data and Your Data within organisations.

John Morton has 30 years experience in delivering information exploitation solutions in a range of industries (the last 5 years specifically in Financial Services) ; runs a consultancy company advising on disruptive technologies like Big Data; and mentors and advises on a number of start-ups, five of which are exploiting open and Big Data. He has held Chief Technology Officer positions within Intel and SAS Institute (the major business analytics company).


See John at:
IT Leaders-Analytics Masterclass 3  March 2016, Thames Valley, UK
BCS Entrepreneurs Big Data  7 March 2016, London, UK
Faculty of Mathematics and Informatics 21 – 25 March 2016, Vilnius, Lithuania
Insurance IoT Europe Summit 7-8 June 2016 London, United Kingdom
Marketing BigData Analytics  9 June 2015, Reading, UK
Retail Banking Analytics Europe 20 - 21 June 2016 London, UK

3rd March 2017, 11am-12pm, venue: PSH 305, New Academic Building
Guest speaker: Sabrina Duggan, Goldsmiths Careers Service
Data Science employability seminar
(reserved to Data Science MSc students)

10th March 2017, 11am-12pm,
venue: PSH 305, New Academic Building

Guest speaker: Dr Daniel Stahl, King's College London
Integrating Machine Learning Methods in Standard Medical Research Studies (slides)

Machine learning is typically used to analyse large, complex datasets with the number of variables often larger than sample size (“p>>n”), such as in neuroimaging (comparing activity of Millions of brain voxels) or bioinformatics (comparing gene expression profiles of 100 000s of genes across various experimental conditions or phenotypes). Statistical modelling is predominately used in other medical research areas, such as in the analyses of randomized clinical trials and experimental studies or in epidemiology (study of the patterns, causes, and effects of health and disease conditions of defined populations).

The aim of this presentation is to compare machine learning and statistical modelling approaches and to highlight similarities and differences. I will then assess the usefulness of machine learning algorithms for applications in medical research as an alternative to classical statistical modelling methods.

As an example of the usefulness to  integrate machine learning methods in medical research I  will present a re-analyses of an  event-related brain potential (ERP) dataset from infants at high or low risk of developing autism. Event-related brain potential is a non-invasive method of measuring brain activity during cognitive processing with high temporal resolution. The standard analysis of averaged ERP measurements usually involves a large number of univariate mean group comparisons resulting in a multiple testing problem. Machine learning methods combined with cross-validation methods allow assessing the predictive performance of a derived model, thereby avoiding multiple testing problems.

Dr Daniel Stahl is a Reader in the Department of Biostatistics and Health Informatics, Institute of Psychiatry, Psychology & Neuroscience, King's College London, where he is the Head of the Statistical Learning Group. He is a leading scientist in the areas of Biostatistics, Statistical Learning and Machine Learning, whose research focuses on Stratified Medicine and related topics.


17th March 2017, 11am-12pm, venue: PSH 305, New Academic Building

Guest speaker: Mr James Ravenscroft, University of Warwick
Talk: Scientific progress and using Natural Language Processing for the greater good

James is a Machine Learning and AI software specialist. He has a background in Artificial Intelligence and Robotics, and is half way through a PhD in Natural Language Processing which he works on part time as well as lecturing and speaking at conferences on behalf of the University of Warwick. By 2016 James worked for 3 years as a Cognitive Solutions Architect for IBM’s Watson division, and has recently started a role of CTO within his new AI company Filament developing Machine Learning solutions.


24th March 2017, 11am-12pm, venue: PSH 305, New Academic Building

Guest speaker: Dr Valeriia Haberland, Centre for Intelligent Data Analytics
Talk: Data Standardization for Business Addresses using Location-Based Services


Location-based services are becoming increasingly important in our daily life e.g., Google Maps. Companies also rely on address information not only in order to send correspondence to the right destination, but also to match records and / or identify duplicates in their databases automatically. However, addresses are often misspelled, inconsistently structured or incomplete due to many factors such as human error, imperfect database structure, etc. Although available location-based services might have their own shortcomings, their data can be used to improve the quality of postal addresses for companies. In this talk, we will discuss how to retrieve the required address data from the external APIs (e.g. Google Places API), interpret these data and integrate the resulting addresses with available business address information.

Dr Valeriia Haberland is a Research Associate at the CIDA - Centre for Intelligent Data Analytics, established in Goldsmiths, with a focus on the global e-Invoicing, analytics and invoice financing. Valeriia's research currently focuses on automated data improvement for businesses, i.e. data enrichment, cleaning and integration from multiple sources. The CIDA was launched in March 2015 as a multidisciplinary department applying a broad variety of methods to the problems of data analytics. The centre is fully commercially funded and maintains academic autonomy to pursue theoretical development of this propitious field introducing novel and apposite models for knowledge discovery and semantic reasoning. The centre has focused on the analysis of procurement and supply cycle data which affords a reasonable scope to make significant progress on these prominent areas of current research.


Other talks to be announced soon ...