PhD students engage in online training on Data Intensive Science

Welcome talk by Prof. Carsten Welsch and Prof. Philip James.

Earlier this year the Liverpool Centre for Doctoral Training on Data Intensive Science (LIV.DAT) was awarded 60k£ funding to organise the 2020 STFC Data Intensive Summer School. For one week, the University of Liverpool’s campus was scheduled to be the centre of the STFC School on Data Intensive Science. However, due to travel restrictions, it was decided to hold this school as an online event instead. Whilst everyone at LIV.DAT had looked forward to welcoming students from across the country in Liverpool, the CDT was pleased to take this opportunity to ensure that all participants could get involved with all aspects of online training and to offer an engaging programme.

LIV.DAT is a hub for training students in managing, analysing and interpreting large, complex datasets and high rates of data flow, and features a unique training approach addressing some of the biggest challenges in data intensive science. The Centre, which is hosted by the University of Liverpool and Liverpool John Moores University (LJMU) / Astrophysics Research Institute (ARI), offers a comprehensive training in data intensive science through cutting edge research projects and targeted academic training programme, complemented by secondments to national international partners.

The aim of this STFC-funded school, which took place between 12-16 October, was to provide PhD students that are active in data intensive science additional skills to support their research, help them make industry placements a success and provide advice concerning possible career pathways in industry.

Amongst the participants who participated in this School were several PhD students based at the Cockcroft Institute, Daresbury. Here, they carry out research in accelerator science and technology, using big data techniques. Throughout the week, the students were provided with a foundation in state-of-the-art computational tools and key techniques used in machine learning and large data management; all were linked to LIV.DAT cutting-edge research activities.

The school kicked off on Monday with an introductory talk by CI member and LIV.DAT Director Prof. Carsten Welsch (University of Liverpool) and Head of ARI Prof. Philip James (LJMU). This was followed by a talk on Pitfalls and Limitations in Machine Learning by Dr Kurt Rinnert from the University of Liverpool. The day continued with three parallel hands-on sessions: an introduction to Machine Learning (ML) by Meirin Oan Evans (University of Sussex); a tutorial on SKLearn and Keras by Prof. Adrian Bevan (Queen Mary University of London); a workshop on HEP NN training by Prof. Gregor Kasieczka and Dr Lisa Benato (University of Hamburg). The final part of the day was devoted to interactive poster sessions, fully utilising the Zoom breakout room functionality. Here, each poster presenter hosted discussions in their own virtual room, allowing many students to partake in small groups rotation.

The second day started with a talk on human-aware Artificial Intelligence by Prof. Paulo Lisboa from LJMU. The morning, and afternoon, continued with Data Analysis parallel sessions where the students could get hands-on experience during three workshops: Python ecosystems for HEP by Dr Eduardo Rodrigues (University of Liverpool); preparation of large datasets for ML by Dr. Isabell Melzer-Pellmann (DESY); Demystifying “Big Data” by Prof. Andrew Newsam (LJMU). The afternoon finished with the final part of the interactive poster sessions. LIV.DAT students contributed to interactive poster sessions on both days and in total over 20 posters about the research of the school participants were presented.

A public lecture on Options and Opportunities for Health Data Science was given by Prof. Andrew Morris from Health Data Research UK in the evening to give participants and the wider public an insight into how computer science and “big data” are transforming health care delivery models.

On Wednesday, Dr Louise Butcher from the STFC Hartree Centre started the day with a talk on established data tools and techniques and how they have been used in practise to solve business problems. Students were then given the opportunity to get involved in workshops on programming: Git and its underlying mechanisms by Dr Mark Dawson (Swansea University); large data sets derived from cosmological simulations and astrophysical observations by Prof. Ian McCarthy and Dr Andreea Font (LJMU). In the evening, the students were given an introduction into live astronomy by Dr Chris Copperwheat from LJMU during an interactive session with the fully robotic Liverpool Telescope 2 based on the Canary Island of La Palma.

Thursday was a focused industry day with case studies of careers in data science, bespoke training in project management by Fistral with a focus on the particular challenges found in the IT sector and in international collaborations. The afternoon concluded with an industry careers workshop where a panel of external speakers provided useful careers advice with the opportunity for students to engage with all of the speakers. As this school was held as a fully virtual event, in the evening all students were encouraged to participate in an online Escape Room; a fun and social event to connect with fellow students in an informal setting.

The final day of this week saw a Kaggle Competition by Dr Stephen Farry (University of Liverpool) where students worked together in small groups to solve a programming challenge. The morning finished with an overview by LIV.DAT student Alexander Hill of his industry secondment to IBM Research. The school was concluded by Prof. Carsten Welsch who summarised the students’ achievements of the whole week.

The entire school was held via virtual conferencing tool Zoom and made full use of its functionality such as spotlighting speakers to create a virtual panel for participants and using breakout rooms to stimulate interaction and discussion between students, session leaders and organisers. Throughout the entire week, a purposely created Slack Channel facilitated interaction between everyone involved in the school as a good platform for online social interaction.

Exercises are available via https://indico.ph.liv.ac.uk/event/103/

This event is supported by the STFC under agreement No 4070265360.