Scientific Report on the workshop: “MARVEL/MaX/Psi-k Tutorial on high-throughput computations: general methods and applications using AiiDA”

EPFL, Lausanne, Switzerland, 22-24 June 2016

Group picture from the AiiDA tutorial, EPFL June 2016

High-throughput computing (HTC) is emerging as an effective methodology in computational materials science for the discovery of novel materials. Its adoption is spreading rapidly at the point that HTC is becoming an essential tool for computational materials scientists.

The aim of the tutorial was to introduce young researchers to HTC, with hands-on tutorials based on the open-source high-throughput platform AiiDA (http://www.aiida.net), complemented by three invited highlight talks to underscore the diverse application fields of HTC.

We report here a summary of what has happened.

Report

The tutorial was targeted at about 40 students, postdocs and researchers interested in applying high-throughput computations in their research, and in particular to those interested in learning how to use the AiiDA platform.

It started in the morning of Wednesday 22 June. After the registration formalities and a short introduction by the workshop organizers, Giovanni Pizzi (EPFL) gave a first introduction talk on AiiDA and the design concepts behind it, essential to understand the code but more generally to efficiently manage high-throughput computations: the ADES model, the concepts of provenance and reproducibility, and how these can be achieved using AiiDA.

Hands-on sessions

After the coffee break, we started the first of a series of hands-on sessions with AiiDA. The room was large enough to host all participants, each of them working on a different computer (either their laptop, or on machines provided by EPFL). Each session was always introduced by the instructors, who briefly explained the aim of the session. The detailed instructions on the tasks to achieve were then provided on a printed booklet (and also on a PDF file), so that participants could learn at their own pace. Eight instructors (see list below) were available throughout all sessions to answer to specific questions any participant could have had.

Sessions were interleaved with invited talks (see below) and with coffee breaks.

Participants were extremely interested in learning the code – actually, instructors were quite surprised to see that most participants were so eager to learn that they would not leave the room during coffee breaks, and had to be “pushed out” out of the room!!

Invited highlight talks

The workshop, however, was also focused more generally on teaching general techniques to be used in high-throughput computations, independent of the code used. For this reason, we have invited three experts in the domain of high-throughput computations.

The first talk, on Wednesday afternoon, was given by Kristian S. Thygesen (DTU, Denmark), with title “Computational screening for new solar energy materials”. The second talk by Gábor Csányi (University of Cambridge, UK) was on Thursday morning and it was titled “Fitting interatomic potentials to moderate and large amounts of DFT data”. Finally, Geoffroy Hautier (Université Catholique de Louvain, Belgium) gave a talk on Friday morning on “Accelerating materials discovery through high-throughput computing and data mining”.

All speakers discussed extremely interesting results from their research, showing in particular how they performed computational searches of materials inside known classes, how filtering of results could be performed to reduce the number of candidates and perform very expensive calculations only on a subset of them, and explaining methods that can be applied to extract information from simulations using machine-learning techniques.

Geoffroy Hautier giving a highlight talk at the AiiDA tutorial

Fostering interactions between participants

In order to encourage discussions and exchange between participants, a poster session was organised in the evening of the first day (22nd June), together with a standing dinner. Seventeen posters were presented, and the participants seemed very interested in discussing in detail each other’s work.

Poster session with standing dinner on the evening of the first day

Moreover, each morning started with a half-an-hour discussion session, where each participant was encouraged to ask questions either on technical questions on the AiiDA code, but more generally on their high-throughput research. It was interesting to see how the participants were discussing among them on different ways to solve common problems to tackle when running large numbers of simulations.
Finally, a social dinner took place on Thursday evening at the restaurant “Le Debarcadère” in Saint Sulpice. We were lucky enough to have a very nice and pleasant weather, that allowed us to have dinner on the terrace on the lakeside, facing the Alps, with a beautiful natural landscape.

Finally, it’s worth mentioning that, thanks to the generous support of funding entities (Psi-K, MARVEL and MaX), it has been possible to provide financial support for the accommodation in a hotel on campus to 23 participants (additionally to covering the organization expenses, and all coffee breaks, the standing dinner during the poster session, and the social dinner on Thursday evening).
This has made it possible to some participants to take part to the tutorial.

Tutorial details

The tutorial that the instructors prepared in the weeks preceding the event was organised to allow for a smooth learning curve, and to get participants interested to learn more, rather than bored by technical details.

For this reason, from the very first session, users were starting to use the code directly, without any initial session on how to install the code. To achieve this, the tutorial was running on Amazon AWS machines, that provided a very consistent and homogeneous environment to all participants, giving at the same time to each of them a different machine to test, learn and practice.

After the end of the tutorial the virtual machines have also been distributed as a downloadable VirtualBox appliance (on http://www.aiida.net/tutorials). In this way, it becomes extremely easy for participants to run again the tutorial in the future (e.g., optional parts). Most importantly, also for people who could not attend the tutorial can profit of the learning material, and start learning the code with almost zero time required for initial setup (it’s just needed to install VirtualBox, download the appliance and start it).
Finally, both the instructor sessions and the invited speaker talks have been recorded and will be available soon online.

Tutorial content

The first sessions focused both on understanding the basic commands to interact with the code, but at the same time participants were getting acquainted with the concept of directed graphs (the way AiiDA internally stores calculations, data and their relationships).
Later, they started to learn how to submit calculations (using Quantum ESPRESSO) with AiiDA. Since in real life errors always occur, we decided to avoid to present a “perfect” tutorial that always works. Instead, the instructions were explicitly asking the user to submit a ‘wrong’ calculation that (for various reasons) would crash, to then teach them how to understand where things went wrong, and how to fix potential problems.

The second day was focused on more advanced topics. First, on how to efficiently query calculations in the database. A test database comprising about 300 calculations on a family of perovskites was already provided, and participants could perform various queries to understand the data, with the final aim of producing a plot to understand which perovskites were metallic, and which were magnetic.

The second very important topic was “workflows”. We first started by introducing the concept of provenance, why it is important to keep track of what has happened, and how to run simulations without “breaking” it. Examples were shown from very simple use cases (like an equation of state). Participants learned how, with a single line, one can ask AiiDA to store the representation of a python function in the database for later querying (using ‘workfunctions’) and how to write full-fledged workflows to automatically obtain a result of interest that originates from a long sequence of calculations.

The session on workflows extended into the morning of the final (third) day, that ended with an explanation of how to install AiiDA and extend it with plugins.

We also had a presentation by Nicolas Mounet, that showed a real-case study from his research: using AiiDA, he could screen over 200,000 materials from 3D databases (ICSD and COD) to filter out only those that are layered (~6000). Those were further refined with extensive DFT calculations to calculate the binding energy and produce a very interesting database of those (~1800) that are indeed weakly (Van der Waals) bonded and are potentially realisable in the lab using exfoliation techniques.

Results of the feedback form

The feedback received from the participants has been extremely positive. We have been therefore strongly encouraged by them to organize a new event in May 2017, whose registrations are (to date) open: http://nccr-marvel.ch/en/events/aiida-tutorial-may-2017. We report below the main results of the feedback form.

The next two plots compare the self-assessed level of knowledge of AiiDA of the participants before and after the tutorial. Remarkably, participants with a “poor” level of knowledge, which represented the relative majority before the tutorial, were no longer present after the tutorial. The vast majority of the self-evaluations after the tutorial was “satisfactory” or better.

Change in the level of skills before and after the tutorial

The following series of plots reports on the evaluations that the instructors received from the participants. The distribution of the responses suggests that the instructors were prepared, available, and capable to effectively motivate the participants.

Results of the feedback form on the instructors

Finally, 81.5% of the participants declared that they would strongly recommend their colleagues to participate to a similar tutorial on high-throughput computations using AiiDA.

Additional information

Workshop organizers

Giovanni Pizzi – EPFL, Switzerland
Andrea Ferretti – CNR, Istituto Nanoscienze, Italy
Boris Kozinsky – Robert Bosch RTC, Cambridge MA, USA

Tutorial instructors

Andrea Cepellotti – EPFL, Switzerland
Fernando Gargiulo – EPFL, Switzerland
Giovanni Pizzi – EPFL, Switzerland
Leonid Kahle – EPFL, Switzerland
Martin Uhrin – EPFL, Switzerland
Nicolas Mounet – EPFL, Switzerland
Snehal Waychal – EPFL, Switzerland
Spyros Zoupanos – EPFL, Switzerland

Invited highlight talks

Gábor Csányi (University of Cambridge, UK): “Fitting interatomic potentials to moderate and large amounts of DFT data”
Geoffroy Hautier (Université Catholique de Louvain, Belgium): “Accelerating materials discovery through high-throughput computing and data mining”
Kristian S. Thygesen (DTU, Denmark): “Computational screening for new solar energy materials”

Program

The program is available online at: http://nccr-marvel.ch/en/events/aiida-tutorial-june-2016

Pictures of the event

Additional pictures of the event can be found at the following links: day 1, day 2 and day 3.

List of participants

17 among the participants also presented a poster during the evening of the first day.

Michael Atambo, University of Modena and Reggio Emilia, Italy
Gabriel Autes, EPFL
Luigi Bagolini, CNR-IOM, Italy
Juan Beltrán, IMDEA Materials, Spain
Lilia Boeri, TU Graz, Austria
Marco Borelli, SISSA, Italy
Ariadni Boziki, EPFL
Jens Bröder, Forschungszentrum Juelich GmbH, Germany
Claudia Cardoso, CNR–Istituto Nanoscienze, Italy
Andrea Cepellotti, EPFL
Francesca Costanzo, ICN2, Spain
Gábor Csányi, University of Cambridge, UK
Pietro Delugas, SISSA, Italy
Bonny Dongre, Ruhr-Universität Bochum, Germany
Mahdi Faghihnasiri, Shahrood University of Technology, Iran
Matteo Fasano, Politecnico di Torino, Italy
Andrea Ferretti, CNR–Istituto Nanoscienze, Italy
Fernando Gargiulo, EPFL
Jacek Golebiowski, Imperial College London, UK
Ionel Bogdan Guster, ICN2, Spain
Geoffroy Hautier, Université Catholique de Louvain, Belgium
Vinay Hedge, Northwestern University, USA
Sergio Illera, ICN2, Spain
Farzaneh Jahanbakhshi, EPFL
Conrad Johnston, Queen’s University Belfast, UK
Leonid Kahle, EPFL
Vamshi Mohan, Katukuri EPFL
Ryo Kobayashi Shiga, EPFL and Nagoya Institute of Technology, Japan
Boris Kozinsky, Bosch RTC, Cambridge, USA
Chan-Woo Lee, Korea Institute of Energy Research, South Korea
Ariel Lozano, Basque Center for Applied Mathematics, Spain
Changru Ma, EPFL
Ivan Marri, CNR–Istituto Nanoscienze, Italy
Henrique Miranda, Université du Luxembourg, Luxembourg
Hossein Mirhosseini, MPI Dresden, Germany
Stephan Mohr, Barcelona Supercomputing Center, Spain
Matthieu Mottet, IBM Research Zurich and EPFL
Nicolas Mounet, EPFL
Silviya Ninova, University of Bern
Diego Pasquier, EPFL
Xu Pengxiang, ETH Zürich
Riccardo Petraglia, EPFL
Simone Piccinin, CNR-IOM, Italy
Giovanni Pizzi, EPFL
Miguel Pruneda, ICN2, Spain
Clelia Righi, CNR–Istituto Nanoscienze, Italy
Ilia Sivkov, ETH Zürich
Olga Syzgantseva, Aalto University, Finland
Kristian S. Thygesen, DTU, Denmark
Iurii Timrov, EPFL
Martin Uhrin, EPFL
Joel Voiselle, EPFL
Snehal Waychal, EPFL
Brandon Wood, Lawrence Livermore National Laboratory, USA
Aliaksandr Yakutovich, Empa
Felipe Zapata, Ruiz VU Amsterdam, The Netherlands
Spyros Zoupanos, EPFL

M	T	W	T	F	S	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30
31

Psi-k

Scientific Report on the workshop: “MARVEL/MaX/Psi-k Tutorial on high-throughput computations: general methods and applications using AiiDA”