Writing reproducible workflows for computational materials science

EPFL (Lausanne, Switzerland),  May 21-24, 2019

Today, many open questions in computational science call for more than individual computations using a single code. As the demand for integration and throughput increases, the skill of writing robust and reproducible workflows is becoming ever more important. In this context, the move towards open science raises the level of scrutiny and demands that workflows be recorded in a way that can be inspected and reused by scientific peers.

This hands-on tutorial introduced young researchers to writing reproducible computational workflows using the open-source AiiDA framework for workflow management and provenance tracking (http://​www.aiida.net), complemented by invited talks from experts in the field that highlight the power and the challenges involved with leveraging complex workflows in computational materials science.

Tutorial web page:  http://www.aiida.net/tutorial-reproducible-workflows/

Tutorial structure


Structure: The first two days consisted of a hands-on tutorials, while participants could start working on their own projects during the remaining 1.5 days.

 

The first half of the tutorial consisted of hands-on sessions, interspersed with short introductory presentations by tutors, that allowed participants to work through the extensive introductory material at their own speed, according to their level of python/AiiDA proficiency.

About one third of participants were at a basic level of python proficiency (having written a python script but no experience with object-oriented programming in python). Two-thirds of participants were new to AiiDA.

 

Open discussion sessions encouraged questions and participation, and highlight talks by invited experts from the fields of high-throughput computation, data curation and analytics provided insights into the current challenges of the field and approaches to tackle them.

The second half was organised in a workshop fashion: participants could choose among three groups and start working on their own projects with help from the tutors. 

A poster session with a standing dinner was held to give participants the opportunity to present their own research. The social dinner provided further opportunities for socialization and exchanges.

Most lectures of the tutorial were recorded and are available online. Links to slides of the lectures can be found in the video description.

Tutorial content


Introduction to the tutorial by Nicola Marzari, Giovanni Pizzi and Leopold Talirz.

The tutorial began on Tuesday, May 21st with an introduction by Nicola Marzari on the vision of an open-science platform for high-performance computing for materials research, followed by a talk by Giovanni Pizzi who discussed how AiiDA implements the major pillars of this vision. The rest of the morning was devoted to the first hands-on session, on basic issues: getting set up, how to use the verdi command-line interface and the verdi shell, both for newcomers and previous AiiDA users as the verdi interface has made significant advances in v1.0, both in terms of features and robustness.


Based on feedback on the ab initio codes used by participants, the hands-on provided examples for Quantum ESPRESSO, while Espen Flage-Larsen provided an introduction to workflows in aiida-vasp.

 

After lunch, the hands-on sessions continued, focusing on managing calculations and the use of the QueryBuilder interface and the new interactive graph explorer to extract data from the AiiDA database.

In the morning of 22 May, Marco Govoni from Argonne National Laboratory in the USA  gave a highlight talk on “QRESP: A tool for the curation, discovery, and exploration of reproducible scientific papers” that is developed at ANL to enable complementing of scientific papers by electronic notebooks that describe dataset manipulations, with metadata available to describe the provenance of all used codes and experiments.


Marco Govoni addressing questions after his presentation

 

Stefaan Cottenier, from the University of Gent in Belgium could not attend in person, but more than made up for it by giving a highly insightful pre-recorded lecture on the evolution of how the practise of running DFT calculation, and why workflow managers become relevant tools today when they weren’t 20 years ago. 

Participants were asked questions by Stefaan & could ask Stefaan questions electronically during the lecture. Stefaan remarked after the lecture “[…] the responses by the audience on the two questions I asked where numerous and thoughtful. This is one thing I learn from this remote talk experiment: the interaction with the audience can be deeper and more explicit than in a regular talk.” In view of the financial and environmental costs associated with academic travel, perhaps this is a lesson to be taken to heart.

 


Stefaan Cottenier on the evolution of user friendliness of DFT codes.

 

The second part of the morning was devoted a brief talk on AiiDA workflows by  Sebastiaan Huber, followed by a hands-on session which continued in the afternoon. Participants could work through examples of workflows of increasing sophistication and usefulness, including one on the computation of electronic band structures.

The last talk on “Real-life workchains in aiida-quantumespresso and aiida-vasp” was given by Espen Flage-Larsen  (developer of the AiiDA plugin for VASP) and Sebastiaan Huber, focusing on error handling in real-life use cases.

The tutorial continued in the early evening with a poster session and a standing dinner was served. All poster abstracts can be found online

 

On Thursday May 23rd, the tutorial kicked off with a general question-and-answer session, where participants could ask general questions on workflow engines and workflow design unrelated to the specifics of the tutorial.
Fawzi Mohamed of the Fritz-Haber Institut (Berlin, Germany) delivered a presentation on how the NOMAD COE approaches workflows, pointing out the challenges associated with discovering workflow in existing data sets and good practises that software developers can adopt to make this easier.

The rest of the day focussed on introducing two remaining pillars of the AiiDA ecosystem: the AiiDA lab (presented by Aliaksandr Yakutovich) leverages AiiDA and Jupyter(Hub) to provide a way of sharing AiiDA-powered workflows with a graphical user interface for use by the wider community of researchers.
Finally, the AiiDA plugin system (presented by Leopold Talirz) is responsible for integrating AiiDA with numerous simulation codes, job schedulers and more and making them available via the  AiiDA plugin registry.

 


Statistics collected ahead of the tutorial (above) were largely confirmed during the open-mic session, with most people joining either the groups on workflow (50%) or plugin (40%) development.

 

With a complete overview of the AiiDA ecosystem at hand, it was time for the participants to decide how they would like to spend the remaining two days: 

  1. Workflow development (with Sebastiaan Huber and Espen Flage-Larsen)
  2. Plugin development (with Leopold Talirz and Alberto Garcia)
  3. Writing apps for the AiiDA lab (with Aliaksandr Yakutovich)

Most participants decided to work on developing a new plugin for their code or developing a new workflow for their use case.

After lunch, participants split up into groups. Introductory lectures (writing workflows, getting started with writing AiiDA plugins) provided practical hints on how to get started. For interested participants, help was available to set up AiiDA on their own laptops or workstations.

After an intensive day of coding, the group walked to the shores of Lake Geneva for a relaxing aperitivo and social dinner at “Le Débarcadère”.


Tutors were available for help throughout the tutorial; and hands-on sessions interspersed with short presentations on the inner workings of AiiDA.

The last day, May 24th, was kicked off by Guido Petretto of UC Louvain, Belgium, with his presentation “Automatize a DFT code: high-throughput workflows for Abinit”, providing insights into the technical details of the workflow design and challenges encountered.


When sharing details of calculations in a publication, only ~10-20% of participants currently go beyond the methods section or supporting information.

 

Giovanni Pizzi followed with an introduction to the Materials Cloud, illustrating how MARVEL approaches the many aspects of Open Science in the domain of computational materials science.

Leopold Talirz ended with an outlook on future developments and closing remarks on ways to stay in touch, and how to participate in the development of the AiiDA ecosystem going forward.


Leopold Talirz presenting the AiiDA and Materials Cloud teams.

 

Tutorial technology

The tutorial materials were prepared in the weeks leading up to the event by leveraging the Quantum Mobile virtual machine, which comes with installations of the AiiDA platform plus a selection of open-source materials simulation codes.
Each participant was given access to their own instance of the machine running on a cloud service. This provided a consistent and homogeneous environment to all participants, who were able to get started immediately and focus on learning how to use AiiDA, without the hassle of installing anything.

The tutorial virtual machine has been published online as a downloadable VirtualBox appliance, allowing people who could not attend the tutorial to profit as well: just install VirtualBox, download the appliance and start it.

The software side was complemented by detailed hands-on exercises (some in the form of Jupyter notebooks), all of which were made available online.

Statistics & feedback

The tutorial gathered over 50 participants from a diverse background, both in terms of home country and current position, including Phd students, postdocs, MSc students, and researchers from institutes & companies.


Participants from all over Europe and beyond made their way to Lausanne.

Feedback during and after the tutorial was positive, with most participants reported a marked improvement in their self-assessed knowledge of the tutorial topics. 

The instructors’ preparation, availability, and motivational aptitudes were highly regarded.

Most participants said that they would recommend the tutorial to a colleague, and a good fraction expressed their interest in future tutorials on various related topics.

When asked what kind of future tutorials they would like to participate in, the most popular answers were tutorials on “advanced workflows” (70% of replies) & “plugin development” (50%), i.e. “more of the same”, confirming the need for further education in this direction. 

Other popular topics included advanced AiiDA features (40%) and deploying AiiDA in a research group (40%).

Additional Information

Pictures of the events

Additional pictures of the event are available at the AiiDA facebook page.

Detailed Programme


Detailed programme of the event.

Support

The AiiDA tutorial on writing reproducible workflows for computational materials science was supported by Psi-k, MARVEL, MaX, swissuniversitites and INTERSECT, and kindly hosted by EPFL. We sincerely thank all our sponsors for making this event possible.

   

Organizers

  • Leopold Talirz (EPFL, CH)
  • Sebastiaan Huber (EPFL, CH)
  • Andrea Ferretti (CNR, Istituto Nanoscienze, IT)
  • Espen Flage-Larsen (SINTEF, No)
  • Alberto Garcia (ICMAB-CSIC, ES)

Instructors

  • Espen Flage-Larsen (SINTEF, No)
  • Andrea Ferretti (CNR, Istituto Nanoscienze, IT)
  • Alberto Garcia (ICMAB-CSIC, ES)
  • Sebastiaan Huber (EPFL, CH)
  • Leopold Talirz (EPFL, CH)
  • Giovanni Pizzi (EPFL, CH)
  • Aliaksandr Yakutovich (EPFL, CH)
  • Casper Andersen (EPFL, CH)
  • Oscar Arbelaez (EPFL, CH)
  • Daniele Tomerini (EPFL, CH)

Invited Highlight Speakers

  • Stefaan Cottenier (UGent, BE)
  • Marco Govoni (ANL, US)
  • Fawzi Mohamed (FHI, GE)
  • Guido Petretto (UC Louvain, BE)

Complete list of Participants

First Name(s) Last Name(s) Affiliation
Arsalan Akhtar ICN2 ,Barcelona ,Spain
Maximilian Amsler Bern University, Switzerland
Casper Andersen EPFL, Switzerland
Oscar Arbelaez EPFL, Switzerland
Jonathan Backman ETH Switzerland
Jana Boehm Fraunhofer IWM, Germany
Arrigo Calzolari CNR-NANO Istituto Nanoscienze, Italy
Kristians Cernevics EPFL, Switzerland
Anoop Chandran Forschungszentrum Jülich, Germany
tommaso chiarotti EPFL, Switzerland
Jonathan Chico Sandvik Coromant AB, Sweden
Crispin Cooper Johnson Matthey Technology Centre, UK
Riccardo De Gennaro EPFL, Switzerland
Augustin Degomme CEA, Grenoble, France
Nuno Miguel dos Santos Fortunato Technical University of Darmstadt, Germany
Florian Ellinger University of Vienna, Austria
Loris Ercole EPFL, Switzerland
Andrea Ferretti CNR, Istituto Nanoscienze, Italy
Sara Fiore ETH Switzerland
Espen Flage-Larsen Sintef, Norway
Marco Foscato University of Bergen, Norway
Alberto Garcia ICMAB-CSIC, Spain
Daniel Gosálbez Martínez EPFL, Switzerland
Marco Govoni ANL, Illinois, US
Davide Grassano Universita’ degli studi di Roma “Tor Vergata”
Corentin Grillet ICN2 ,Barcelona ,Spain
Francois Gygi University of California Davis
Irina Heinz Justus Liebig University Giessen, Germany
Sebastiaan Huber EPFL, Switzerland
Manaswita Kar University of Southampton
Hyun-Jung Kim Korea Institute for Advanced Study
Cedric Klinkert ETH Switzerland
Lukas Koschmieder MICRESS Group, Aachen, Germany
Roman Kováčik Forschungszentrum Jülich, Germany
Martina Lattemann Sandvik Coromant R&D
Francesco Libbi EPFL, Switzerland
Mingxuan Lin Steel institue of RWTH Aachen University, Germany
RUCHIKA MAHAJAN Indian Institute Of Technology Mandi, India
Nils Marchal CTIPC
Daniel Marchand EPFL, Switzerland
Nicola Marzari EPFL, Switzerland
Linda-Sheila Medondjio ICN2, Barcelona, Physics, Spain
Guido Menichetti Istituto Italiano di Tecnologia, Genova, Italy
Fawzi Mohamed Fritz Haber Institute, Berlin, Germany
Yoav Nahshon Fraunhofer IWM, Germany
Takayuki Nishiyama Kyoto University, Japan
Daniele Ongari EPFL Sion
MANUEL PEREZ JIGATO Luxembourg Institute of Science and Technology
Guido Petretto Université Catholique de Louvain, Belgium
Simon Pintarelli ETH-CSCS
Giovanni Pizzi EPFL, Switzerland
Avijeet Ray KAUST, Saudi Arabia
Philipp Risius Justus Liebig University Giessen, Germany
Norma Rivano EPFL, Switzerland
Marcel Sadowski Technical University Darmstadt, Germany
Andrea Silva University of Southampton, UK
Harish Kumar Singh Technical University Darmstadt, Germany
Nicola Spallanzani CINECA Italy
Jens Renè Suckert Friedrich-Schiller-Universität Jena, Germany
Leopold Talirz EPFL, Switzerland
Atsushi Togo Materials science department, Kyoto university
daniele tomerini EPFL, Switzerland
Vasily Tseplyaev Forschungszentrum Jülich, Germany
Lorenzo Varrassi University of Bologna, Italy
Valerio Vitale University of Cambridge
Aliaksandr Yakutovich EPFL, Switzerland
Pezhman Zarabadi-Poor CEITEC − Masaryk University, Brno, Czechia

Alberto Garcia, Andrea Ferretti, and Leopold Talirz
on behalf of the of the organisers

 

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.