Report of the E-CAM workshop “Improving the accuracy of ab-initio methods for materials”

Title: Improving the accuracy of ab-initio predictions for materials
Location: CECAM-FR-MOSER
Webpage with list of participants, schedule and slides of presentations: http://www.cecam.org/workshop-0-1643.html
Dates: September 17, 2018 to September 20, 2018
Organizers: Dario Alfè, Michele Casula, David Ceperley, Carlo Pierleoni

State of the art
Improving the accuracy of ab-initio methods for materials means to devise a global strategy which integrates several approaches to provide a robust, controlled and reasonably fast methodology to predict properties of materials from first principle. Kohn-Sham DFT is the present workhorse in the field but its phenomenological character, induced by the approximations in the exchange-correlation functional, limit its transferability and reliability.
A change of paradigm is required to bring the ab-initio methods to a predictive level. The accuracy of XC functional in DFT should be assessed against more fundamental theories and not, as it is often done today, against experiments. This is because the comparison with experiments is often indirect and could be misleading. The emerging more fundamental method for materials is Quantum Monte Carlo because of: 1) its favourable scaling with system size with respect to other Quantum Chemistry methods; 2) its variational character which defines an accuracy scale and allows to progressively improve the results. However QMC being much more demanding in terms of computer resources, and intricate than DFT, a combined approach is still desirable where QMC is used to benchmark DFT approximations for specific systems before performing the production study by DFT.
A different aspect of accuracy is related to size effects: often relevant phenomena occurs at length and time scales beyond the one approachable by first-principle methods. In these cases effective force fields methods can be employed. Machine Learning methods can be used to extract those force fields from training sets provided by ab-initio calculations. Presently DFT-based training sets are used. Improving their accuracy will improve the ultimate accuracy at all scales.
This change of paradigm requires building a community of people with different expertises working in an integrated fashion. This has been the main aim of the workshop.

Major outcomes
The following is a partial list of the topics discussed at the workshop, and of their importance to develop the field of computational materials science from first principles.
1) Importance of computational benchmarks to assess the accuracy of different methods and to feed the machine learning and neural network schemes with reliable data;
2) Need of a common database, and need to develop a common language across different codes and different computational approaches;
3) Interesting capabilities for neural network methods to develop new correlated wave functions;
4) Cross-fertilizing combination of computational schemes in a multi-scale environment: from the elemental interactions described at very high-level by expensive approaches to the generation of effective potentials, keeping the accuracy of high-level methods but at much lower cost.
5) Recent progress in quantum Monte Carlo to further improve the accuracy of the calculations by taking alternative routes: transcorrelated Hamiltonians, multideterminantal expansions, pfaffian wave functions.
Limitations:
Lack of a common environment where to develop multi-scale approaches for the prediction of material properties. This workshop is one of the first attempts where such needs have been discussed, and possible solutions explored.
Open questions:
How to make the codes ready for the next high performance computing (HPC) generation? A fundamental limitation to the future expansion of HPC is the need to reduce energy cost per unit of computation, which requires new technologies. Some of these new technologies are based on accelerators, such as GPU’s, which require in many case a complete rewriting of legacy scientific codes. This is a serious problem for the scientific community, requiring open and frank discussions, including a re-think of work recognition of computer code development as a major scientific endeavour.
How to develop a meaningful materials science database (which gathers both experimental and theoretical results)?
How to develop a common platform to merge different methods in a multi-scale spirit?

Community needs
The community of computational material science has increased in size tremendously in the past few decades. The drive for this expansion has been the development of ever more friendly computer codes, mainly based on density functional theory (DFT). Indeed, web of science is now reporting tens of thousands of papers per year based on DFT. By contrast, the quantum Monte Carlo (QMC) method, normally much more accurate than DFT, is only published in the hundreds/year, because of its much higher cost and also because of the intricacies of the method that make it more difficult to use. The number of workshops on QMC and the number of schools in which QMC is taught are also only a fraction compared to those on DFT. We believe that the community is now at a turning point where the extra accuracy offered by QMC is not only desirable, but also very much needed if serious progress is to be achieved in the computational design of new materials with bespoke properties. Only a few QMC codes (not even a handful) are currently supported through serious effort, and the community desperately need more formal recognition for code development in order to attract the best people to this endeavour. We are particularly focussing on QMC because we believe that it is the natural method capable of exploiting to the full the forecasted expansion in computer power in the next 10-20 years, but this is also a crucial point in time for this expansion, where new architectures require a complete re-thinking of computational approaches. A series of CECAM workshop may help to draw attention to these points.

Funding
Typical funding channels for the activities discussed at the meeting could be the Psi-k community and national funding schemes. In addition, since the ultimate goal of these activities will be to be able to design new materials entirely from first principles, it should be possible to target and persuade specific industries involved in the synthesis of new materials, including for example energy materials such as new batteries and new hydrogen storage materials. Industry funding could be targeted by offering to industry members of staff limited number of spaces to the workshops and requesting a registration fee.

Will these developments bring societal benefits?
The potential benefits of developing and handling computational tools able to predict material properties with a high level of reliability are numerous and of tremendous societal impact. During our workshop, a clearcut example was given by Xenophon Krokidis, who talked about the development of Scienomics, a software used to design and test new compounds in silico. This would allow a company to accelerate the R&D stage of its projects, cut ressources spent for checking the functionalities of a given material, and significantly shorten the “trial and error” time. The budget and time reductions yielded by the usage of a reliable material science software in the R&D is estimated to be of one order of magnitude, according to Krokidis’s experience. This is a huge amount.
Thus, the efforts of bringing together the three different communities (ab initio quantum Monte Carlo, density functional theory, and machine learning) is definitely worth, in the perspective of improving the accuracy of ab initio predictions for materials.