User login


You are here

Journal Club for June 2023: A perspective on the role of machine learning in constitutive modeling and computational mechanics

nbouklas's picture


The purpose of this blog post is to briefly outline a perspective on the emerging opportunities and challenges that arise as the field of Machine Learning (ML) is quickly driving the development of tools that are becoming mainstream in the study of solid mechanics and computational mechanics, with a focus on constitutive modeling. By no means is this a complete review of ML-enabled constitutive modeling.

Background on Scientific Machine Learning (SciML)

Even though one might argue that General Artificial Intelligence is still a distantgoal, Large Language Models (LLMs) such as ChatGPT that have received a lot of attention recently, are quickly proving that they can have substantial impact in the industry sector, but also towards research. One can imagine that virtual self-trained analysts that could perform tasks that human analysts and mechanics experts could perform do not belong in the too distant future. In the same vein, an LLM-powered embodied lifelong learning agent in Minecraft that continuously explores the world, acquires diverse skills, and makes novel discoveries without human intervention was proposed less than a week ago in [1], the interesting point in this was the success of the agent in performing unseen tasks.

There has been an increased interest in machine learning (ML) tools in the computational sciences the last few years. This rise in popularity is due to multiple reasons: the ability of machine learning models to directly utilize experimental data in simulation environments, generalization capabilities of the machine learning tools, potential speed up in comparison to traditional numerical methods and their automatic differentiation framework. The main opportunity in SciML is that the underlying data often connects to a known mathematical structure and also comply to physical laws (known or yet to be discovered). For these reasons, ML tools have recently been used as a solution scheme for
forward and inverse problems involving partial differential equations (PDEs) or for the development of intrusive and non-intrusive reduced order modeling schemes for accelerated solutions of PDEs [2]. Surrogate models, or reduced orer models, have allowed capturing solution operators using both experimental and synthetic data for linear and nonlinear problems [3–6]. Recent developments include DeepONet and Neural Operators, that aim to learn solution operators that are not restricted to a finite dimensional setting [7–9]. Some early work focused on intrusive linear subspace methods using Proper Orthogonal Decomposition [4–6]. In contrast, more recently, non-intrusive methods using Neural Networks (NN) and Autoencoders for linear and nonlinear methods have been developed [3,10,11]. Physics-guided approaches, such as Physics Informed Neural Networks (PINN) have also enabled the combination of training data, and physical constraints towards solving forward and inverse problems [12, 13].

Phenomenological constitutive modeling

Focusing on approaches to constitutive modeling, it is important to note that phenomenological constitutive modeling is at its core a data-driven approach (not from an algorithmic perspective), where the goal, similar to the majority of ML tools, is to fit a function to a set of observations. One has to point out that mechanics, and constitutive modeling in particular, has been a ”limited-data” or ”partial-data” discipline. Where we note the following distinction:

• Low data: A small number of observations (labeled data pairs) that could potentially span the space of the input argument (e.g. a strain or strain rate tensor) in a uniform fashion. Such as data obtained from a computational RVE that can be probed in any strain state and strain history.

• Limited data: Observations (labeled data pairs) that only span specific regions of the input space due to constrains in data acquisition (e.g. experimental design). One can think of simple mechanical tests, such as uniaxial tension.

• Partial data: Observation that do not provide labeled data pairs. DIC experiments provide strain fields but only access to global loading measures and not point-wise stress data.

The goal in phenomenological constitutive modeling has always been to fit available data in a robust way, such that the predictions can be trustworthy in unseen situations, but this is one of the known bottlenecks of ML approaches, failure at so-called generalization. Due to limited data-availability, experiments that could provide a homogeneous stress state (e.g. uniaxial tension and compression, biaxial tension, simple shear) are traditionally utilized as a data source to test the suggested phenomenological constitutive models. The success of phenomenological modeling came from developing frameworks that comply with thermodynamic principles, eventually allowing the models to robustly interpolate and provide physical results, even when developed from low- and limited-data sources.

Different data sources (simple mechanical loading experiments, advanced imaging techniques, computational RVEs) can provide a different level of data availability and access to information (see Figure 1).

Figure 1: Scientific data for closure models: (a) Dogbone specimens from steel rebars in uniaxial tension (adapted from [14]). (b) DIC images from biaxial loading experiments (adapted from [15]). (c) Schematic of in-situ HEXD experiment of dogbone specimen in uniaxial tension (adapted from [16]). (d) Computational RVE for CP simulations (adapted from [17]). A Table is included to summarize data-availability and access to different types of data from the
above types of experiments.

ML-enabled constitutive modeling

The rising popularity of data-driven approaches has been crucial towards setting common practices for sampling, training, testing and validating. The purpose of these practices is to optimize the use of data, and design (physical or synthetic) experiments to test the performance of the trained models. In the context of mechanics, phenomenological modeling has led mostly to parameter estimation problems so the use of limited experiments is warranted for the most part. When one thinks of the model discovery part of phenomenological modeling, quickly, requirements on experimental data are not clear. As problems become more complex, understanding the data availability as well as optimizing the experiments and designing the learning approach accordingly is crucial.


The development of an automated data-driven approach for constitutive modeling, is a very practical need connected with material discovery, industrial engineering simulations and research, with benefits that can lead to more accurate predictions, but also reduce human involvement, speed-up of the processing-performance-product development cycle, and aleviation of high computational costs.

Ideally, data-driven constitutive models should:
1. be robust towards implementation in numerical solvers (both traditional like FEM, but also ML-enabled like PINN) for the solution of structural problems
2. not lead to unphysical results
3. be interpretable (human-understandable)
4. account for uncertainty
5. be data-availability aware
6. be able to account for observations from different data sources.

Even before reaching ML-enabled discovery of constitutive models, ML tools have often been utilized to for parameter estimation of known constitutive models [18], a task that as model parameters increase and experimental observations are limited becomes more complex due to the non-convex nature of the optimization problem at hand, a task that ML approaches excel at.

Utilizing labeled data pairs, a first generation of data-driven ML-enabled constitutive models was pioneered by Ghaboussi and collaborators as well as Furakawa and collaborators [19–22] focusing many times on 1D responses for different classes of materials, but also not trying to tackle the main issues of ML approachs, namely, overfitting, failure to generalize and lack of interpretability. A second wave of ML-enabled constitutive modeling focused on the development of approaches that can tackle problems from hyperelasticity to elastoplasticity more robustly, and can eventually be integrated in FE frameworks [23–25].

More recently, and aiming to tackle generalization and the limited-data nature of mechanics experiments, physical constraints have been utilized towards the construction of ML frameworks for constitutive modeling. From enforcing objectivity, to material symmetries and thermodynamic constraints there are approaches that enforce these condition weakly (through the loss function) [26,27] or strictly in the construction of the framework [28–30]. Strict enforcement of polyconvexity requirements for the strain energy density in the context of hyperelasticity has also proven extremely useful towards generalization, discovery and robustness [31–33], where in some case even interpretability can be achieved as seen in [34] due to the non-parametric nature of the specific implementation. Especially as one moves to path-dependent problems where data-driven discovery is particularly challenging [35,36], approaching elastoplasticity in a modular fashion [37–39] and utilizing some mechanistic intuition (towards constraining  the learning process in balance with data-availability) is crucial towards efficient utilization of the data and the development of robust approaches that can efficiently generalize.

Working with unlabeled data and inspired from approaches using symbolic regression [40] , previously utilized to discover laws from free-form data, and under the sparse regression umbrella, the EUCLID platform allows to work with unlabeled data to discover interpretable constitutive laws for a wide array of material classes [41,42]. This approach utilizes a library of known models, full field strain information (from synthetic or DIC experiments) as input along with global loading information to distill parsimonious data-driven constitutive laws. A very interesting extension was the development of an unsupervised Bayesian framework for discovering hyperelasticity models in EUCLID, accounting for uncertainty [43].


At that stage, still insufficient attention has been given to data-availability and uncertainty propagation, and the design of appropriate verification and validation processes. Most ML-based models have been designed from a big-data perspective, whereas experiments and lower-scale computations can only provide limited , often due to high computational cost. For the majority of structural engineering systems, modeling is conducted around deterministic frameworks and uncertainties are taken into account through safety factors. One major bottleneck, is that the solution of structural problems with uncertain parameters is exceedingly prohibitive when using traditional Monte-Carlo type sampling approaches. Additionally, the deterministic models used to model structural systems are developed based on experimental information that carries uncertainty but does not account for it, and all the information about uncertainty is lost when the structural problem is solved. This leaves a two-way disconnect between engineering systems and their virtual twins which is critical for assessing the risk and reliability of these systems.

To conclude, some challenges and opportunities for the next ML-enabled constitutive modeling and computational mechanics are the following:

• Enable interpretability and generalization in data-availability aware setting.

• Develop multifidelity learning frameworks that account for uncertainty utilizing a variety of data sources.

• Develop the next generation of verification and validation processes for ML-enabled tools in computational mechanics

• Develop benchmarks (similar to recent work in [44], and following the discussion from the Journal Club entry of May 2022 to test the robustness of new ML-enables constitutive laws. Similar to how the Sandia Fracture Challenge was very useful for the mechanics community, datasets that provide several modalities of training, testing and validataion, potentially to multifidelity data can be critical for the next steps in the community.

• Develop the next generation of open-source numerical solver platforms that allow flexibilty for constitutive model implementation, direct utilization of experimental data, access to ML-tools and solution of inverse problems.




[1] G. Wang, Y. Xie, Y. Jiang, A. Mandlekar, C. Xiao, Y. Zhu, L. Fan, A. Anandkumar, Voyager: An open-ended embodied agent with large language models, arXiv preprint arXiv:2305.16291 (2023).
[2] T. Kadeethum, D. O’Malley, J. N. Fuhg, Y. Choi, J. Lee, H. S. Viswanathan, N. Bouklas, A framework for data-driven solution and parameter estimation of pdes using conditional generative adversarial networks, Nature Computational Science 1 (12) (2021) 819–829.
[3] D. Xiao, C. Heaney, F. Fang, L. Mottet, R. Hu, D. Bistrian, E. Aristodemou, I. Navon, C. Pain, A domain decomposition non-intrusive reduced order model for turbulent flows, Computers & Fluids 182 (2019) 15–27.
[4] F. Ballarin, A. D’amario, S. Perotto, G. Rozza, A POD-selective inverse distance weighting method for fast parametrized shape morphing, International Journal for Numerical Methods in Engineering 117 (8) (2019) 860–884.
[5] L. Venturi, F. Ballarin, G. Rozza, A weighted POD method for elliptic PDEs with random inputs, Journal of Scientific Computing 81 (1) (2019) 136–153.
[6] J. Hesthaven, G. Rozza, B. Stamm, et al., Certified reduced basis methods for parametrized partial differential equations, Springer, 2016.
[7] L. Lu, P. Jin, G. Pang, Z. Zhang, G. E. Karniadakis, Learning nonlinearoperators via deeponet based on the universal approximation theorem of operators, Nature machine intelligence 3 (3) (2021) 218–229.
[8] N. Kovachki, Z. Li, B. Liu, K. Azizzadenesheli, K. Bhattacharya, A. Stuart, A. Anandkumar, Neural operator: Learning maps between function spaces, arXiv preprint arXiv:2108.08481 (2021).
[9] Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stuart, A. Anandkumar, Fourier neural operator for parametric partial differential equations, arXiv preprint arXiv:2010.08895 (2020).
[10] J. Hesthaven, S. Ubbiali, Non-intrusive reduced order modeling of nonlinear problems using neural networks, Journal of Computational Physics 363 (2018) 55–78.
[11] Q. Wang, J. Hesthaven, D. Ray, Non-intrusive reduced order modeling of unsteady flows using artificial neural networks with application to a combustion problem, Journal of computational physics 384 (2019) 289–307.
[12] I. E. Lagaris, A. Likas, D. I. Fotiadis, Artificial neural networks for solving ordinary and partial differential equations, IEEE transactions on neural networks 9 (5) (1998) 987–1000.
[13] M. Raissi, P. Perdikaris, G. Karniadakis, Physics-informed neural networks: A deep learning framework for solving forward and inverse problems in- volving nonlinear partial differential equations, Journal of Computational Physics 378 (2019) 686–707.
[14] S. Raza, J. Michels, M. Shahverdi, Uniaxial behavior of pre-stressed iron-based shape memory alloy rebars under cyclic loading reversals, Construction and Building Materials 326 (2022) 126900.
[15] K. B. Putra, X. Tian, J. Plott, A. Shih, Biaxial test and hyperelastic material models of silicone elastomer fabricated by extrusion-based additive manufacturing for wearable biomedical devices, Journal of the Mechanical Behavior of Biomedical Materials 107 (2020) 103733.
[16] M. Miller, P. Dawson, Understanding local deformation in metallic polycrystals using high energy x-rays and finite elements, Current Opinion in Solid State and Materials Science (0) (2014)
[17] J. N. Fuhg, L. van Wees, M. Obstalecki, P. Shade, N. Bouklas, M. Kasemer,Machine-learning convex and texture-dependent macroscopic yield from crystal plasticity simulations, Materialia 23 (2022) 101446.
[18] J. Wang, T. Li, F. Cui, C.-Y. Hui, J. Yeo, A. T. Zehnder, Metamodeling of constitutive model using gaussian process machine learning, Journal of the Mechanics and Physics of Solids (2021) 104532.
[19] J. Ghaboussi, J. H. Garrett, X. Wu, Material modeling with neural net- works, in: Proc. Int. Conf. on Numerical Methods in Engineering: Theory and Applications, 1990, pp. 701–717.
[20] J. Ghaboussi, J. Garrett Jr, X. Wu, Knowledge-based modeling of material behavior with neural networks, Journal of engineering mechanics 117 (1) (1991) 132–153.
[21] J. Ghaboussi, Neuro-biological computational models with learning capabilities and their applications in geomechanical modeling, in: Proceedings, Workshop on Recent Accomplishments and Future Trends in Geomechanics in the 21st Century, 1992.
[22] T. Furukawa, A neural constitutive model for viscoplasticity, in: International Conference on Computational Engineering Science, Costa Rica, 1997, pp. 453–458.
[23] M. A. Bessa, R. Bostanabad, Z. Liu, A. Hu, D. W. Apley, C. BrinsonW. Chen, W. K. Liu, A framework for data-driven analysis of materials under uncertainty: Countering the curse of dimensionality, Computer Methods in Applied Mechanics and Engineering 320 (2017) 633–667.
[24] D. Huang, J. N. Fuhg, C. Weissenfels, P. Wriggers, A machine learning based plasticity model using proper orthogonal decomposition, Computer Methods in Applied Mechanics and Engineering 365 (2020) 113008.
[25] J. N. Fuhg, M. Marino, N. Bouklas, Local approximate gaussian process regression for data-driven constitutive models: development and comparison with neural networks, Computer Methods in Applied Mechanics and Engineering 388 (2022) 114217.
[26] F. Masi, I. Stefanou, P. Vannucci, V. Maffi-Berthier, Thermodynamics-based artificial neural networks for constitutive modeling, Journal of the Mechanics and Physics of Solids 147 (2021) 104277.
[27] K. Linka, M. Hillgärtner, K. P. Abdolazizi, R. C. Aydin, M. Itskov, C. J. Cyron, Constitutive artificial neural networks: A fast and general approach to predictive data-driven constitutive modeling by deep learning, Journal of Computational Physics 429 (2021) 110010.
[28] A. Frankel, K. Tachida, R. Jones, Prediction of the evolution of the stress field of polycrystals undergoing elastic-plastic deformation with a hybrid neural network model, Machine Learning: Science and Technology 1 (3) (2020) 035005.
[29] J. N. Fuhg, N. Bouklas, On physics-informed data-driven isotropic and anisotropic constitutive models through probabilistic machine learning and space-filling sampling, Computer Methods in Applied Mechanics and Engineering 394 (2022) 114915.
[30] L. Linden, D. K. Klein, K. A. Kalina, J. Brummund, O. Weeger, M. Kästner, Neural networks meet hyperelasticity: A guide to enforcing physics, arXiv preprint arXiv:2302.02403 (2023).
[31] D. K. Klein, M. Fernández, R. J. Martin, P. Neff, O. Weeger, Polyconvex anisotropic hyperelasticity with neural networks, Journal of the Mechanics and Physics of Solids 159 (2022) 104703.
8[32] V. Tac, F. S. Costabal, A. B. Tepole, Data-driven tissue mechanics with polyconvex neural ordinary differential equations, Computer Methods in Applied Mechanics and Engineering 398 (2022) 115248.
[33] J. N. Fuhg, N. Bouklas, R. E. Jones, Learning hyperelastic anisotropy from data via a tensor basis neural network, arXiv preprint arXiv:2204.04529 (2022).
[34] K. Linka, E. Kuhl, A new family of constitutive artificial neural networks towards automated model discovery, Computer Methods in Applied Mechanics and Engineering 403 (2023) 115731.
[35] B. Liu, N. Kovachki, Z. Li, K. Azizzadenesheli, A. Anandkumar, A. Stuart, K. Bhattacharya, A learning-based multiscale method and its application to inelastic impact problems, arXiv preprint arXiv:2102.07256 (2021).
[36] N. N. Vlassis, W. Sun, Geometric learning for computational mechanics part ii: Graph embedding for interpretable multiscale plasticity, Computer Methods in Applied Mechanics and Engineering 404 (2023) 115768.
[37] N. N. Vlassis, W. Sun, Sobolev training of thermodynamic-informed neural networks for interpretable elasto-plasticity models with level set hardening, Computer Methods in Applied Mechanics and Engineering 377 (2021) 113695.
[38] J. N. Fuhg, A. Fau, N. Bouklas, M. Marino, Enhancing phenomenological yield functions with data: Challenges and opportunities, European Journal of Mechanics-A/Solids (2023) 104925.
[39] J. N. Fuhg, C. M. Hamel, K. Johnson, R. Jones, N. Bouklas, Modular machine learning-based elastoplasticity: Generalization in the context of limited data, Computer Methods in Applied Mechanics and Engineering 407 (2023) 115930.
[40] M. Schmidt, H. Lipson, Distilling free-form natural laws from experimental data, science 324 (5923) (2009) 81–85.
[41] P. Thakolkaran, A. Joshi, Y. Zheng, M. Flaschel, L. De Lorenzis, S. Kumar, Nn-euclid: deep-learning hyperelasticity without stress data, arXiv preprint arXiv:2205.06664 (2022).
[42] M. Flaschel, S. Kumar, L. De Lorenzis, Discovering plasticity models without stress data, npj Computational Materials 8 (1) (2022) 1–10.
[43] A. Joshi, P. Thakolkaran, Y. Zheng, M. Escande, M. Flaschel, L. De Lorenzis, S. Kumar, Bayesian-euclid: discovering hyperelastic material laws with uncertainties, arXiv preprint arXiv:2203.07422 (2022).
[44] V. Tac, K. Linka, F. Sahli-Costabal, E. Kuhl, A. B. Tepole, Benchmarks for physics-informed data-driven hyperelasticity, arXiv preprint arXiv:2301.10714 (2023).

Image icon jc2023.jpg429.48 KB
Subscribe to Comments for "Journal Club for June 2023: A perspective on the role of machine learning in constitutive modeling and computational mechanics"

Recent comments

More comments


Subscribe to Syndicate