Accurate piezoelectric tensor prediction with equivariant attention tensor graph neural network

Accurate piezoelectric tensor prediction with equivariant attention tensor graph neural network


Equivariant attention tensor graph neural network

EATGNN can establish the relationship between crystal structures and their properties. The architecture and specific details of EATGNN are shown in Fig. 1. It takes the crystal structure as input22, represents it as a crystal graph \({\mathcal{G}}(V,E)\), and utilizes message passing and multi-head attention mechanism to update the graph across multiple graph attention layers. Ultimately, the model outputs the piezoelectric tensor in its irreducible representation, which is then post-processed to obtain the target piezoelectric tensor.

Fig. 1: Schematic overview of EATGNN.
figure 1
The model converts the input crystal structure into a graph representation, which is then processed by an equivariant graph attention neural network to predict the piezoelectric tensor as the output. In this figure, “\(\oplus\)” denotes addition, “\(\otimes\)” denotes multiplication, and “TP” means tensor product.

The crystal graph \({\mathcal{G}}(V,E)\) is composed of nodes and edges, where the nodes represent atoms, and edges represent atomic bonds. The feature vector \({f}_{i}\) characterizes atom \(i\), and the initial value of \({f}_{i}\) is obtained by Magpie23,24 which describes the physical properties of an atom using a vector that contains 21 atomic physical attributes, including atomic weight, covalent radius, electronegativity, group number, period number, and other information. We then perform one-hot encoding on these 21 atomic attribute values, resulting in a 119-dimensional vector to represent the nodes’ attributes and initial features25. A cutoff radius \({r}_{{cut}}\) varies for different unit cell sizes to better capture the full symmetry of the crystal, and edges are created to connect atoms within the cutoff radius, considering the periodic boundary conditions during the search for neighboring atoms. For the edge embedding, it is processed in two parts: first, the radial component \({||}{\vec{r}}_{{ij}}{||}\) is expanded using a smooth-finite basis functions, where \({\vec{r}}_{{ij}}\) is a vector between atoms \(i\) and \(j\), and the angular component \({\hat{\vec{r}}}_{{ij}}\) is expanded using spherical harmonics \({Y}_{l}^{m}\).

GNNs typically update information through a message-passing mechanism.

$${f}_{i}^{{\prime} }=\mathop{\sum }\limits_{j\in \partial (i)}{f}_{j}\otimes h({\rm{||}}{\vec{r}}_{{ij}}{\rm{||}})Y\left(\frac{{\vec{r}}_{{ij}}}{{||}{\vec{r}}_{{ij}}{||}}\right),$$

(1)

$${\left({x}^{\left({L}_{1}\right)}\otimes {y}^{\left({L}_{2}\right)}\right)}_{{m}_{3}}=\mathop{\sum }\limits_{{m}_{1}=-{L}_{1}}^{{L}_{1}}\mathop{\sum }\limits_{{m}_{2}=-{L}_{2}}^{{L}_{2}}{C}_{\left({L}_{1},{m}_{1}\right),\left({L}_{2},{m}_{2}\right)}^{\left({L}_{3},{m}_{3}\right)}{x}_{{m}_{1}}^{\left({L}_{1}\right)}{y}_{{m}_{2}}^{\left({L}_{2}\right)}$$

(2)

Equation (1) represents a general message-passing formula for equivariant graph neural networks, where \({f}_{j}\), \({f}_{i}^{{\prime} }\) are the nodes input and output, \(\partial (i)\) is the set of average neighbors of the atom \(i\) in the training dataset, \({\vec{r}}_{{ij}}\) is the relative vector, \(h\) is a MLP with distance between atoms as input, \(Y\) is the spherical harmonics and \(x\otimes y\) is tensor product of \(x\) and \(y\) parametrized by some weights w10,18, the tensor product \(\otimes\) is defined in Eq. (2), where \(C\) denotes the Clebsch-Gordan coefficients. Equation (6) is an equivariant function, so if the input is rotated, the output is transformed accordingly. Unlike traditional GNNs, the atomic features in EATGNN include not only scalars but also higher-order irreps tensors, which enhances the model’s generalization capabilities and allows it to achieve better results with less data18.

In this work, we replace this summation process in Eq. (1) with a multi-head attention mechanism. The attention mechanism transforms features sent from one node to another with input-dependent weights10, and enhances the model’s ability to update information of the node and graph by leveraging the neighbors of each node. We perform a tensor product of the node features and node attributes to obtain the query (\({q}_{i}\)), and a tensor product of the node features and edge attributes to obtain the key (\({k}_{i}\)) and value (\({v}_{i}\)) required for the attention mechanism computation. Then, the \({q}_{i}\), \({k}_{i}\) and \({v}_{i}\) are each split according to the number of heads10, then, for each head, we obtain the respective \({q}_{{im}}\), \({k}_{{im}}\) and \({v}_{{im}}\). And then calculate the attention coefficients \({\alpha }_{{ijm}}\), which are defined by Eq. (3), the calculation of the output for node features \({f}_{{im}}^{{\prime} }\) is shown as in Eq. (4) and concatenate the representations from all heads back together17. Afterward, the node features are subjected to nonlinear activation and equivariant layer normalization10. Before the model outputs the target quantity, an average pooling is performed across all nodes, since our target physical quantity, the piezoelectric tensor, is independent of the number of atoms in the unit cell.

$${\alpha }_{{ijm}}=\frac{\exp \left({q}_{{im}}\otimes {k}_{{jm}}\right)}{{\sum }_{{{j}^{{\prime} }}=1}^{n}\exp ({q}_{{im}}\otimes {k}_{{{j^{{\prime} }}m}})}$$

(3)

$${f}_{{im}}^{{\prime} }=\mathop{\sum }\limits_{j=1}^{n}{\alpha }_{{ijm}}{v}_{{jm}}$$

(4)

In summary, EATGNN can be considered as an equivariant function to the group transformation, such that:

$${D}_{y}\left(g\right)f\left(x\right)=f\left({D}_{x}\left(g\right)x\right),$$

(5)

where \({D}_{x}(g)\) and \({D}_{y}\left(g\right)\) are the transformation matrices for the crystal structure and the piezoelectric tensor, respectively. Equation (5) ensures the model’s capability for equivariant recognition, accommodating the symmetry operations of major space groups and outputting a piezoelectric tensor that conforms to the material’s intrinsic symmetry. Under these symmetry constraints, many components of the piezoelectric tensor are restricted, thereby significantly enhancing the model’s predictive accuracy for piezoelectric tensors.

Irreducible representation of piezoelectric tensor

The piezoelectric tensor \({e}_{{ijk}}\) is a third-rank tensor, with a total of 27 components, and it satisfies the index symmetry relation \({e}_{{ijk}}={e}_{{ikj}}\). Therefore, there are a total of 18 independent components for a piezoelectric tensor. The intrinsic symmetry requirements of materials further restrict the number of independent components in the piezoelectric tensor (Fig. 2a). It is worth noting that the specific representation of the third-rank piezoelectric tensor depends on the choice of the coordinate axes. Thus, the rotation of the reference coordinate axes will change the values of tensor elements. The piezoelectric component varies with the rotation of the coordinate axes, as follows:

$${e}_{{ijk}}^{* }={A}_{{il}}{A}_{{jm}}{A}_{{kn}}{e}_{{lmn}},$$

(6)

where \({A}_{{il}}\) are the elements of the rotation matrix corresponding to the two coordinate axes26. Therefore, when we know the piezoelectric tensor in one coordinate system, the formula can be used to find the representation of the piezoelectric tensor in any other coordinate system. Due to the constraints imposed by the intrinsic symmetry of the crystals and in conjunction with Eq. (6), we obtain the distribution of independent elements in the piezoelectric tensors across all crystal systems and point groups, as illustrated in Fig. 2a. Nevertheless, the numerical values of the final piezoelectric tensor matrix still depend on the choice of coordinate system and cannot be directly expressed in an equivariant form with respect to the crystal structure.

Fig. 2: Schematic of piezoelectric tensor.
figure 2

a Symmetry classes and independent components of the piezoelectric tensor in each crystal system and point group. b Irreducible representation decomposition of the piezoelectric tensor.

To address this issue, we performed a harmonic decomposition of the piezoelectric tensor27. Then, the space of piezoelectric tensors is factored into the direct sum of irreducible representations. The piezoelectric tensor can be decomposed into four irreducible subspaces:

$${Piez}\to {{\mathcal{H}}}^{3}\oplus {{\mathcal{H}}}^{1}\oplus {{\mathcal{H}}}^{2}\oplus {{\mathcal{H}}}^{1}$$

(7)

where \({{\mathcal{H}}}^{n}\) is the space of nth order symmetric and traceless tensors28, and a symmetric and traceless tensor is called a deviator. For example, scaler and vector are zeroth- and first-order deviator. The piezoelectric tensor \({e}_{{ijk}}\) has the orthogonal irreducible decomposition27,29:

$${e}_{{ijk}}={\delta }_{{jk}}{u}_{i}+\left({\delta }_{{ij}}{v}_{k}+{\delta }_{{ik}}{v}_{j}-\frac{2}{3}{\delta }_{{jk}}{v}_{i}\right)+\left({\epsilon }_{{iks}}{D}_{{sj}}+{\epsilon }_{{ijs}}{D}_{{sk}}\right)+{D}_{{ijk}}$$

(8)

where \(u=({u}_{i})\) and \(v={(v}_{j})\) is a vector with three independent components, \(D=({D}_{{ij}})\) is a second order symmetric and traceless tensor with five independent components, \({D}_{{ijk}}\) is a third order symmetric and traceless tensor with seven independent components, \({\epsilon }_{{ijk}}\) is the Levi-Civita symbol and \({\delta }_{{ij}}\) is the Kronecker delta27,28,29, as shown in Fig. 2b. In Eq. (8), each component of the decomposed piezoelectric tensor is equivariant to rotational operations. This decomposition provides a powerful framework for physically understanding the piezoelectric behavior of various materials based on the contributions from these irreducible subspaces.

With the irreducible representation of the piezoelectric tensor, we now can overcome the challenges in previous reports11,12,13,18 posed by the choice of coordinate system in the prediction of piezoelectric tensors. Therefore, by combining the irreducible representation of the piezoelectric tensor and the multi-head attention mechanism with an equivariant model, we can achieve highly accurate AI model for predicting the piezoelectric tensor of crystals. This significantly accelerates the discovery of new and high-performance piezoelectric materials and devices.

Piezoelectric tensor machine learning for bulk crystals

The piezoelectric tensor data of bulk crystals in this work was obtained from the Materials Project21 and a recent high-throughput computational study on piezoelectric tensors24,30. Due to the presence of a significant number of outliers in the piezoelectric tensors, we performed data resampling before applying machine learning techniques to prevent overfitting. Additionally, we restricted the materials used in this work to a unit cell containing no more than 25 atoms to ensure a reasonable distribution of data. We examined all materials dataset to ensure that the target piezoelectric tensors are consistent with the material’s symmetry. Furthermore, we mathematically adjusted those piezoelectric tensors that did not conform to the material’s symmetry to be invariant under crystallographic symmetry operations. Ultimately, we obtained a dataset comprising 3444 entries, and these data were randomly divided into training and test sets in a 9:1 ratio.

Figure 3a shows the distribution of the absolute values of the maximum piezoelectric components in the dataset after resampling. The MAE of each piezoelectric tensor element in Voigt notation is displayed in Fig. 3b using a heatmap. Unlike other models about piezoelectric tensor prediction, EATGNN can directly predict complete piezoelectric tensors that conform to crystal symmetries, and each tensor element exhibits relatively small errors. Utilizing the complete piezoelectric tensor allows for the determination of the maximum piezoelectric components, including the maximum longitudinal, transverse, and shear piezoelectric effects. Furthermore, the Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE) of EATGNN on the test set are 0.154 C/m2 and 0.141 C/m2, respectively. Table 1 compares the performance of EATGNN with the equivariant model MatTen18, the invariant model ALIGNN, and the invariant model based on CFID descriptors combined with XGBoost (XGB)31. It can be observed that EATGNN outperforms the other models in both MAE and RMSE. Additionally, we evaluated the impact of the multi-head attention mechanism on the learning capability of the equivariant model. The results in Table 1 indicate that incorporating this mechanism further enhances the model’s predictive performance. This demonstrates the effectiveness of the multi-head attention mechanism and harmonic decomposition of the tensor.

Fig. 3: Performance of EATGNN on various piezoelectric properties and the distribution of dataset.
figure 3

a Histogram of the distribution of \({|{e}_{{ij}}|}_{\max }\) for bulk materials. b Heatmap of the MAE of the piezoelectric tensors for bulk materials in unit of \({\rm{C}}/{{\rm{m}}}^{2}\). c Histogram of the distribution of \({|{e}_{{ij}}|}_{\max }\) for 2D materials. d Regression plot of the maximum piezoelectric component in 2D materials. e Regression plot of the maximum longitudinal piezoelectric component in 2D materials. f The maximum piezoelectric component of MoS2 under various rotations of the coordinate system, the coincidence of the solid and dashed lines indicates that EATGNN satisfies equivariance. The green curve represents the maximum piezoelectric components of the initial output piezoelectric tensor at various rotation angles, while the yellow stars denote the maximum piezoelectric components output by EATGNN at each angle.

Table 1 Performance of various models in predicting the complete bulk piezoelectric tensor (C/m2) and 2D piezoelectric tensor (pC/m)

Piezoelectric tensor machine learning for 2D crystals

The high demand for flexible nanodevices drives interest in 2D piezoelectric materials, essential for precision actuators, wearable sensors, and smart materials. These materials serve as nano-generators, offering a practical alternative to micro-scale battery packs for powering nanoscale devices. In contrast to bulk systems, the periodicity along the \(c\)-direction in 2D materials is lost, and consequently the 2D piezoelectric tensor can be further simplified to a 3 × 3 matrix32. Therefore, there has not been any reported machine learning models to predict piezoelectric tensors for both bulk and 2D crystals.

Next, we will prove that our EATGNN model is also applicable for 2D materials with good accuracy. The piezoelectric tensor data of 2D materials was obtained from the C2DB database33,34, comprising a total of 1382 data points, after analyzing and removing some extreme outliers to prevent the model from learning an inaccurate data distribution, 1350 data points were retained and randomly split into training and test sets with 9:1 ratio. Figure 3c shows the distribution of the absolute values of the maximum piezoelectric components in the dataset after resampling. EATGNN requires that the output maintains equivariance when the input data undergoes a group transformation. We first checked whether the piezoelectric tensors in the dataset satisfy the symmetry of their corresponding crystal structures. This was achieved by applying the crystal structure’s symmetry operations to the tensor and checking if it remains invariant under these operations. If a change occurred, we transformed the tensor into a symmetry-compliant form that respects the lattice symmetry22. Generally, this operation can utilize symmetry constraints to restrict certain tensor elements to zero.

EATGNN can output the full piezoelectric tensor, and the RMSE and MAE are used to evaluate the performance of this model. the RMSE and MAE of our model are 50.6 pC/m and 16.8 pC/m, respectively. Table 1 compares the performance of EATGNN on 2D materials with that of the equivariant model MatTen, the invariant model ALIGNN, and the invariant model based on CFID descriptors combined with XGB, similar to the comparisons performed for bulk models. It can be observed that EATGNN outperforms the other models in both MAE and RMSE, with the incorporation of the multi-head attention mechanism further enhancing its predictive capability.

We also computed several commonly used piezoelectric properties: the maximum piezoelectric component and longitudinal component, all taken as absolute values (\({|{e}_{{ij}}|}_{\max }\) and \({|{e}_{{ii}}|}_{\max }\)). Model performance on the test set is shown in Fig. 3d and e. Additionally, we generate material descriptors using CFID and train scalar models with XGB and Random Forests (RF)35 for comparison. The results, shown in Table 2, indicate that for predicting the highly symmetry-dependent physical quantity of the piezoelectric tensor, EATGNN demonstrates a significant advantage over the descriptor-based methods.

Table 2 Performance of various models in predicting the maximum piezoelectric component (pC/m) and maximum longitudinal piezoelectric component (pC/m)

To ensure the model’s equivariance, we rotated the crystal structure and anticipated that the predicted piezoelectric tensor would correspondingly transform. Using Eq. (6), we calculated the piezoelectric tensor under rotational operations at various angles and checked if it was consistent with the model’s output. As shown in Fig. 3f, we calculated the maximum piezoelectric component \({|{e}_{{ij}}|}_{\max }\) of MoS2 under arbitrary in-plane rotations of the coordinate system. Simultaneously, we rotate the crystal structure accordingly, and employ the model to predict the corresponding rotated piezoelectric tensor and its maximum piezoelectric component. By comparing the predicted and calculated values, we verify the equivariance of the model as shown in Fig. 3f. The results demonstrate that the model predictions perfectly match the actual calculations, confirming that this model is indeed an equivariant model for piezoelectric tensor prediction.

Screening of crystals with large piezoelectricity

In order to discover new potential high-performance piezoelectric materials, we use our model on both 3D and 2D materials in the Materials Project and 2DMatPedia database36, respectively. Our selection criteria are 1) out of the training datasets, 2) inversion asymmetry, 3) Egap > 0.1 eV, and 4) Nuc < 30 atoms. Using our model, we identify several good candidates (See in Tables 3 and 4 and Supplementary Material), which are further validated using density functional perturbation theory (DFPT) calculations. As demonstrated in Fig. 4a, the distribution of the maximum piezoelectric components of these newly discovered bulk materials were obtained through first-principles calculations. The results of one of the best piezoelectric materials, CsBiNb2O7, is shown in Fig. 4b, with its piezoelectric tensor components \({e}_{22}\) and \({e}_{26}\) being 2.01 and 2.72 C/m2, respectively. The piezoelectric tensor can be further decomposed into two parts: a clamped-ion term and internal strain term37. One can observes that the total \({e}_{22}\) is dominated by the large internal strain term. However, the clamped-ion contribution is nearly zero. Besides, since the piezoelectric tensor depends on the choice of coordinate system, rotating the structure by 45 degrees around the z-axis can yield larger piezoelectric tensor components \({e}_{11}=\) 3.14 C/m2. Based on this property, a Z-45° cutting slice can be fabricated in experiments to achieve greater longitudinal piezoelectric effect.

Table 3 A selection of potential bulk piezoelectric materials, along with their Material Project IDs, formulas and \({|{e}_{{ij}}|}_{\max }({\rm{C}}/{{\rm{m}}}^{2})\) calculated by DFT
Table 4 A selection of potential 2D piezoelectric materials, along with their 2DMatPedia IDs, formulas and \({|{e}_{{ij}}|}_{\max }({\rm{pC}}/{\rm{m}})\) calculated by DFT
Fig. 4: Analysis of the potential piezoelectric materials.
figure 4

a Histogram of the distribution of \({|{e}_{{ij}}|}_{\max }\) of new potential bulk piezoelectric materials. b The variation of polarization with strain along b axis in \({{\rm{CsBiNb}}}_{2}{{\rm{O}}}_{7}\). The inset image shows the geometric structure of CsBiNb2O7.




Source link

Previous Article

Pamela Bach, Baywatch and Knight Rider star, found dead, aged 62

Next Article

Transak secures AUSTRAC registration, expands crypto services in Australia

Write a Comment

Leave a Comment

Your email address will not be published. Required fields are marked *

Subscribe to our Newsletter

Subscribe to our email newsletter to get the latest posts delivered right to your email.
Pure inspiration, zero spam ✨