Default parameters¶
The default parameters that control the database construction, model training, high-throughput prediction …
You can import such parameters from agat
module.
from agat.default_paramters import default_elements, default_build_properties, default_data_config, default_train_config, default_high_throughput_config
Or you can read the source code.
default_elements
¶
Elements used to build graph. A list of elements that are used to encode atomic features.
['Ac', 'Ag', 'Al', 'Am', 'Ar', 'As', 'At', 'Au', 'B', 'Ba',
'Be', 'Bh', 'Bi', 'Bk', 'Br', 'C', 'Ca', 'Cd', 'Ce', 'Cf',
'Cl', 'Cm', 'Cn', 'Co', 'Cr', 'Cs', 'Cu', 'Db', 'Ds', 'Dy',
'Er', 'Es', 'Eu', 'F', 'Fe', 'Fl', 'Fm', 'Fr', 'Ga', 'Gd',
'Ge', 'H', 'He', 'Hf', 'Hg', 'Ho', 'Hs', 'I', 'In', 'Ir',
'K', 'Kr', 'La', 'Li', 'Lr', 'Lu', 'Lv', 'Mc', 'Md', 'Mg',
'Mn', 'Mo', 'Mt', 'N', 'Na', 'Nb', 'Nd', 'Ne', 'Nh', 'Ni',
'No', 'Np', 'O', 'Og', 'Os', 'P', 'Pa', 'Pb', 'Pd', 'Pm',
'Po', 'Pr', 'Pt', 'Pu', 'Ra', 'Rb', 'Re', 'Rf', 'Rg', 'Rh',
'Rn', 'Ru', 'S', 'Sb', 'Sc', 'Se', 'Sg', 'Si', 'Sm', 'Sn',
'Sr', 'Ta', 'Tb', 'Tc', 'Te', 'Th', 'Ti', 'Tl', 'Tm', 'Ts',
'U', 'V', 'W', 'Xe', 'Y', 'Yb', 'Zn', 'Zr']
default_build_properties
¶
A dictionary defines which properties will be built into the graph.
Parameter | Default value | Alternative(s) | Explanation |
---|---|---|---|
energy |
True |
False |
Include total energy when building graphs. |
forces |
True |
False |
Include atomic forces when building graphs. |
cell |
True |
False |
Include structural cell when building graphs. |
cart_coords |
True |
False |
Include Cartesian coordinates when building graphs. |
frac_coords |
True |
False |
Include Fractional coordinates when building graphs. |
constraints |
True |
False |
Include constraint information when building graphs. |
stress |
True |
False |
Include Virial stress when building graphs. |
distance |
True |
False |
Include distance between connected atoms when building graphs. |
direction |
True |
False |
Include unit vector between connected atoms when building graphs. |
path |
False |
True |
Include file path of each graph corresponding to DFT calculations when building graphs. |
default_data_config
¶
A dictionary defines how to build a database.
Parameter | Default value | Alternative(s) | Explanation |
---|---|---|---|
species |
default_elements above |
A list of element symbols | A list of elements that are used to encode atomic features. |
path_file |
'paths.log' | str |
A file of absolute paths where OUTCAR and XDATCAR files exist. |
build_properties |
default_build_properties above |
See default_build_properties | Properties needed to be built into graph. |
topology_only |
False |
True |
Build graph with topology connections only. The energy , forces , cell , and stress will not be included. This setting has higher priority than default_build_properties |
dataset_path |
'dataset' | A str |
A directory contains the database. |
mode_of_NN |
'ase_natural_cutoffs' | 'ase_natural_cutoffs', 'pymatgen_dist', 'ase_dist', and 'voronoi' | The mode of how to detect connection between atoms. Note that pymatgen is much faster than ase . |
cutoff |
5.0 | A float |
Cutoff distance to identify connections between atoms. Deprecated if mode_of_NN is 'ase_natural_cutoffs' |
load_from_binary |
False |
True |
Read graphs from binary graphs that are constructed before. If this variable is True , these above variables will be depressed. |
num_of_cores |
2 | int |
How many cores are used to extract vasp files and build graphs. |
super_cell |
False |
True |
When building graphs, small cell may have problems to find neighbors. Specify this parameter as True to repeat cell to avoid such problems |
has_adsorbate |
False |
True |
Include adsorbate information when building graphs. For now, only H and O atoms are considered as adsorbate atoms. |
keep_readable_structural_files |
False |
True |
Massive number of structural files (POSCARs) under dataset_path are generated when building graphs, you can choose to keep them or not. |
mask_similar_frames |
False |
True |
In VASP calculations, the energy optimization generate many frames that have similar geometry and total energies, you can extract only some of them by specifying this parameter and energy_stride below. |
mask_reversed_magnetic_moments |
False |
float |
Frames with atomic magnetic moments lower than this value will be masked. |
scale_prop |
False |
True |
Scale the properties. This function seems to be deprecated. I need to double-check the source code first, so do not use it. |
default_train_config
¶
A dict
determines how to train the AGAT model.
Parameter | Default value | Alternative(s) | Explanation |
---|---|---|---|
verbose |
1 |
0 , 1 |
Output verbosity. 0 : test output; 1 : Validation and test output; 2 : train, validation, and test output. |
dataset_path |
'dataset' | A str |
A directory contains the database. |
model_save_dir |
'agat_model' | directory name, str |
A directory to save the well-trained model. |
epochs |
1000 |
int |
Number of training epochs. |
output_files |
'out_file' | str |
A directory to store ouputs of true and predicted properties. |
device |
'cuda:0' | 'cpu' | Device to train the model. Use GPU cards to accerelate training. |
validation_size |
0.15 | float , 0<validation_size <1 |
Determines the proportion of the dataset to be included in the validation split. |
test_size |
0.15 | float , 0<validation_size <1 |
Determines the proportion of the dataset to be included in the test split. |
early_stop |
True |
False |
Implement early stop or not. If this is True , the training will be terminated after a specified number of epochs without model improvement. If this is False , the model weights will be saved every epoch. |
stop_patience |
300 |
int |
Activated when early_stop=True . The training will be terminated after stop_patience epochs without model improvement. |
head_list |
['mul', 'div', 'free'] | list |
A list of attention mechanisms. See agat/model/model.py. |
gat_node_dim_list |
[len(default_elements), 100, 100, 100] | list |
Node dimensions of AGAT Layer . |
energy_readout_node_list |
[len(head_list)*gat_node_dim_list[-1], 100, 50, 30, 10, 3, 1] | list |
A list of node dimensions of energy readout layers. |
force_readout_node_list |
[len(head_list)*gat_node_dim_list[-1], 100, 50, 30, 10, 3, 3] | list |
A list of node dimensions of force readout layers. |
stress_readout_node_list |
[len(head_list)*gat_node_dim_list[-1], 100, 50, 30, 10, 3, 6] | list |
A list of node dimensions of stress readout layers. |
bias |
True |
False |
Add bias or not to the neural networks. |
negative_slope |
0.2 | float |
This specifies the negative slope of the LeakyReLU activation function. |
criterion |
nn.MSELoss() | torch.nn loss functions |
Creates a criterion that measures the mean squared error (squared L2 norm) between each element in the input x and target y . |
a |
1.0 | float |
The importance of energy loss in the total loss function. See agat/model/fit.py. |
b |
1.0 | float |
The importance of force loss in the total loss function. See agat/model/fit.py. |
c |
0.0 | float |
The importance of stress loss in the total loss function. See agat/model/fit.py. |
learning_rate |
0.0001 | float |
The learning rate of torch.optim.Adam optimizer. |
weight_decay |
0.0 | float |
The weight decay of torch.optim.Adam optimizer. |
batch_size |
64 | int |
Training batch size. |
val_batch_size |
400 | int |
Batch size when validation and test. |
transfer_learning |
False |
True |
Turn on the transfer learning when True . (Deprecated) |
trainable_layers |
-4 | negative int |
tail trainable_layers layers are trainable, other layers are freezed. (Deprecated) |
mask_fixed |
False |
True |
Mask fixed atoms or not. When True , the atomic forces of fixed atoms will not be included in the loss function. (Deprecated) |
tail_readout_no_act |
[3,3,3] | list |
The tail tail_readout_no_act layers will have no activation functions. The first, second, and third elements are for energy, force, and stress readout layers, respectively. |
adsorbate_coeff |
20.0 | float |
Indentify and specify the importance of adsorbate atoms with respective to surface atoms. zero for equal importance. |
default_ase_calculator_config
.¶
See bfgs for more details.
Parameter | Default value | Alternative(s) | Explanation |
---|---|---|---|
fmax |
0.1 | float |
Convergence criterion of atomic forces. Details: ase optimizer |
steps |
200 | int |
Maximum iteration steps. |
maxstep |
0.05 | float |
maximum distance an atom can move per iteration, unit is Å. |
restart |
None |
str |
Pickle file used to store hessian matrix. |
restart_steps |
0 | int |
Restart optimization if the optimization cannot converge. |
perturb_steps |
0 | int |
Number of perturbated steps. AGAT may have issues in converging BFGS, perturbating atomic positions may help the convergence. |
perturb_amplitude |
0.05 | float |
Perturbation amplitudes if erturb_steps larger than 1 . |
out |
None |
str | Base name of the output log and traj. |
default_high_throughput_config
¶
Settings for the high-throughput predictions.
Parameter | Default value | Alternative(s) | Explanation |
---|---|---|---|
model_save_dir |
agat_model |
str , a directory name |
A directory for loading the well-trained model from. |
opt_config |
default_ase_calculator_config |
dict |
Settings for ase.optimize.BFGS structural optimizer. |
calculation_index |
0 | str |
To label the calculation outputs. |
fix_all_surface_atom |
False |
True |
Fix all surface atoms or not. |
remove_bottom_atoms |
False |
True |
Remove the bottom atoms or not. |
save_trajectory |
False |
True |
Keep the optimization trajectory. |
partial_fix_adsorbate |
False |
True |
Partially fix the adsorbate freedom. |
adsorbates |
['H'] | keys of agat/lib/adsorbate_poscar.py |
|
sites |
['ontop'] | list |
A list of adsorption sites. See agat/app/cata/generate_adsorption_sites.py |
dist_from_surf |
1.7 | float |
Distance between adosrbate and surface. Unit: angstrom. |
using_template_bulk_structure |
False |
True |
Using template to build the surface model. If this is True , you need to prepare a POSCAR_temp file in the working directory. |
graph_build_scheme_dir |
'dataset' | A directory name. str |
A directory storing the graph_build_scheme.json file. This file is generated when building the database, and is normally saved in the default_data_config['dataset_path'] . |
device |
'cuda' | str |
Determines the device for the model prediction (forward). |
default_hp_dft_config
¶
Default settings for the high-throughput DFT calculations.
To be continued…