build_dataset¶
The high-level API in this script is BuildDatabase
.
For example:
database = BuildDatabase()
database.build()
See https://jzhang-github.github.io/AGAT/Tutorial/Build_database.html for more info.
- class CrystalGraph(object)¶
Read structural file and return a graph.
Caution
The constructed crystal graph may be unreasonable for high-entropy materials, if the connections are analyzed by Voronoi method.
Code example:
from Crystal2raph import CrystalGraph cg = CrystalGraph(cutoff = 6.0, mode_of_NN='distance', adsorbate=True) cg.get_graph('POSCAR')
Hint
Although we recommend representing atoms with one hot code, you can use the another way with:
self.all_atom_feat = get_atomic_features()
Hint
In order to build a reasonable graph, a samll cell should be repeated. One can modify “self._cell_length_cutoff” for special needs.
Hint
We encourage you to use
ase
module to build crystal graphs. Thepymatgen
module needs some dependencies that conflict with other modules.- __init__(self, **data_config)¶
- param **data_config:
Configuration file for building database. See https://jzhang-github.github.io/AGAT/Default%20parameters.html#default-data-config for the detailed info.
- type **data_config:
str/dict
- return:
A
DGL.graph
.- rtype:
DGL.graph
.
Hint
Mode of how to get the neighbors, which can be:
'voronoi'
: consider Voronoi neighbors only.'pymatgen_dist'
: build graph based on a constant distance usingpymatgen
module.'ase_dist'
: build graph based on a constant distance usingase
module.'ase_natural_cutoffs'
: build graph fromase
which has a dynamic cutoff scheme. In this case, thecutoff
is deprecated becausease
will use the dynamic cutoffs inase.neighborlist.natural_cutoffs()
.
- Parameters:
adsorbate (bool) – Identify the adsorbate or not.
- get_adsorbate_bool(self, element_list)¶
Identify adsorbates based on elements: H and O.
- Parameters:
element_list (list) – a list of element symbols.
- Returns:
a list of bool values.
- Return type:
torch.tensor
- get_crystal(self, crystal_fpath, super_cell=True)¶
Read structural file and return a pymatgen crystal object.
- Parameters:
crystal_fpath (str) – the path to the crystal structural.
super_cell (bool) – repeat the cell or not.
- Returns:
a pymatgen structure object.
- Return type:
pymatgen.core.structure
.
- get_1NN_pairs_voronoi(self, crystal)¶
The
get_connections_new()
ofVoronoiConnectivity
object is modified.- Parameters:
crystal (pymatgen.core.structure) – a pymatgen structure object.
- Returns:
index of senders
index of receivers
a list of distance between senders and receivers
- get_1NN_pairs_distance(self, crystal)¶
Find the index of senders, receivers, and distance between them based on the
distance_matrix
of pymargen crystal object.- Parameters:
crystal (pymargen.core.structure) – pymargen crystal object
- Returns:
index of senders
index of receivers
a list of distance between senders and receivers
- get_1NN_pairs_ase_distance(self, ase_atoms)¶
- Parameters:
ase_atoms (ase.atoms) –
ase.atoms
object.- Returns:
index of senders
index of receivers
a list of distance between senders and receivers
- get_ndata(self, crystal)¶
- Parameters:
crystal (pymargen.core.structure) – a pymatgen crystal object.
- Returns:
ndata: the atomic representations of a crystal graph.
- Return type:
numpy.ndarray
- get_graph_from_ase(self, fname, include_forces=False)¶
Build graphs with
ase
.- Parameters:
fname (str/ase.Atoms) – File name or
ase.Atoms
object.include_forces (bool) – Include forces into graphs or not.
- Returns:
A bidirectional graph with self-loop connection.
- get_graph_from_pymatgen(self, crystal_fname, super_cell=True, include_forces=False)¶
Build graphs with pymatgen.
- Parameters:
crystal_fname (str) – File name.
super_cell (bool) – repeat small cell or not.
include_forces (bool) – Include forces into graphs or not.
- Returns:
A bidirectional graph with self-loop connection.
- get_graph(self, crystal_fname, super_cell=False, include_forces=True)¶
This method can choose which graph-construction method is used, according to the
mode_of_NN
attribute.Hint
You can call this method to build one graph.
- Parameters:
crystal_fname (str) – File name.
super_cell (bool) – repeat small cell or not.
include_forces (bool) – Include forces into graphs or not.
- Returns:
A bidirectional graph with self-loop connection.
- class ReadGraphs¶
This object is used to build a list of graphs.
- __init__(self, **data_config)¶
- Parameters:
**data_config (dict) –
Configuration file for building database. See https://jzhang-github.github.io/AGAT/Default%20parameters.html#default-data-config for the detailed info.
Hint
Mode of how to get the neighbors, which can be:
'voronoi'
: consider Voronoi neighbors only.'pymatgen_dist'
: build graph based on a constant distance usingpymatgen
module.'ase_dist'
: build graph based on a constant distance usingase
module.'ase_natural_cutoffs'
: build graph fromase
which has a dynamic cutoff scheme. In this case, thecutoff
is deprecated becausease
will use the dynamic cutoffs inase.neighborlist.natural_cutoffs()
.
- read_batch_graphs(self, batch_index_list, batch_num)¶
Read graphs with batches.
Note
The loaded graphs are saved under the attribute of
dataset_path
.- Parameters:
batch_index_list (list) – a list of graph index.
batch_num (str) – number the graph batches.
- read_all_graphs(self, scale_prop=False, ckpt_path='.')¶
Read all graphs specified in the csv file.
Note
The loaded graphs are saved under the attribute of
dataset_path
.Danger
Do not scale the label if you don’t know what are you doing.
- Parameters:
scale_prop (bool) – scale the label or not. DO NOT scale unless you know what you are doing.
ckpt_path (str) – checkpoint directory of the well-trained model.
- Returns:
graph_list: a list of
DGL
graph.graph_labels: a list of labels.
- class TrainValTestSplit(object)¶
Split the dataset.
Note
This object is deprecated.
- class ExtractVaspFiles(object)¶
Extract VASP outputs for building AGAT database.
- Parameters:
data_config['dataset_path'] (str) – Absolute path where the collected data to save.
Note
Always save the property per node as the label. For example: energy per atom (eV/atom).
- __init__(self, **data_config)¶
- Parameters:
**data_config (dict) –
Configuration file for building database. See https://jzhang-github.github.io/AGAT/Default%20parameters.html#default-data-config for the detailed info.
- read_oszicar(self, fname='OSZICAR')¶
Get the electronic steps of a VASP run.
- Parameters:
fname (str, optional) – file name, defaults to ‘OSZICAR’
- Returns:
electronic steps of a VASP run.
- Return type:
list
- split_output(self, process_index)¶
- Parameters:
process_index (int.) – A number to index the process.
- __call__(self)¶
The __call__ function
- class BuildDatabase¶
Build a database. Detailed information: https://jzhang-github.github.io/AGAT/Tutorial/Build_database.html
- __init__(self, **data_config)¶
- Parameters:
**data_config (dict) –
Configuration file for building database. See https://jzhang-github.github.io/AGAT/Default%20parameters.html#default-data-config for the detailed info.
- build(self)¶
Run the construction process.
- concat_graphs(*list_of_bin)¶
Concat binary graph files.
- Parameters:
*list_of_bin –
input file names of binary graphs.
- Returns:
A new file is saved to the current directory: concated_graphs.bin.
- Return type:
None. A new file.
Example:
concat_graphs('graphs1.bin', 'graphs2.bin', 'graphs3.bin')