build_dataset

Hint

The high-level API in this script is BuildDatabase.

For example:

database = BuildDatabase()
database.build()

See https://jzhang-github.github.io/AGAT/Tutorial/Build_database.html for more info.

Warning

Some functions on this page will be deprecated in the future. Including select_graphs_random and concat_graphs. Use select_graphs_from_dataset_random and concat_dataset, respectively.

class ReadGraphs

This object is used to build a list of graphs.

__init__(self, **data_config)
Parameters:

**data_config (dict) –

Configuration file for building database. See https://jzhang-github.github.io/AGAT/Default%20parameters.html#default-data-config for the detailed info.

Hint

Mode of how to get the neighbors, which can be:

  • 'voronoi': consider Voronoi neighbors only.

  • 'pymatgen_dist': build graph based on a constant distance using pymatgen module.

  • 'ase_dist': build graph based on a constant distance using ase module.

  • 'ase_natural_cutoffs': build graph from ase which has a dynamic cutoff scheme. In this case, the cutoff is deprecated because ase will use the dynamic cutoffs in ase.neighborlist.natural_cutoffs().

read_batch_graphs(self, batch_index_list, batch_num)

Read graphs with batches.

Note

The loaded graphs are saved under the attribute of dataset_path.

Parameters:
  • batch_index_list (list) – a list of graph index.

  • batch_num (str) – number the graph batches.

read_all_graphs(self, scale_prop=False, ckpt_path='.')

Read all graphs specified in the csv file.

Note

The loaded graphs are saved under the attribute of dataset_path.

Danger

Do not scale the label if you don’t know what are you doing.

Parameters:
  • scale_prop (bool) – scale the label or not. DO NOT scale unless you know what you are doing.

  • ckpt_path (str) – checkpoint directory of the well-trained model.

Returns:
  • graph_list: a list of DGL graph.

  • graph_labels: a list of labels.

class TrainValTestSplit(object)

Split the dataset.

Note

This object is deprecated.

class ExtractVaspFiles(object)

Extract VASP outputs for building AGAT database.

Parameters:

data_config['dataset_path'] (str) – Absolute path where the collected data to save.

Note

Always save the property per node as the label. For example: energy per atom (eV/atom).

__init__(self, **data_config)
Parameters:

**data_config (dict) – Configuration file for building database. See https://jzhang-github.github.io/AGAT/Default%20parameters.html#default-data-config for the detailed info.

read_oszicar(self, fname='OSZICAR')

Get the electronic steps of a VASP run.

Parameters:

fname (str, optional) – file name, defaults to ‘OSZICAR’

Returns:

electronic steps of a VASP run.

Return type:

list.

split_output(self, process_index)
Parameters:

process_index (int.) – A number to index the process.

__call__(self)

The __call__ function

class BuildDatabase

Build a database. Detailed information: https://jzhang-github.github.io/AGAT/Tutorial/Build_database.html

__init__(self, **data_config)
Parameters:

**data_config (dict) – Configuration file for building database. See https://jzhang-github.github.io/AGAT/Default%20parameters.html#default-data-config for the detailed info.

build(self)

Run the construction process.

concat_graphs(*list_of_bin)

Concat binary graph files.

Parameters:

*list_of_bin (strings) – input file names of binary graphs.

Returns:

A new file is saved to the current directory: concated_graphs.bin.

Return type:

None. A new file.

Example:

concat_graphs('graphs1.bin', 'graphs2.bin', 'graphs3.bin')
concat_dataset(*list_of_datasets, save_file=False, fname='concated_graphs.bin')

Concat agat.dataset.Dataset in the RAM.

Parameters:
  • *list_of_datasets (agat.dataset.Dataset) – a list of agat.dataset.Dataset object.

  • save_file (bool) – save to a new file or not. Default: False

  • fname (str) – The saved file name if savefile=True. Default: ‘concated_graphs.bin’

Returns:

A new file is saved to the current directory: concated_graphs.bin.

Return type:

agat.dataset.Dataset

select_graphs_random(fname: str, num: int)

Randomly split graphs from a binary file.

Parameters:
  • fname (str) – input file name.

  • num (int) – number of selected graphs (should be smaller than number of all graphs.

Returns:

A new file is saved to the current directory: Selected_graphs.bin.

Return type:

None. A new file.

Example:

select_graphs_random('graphs1.bin')
select_graphs_from_dataset_random(dataset, num: int, save_file=False, fname='selected_graphs.bin')

Randomly split graphs from a binary file.

Parameters:
  • fname (str) – input file name.

  • num (int) – number of selected graphs (should be smaller than number of all graphs.

Returns:

A new file is saved to the current directory: Selected_graphs.bin.

Return type:

None. A new file.

Example:

select_graphs_random('graphs1.bin')
save_dataset(dataset: Dataset, fname='graphs.bin')

Save a agat.dataset.Dataset to a binary file.

Parameters:
  • dataset (agat.dataset.Dataset) – AGAT dataset in RAM.

  • fname (str) – output file name.