First Write¶
Step-by-step: how to write scientific data with openPMD-api?
Include / Import¶
After successful installation, you can start using openPMD-api as follows:
C++11¶
#include <openPMD/openPMD.hpp>
// example: data handling
#include <numeric> // std::iota
#include <vector> // std::vector
namespace api = openPMD;
Python¶
import openpmd_api as api
# example: data handling
import numpy as np
Open¶
Write into a new openPMD series in myOutput/data_<00...N>.h5
.
Further file formats than .h5
(HDF5) are supported:
.bp
(ADIOS1) or .json
(JSON).
C++11¶
auto series = api::Series(
"myOutput/data_%05T.h5",
api::AccessType::CREATE);
Python¶
series = api.Series(
"myOutput/data_%05T.h5",
api.Access_Type.create)
Iteration¶
Grouping by an arbitrary, positive integer number <N>
in a series:
C++11¶
auto i = series.iterations[42];
Python¶
i = series.iterations[42]
Attributes¶
Everything in openPMD can be extended and user-annotated. Let us try this by writing some meta data:
C++11¶
series.setAuthor(
"Axel Huebl <a.huebl@hzdr.de>");
series.setMachine(
"Hall Probe 5000, Model 3");
series.setAttribute(
"dinner", "Pizza and Coke");
i.setAttribute(
"vacuum", true);
Python¶
series.set_author(
"Axel Huebl <a.huebl@hzdr.de>")
series.set_machine(
"Hall Probe 5000, Model 3")
series.set_attribute(
"dinner", "Pizza and Coke")
i.set_attribute(
"vacuum", True)
Data¶
Let’s prepare some data that we want to write. For example, a magnetic field \(\vec B(i, j)\) slice in two dimensions with three components \((B_x, B_y, B_z)^\intercal\) of which the \(B_y\) component shall be constant for all \((i, j)\) indices.
C++11¶
std::vector<float> x_data(
150 * 300);
std::iota(
x_data.begin(),
x_data.end(),
0.);
float y_data = 4.f;
std::vector<float> z_data(x_data);
for( auto& c : z_data )
c -= 8000.f;
Python¶
x_data = np.arange(
150 * 300,
dtype=np.float
).reshape(150, 300)
y_data = 4.
z_data = x_data.copy() - 8000.
Record¶
An openPMD record can be either structured (mesh) or unstructured (particles). We prepared a vector field in 2D above, which is a mesh:
C++11¶
// record
auto B = i.meshes["B"];
// record components
auto B_x = B["x"];
auto B_y = B["y"];
auto B_z = B["z"];
auto dataset = api::Dataset(
api::determineDatatype<float>(),
{150, 300});
B_x.resetDataset(dataset);
B_y.resetDataset(dataset);
B_z.resetDataset(dataset);
Python¶
# record
B = i.meshes["B"]
# record components
B_x = B["x"]
B_y = B["y"]
B_z = B["z"]
dataset = api.Dataset(
x_data.dtype,
x_data.shape)
B_x.reset_dataset(dataset)
B_y.reset_dataset(dataset)
B_z.reset_dataset(dataset)
Units¶
Ouch, our measured magnetic field data is in Gauss! Quick, let’s store the conversion factor to SI (Tesla).
C++11¶
// conversion to SI
B_x.setUnitSI(1.e-4);
B_y.setUnitSI(1.e-4);
B_z.setUnitSI(1.e-4);
// unit system agnostic dimension
B.setUnitDimension({
{api::UnitDimension::M, 1},
{api::UnitDimension::I, -1},
{api::UnitDimension::T, -2}
});
Python¶
# conversion to SI
B_x.set_unit_SI(1.e-4)
B_y.set_unit_SI(1.e-4)
B_z.set_unit_SI(1.e-4)
# unit system agnostic dimension
B.set_unit_dimension({
api.Unit_Dimension.M: 1,
api.Unit_Dimension.I: -1,
api.Unit_Dimension.T: -2
})
Tip
Annotating the dimensionality of a record allows us to read data sets with arbitrary names and understand their purpose simply by dimensional analysis.
Register Chunk¶
We can write record components partially and in parallel or at once. Writing very small data one by one is is a performance killer for I/O. Therefore, we register all data to be written first and then flush it out collectively.
C++11¶
B_x.storeChunk(
api::shareRaw(x_data),
{0, 0}, {150, 300});
B_z.storeChunk(
api::shareRaw(z_data),
{0, 0}, {150, 300});
B_y.makeConstant(y_data);
Python¶
B_x.store_chunk(x_data)
B_z.store_chunk(z_data)
B_y.make_constant(y_data)
Attention
After registering a data chunk such as x_data
and y_data
, it MUST NOT be modified or deleted until the flush()
step is performed!