Data Type and Preprocessing¶
Data Types¶
Python list, numpy.ndarray, and torch.tensor (PyTorch tensors) are the base data types.
The Experiment
class can handle data saved in any of these types.
We use several customized types to specify the dimensionality and the base type of the data.
Customized Type |
Base Type |
---|---|
|
|
|
|
|
1D numpy.ndarray, list, or torch.tensor |
|
2D numpy.ndarray, list, or torch.tensor |
Usually, users provide data in numpy.ndarray or list. NEXTorch converts them to torch.tensor and then pass them between BoTorch functions. Once the training is done, NEXTorch converts the data to numpy.ndarray for output and visualization purposes.
Type Conversion¶
Converts numpy objects to tensor objects Returns a copy |
|
Convert tensor objects to numpy array objects Returns a copy with no gradient information :param X: tensor objects :type X: MatrixLike2d |
|
Expand 1d list to 2d |
(Inverse) Normalization¶
Convert arrays or matrics from a real scale into a unit scale [0, 1] in each dimension, vice versa.
This step is often needed for X
before modeling training.
Note
X
or X_unit
suggests that the values are in unit scales.
X_real
suggests that the values are in real scales or units.
Takes in an x array in a real scale and converts it to a unit scale |
|
Takes in a matrix in a real scale and converts it into a unit scale |
|
Takes in an x array in a unit scale and converts it to a real scale |
|
Takes in a matrix in a unit scale and converts it into a real scale |
(Inverse) Standardization¶
Convert arrays or matrics from a real scale into a standardized scale with a zero mean and unit variance in each dimension, vice versa.
This step is often needed for Y
before modeling training.
Note
Y
suggests that the values are in standardized scales.
Y_real
suggests that the values are in real scales or units.
Takes in an array/matrix X and returns the standardized data with zero mean and a unit variance |
|
Takes in an arrary/matrix/tensor X and returns the data in the real scale |
Test Points Generation¶
Generate X
points as in a mesh grid for visualization or testing purposes.
Create 2D mesh for testing |
|
takes in 1 column of X tensor predict the Y values convert to real units and return a 2D numpy array in the size of mesh_size*mesh_size |
|
Transform X1 and X2 in unit scale to real scales for plotting |
|
Create a full X_test matrix given with varying x at dimensions defined by x_indices and fixed values at the rest dimensions |
|
Create a full X_test matrix given with varying x at dimensions defined by x_indices and fixed values at the rest dimensions |
|
Get the baseline values from X_ranges in a unit scale |
|
Choose certain dimensions defined by x_indices, fill them with mesh test points and keep the rest as fixed values. |
|
Choose two dimensions, create 2D mesh and keep the rest as fixed values. |
|
Choose two dimensions, create 2D mesh and keep the rest as fixed values. |
Encoding/Decoding¶
For ordinal and categorical variables, their real values need to be converted into unit-scale encodings in the continuous space.
We can do it with real_to_encode_X
. These encodings are used to train BO functions.
To convert the unit-scale encodings back to the original variable values, we can do it in two steps: using unit_to_encode_X
and encode_to_real_X
.
A ParameterSpace
object can also be the input to these functions.
Convert original data to encoded data |
|
Decoded the data to ordinal or categorical values |
|
Takes in a matrix in a real scale from the relaxed continuous space, rounds (encodes) to the available values, and converts it into a unit scale |
|
Takes in a matrix in a unit scale from the relaxed continuous space, rounds (encodes) to the available values |
|
Takes in a matrix in a unit scale from the encoding space, converts it into a real scale |
|
Takes in a matrix in a real scale from the relaxed continuous space, rounds (encodes) to the available values, and converts it into a unit scale. |
|
Takes in a matrix in a unit scale from the relaxed continuous space, rounds (encodes) to the available values Using ParameterSpace object |
|
Takes in a matrix in a unit scale from the encoding space, converts it into a real scale Using ParameterSpace object |