Protocols¶
Introduction¶
Protocols are special cases of Bionic decorators; their effect is to specify the Serialization Protocol for the entity being defined. For example:
# This entity should only have values equal to "short" or "long".
@builder
@bn.protocol.enum('short', 'long')
def name_length(name):
if len(name) < 10:
return 'short'
else:
return 'long'
# This entity's value will always be a ``pandas.DataFrame``.
@builder
@bn.protocol.DataFrame
def raw_df():
from sklearn import datasets
dataset = datasets.load_breast_cancer()
df = pd.DataFrame(
data=dataset.data,
)
df['target'] = dataset.target
return df
Protocols are used to tell Bionic how to serialize, deserialize, and validate entity values. In most cases, Bionic’s default protocol can figure out an appropriate way to handle each value, so explicit protocol decorators are usually not required. However, they can be useful for data types that need special handling, or just to add clarity, safety, or documentation to a entity definition.
Protocols can also be used when creating new entities with declare
or
assign
:
builder.assign('name_length', 'short', bn.protocol.enum('short', 'long'))
builder.declare('raw_df', bn.protocol.DataFrame)
Custom Protocols¶
If you need to control how an entity is serialized, you can write your own custom protocol. (However, since Bionic is still at an early stage, future API changes may break your implementation.)
class MyProtocol(BaseProtocol):
def get_fixed_file_extension(self):
"""Returns the extension that persisted files should end with."""
raise NotImplementedError()
def write(self, value, path):
"""Write the contents of ``value`` to path object ``path``."""
raise NotImplementedError()
def read(self, path):
"""Read an object from path object ``path``, and return it."""
raise NotImplementedError()
Protocol Decorators¶
-
bionic.protocol.
dillable
(func_or_provider=None, **kwargs)¶ Decorator indicating that an entity’s values can be serialized using the
dill
library.This is useful for objects that can’t be pickled for some reason.
-
bionic.protocol.
enum
(*allowed_values)¶ Indicates that an entity will only have one of a specific set of values.
- Parameters
allowed_values (Sequence of objects) – The expected possible values for this entity.
- Returns
An entity decorator.
- Return type
Function
-
bionic.protocol.
frame
(func_or_provider=None, file_format=None, check_dtypes=None)[source]¶ Decorator indicating that an entity will always have a pandas DataFrame type.
The frame values will be serialized to either Parquet (default) or Feather. Parquet is more popular, but some types of data or frame structures are only supported by one format or the other. In particular, ordered categorical columns are supported by Feather and not Parquet.
This decorator can be used with or without arguments:
@frame def dataframe(...): ... @frame(file_format='feather') def dataframe(...): ...
- Parameters
file_format ({'parquet', 'feather'} (default: 'parquet')) – Which file format to use when saving values to disk.
check_dtypes (boolean (default: True)) – Check for column types not supported by the file format. This check is best-effort and not guaranteed to catch all problems. If an unsupported data type is found, an exception will be thrown at serialization time.
-
bionic.protocol.
image
(func_or_provider=None, **kwargs)¶ Decorator indicating that an entity’s values always have the
Pillow.Image
type.These values will be serialized to PNG files.
-
bionic.protocol.
numpy
(func_or_provider=None, **kwargs)¶ Decorator indicating that an entity’s values always have the
numpy.ndarray
type.These values will be serialized to .npy files.
-
bionic.protocol.
picklable
(func_or_provider=None, **kwargs)¶ Decorator indicating that an entity’s values can be serialized using the
pickle
library.
-
bionic.protocol.
type
(type_)¶ Indicates that an entity’s values will always have a specific type.
- Parameters
type_ (Type) – The expected type for this entity.
- Returns
A entity decorator.
- Return type
Function