Flow and FlowBuilder¶
Introduction¶
FlowBuilder
and Flow
are the primary interfaces for constructing and
running Bionic flows. Either of them can be used to represent
the collection of interdependent entities that make up a single analysis. The
difference is that a FlowBuilder
is a mutable object which can be updated,
while a Flow
is an immutable object which can perform computation.
The typical pattern is to start with an empty FlowBuilder
, incrementally
add entity definitions to it, then use FlowBuilder.build()
to generate a
Flow
. This Flow
can be used immediately to compute entity values, or
passed to other code, which might reconfigure or extend it.
Although Flow
objects are immutable, there is a mechanism for modifying
them: instead of a method like set
that mutates the Flow
, there is a
setting
method that returns a new copy with the requested change. This
allows Flow
s to be easily customized without worrying about shared state.
However, this API can only be used to update existing entities; if you want to
define new entities, you’ll need to convert the Flow
back to a
FlowBuilder
using to_builder
.
See the Concepts documentation for more details.
FlowBuilder API¶
-
class
bionic.
FlowBuilder
(name, _state=None)[source]¶ A mutable builder for Flows.
Allows
Flow
objects to be constructed incrementally. Usedeclare
,assign
,set
, and/or__call__
to add entities to the builder, then usebuild
to convert it into a Flow.- Parameters
name (String) – Identifies the flow and provides a namespace for cached data.
-
add_case
(*name_values)[source]¶ Adds a “case”: a collection of associated values for a set of entities.
Assigning entity values by case is an alternative to
set
(orassign
). Ifset
is used to set multiple values for some entities, then every combination of those values will be considered for downstream entities. On the other hand, ifadd_case
is used, only the specified combinations will be considered.Example Using
assign
:builder = FlowBuilder() builder.assign('first_name', values=['Alice', 'Bob']) builder.assign('last_name', values=['Smith', 'Jones']) @builder def full_name(first_name, last_name): return first_name + ' ' + last_name # Prints: {'Alice Jones', 'Alice Smith', 'Bob Jones', 'Bob Smith'} print(builder.build().get('full_name', set))
Example using
add_case
:builder = FlowBuilder() builder.declare('first_name') builder.declare('last_name') builder.add_case('first_name', 'Alice', 'last_name', 'Jones') builder.add_case('first_name', 'Alice', 'last_name', 'Smith') builder.add_case('first_name', 'Bob', 'last_name', 'Smith') @builder def full_name(first_name, last_name): return first_name + ' ' + last_name print(builder.build().get('full_name', set)) # Prints: {'Alice Jones', 'Alice Smith', 'Bob Smith'}
All entities must already exist. They may have existing values, but those values must have been set case-by-case with the same structure as this call.
- Parameters
name_values (String/Object) – Alternating entity names and values.
- Returns
An object which can be used to set values on additional entities with this case.
- Return type
-
assign
(name, value=None, values=None, protocol=None, doc=None, docstring=None)[source]¶ Creates a new entity and assigns it a value.
Exactly one of
value
orvalues
must be provided. The entity must not already exist.- Parameters
name (String) – The name of the new entity.
value (Object, optional) – A single value for the entity.
values (Sequence, optional) – A sequence of values for the entity.
protocol (Protocol, optional) – The entity’s protocol. The default is a smart type-detecting protocol.
doc (String, optional) – Description of the new entity.
-
build
()[source]¶ Constructs a
Flow
object from this builder’s state.The returned flow is immutable and will not be affected by future changes to this builder’s state.
-
clear_cases
(*names)[source]¶ Removes all values assigned to one or more entities.
The values will still exist, but not have any values, as if they had just been created with
declare
. If any of the entities were set in a group usingadd_case
, they must all be cleared together.- Parameters
names (Sequence of strings) – The entities whose values should be cleared.
-
declare
(name, protocol=None, doc=None, docstring=None)[source]¶ Creates a new entity but does not assign it a value.
The entity must not already exist.
- Parameters
name (String) – The name of the new entity.
protocol (Protocol, optional) – The entity’s protocol. The default is a smart type-detecting protocol.
doc (String, optional) – Description of the new entity.
-
delete
(*names)[source]¶ Deletes one or more entities.
If any of the entities were set in a group using
add_case
, they must all be cleared together.- Parameters
names (Sequence of strings) – The entities to be deleted.
-
merge
(flow, keep='error', allow_name_match=False)[source]¶ Updates this builder by importing all entities from another flow.
If any incoming entity has the same name as an existing entity, the conflict is resolved by apply the following rules, in order:
The name (core__flow_name) of this builder is never changed; the original value is always kept.
Entities that were set by default (not explicitly set by the user) are never imported and can be overwritten.
Assignments (definitions with values) take precedence over declarations (definitions with no values).
Otherwise, the
keep
parameter can be used to specify which entity to keep.
- Parameters
flow (Flow) – Any Bionic Flow.
keep ('error', 'self', or 'arg' (default: 'error')) –
How to handle conflicting entity names. Options:
’error’: throw an
AlreadyDefinedEntityError
’self’ or ‘old’: use the definition from this builder
’arg’ or ‘new’: use the definition from
flow
allow_name_match (boolean (default: False)) – Allows the incoming flow to have the same name as this builder. (If this is False, we handle duplicate names by throwing an exception. It’s technically possible to share a name between flows, but it’s generally not good practice.)
-
set
(name, value=None, values=None)[source]¶ Sets the value of an existing entity.
Exactly one of
value
orvalues
must be provided. The entity must already exist and may already have a value (which will be overwritten).- Parameters
name (String) – The name of the new entity.
value (Object, optional) – A single value for the entity.
values (Sequence, optional) – A sequence of values for the entity.
Flow API¶
-
class
bionic.
Flow
(state, _official=False)[source]¶ An immutable workflow object. You can use get() to compute any entity in the workflow, or setting() to create a new workflow with modifications. Not all modifications are possible with this interface, but to_builder() can be used to get a mutable FlowBuilder version of a Flow.
-
all_entity_names
(include_core=False)[source]¶ Returns a list of all declared entity names in this flow.
- Parameters
include_core (Boolean, optional (default false)) – Include internal entities used for Bionic infrastructure.
-
assigning
(name, value=None, values=None, protocol=None)[source]¶ Like
FlowBuilder.assign
, but returns a new copy of this flow.
-
declaring
(name, protocol=None)[source]¶ Like
FlowBuilder.declare
, but returns a new copy of this flow.
-
entity_doc
(name)[source]¶ Returns the doc for the named entity if one is defined, otherwise return None.
- Parameters
name (String) – The name of an entity.
-
entity_docstring
(name)[source]¶ (Deprecated in favor of entity_doc.) Returns the doc for the named entity if one is defined, otherwise return None.
- Parameters
name (String) – The name of an entity.
-
entity_protocol
(name)[source]¶ Returns the protocol for a given entity.
- Parameters
name (String) – The name of an entity.
-
export
(name, file_path=None, dir_path=None)[source]¶ Provides access to the persisted file corresponding to an entity. Note: this method is deprecated and the same functionality is available through Flow#get.
Can be called in three ways:
# Returns a path to the persisted file. export(name) # Copies the persisted file to the specified file path. export(name, file_path=path) # Copies the persisted file to the specified directory. export(name, dir_path=path)
The entity must be persisted and have only one instance. The dir_path and file_path options support paths on GCS, specified like: gs://mybucket/subdir/
-
get
(name, collection=None, fmt=None, mode=<class 'object'>)[source]¶ Computes the value(s) associated with an entity.
If the entity has multiple values, the
collection
parameter indicates how to handle them. It can have any of the following values:None
: return a single value or throw an exceptionlist
or'list'
: return a list of valuesset
or'set'
: return a set of valuespandas.Series
or'series'
: return a series whose index is the root cases distinguishing the different values
The user can specify the type of object (implicitly specifying in-memory vs. persisted data) to return in the collection using the
mode
parameter. It can have any of the following values: *object
or'object'
for a value in-memory *'FileCopier'
for a wrapper for a path to the persisted file for the computed entity *Path
or'path'
for a path to persisted file *'filename'
for a string representing a path to a persisted file- Parameters
name (String) – The name of an entity.
collection (String or type, optional, default is
None
) – The data structure to use if the entity has multiple values.fmt (String or type, optional, default is
None
) – The data structure to use if the entity has multiple values. Deprecated in favor ofcollection
and will be removed in future release.mode (String or type, optional, default is
object
) – The type of object to return in the collection.
- Returns
- Return type
The value of the entity, or a collection containing its values.
-
property
name
¶ Returns the name of this flow.
-
reloading
()[source]¶ Attempts to reload all modules used directly by this flow.
For safety, this only works if this flow meets the following requirements:
is the first Flow built by its FlowBuilder
has never been modified (i.e., isn’t derived from another Flow)
is assigned to a top-level variable in a module that one of its functions is defined in
The most straightforward way to meet these requirements is to define your flow in a module as:
builder = ... @builder def ... ... flow = builder.build()
and then import in the notebook like so:
from mymodule import flow ... flow.reloading().get('my_entity')
This will reload the modules and use the most recent version of the flow before doing the
get()
.
-
render_dag
(include_core=False, vertical=False, curvy_lines=False)[source]¶ Returns a
FlowImage
with a visualization of this flow’s DAG. This object behaves similarly to a PillowImage
object.Will fail if Graphviz is not installed on the system.
-
setting
(name, value=None, values=None)[source]¶ Like
FlowBuilder.set
, but returns a new copy of this flow.
-