Pipeline Element

The pipeline element class is essential for adding logic to your pipeline. Each element encapsulates an algorithm or a method you want to use. With the pipeline element class we add the functionality of completely disabling an item as well as storing, preparing and managing the hyperparameter space for it.

We have preregistered many already implemented preprocessing steps or models so you can easily add them via their name. See PHOTON Register.list() for a complete list of registered items. You can register your own, too.

# SHOW WHAT IS POSSIBLE IN THE CONSOLE
PhotonRegister.list()

# NOW FIND OUT MORE ABOUT A SPECIFIC ELEMENT
PhotonRegister.info('SVC')

...
# then do feature selection using a PCA, specify which values to try in the hyperparameter search
my_pipe += PipelineElement('PCA', hyperparameters={'n_components': [5, 10, None]}, test_disabled=True)

# engage and optimize the good old SVM for Classification
my_pipe += PipelineElement('SVC', hyperparameters={'kernel': Categorical(['rbf', 'linear']),
                                                   'C': FloatRange(0.5, 2, "linspace", num=5)})

class PipelineElement

Photon wrapper class for any transformer or predictor element in the pipeline.

  1. Saves the hyperparameters that are to be tested and creates a grid of all hyperparameter configurations
  2. Enables fast and rapid instantiation of pipeline elements per string identifier, e.g 'svc' creates an sklearn.svm.SVC object.
  3. Attaches a "disable" switch to every element in the pipeline in order to test a complete disable

Parameters

  • name [str]: A string literal encoding the class to be instantiated
  • hyperparameters [dict]: Which values/value range should be tested for the hyperparameter. In form of "Hyperparameter_name: [array of parameter values to be tested]"
  • test_disabled [bool]: If the hyperparameter search should evaluate a complete disabling of the element
  • disabled [bool]: If true, the element is currently disabled and does nothing except return the data it received
  • kwargs [dict]: Any parameters that should be passed to the object to be instantiated, default parameters
class PipelineElement(BaseEstimator):
    """
    Photon wrapper class for any transformer or predictor element in the pipeline.

    1. Saves the hyperparameters that are to be tested and creates a grid of all hyperparameter configurations
    2. Enables fast and rapid instantiation of pipeline elements per string identifier,
         e.g 'svc' creates an sklearn.svm.SVC object.
    3. Attaches a "disable" switch to every element in the pipeline in order to test a complete disable


    Parameters
    ----------
    * `name` [str]:
       A string literal encoding the class to be instantiated
    * `hyperparameters` [dict]:
       Which values/value range should be tested for the hyperparameter.
       In form of "Hyperparameter_name: [array of parameter values to be tested]"
    * `test_disabled` [bool]:
        If the hyperparameter search should evaluate a complete disabling of the element
    * `disabled` [bool]:
        If true, the element is currently disabled and does nothing except return the data it received
    * `kwargs` [dict]:
        Any parameters that should be passed to the object to be instantiated, default parameters

    """
    # Registering Pipeline Elements
    ELEMENT_DICTIONARY = PhotonRegister.get_package_info()

    def __init__(self, name, hyperparameters: dict=None, test_disabled: bool=False,
                 disabled: bool =False, base_element=None,
                 **kwargs):
        """
        Takes a string literal and transforms it into an object of the associated class (see PhotonCore.JSON)

        Returns
        -------
        instantiated class object
        """
        if hyperparameters is None:
            hyperparameters = {}

        if not base_element:
            if name in PipelineElement.ELEMENT_DICTIONARY:
                try:
                    desired_class_info = PipelineElement.ELEMENT_DICTIONARY[name]
                    desired_class_home = desired_class_info[0]
                    desired_class_name = desired_class_info[1]
                    imported_module = __import__(desired_class_home, globals(), locals(), desired_class_name, 0)
                    desired_class = getattr(imported_module, desired_class_name)
                    base_element = desired_class(**kwargs)
                    obj = PipelineElement(name, hyperparameters, test_disabled, disabled, base_element)
                    self.base_element = obj
                except AttributeError as ae:
                    Logger().error('ValueError: Could not find according class:'
                                   + str(PipelineElement.ELEMENT_DICTIONARY[name]))
                    raise ValueError('Could not find according class:', PipelineElement.ELEMENT_DICTIONARY[name])
            else:
                Logger().error('Element not supported right now:' + name)
                raise NameError('Element not supported right now:', name)
        else:
            self.base_element = base_element


        # Todo: check if hyperparameters are members of the class
        # Todo: write method that returns any hyperparameter that could be optimized --> sklearn: get_params.keys
        # Todo: map any hyperparameter to a possible default list of values to try
        self.name = name
        self.test_disabled = test_disabled
        self._sklearn_disabled = self.name + '__disabled'
        self._hyperparameters = hyperparameters
        # check if hyperparameters are already in sklearn style
        if len(hyperparameters) > 0:
            key_0 = next(iter(hyperparameters))
            if self.name not in key_0:
                self.hyperparameters = hyperparameters
        self.disabled = disabled

    def copy_me(self):
        return deepcopy(self)

    @classmethod
    def create(cls, name, base_element, hyperparameters: dict, test_disabled=False, disabled=False, **kwargs):
        """
        Takes an instantiated object and encapsulates it into the PHOTON structure,
        add the disabled function and attaches information about the hyperparameters that should be tested
        """
        return PipelineElement(name, hyperparameters, test_disabled, disabled, base_element=base_element, **kwargs)

    @property
    def hyperparameters(self):
        return self._hyperparameters

    @hyperparameters.setter
    def hyperparameters(self, value: dict):
        self.generate_sklearn_hyperparameters(value)

    def generate_config_grid(self):
        config_dict = create_global_config_dict([self])
        if len(config_dict) > 0:
            if self.test_disabled:
                config_dict.pop(self._sklearn_disabled)
            config_list = list(ParameterGrid(config_dict))
            if self.test_disabled:
                config_list.append({self._sklearn_disabled: True})
            return config_list
        else:
            return []

    def generate_sklearn_hyperparameters(self, value: dict):
        """
        Generates a dictionary according to the sklearn convention of element_name__parameter_name: parameter_value
        """
        self._hyperparameters = {}
        for attribute, value_list in value.items():
            self._hyperparameters[self.name + '__' + attribute] = value_list
        if self.test_disabled:
            self._hyperparameters[self._sklearn_disabled] = [False, True]

    def get_params(self, deep: bool=True):
        """
        Forwards the get_params request to the wrapped base element
        """
        return self.base_element.get_params(deep)


    def set_params(self, **kwargs):
        """
        Forwards the set_params request to the wrapped base element
        Takes care of the disabled parameter which is additionally attached by the PHOTON wrapper
        """
        # element disable is a construct used for this container only
        if self._sklearn_disabled in kwargs:
            self.disabled = kwargs[self._sklearn_disabled]
            del kwargs[self._sklearn_disabled]
        elif 'disabled' in kwargs:
            self.disabled = kwargs['disabled']
            del kwargs['disabled']
        self.base_element.set_params(**kwargs)
        return self

    def fit(self, data, targets=None):
        """
        Calls the fit function of the base element

        Returns
        ------
        self
        """
        if not self.disabled:
            obj = self.base_element
            obj.fit(data, targets)
            # self.base_element.fit(data, targets)
        return self

    def predict(self, data):
        """
        Calls predict function on the base element.

        IF PREDICT IS NOT AVAILABLE CALLS TRANSFORM.
        This is for the case that the encapsulated hyperpipe only part of another hyperpipe, and works as a transformer.
        Sklearn usually expects the last element to predict.
        Also this is needed in case we are using an autoencoder which is firstly trained by using predict, and after
        training only used for transforming.
        """
        if not self.disabled:
            if hasattr(self.base_element, 'predict'):
                return self.base_element.predict(data)
            elif hasattr(self.base_element, 'transform'):
                return self.base_element.transform(data)
            else:
                Logger().error('BaseException. base Element should have function ' +
                               'predict, or at least transform.')
                raise BaseException('base Element should have function predict, or at least transform.')
        else:
            return data

    def predict_proba(self, data):
        """
        Predict probabilities
        base element needs predict_proba() function, otherwise throw
        base exception.
        """
        if not self.disabled:
            if hasattr(self.base_element, 'predict_proba'):
                return self.base_element.predict_proba(data)
            else:
                Logger().error('BaseException. base Element should have "predict_proba" function.')
            raise BaseException('base Element should have predict_proba function.')
        return data

    # def fit_predict(self, data, targets):
    #     if not self.disabled:
    #         return self.base_element.fit_predict(data, targets)
    #     else:
    #         return data

    def transform(self, data):
        """
        Calls transform on the base element.

        IN CASE THERE IS NO TRANSFORM METHOD, CALLS PREDICT.
        This is used if we are using an estimator as a preprocessing step.
        """
        if not self.disabled:
            if hasattr(self.base_element, 'transform'):
                return self.base_element.transform(data)
            elif hasattr(self.base_element, 'predict'):
                return self.base_element.predict(data)
            else:
                Logger().error('BaseException: transform-predict-mess')
                raise BaseException('transform-predict-mess')
        else:
            return data

    def inverse_transform(self, data):
        """
        Calls inverse_transform on the base element
        """
        if hasattr(self.base_element, 'inverse_transform'):
            return self.base_element.inverse_transform(data)
        else:
            # raise Warning('Element ' + self.name + ' has no method inverse_transform')
            return data

    # def fit_transform(self, data, targets=None):
    #     if not self.disabled:
    #         if hasattr(self.base_element, 'fit_transform'):
    #             return self.base_element.fit_transform(data, targets)
    #         elif hasattr(self.base_element, 'transform'):
    #             self.base_element.fit(data, targets)
    #             return self.base_element.transform(data)
    #         # elif hasattr(self.base_element, 'predict'):
    #         #     self.base_element.fit(data, targets)
    #         #     return self.base_element.predict(data)
    #     else:
    #         return data

    def score(self, X_test, y_test):
        """
        Calls the score function on the base element:
        Returns a goodness of fit measure or a likelihood of unseen data:
        """
        return self.base_element.score(X_test, y_test)

    def prettify_config_output(self, config_name: str, config_value, return_dict:bool=False):
        """Make hyperparameter combinations human readable """
        if config_name == "disabled" and config_value is False:
            if return_dict:
                return {'enabled':True}
            else:
                return "enabled = True"
        else:
            if return_dict:
                return {config_name:config_value}
            else:
                return config_name + '=' + str(config_value)

Ancestors (in MRO)

Class variables

var ELEMENT_DICTIONARY

Static methods

def __init__(

self, name, hyperparameters=None, test_disabled=False, disabled=False, base_element=None, **kwargs)

Takes a string literal and transforms it into an object of the associated class (see PhotonCore.JSON)

Returns

instantiated class object

def __init__(self, name, hyperparameters: dict=None, test_disabled: bool=False,
             disabled: bool =False, base_element=None,
             **kwargs):
    """
    Takes a string literal and transforms it into an object of the associated class (see PhotonCore.JSON)
    Returns
    -------
    instantiated class object
    """
    if hyperparameters is None:
        hyperparameters = {}
    if not base_element:
        if name in PipelineElement.ELEMENT_DICTIONARY:
            try:
                desired_class_info = PipelineElement.ELEMENT_DICTIONARY[name]
                desired_class_home = desired_class_info[0]
                desired_class_name = desired_class_info[1]
                imported_module = __import__(desired_class_home, globals(), locals(), desired_class_name, 0)
                desired_class = getattr(imported_module, desired_class_name)
                base_element = desired_class(**kwargs)
                obj = PipelineElement(name, hyperparameters, test_disabled, disabled, base_element)
                self.base_element = obj
            except AttributeError as ae:
                Logger().error('ValueError: Could not find according class:'
                               + str(PipelineElement.ELEMENT_DICTIONARY[name]))
                raise ValueError('Could not find according class:', PipelineElement.ELEMENT_DICTIONARY[name])
        else:
            Logger().error('Element not supported right now:' + name)
            raise NameError('Element not supported right now:', name)
    else:
        self.base_element = base_element
    # Todo: check if hyperparameters are members of the class
    # Todo: write method that returns any hyperparameter that could be optimized --> sklearn: get_params.keys
    # Todo: map any hyperparameter to a possible default list of values to try
    self.name = name
    self.test_disabled = test_disabled
    self._sklearn_disabled = self.name + '__disabled'
    self._hyperparameters = hyperparameters
    # check if hyperparameters are already in sklearn style
    if len(hyperparameters) > 0:
        key_0 = next(iter(hyperparameters))
        if self.name not in key_0:
            self.hyperparameters = hyperparameters
    self.disabled = disabled

def copy_me(

self)

def copy_me(self):
    return deepcopy(self)

def fit(

self, data, targets=None)

Calls the fit function of the base element

Returns

self

def fit(self, data, targets=None):
    """
    Calls the fit function of the base element
    Returns
    ------
    self
    """
    if not self.disabled:
        obj = self.base_element
        obj.fit(data, targets)
        # self.base_element.fit(data, targets)
    return self

def generate_config_grid(

self)

def generate_config_grid(self):
    config_dict = create_global_config_dict([self])
    if len(config_dict) > 0:
        if self.test_disabled:
            config_dict.pop(self._sklearn_disabled)
        config_list = list(ParameterGrid(config_dict))
        if self.test_disabled:
            config_list.append({self._sklearn_disabled: True})
        return config_list
    else:
        return []

def generate_sklearn_hyperparameters(

self, value)

Generates a dictionary according to the sklearn convention of element_name__parameter_name: parameter_value

def generate_sklearn_hyperparameters(self, value: dict):
    """
    Generates a dictionary according to the sklearn convention of element_name__parameter_name: parameter_value
    """
    self._hyperparameters = {}
    for attribute, value_list in value.items():
        self._hyperparameters[self.name + '__' + attribute] = value_list
    if self.test_disabled:
        self._hyperparameters[self._sklearn_disabled] = [False, True]

def get_params(

self, deep=True)

Forwards the get_params request to the wrapped base element

def get_params(self, deep: bool=True):
    """
    Forwards the get_params request to the wrapped base element
    """
    return self.base_element.get_params(deep)

def inverse_transform(

self, data)

Calls inverse_transform on the base element

def inverse_transform(self, data):
    """
    Calls inverse_transform on the base element
    """
    if hasattr(self.base_element, 'inverse_transform'):
        return self.base_element.inverse_transform(data)
    else:
        # raise Warning('Element ' + self.name + ' has no method inverse_transform')
        return data

def predict(

self, data)

Calls predict function on the base element.

IF PREDICT IS NOT AVAILABLE CALLS TRANSFORM. This is for the case that the encapsulated hyperpipe only part of another hyperpipe, and works as a transformer. Sklearn usually expects the last element to predict. Also this is needed in case we are using an autoencoder which is firstly trained by using predict, and after training only used for transforming.

def predict(self, data):
    """
    Calls predict function on the base element.
    IF PREDICT IS NOT AVAILABLE CALLS TRANSFORM.
    This is for the case that the encapsulated hyperpipe only part of another hyperpipe, and works as a transformer.
    Sklearn usually expects the last element to predict.
    Also this is needed in case we are using an autoencoder which is firstly trained by using predict, and after
    training only used for transforming.
    """
    if not self.disabled:
        if hasattr(self.base_element, 'predict'):
            return self.base_element.predict(data)
        elif hasattr(self.base_element, 'transform'):
            return self.base_element.transform(data)
        else:
            Logger().error('BaseException. base Element should have function ' +
                           'predict, or at least transform.')
            raise BaseException('base Element should have function predict, or at least transform.')
    else:
        return data

def predict_proba(

self, data)

Predict probabilities base element needs predict_proba() function, otherwise throw base exception.

def predict_proba(self, data):
    """
    Predict probabilities
    base element needs predict_proba() function, otherwise throw
    base exception.
    """
    if not self.disabled:
        if hasattr(self.base_element, 'predict_proba'):
            return self.base_element.predict_proba(data)
        else:
            Logger().error('BaseException. base Element should have "predict_proba" function.')
        raise BaseException('base Element should have predict_proba function.')
    return data

def prettify_config_output(

self, config_name, config_value, return_dict=False)

Make hyperparameter combinations human readable

def prettify_config_output(self, config_name: str, config_value, return_dict:bool=False):
    """Make hyperparameter combinations human readable """
    if config_name == "disabled" and config_value is False:
        if return_dict:
            return {'enabled':True}
        else:
            return "enabled = True"
    else:
        if return_dict:
            return {config_name:config_value}
        else:
            return config_name + '=' + str(config_value)

def score(

self, X_test, y_test)

Calls the score function on the base element: Returns a goodness of fit measure or a likelihood of unseen data:

def score(self, X_test, y_test):
    """
    Calls the score function on the base element:
    Returns a goodness of fit measure or a likelihood of unseen data:
    """
    return self.base_element.score(X_test, y_test)

def set_params(

self, **kwargs)

Forwards the set_params request to the wrapped base element Takes care of the disabled parameter which is additionally attached by the PHOTON wrapper

def set_params(self, **kwargs):
    """
    Forwards the set_params request to the wrapped base element
    Takes care of the disabled parameter which is additionally attached by the PHOTON wrapper
    """
    # element disable is a construct used for this container only
    if self._sklearn_disabled in kwargs:
        self.disabled = kwargs[self._sklearn_disabled]
        del kwargs[self._sklearn_disabled]
    elif 'disabled' in kwargs:
        self.disabled = kwargs['disabled']
        del kwargs['disabled']
    self.base_element.set_params(**kwargs)
    return self

def transform(

self, data)

Calls transform on the base element.

IN CASE THERE IS NO TRANSFORM METHOD, CALLS PREDICT. This is used if we are using an estimator as a preprocessing step.

def transform(self, data):
    """
    Calls transform on the base element.
    IN CASE THERE IS NO TRANSFORM METHOD, CALLS PREDICT.
    This is used if we are using an estimator as a preprocessing step.
    """
    if not self.disabled:
        if hasattr(self.base_element, 'transform'):
            return self.base_element.transform(data)
        elif hasattr(self.base_element, 'predict'):
            return self.base_element.predict(data)
        else:
            Logger().error('BaseException: transform-predict-mess')
            raise BaseException('transform-predict-mess')
    else:
        return data

Instance variables

var disabled

var hyperparameters

var name

var test_disabled

Methods

def create(

cls, name, base_element, hyperparameters, test_disabled=False, disabled=False, **kwargs)

Takes an instantiated object and encapsulates it into the PHOTON structure, add the disabled function and attaches information about the hyperparameters that should be tested

@classmethod
def create(cls, name, base_element, hyperparameters: dict, test_disabled=False, disabled=False, **kwargs):
    """
    Takes an instantiated object and encapsulates it into the PHOTON structure,
    add the disabled function and attaches information about the hyperparameters that should be tested
    """
    return PipelineElement(name, hyperparameters, test_disabled, disabled, base_element=base_element, **kwargs)