Model

The core implementation of the “fireball” library. This file implements the “Model” class. A model can be loaded from a file or created from scratch. The network structure can be specified using a short-form language that defines different layers of the network in a text string.

Please refer to the “Layers” class for more details about the types of layers currently supported by Fireball.

class fireball.model.Model(name='FireballModel', layersInfo=None, trainDs=None, testDs=None, validationDs=None, batchSize=None, numEpochs=10, regFactor=0.0, dropOutKeep=1.0, learningRate=0.01, optimizer=None, lossFunction=None, blocks=[], modelFilePath=None, saveModelFileName=None, savePeriod=None, saveBest=False, gpus=None, netParams=None, trainingState=None)

The implementation of Fireball Model.

A General Purpose Neural Network for Classification and Regression problems. Fireball is built on top of Tensorflow 1.x.

Initialize all the parameters and then build a TensorFlow graph based on the parameters.

Parameters:
  • name (str) – An optional name for the network.

  • layersInfo (str) – If contains information for each hidden layer. The string should be in the format explained in the “Layers” class.

  • trainDs (dataset object, Derived from "BaseDSet" class) – Must conform to the dataset object. Used only for training. For more information about the dataset classes, please refer to the “datasets” directory.

  • testDs (dataset object, Derived from "BaseDSet" class) – Must conform to the dataset object. Used for testing. For more information about the dataset classes, please refer to the “datasets” directory.

  • validationDs (dataset object, Derived from "BaseDSet" class) – Must conform to the dataset object. Used for evaluation and hyper-parameter search.

  • batchSize (int) – The batch size used in each iteration of training. If not specified, the batch size of “trainDs” is used for training.

  • numEpochs (int) – Total number of Epochs to train the model.

  • regFactor (float, default = 0.0) –

    Regularization Factor. If this is zero, then L2 regularization is disabled for all layers. Otherwise, if this is non-zero, then L2 regularization is enabled. In this case:

    • If a factor is specified in the L2R post-activation, the specified value is used for that layer.

    • Otherwise, if the factor is not specified, then this value is use for the L2 regularization factor for that layer.

  • dropOutKeep (float, default = 1.0) –

    The probability of keeping results for dropout. dropRate is (1.0 - dropOutKeep). This value can be one of the following:

    • If this value is 1 (default), then all DO layers use their specified rate. If a rate was not specified for a DO layer, Dropout is disabled for that later.

    • If this values is non-zero and less than 1, then all DO layers use their specified rate. If a rate was not specified for a DO layer, its rate is set to “1.0-dropOutKeep”.

    • If this value is 0.0, then Dropout is disabled globally in the whole network.

  • learningRate (tuple, list, float, or None) –

    • None: This is used when the model is not used for training. (trainDs should also be None in this case)

    • tuple: If it is a tuple (start, end, [warmUp]), the learning rate starts at “start” value and exponentially decays to “end” value at the end of training. The optional “warmUp” value gives the number of “Warm-up” batches at the beginning of training. During warm-up phase, the learning rate increases linearly from 0 to the “start” value.

    • float: If it is a single floating point value, the learning rate is fixed during the training.

    • list of tuples: If it is a list of the form: [(0,lr1),(n2,lr2), …,(nN,lrN)], learning rate is changed based on a piecewise equation. It starts at “lr1” and stays constant until it reaches batch “n2”. Then the learning rate becomes “lr2” until it reaches batch “n3” where it changes to “lr3”, and so on. The values “n2”, “n3”, …, “nN” must monotonically increase. The last learning rate is used until the end of training. Note that “n1” must always be equal to 0 specifying initial learning rate.

      • Example:

        learningRate=[(0,.01),(300,.005),(400, .001)]
        

        Learning rates for batches:

        Batches

        Learning Rate

        0 to 299

        0.01

        300 to 399

        0.005

        400 to the end of training

        0.001

      One special case is when a tuple of the form (b, ‘trainAll’) is included. This is used when doing transfer learning. In this case we train only the non-transferred parameter before batch ‘b’ and all trainable parameters after batch ‘b’. This does not change the learning rate behavior.

      • Example:

        learningRate=[(0,.1),(200,'trainAll'),(300,.05),(400, .01)]
        

        Schedule of batches:

        Batches

        Learning Rate

        Trained Parameters

        0 to 199

        0.1

        Only non-transferred

        200 to 299

        0.1

        All trainable

        300 to 399

        0.05

        All trainable

        400 to end

        0.01

        All trainable

      If the first tuple is of the form (0, “WarmUp”), then it means the training starts with “w” batches of Warm-up before continuing with the piecewise. The number of Warm-up batches “w” is specified by the next tuple in the list.

      • Example:

        learningRate=[(0,'WarmUp'), (100,.1), (200,'trainAll'), (300,.05), (400, .01)]
        

        Schedule of batches: (w=100 in this case)

        Batches

        Learning Rate

        Trained Parameters

        0 to 99

        b*(0.1)/100

        Only non-transferred

        100 to 199

        0.1

        Only non-transferred

        200 to 299

        0.1

        All trainable

        300 to 399

        0.05

        All trainable

        400 to end

        0.01

        All trainable

  • optimizer (str) – The type of optimizer used for training. Available options are: ‘GradientDescent’, ‘Adam’ (Default), and ‘Momentum’. If not specified, “Momentum” is used for classification and “Adam” for regression problems.

  • lossFunction (function) –

    A function that is used to calculate the loss in the training mode. This is used to define a customized loss function. This is used by the model to calculate the loss during the training. The function takes the following arguments:

    • layers: A “Layers” object as defined in the “Layers.py” file. This keeps a list of all the layers in the model that may be used for the calculation of the loss.

    • predictions: The output(s) of the network just before the output layer. This is a tuple containing all outputs of the network.

    • groundTruths: The batch of labels used for the training step. This is a tuple containing all label objects. The tuple usually contains the placeholders created in the Layer’s “makePlaceholders” function.

  • blocks (list) – A list of Block objects or blockInfo strings that extend fireball’s predefined layers and can be used in the layersInfo string. For more information about how blocks work, please refer to Blocks.

  • modelFilePath (str) – The path to the model file used to load this model. Only set when the makeFromFile is used to create a model.

  • saveModelFileName (str) –

    The file name to use when saving the trained network information. A training session can be resumed using information in the saved file.

    If the “saveBest” argument is True, then this file name is also used to create another file that saves the network with best results so far during the training. If this argument is None, then the “savePeriod” and “saveBest” arguments are ignored and the network information is not saved during the training. If this argument is True, then the file name “SavedModel” will be used to save the training information.

  • savePeriod (int) –

    The number of epochs before saving the current training state. The training can be resumed using the files saved during the training.

    Ignored if “saveModelFileName” is None. If this is set to None or 0, then the trained network is saved only once at the end of training. In this case if the training is interrupted for any reason, the training restarts from the begining in the next session (all training info is lost).

  • saveBest (Boolean) – If true, the network with best results so far during the training will be saved in a separate file. This is useful when the network performance degrades as more training epochs are processed. Ignored if “saveModelFileName” is None.

  • gpus (Number, List, or str) –

    If this is a list, it should contain the integers specifying the GPU devices to be used. For example [0,3] means this class will use the “/gpu:0” and “/gpu:3” devices during the training and inference.

    If this is a string, it can be “All” or “Half” which means use all or half of the detected gpu devices correspondingly. Or it can be a comma delimited list of gpu numbers for example: “1,3”.

    If this is an integer, that number would specify the only GPU device that will be used.

    If this is None, this class uses half of the detected GPUs. (Same as passing the “Half” string)

    Use [-1] to disable GPU usage and run on CPU.

  • netParams (list) – This parameter is only used internally when this constructor function is called from “makeFromFile”. It is a list of NetParam objects.

  • trainingState (dict) – This parameter is only used internally when this constructor function is called from “makeFromFile”. It contains the training state that can be used to resume an interrupted training session.

__init__(name='FireballModel', layersInfo=None, trainDs=None, testDs=None, validationDs=None, batchSize=None, numEpochs=10, regFactor=0.0, dropOutKeep=1.0, learningRate=0.01, optimizer=None, lossFunction=None, blocks=[], modelFilePath=None, saveModelFileName=None, savePeriod=None, saveBest=False, gpus=None, netParams=None, trainingState=None)

Initialize all the parameters and then build a TensorFlow graph based on the parameters.

Parameters:
  • name (str) – An optional name for the network.

  • layersInfo (str) – If contains information for each hidden layer. The string should be in the format explained in the “Layers” class.

  • trainDs (dataset object, Derived from "BaseDSet" class) – Must conform to the dataset object. Used only for training. For more information about the dataset classes, please refer to the “datasets” directory.

  • testDs (dataset object, Derived from "BaseDSet" class) – Must conform to the dataset object. Used for testing. For more information about the dataset classes, please refer to the “datasets” directory.

  • validationDs (dataset object, Derived from "BaseDSet" class) – Must conform to the dataset object. Used for evaluation and hyper-parameter search.

  • batchSize (int) – The batch size used in each iteration of training. If not specified, the batch size of “trainDs” is used for training.

  • numEpochs (int) – Total number of Epochs to train the model.

  • regFactor (float, default = 0.0) –

    Regularization Factor. If this is zero, then L2 regularization is disabled for all layers. Otherwise, if this is non-zero, then L2 regularization is enabled. In this case:

    • If a factor is specified in the L2R post-activation, the specified value is used for that layer.

    • Otherwise, if the factor is not specified, then this value is use for the L2 regularization factor for that layer.

  • dropOutKeep (float, default = 1.0) –

    The probability of keeping results for dropout. dropRate is (1.0 - dropOutKeep). This value can be one of the following:

    • If this value is 1 (default), then all DO layers use their specified rate. If a rate was not specified for a DO layer, Dropout is disabled for that later.

    • If this values is non-zero and less than 1, then all DO layers use their specified rate. If a rate was not specified for a DO layer, its rate is set to “1.0-dropOutKeep”.

    • If this value is 0.0, then Dropout is disabled globally in the whole network.

  • learningRate (tuple, list, float, or None) –

    • None: This is used when the model is not used for training. (trainDs should also be None in this case)

    • tuple: If it is a tuple (start, end, [warmUp]), the learning rate starts at “start” value and exponentially decays to “end” value at the end of training. The optional “warmUp” value gives the number of “Warm-up” batches at the beginning of training. During warm-up phase, the learning rate increases linearly from 0 to the “start” value.

    • float: If it is a single floating point value, the learning rate is fixed during the training.

    • list of tuples: If it is a list of the form: [(0,lr1),(n2,lr2), …,(nN,lrN)], learning rate is changed based on a piecewise equation. It starts at “lr1” and stays constant until it reaches batch “n2”. Then the learning rate becomes “lr2” until it reaches batch “n3” where it changes to “lr3”, and so on. The values “n2”, “n3”, …, “nN” must monotonically increase. The last learning rate is used until the end of training. Note that “n1” must always be equal to 0 specifying initial learning rate.

      • Example:

        learningRate=[(0,.01),(300,.005),(400, .001)]
        

        Learning rates for batches:

        Batches

        Learning Rate

        0 to 299

        0.01

        300 to 399

        0.005

        400 to the end of training

        0.001

      One special case is when a tuple of the form (b, ‘trainAll’) is included. This is used when doing transfer learning. In this case we train only the non-transferred parameter before batch ‘b’ and all trainable parameters after batch ‘b’. This does not change the learning rate behavior.

      • Example:

        learningRate=[(0,.1),(200,'trainAll'),(300,.05),(400, .01)]
        

        Schedule of batches:

        Batches

        Learning Rate

        Trained Parameters

        0 to 199

        0.1

        Only non-transferred

        200 to 299

        0.1

        All trainable

        300 to 399

        0.05

        All trainable

        400 to end

        0.01

        All trainable

      If the first tuple is of the form (0, “WarmUp”), then it means the training starts with “w” batches of Warm-up before continuing with the piecewise. The number of Warm-up batches “w” is specified by the next tuple in the list.

      • Example:

        learningRate=[(0,'WarmUp'), (100,.1), (200,'trainAll'), (300,.05), (400, .01)]
        

        Schedule of batches: (w=100 in this case)

        Batches

        Learning Rate

        Trained Parameters

        0 to 99

        b*(0.1)/100

        Only non-transferred

        100 to 199

        0.1

        Only non-transferred

        200 to 299

        0.1

        All trainable

        300 to 399

        0.05

        All trainable

        400 to end

        0.01

        All trainable

  • optimizer (str) – The type of optimizer used for training. Available options are: ‘GradientDescent’, ‘Adam’ (Default), and ‘Momentum’. If not specified, “Momentum” is used for classification and “Adam” for regression problems.

  • lossFunction (function) –

    A function that is used to calculate the loss in the training mode. This is used to define a customized loss function. This is used by the model to calculate the loss during the training. The function takes the following arguments:

    • layers: A “Layers” object as defined in the “Layers.py” file. This keeps a list of all the layers in the model that may be used for the calculation of the loss.

    • predictions: The output(s) of the network just before the output layer. This is a tuple containing all outputs of the network.

    • groundTruths: The batch of labels used for the training step. This is a tuple containing all label objects. The tuple usually contains the placeholders created in the Layer’s “makePlaceholders” function.

  • blocks (list) – A list of Block objects or blockInfo strings that extend fireball’s predefined layers and can be used in the layersInfo string. For more information about how blocks work, please refer to Blocks.

  • modelFilePath (str) – The path to the model file used to load this model. Only set when the makeFromFile is used to create a model.

  • saveModelFileName (str) –

    The file name to use when saving the trained network information. A training session can be resumed using information in the saved file.

    If the “saveBest” argument is True, then this file name is also used to create another file that saves the network with best results so far during the training. If this argument is None, then the “savePeriod” and “saveBest” arguments are ignored and the network information is not saved during the training. If this argument is True, then the file name “SavedModel” will be used to save the training information.

  • savePeriod (int) –

    The number of epochs before saving the current training state. The training can be resumed using the files saved during the training.

    Ignored if “saveModelFileName” is None. If this is set to None or 0, then the trained network is saved only once at the end of training. In this case if the training is interrupted for any reason, the training restarts from the begining in the next session (all training info is lost).

  • saveBest (Boolean) – If true, the network with best results so far during the training will be saved in a separate file. This is useful when the network performance degrades as more training epochs are processed. Ignored if “saveModelFileName” is None.

  • gpus (Number, List, or str) –

    If this is a list, it should contain the integers specifying the GPU devices to be used. For example [0,3] means this class will use the “/gpu:0” and “/gpu:3” devices during the training and inference.

    If this is a string, it can be “All” or “Half” which means use all or half of the detected gpu devices correspondingly. Or it can be a comma delimited list of gpu numbers for example: “1,3”.

    If this is an integer, that number would specify the only GPU device that will be used.

    If this is None, this class uses half of the detected GPUs. (Same as passing the “Half” string)

    Use [-1] to disable GPU usage and run on CPU.

  • netParams (list) – This parameter is only used internally when this constructor function is called from “makeFromFile”. It is a list of NetParam objects.

  • trainingState (dict) – This parameter is only used internally when this constructor function is called from “makeFromFile”. It contains the training state that can be used to resume an interrupted training session.

classmethod downloadFromZoo(modelName, destFolder, modelType=None)

A class method used to download a model from an online Fireball model zoo.

Parameters:
  • modelName (str) – A string containing the name of the model. If the “modelType” parameter is not provided, then this name must include the file extension to help identify the type of the model.

  • destFolder (str) – The folder where the downloaded model file is saved.

  • modelType (str) –

    The type of the model. Currently the following types are supported:

    • ’Fireball’: Fireball models. (Extension: fbm)

    • ’CoreML’: Models exported to CoreML ready to be deployed to iOS. (Extension: mlmodel)

    • ’ONNX’: Models exported to ONNX. (Extension: onnx)

    • ’NPZ’: Numpy ‘npz’ files containing the model information. (Extension: npz)

classmethod loadModelFrom(fileName)

This is a class method that is used to read the network information from a file and return the results.

Parameters:

fileName (str) –

The name of the file containing the network information. This can be:

  • A “Model” file with ‘fbm’ extension with all network information in a single numpy npz file,

  • A file with ‘fbmc’ extension containing the quantized and compressed network information,

Returns:

  • graphInfo (dict) – A dictionary containing information about structure, inout, output, etc. of the network.

  • netParams (list) – The list of NetParam objects for all network parameters (i.e. weights and biases)

  • trainInfo (dict) – A dictionary containing training information.

  • trainState (dict) – A dictionary containing the last training state such as the epoch number, learning rate, etc.

save(fileName, layersStr=None, blocks=None, netParams=None, epochInfo=None)

This function packages all the information in current network and uses the “saveFbm” class method to save it to the file specified by “fileName”.

Parameters:
  • fileName (str) – The name of file containing network information. An ‘fbm’ extension is appended to the file name if it is not already included.

  • layersStr (str) – A string containing the layers information. See the “Layers” class for more information about the format of the layersStr. If this is None, this function calls the “self.layers.getLayersStr” function to get the layers information from the current model.

  • blocks (list) – A list of text strings or Block instances each defining a block. For more information about how blocks work, please refer to the “Block” and “BlockInstance” classes.

  • netParams (list) – A list of NetParam objects containing the network parameters (weights and biases). If this is None, this function retrieves the network parameters from current model.

  • epochInfo (tuple) –

    If not None, it is a tuple containing the following training state information:

    • epoch: The last Epoch number in the previous training session.

    • batch: The last batch number in the previous training session.

    • learningRate: The value of learningRate in the last batch of last epoch in the previous training session

    • loss: The value of loss in the last batch of last epoch in the previous training session

    • validMetric: The validation ErrorRate/Accuracy/MSE/mAP calculated after the end the last epoch in the previous training session.

    • testMetric: The test ErrorRate/Accuracy/MSE/mAP calculated after the end the last epoch in the previous training session.

    This information is saved to the file in the form of a dictionary. If this is None, the training state is not saved to the file.

classmethod saveFbm(graphInfo, trainInfo, netParams, trainState, fbmFileName)

This function saves the given information to an “fbm” file.

Parameters:
  • graphInfo (dict) – A dictionary containing information about the network structure.

  • trainInfo (dict) – A dictionary containing information about training the network (Num. of Epochs, BatchSize, etc)

  • netParams (dict) – A list of NetParam objects containing the network parameter values (weights and biases).

  • trainState (dict) – A dictionary containing the network state in the last training session. This is used to resume a training session if it is interrupted.

  • fbmFileName (str) – The file name used to save the model information.

exportToOnnx(onnxFilePath, **kwargs)

This function exports current network information into an “ONNX” file. Please refer to the “fb2onnx.py” file for more information.

Parameters:
  • onnxFilePath (str) – The file name used to export the model information.

  • **kwargs (dict) –

    A set of additional arguments passed directly to the “export” function of the “OnnxBuilder” class defined in the “fb2onnx” module. Here is a list of the arguments that may be included:

    • quiet (Boolean): True means there are no messages printed during the execution of the function.

    • runQuantized (Boolean): True means include the codebooks and lookup functionality in the exported model so that the quantized model is executed at the inference time. This makes the exported model smaller at the expense of slightly increased execution time during the inference. If this is False for a quantized model, Fireball de-quantizes all parameters and includes the de-quantized information in the exported model. If this model is not quantized, then this argument is ignored.

    • classNames (list of strings): If present, it must contains a list of class names for a classification model. The class names are then included in the exported model so that at the inference time the actual labels can easily be returned. If this is not present, then the class names are not included in the exported model and the inference code needs to convert predicted classes to the actual labels by some other means.

    • graphDocStr (str): A text string containing documentation for the graph in the exported model. If present, this will be included in the exported onnx file.

    • modelDocStr (str): A text string containing documentation for the exported model. If present, this will be included in the exported onnx file.

exportToTf(tfPath, **kwargs)

This function exports current network information to the specified directory. Please refer to the “fb2tf.py” file for more information.

Parameters:
  • tfPath (str) – The path to a folder that will contain the exported tensorflow files.

  • **kwargs (dict) –

    A set of additional arguments passed directly to the “export” function defined in the “fb2tf” module. Here a list of the arguments that may be included:

    • quiet (Boolean): True means there are no messages printed during the execution of the function.

    • runQuantized (Boolean): True means include the codebooks and lookup functionality in the exported model so that the quantized model is executed at the inference time. This makes the exported model smaller at the expense of slightly increased execution time during the inference. If this is False for a quantized model, Fireball de-quantizes all parameters and includes the de-quantized information in the exported model. If this model is not quantized, then this argument is ignored.

    • classNames (list of str): If present, it must contains a list of class names for a classification model. The class names are then included in the exported model so that at the inference time the actual labels can easily be returned. If this is not present, then the class names are not included in the exported model and the inference code needs to convert predicted classes to the actual labels by some other means.

exportToCoreMl(fileName, **kwargs)

This function exports current network information into a “CoreML” file. Please refer to the “fb2coreml.py” file for more information.

Parameters:
  • fileName (str) – The file name used to export the model information.

  • **kwargs (dict) –

    A set of additional arguments passed directly to the “export” function of CmlBuilder class defined in the “fb2coreml” module. Here a list of the arguments that may be included:

    • classNames (list): The class names used by the CoreML model. This is only used for classification problems.

    • isBgr (Boolean): True means the images at the input of the model are in BGR format. This is used only for models that take an image as input.

    • rgbBias (list or float): If this is a list, then it should contain the bias values for Red, green, and blue components (In the same order). If it is a float, it is used as bias for all 3 components. Also, this is used for the case of monochrome images. This is used only for models that take an image as input.

    • scale (float): This is the scale that is applied to the input image before adding the rgbBias value(s) above. Basically, the “processedImage” which is actually fed to the model is defined as:

      processedImage = scale x image + rgbBias
      
    • maxSeqLen (int): The max sequence length used for NLP models. This defaults to 384 and used only if a sequence of token Ids are fed to the model in different NLP tasks.

    • author (str): The text string used in the CodeML model for the author of the model.

    • modelDesc (str): The text string used in the CodeML model as a short description of the model.

    • quiet (Boolean): True means there are no messages printed during the execution of the function.

exportParamsNpz(fileName, orgNamesFile=None)

This function exports the network parameter to a numpy NZP file.

Parameters:
  • fileName (str) – The (NPZ) file name used to save the model information.

  • orgNamesFile (str) – If specified, it must contain the path to the yaml file that was used to import the model from the original h5 file. In this case the names of parameter tensors are imported from this yaml file and used in the exported NPZ file.

classmethod makeFromFile(modelPath=None, layersInfo=None, trainDs=None, testDs=None, validationDs=None, batchSize=None, numEpochs=None, regFactor=None, dropOutKeep=None, learningRate=None, optimizer=None, lossFunction=None, name=None, blocks=[], saveModelFileName=None, savePeriod=None, saveBest=None, gpus=None, initParams=True)

This class method reads the network information from a file and creates a “Model” instance. This function uses the class method “loadModelFrom” to read the information.

Parameters:
  • modelPath (str) – The name of the file containing the network information. See the “loadModelFrom” function description for more information about the supported file formats.

  • initParams (Boolean) – If True, the network parameters are initialized by the values in the specified model file. Otherwise, the network parameters are initialized randomly (Training from scratch).

  • Others (The rest of parameters) – The rest of arguments are used to override the corresponding information read from the file. For a description of each argument please refer to the documentation of the __init__() function above.

Returns:

An instance of Model class created from the information in the specified file.

Return type:

Model

Note

  • If numEpochs is specified, this means we want a new training. In this case the training state loaded from the file is ignored.

  • If “saveModelFileName” is given and it exists, the model is loaded from this file. This is the case when a retraining of the original model was interrupted for any reason and we are now resuming it.

createLrModel(modelPath, lrParams, **kwargs)

This function converts the specified parameters of this model to Low-Rank tensors based on the information in lrParams and then saves the resulting model to a file specified by modelPath.

Parameters:
  • modelPath (str) – The path to the file that contains the model information for the converted model.

  • lrParams (list) –

    This contains a list of tuples. The first element in each tuple is a layer name that specifies the layer to modify.

    The second parameter is the upper bound of the MSE between the original tensor and it’s low-rank equivalent. The best “rank” value is found using this MSE value.

  • **kwargs (dict) –

    A set of additional arguments. Here is a list of arguments that are currently supported:

    • decomposeDWCN (Boolean): If false, the depth-wise convolutional layers are skipped. Otherwise, (the default), they are decomposed if specified in the lrParams.

    • quiet (Boolean): True means there are no messages printed during the execution of the function.

Returns:

Total number of parameters in the model after applying low-rank decomposition.

Return type:

int

createPrunedModel(modelPath, prnParams, **kwargs)

This function reduces the number of non-zero parameters by pruning the ones close to zero. The pruning is applied to the layers specified in the prnParams. The resulting model is saved to the file specified by modelPath.

Parameters:
  • modelPath (str) – The path to the file that contains the model information for the converted (pruned) model.

  • prnParams (list) –

    This contains a list of tuples. The first element in each tuple is a layer name that specifies the layer to modify.

    The second parameter is the upper bound of the MSE between the original tensor and it’s pruned version.

  • **kwargs (dict) –

    A set of additional arguments. Here is a list of arguments that are currently supported:

    • quiet (Boolean): True means there are no messages printed during the execution of the function.

Returns:

Total size of non-zero parameters in bytes.

Return type:

int

createQuantizedModel(modelPath, qParams, **kwargs)

This function quantizes the parameters of the model. The quantization is applied to the layers specified in the qParams. The resulting model is saved to the file specified by modelPath.

Parameters:
  • modelPath (str) – The path to the file that contains the model information for the quantized model.

  • qParams (list) –

    This contains a list of tuples. The first element in each tuple is a layer name that specifies the layer to quantize.

    The second parameter is the upper bound of the MSE (mseUb) between the original tensor and it’s quantized version.

  • **kwargs (dict) –

    A set of additional arguments. Here is a list of arguments that are currently supported:

    • reuseEmptyClusters (Boolean): True (default) means keep reusing/reassigning empty clusters during the K-Means algorithm. In this case, the final number of clusters is a power of 2 (between minBits and maxBits). False means we remove the empty clusters from the codebook and the final number of clusters may not be a power of 2. (between minSymCount and maxSymCount)

    • weightsOnly (Boolean): True (default) means quantize only weight matrixes. Biases and BatchNorm parameters are not quantized. False means quantize any network parameter if possible.

    • minSymCount (int): The minimum number of symbols in the quantized tensors. Fireball does a binary search between minSymCount and maxSymCount to find the best symbol count that results in a quantization error (MSE) below the specified mseUb. The found symbol count is used for the initial size of codebook. The default is 4. Ignored if reuseEmptyClusters is True.

    • maxSymCount (int): The maximum number of symbols in the quantized tensors. Fireball does a binary search between minSymCount and maxSymCount to find the best symbol count that results in a quantization error (MSE) below the specified mseUb. The found symbol count is used for the initial size of codebook. The default is 4096. Ignored if reuseEmptyClusters is True.

    • minBits (int): The minimum number of quantization bits for the quantized tensors. Fireball searches for the lowest quantization bits (qBits) between minBits and maxBits that results in a quantization error (MSE) below the specified mseUb. The quantization bits found qBits defines the codebook size (codebookSize=symCount=2^qBits). The default is 2. Ignored if reuseEmptyClusters is False.

    • maxBits (int): The maximum number of quantization bits for the quantized tensors. Fireball searches for the lowest quantization bits (qBits) between minBits and maxBits that results in a quantization error (MSE) below the specified mseUb. The found qBits value defines the codebook size (codebookSize=symCount=2^qBits). The default is 12. Ignored if reuseEmptyClusters is False.

    • quiet (Boolean): True means there are no messages printed during the execution of the function.

Returns:

Total size of quantized parameters in bytes.

Return type:

int

classmethod pruneModel(inputModelPath, outputModelPath, mseUb, **kwargs)

This class method reads the model information from “inputModelPath”, prunes its parameters based on the mseUb and minReductionPercent parameters, and saves the new pruned model to “outputModelPath”.

Parameters:
  • inputModelPath (str) – The path to the input file that is about to be pruned.

  • outputModelPath (str) – The path to the resulting quantized file.

  • mseUb (float) – The Upper bound for the MSE between the original and pruned parameters.

  • **kwargs (dict) –

    A set of additional arguments passed to the downstream functions. Here is a list of the arguments that may be included:

    • minReductionPercent (int): If provided, it specifies the minimum percentage of reduction required for each tensor. If this percentage cannot be achieved, the tensor is not pruned.

    • quiet (Boolean): True means there are no messages printed during the execution of the function.

    • verbose (Boolean): If not running in parallel (numWorkers=0), setting this to True causes a line to be printed with detailed results for each model parameter processes. Otherwise only the progress is displayed. This parameter is ignored if quiet is True. This also has no effect if running in parallel mode (numWorkers>0).

    • numWorkers (int): The number of worker threads pruning in parallel. 0 means single thread operation which is slower but can be more verbose. None (default) lets fireball to decide this values based on the number of CPUs available.

Returns:

  • totalPruned (int) – Total number of parameters pruned.

  • sOut (int) – The size of new pruned model.

classmethod quantizeModel(inputModelPath, outputModelPath, mseUb, **kwargs)

This class method reads the model information from “inputModelPath”, quantizes its parameters based on the information in “qInfo”, and saves the new quantized model to “outputModelPath”.

Note

This method is used for trained quantization. The results is not necessarily optimized for entropy coding.

Parameters:
  • inputModelPath (str) – The path to the input file that is about to be quantized.

  • outputModelPath (str) – The path to the resulting quantized file.

  • mseUb (float) – The Upper bound for the MSE between the original and quantized parameters.

  • **kwargs (dict) –

    A set of additional arguments passed to the downstream functions. Here is a list of the arguments that may be included:

    • reuseEmptyClusters (Boolean): True (default) means keep reusing/reassigning empty clusters during the K-Means algorithm. In this case, the final number of clusters is a power of 2 (between minBits and maxBits). False means we remove the empty clusters from the codebook and the final number of clusters may not be a power of 2. (between minSymCount and maxSymCount)

    • weightsOnly (Boolean): True (default) means quantize only weight matrixes. Biases and BatchNorm parameters are not quantized. False means quantize any network parameter if possible.

    • minSymCount (int): The minimum number of symbols in the quantized tensors. Fireball does a binary search between minSymCount and maxSymCount to find the best symbol count that results in a quantization error (MSE) below the specified mseUb. The found symbol count is used for the initial size of codebook. The default is 4. Ignored if reuseEmptyClusters is True.

    • maxSymCount (int): The maximum number of symbols in the quantized tensors. Fireball does a binary search between minSymCount and maxSymCount to find the best symbol count that results in a quantization error (MSE) below the specified mseUb. The found symbol count is used for the initial size of codebook. The default is 4096. Ignored if reuseEmptyClusters is True.

    • minBits (int): The minimum number of quantization bits for the quantized tensors. Fireball searches for the lowest quantization bits (qBits) between minBits and maxBits that results in a quantization error (MSE) below the specified mseUb. The quantization bits found qBits defines the codebook size (codebookSize=symCount=2^qBits). The default is 2. Ignored if reuseEmptyClusters is False.

    • maxBits (int): The maximum number of quantization bits for the quantized tensors. Fireball searches for the lowest quantization bits (qBits) between minBits and maxBits that results in a quantization error (MSE) below the specified mseUb. The found qBits value defines the codebook size (codebookSize=symCount=2^qBits). The default is 12. Ignored if reuseEmptyClusters is False.

    • quiet (Boolean): True means there are no messages printed during the execution of the function.

    • verbose (Boolean): If not running in parallel (numWorkers=0), setting this to True causes a line to be printed with detailed results for each model parameter processes. Otherwise only the progress is displayed. This parameter is ignored if quiet is True. This also has no effect if running in parallel mode (numWorkers>0).

    • numWorkers (int): The number of worker threads quantizing in parallel. 0 means single thread operation which is slower but can be more verbose. None (default) lets fireball to decide this values based on the number of CPUs available.

Returns:

  • originalBytes (int) – The original size of parameter data in bytes.

  • quantizedBytes (int) – The quantized size of parameter data in bytes.

  • sOut (int) – The file size of new quantized model.

classmethod compressModel(inputModelPath, outputModelPath, **kwargs)

This class method reads the model information from inputModelPath, compresses its parameters using arithmetic coding and saves the new quantized model to outputModelPath.

Parameters:
  • inputModelPath (str) – The path to the input file that is about to be compressed.

  • outputModelPath (str) – The path to the resulting compressed file.

  • **kwargs (dict) –

    A set of additional arguments passed to the downstream functions. Here is a list of the arguments that may be included:

    • quiet (Boolean): True means there are no messages printed during the execution of the function.

    • verbose (Boolean): If not running in parallel (numWorkers=0), setting this to True causes a line to be printed with detailed results for each model parameter processes. Otherwise only the progress is displayed. This parameter is ignored if quiet is True. This also has no effect if running in parallel mode (numWorkers>0).

    • numWorkers (int): The number of worker threads compressing in parallel. 0 means single thread operation which is slower but can be more verbose. None (default) lets fireball to decide this values based on the number of CPUs available.

Returns:

  • sIn (int) – The original file size.

  • sOut (int) – The file size of new compressed model.

classmethod resetGraph()

A utility class method that resets the default graph of TensorFlow.

initSession(session=None, summaryWriter=None)

Creates and initializes a TensorFlow session for this model.

Parameters:
  • session (TensorFlow session object) – If this is None, a new session is created and kept by this class for all TensorFlow operations. Otherwise, the specified session will be used by this class.

  • summaryWriter (TensorFlow summaryWriter object, str, or Boolean) –

    • If this is an str, it contains the path to the TensorFlow summary information folder.

    • If this is Boolean and the value is True, a new path called “TensorBoard” is created and used to save the TensorFlow summary information.

    • If this is None, the summaryWriter is disabled.

    • Otherwise this should be a TensorFlow summaryWriter object.

train(logBatchInfo=False)

Trains the network using the training data in “trainDs”. The training stops after the specified number of epochs or if the minimum-loss criteria are met. The “maxLoss”, “minLoss” and “minLossCount” parameters can be set before calling this function.

This function contains the main training loop. It acquires the batches of training data iteratively and uses them to train the network. At the end of each epoch, this function calculates the test and/or validation errors if possible and prints one row in the training table.

Parameters:

logBatchInfo (Boolean) –

If true, this function logs the learning rate and loss values for each batch during the training. After training this information is available in the batchLogInfo dictionary. This dictionary object has the following items:

  • loss: A 1-D array of floating point values specifying the training loss for each batch.

  • learningRate: a 1-D array of floating point values specifying the learning rate used for each batch of training datas.

getInferenceTime(dataSet=None, maxSamples=None)

This function runs the model in inference mode and calculates the average inference time for processing one sample.

Parameters:
  • dataSet (dataset object (Derived from "BaseDSet" class)) – The dataset object that is used for the evaluation. If this is None, the function uses the “testDs” specified when the model was created. If a test dataset was not specified, then a “ValueError” exception is thrown.

  • maxSamples (int) – The max number of samples from the “dataSet” to be processed for calculation of inference time. A larger value results in a more acurate estimate, but it also takes longer to complete. If this is None, then all of the samples in the specified dataset are used for the calculation.

Return type:

The average inference time to process one sample.

evaluateDSet(dataSet, batchSize=None, quiet=False, returnMetric=False, **kwargs)

This function evaluates this model using the specified dataset.

Parameters:
  • dataSet (dataset object (Derived from "BaseDSet" class)) – The dataset object that is used for the evaluation.

  • batchSize (int) – The batchSize used for the evaluation process. This function processes one batch of the samples at a time. If this is None, the batch size specified by the dataset object is used instead.

  • quiet (Boolean) – If true, no messages are printed to the “stdout” during the evaluation process.

  • returnMetric (Boolean) –

    If true, instead of calculating all the results, just calculates the main metric of the dataset and returns that. This is mostly used during the training at the end of each epoch.

    Otherwise, if this is False (the default), the full results are calculated and a dictionary of all results is returned.

  • **kwargs (dict) –

    This contains some additional task specific arguments. All these arguments are passed down to the dataset’s “evaluateModel” method. Here is a list of what can be included in this dictionary.

    • maxSamples (int): The max number of samples from the “dataSet” to be processed for the evaluation of the model. If not specified, all samples are used (default behavior).

    • topK (int): For classification cases, this indicates whether a “top-K” accuracy value should also be calculated. For example for ImageNet dataset classification, usually the top-5 accuracy value is used (topK=5) besides the top-1. If it is zero (default), the top-K error is not calculated. This is ignored for regression cases.

    • confMat (Boolean): For classification cases, this indicates whether the confusion matrix should be calculated. If the number of classes is more than 10, this argument is ignored and confusion matrix is not calculated. This is ignored for regression cases.

    • expAcc (Boolean or None): Ignored for regression cases. For classification cases:

      • If this is a True, the expected accuracy and kappa values are also calculated. When the number of classes and/or number of evaluation samples is large, calculating expected accuracy can take a long time.

      • If this is False, the expected accuracy and kappa are not calculated.

      • If this is None (the default), then the expected accuracy and kappa are calculated only if number of classes does not exceed 10.

      Note: If confMat is True, then expAcc is automatically set to True.

    • jsonFile (str): The name of JSON file that is created by this function. This is used with some NLP applications where the results could be saved to a JSON file for evaluation.

Returns:

  • If returnMetric is True, the actual value of dataset’s main metric is returned.

  • Otherwise, this function returns a dictionary containing the results of the evaluation process.

evaluate(quiet=False, **kwargs)

This function evaluates this model using “testDs” dataset that was specified when the model was created.

Parameters:
  • quiet (Boolean) – If true, no messages are printed to the “stdout” during the evaluation process.

  • **kwargs (dict) – This contains some additional task specific arguments. All these arguments are passed down to the “evaluateDSet” method. Please refer to the documentation for “evaluateDSet” for a list of possible arguments in “kwargs”

Returns:

A dictionary containing the results of the evaluation process.

Return type:

dict

evalBatch(batchSamples, batchLabels)

This function is used by the evalMultiDimRegression function defined in the BaseDSet class. It can help improve the evaluation time for some regression problems that support calculating the evaluation metrics as part of tensorflow graph. See the “evalMultiDimRegression” function for more info.

Parameters:
  • batchSamples (numpy array, or tuple of numpy arrays) – The samples used for inference. If number of samples is a multiple of the number of towers, the sub-batches are assigned to each tower and the whole operation is done in one call to “session.run”. Otherwise the remaining samples are processed in an additional “session.run” call which uses only the first tower.

  • batchLabels (numpy array, or tuple of numpy arrays) – The labels used for evaluation of the predicted results. These labels are passed to the output layer supporting evaluation (its “supportsEval” is true), the output layer then calculates the evaluation metrics for this batch of samples.

Returns:

The results depends on the implementation of the output layer. Currently the regression output layer “REG” returns a tuple containing sum squared errors and sum absolute errors. These values are passed back to the “evalMultiDimRegression” function which accumulates them for the calculation of other regression evaluation metrics such as MSE, MAE, PSNR.

Return type:

Depends on the implementation of the output layer

inferBatch(samples, returnProbs=True)

Runs the model in inference mode using the specified batch of samples and returns the predicted outputs generated by the model. The inference is applied on all the samples in one or two operations. There is no loop going through the samples. The GPUs may run out of memory if all of the samples do not fit into the GPU memory.

Parameters:
  • samples (numpy array, or tuple of numpy arrays) –

    The samples used for inference. If number of samples is a multiple of the number of towers, the sub-batches are assigned to each tower and the whole operation is done in one call to “session.run”. Otherwise the remaining samples are processed in an additional “session.run” call which uses only the first tower.

    If this is a tuple, the input has multiple components. For example for NLP tasks, we have tokenIds and tokenTypes as two numpy arrays packed in a tuple and given to this function for inference.

  • returnProbs (Boolean) – If true, the predicted probabilities (or Confidence) of each class is returned for each sample. Otherwise, only the predicted classes are returned. This parameter only applies to classification tasks.

Returns:

The output of the model is returned for each given sample. The output of this function depends on the type of model and the parameters passed to this function. The output layer used in the model determines what type of output is returned as a result of inference.

The output can be a single numpy array containing one sub-tensor for each sample in “samples”. It can also be tuple of several numpy arrays each containing different components of the results for each sample. For example for object detection the result could be a tuple containing prediction scores and bounding boxes in different numpy arrays.

Return type:

Depends on the type of model and the parameters passed to this function

inferOne(sample, returnProbs=False)

Runs inference for a single sample. This packages the single sample as an array of size one and calls the “inferBatch” function.

Parameters:
  • samples (numpy array) – The sample used for inference. Please refer to the documentation of “inferBatch” function for more details.

  • returnProbs (Boolean) – If true, the predicted probabilities (or Confidence) of each class is returned for the sample. Otherwise, only the predicted class is returned. Ignored for non-classification tasks.

Returns:

The output of the model is returned for the given sample. Please refer to the documentation of “inferBatch” function for more details.

Return type:

Depends on the type of model and the parameters passed to this function

getGradsForEpoch()

This function calculates the gradients of loss with respect to each model parameter for every training sample. The sum of the absolute values of the gradients is then calculated for each model parameter. These values are then normalized so that the maximum value is 1. The returned values are numbers between 0 and 1 giving a measure of importance for each specific network parameter.

This function saves the results of calculations in a file with “.grd” extension. When it is called again, if the the file already exist, there is no need to calculate the gradients info again.

Returns:

The sum absolute gradients for each network parameter as a list of numpy arrays.

Return type:

list

printNetConfig()

Prints current configuration of the model including training configuration and training state information.

printLayersInfo()

Prints the structure of network in a table. Each row describes one layer of the network.

getLayerOutputs(samples, layer, subLayer=-1)

Feeds the network with the specified sample(s) and for each sample, returns a tensor for outputs of the layer specified by “layer” and “subLayer”.

Parameters:
  • samples (numpy array, or tuple of numpy arrays) – The samples that are fed to the network. If only one sample is provided by itself (not in a list), only one tensor will be returned.

  • layer (int or str) –

    Specifies the layer whose output tensor will be calculated and returned.

    • If this is int, it is the index of the layer starting at 0 for the first layer of the network.

    • If this is string, it must be the name (Scope) of the layer as it is printed by the “printLayersInfo” function.

  • subLayer (int) –

    For multi-Stage layers, “subLayer” specifies stage whose output should be returned. The assignment of sublayers is different depending on types of layers. Here is an example from MobileNetV2:

    layer: 'BN:ReLU:CLP_H6:GAP'
    

    Sublayer

    Value

    0

    The output of Batch Norm layer

    1

    The output of ReLU

    2

    The output after CLP (Clip the output to max=6.0)

    3

    The output of Global Average Pool.

    -1

    Last output (Same as 3 in this case)

    Please refer to the “layers.py” file for the details about sublayer values for each type of layer.

Returns:

The output of the layer specified by “layer” and “subLayer” as tensor for each sample in “samples”.

Return type:

numpy array

getLayerByName(layerScope)

Finds and returns the layer object specified by the “layerScope”.

Parameters:

layerScope (str) – Specifies the layer. It must be the name (Scope) of the layer as it is printed by the “printLayersInfo” function.

Returns:

Returns the layer object specified by the “layerScope”.

Return type:

Layer

getAllNetParamValues()

Returns all model parameters as a list of numpy arrays.

Returns:

A list of numpy arrays.

Return type:

list

getLayerParams(layer=None, orgNamesFile=None)

Returns the network parameters for the specified layer.

Parameters:
  • layer (int or str) –

    Specifies the layer whose parameter are returned.

    • If this is int, it is the index of the layer starting at 0 for the first layer of the network.

    • If this is string, it must be the name (Scope) of the layer as it is printed by the “printLayersInfo” function.

    • If this is None, then all network parameters are returned in a list of pairs (name, param).

  • orgNamesFile (str) – If specified, it must contain the path to the yaml file that was used to import the model from the original h5 file. In this case the names used in the returned parameter info are extracted from the specified file. This is only used if “layer” is set to None.

Returns:

A list of numpy tensors for the network parameters of the specified layer. If a layer is not specified, it returns a list of tuples of the form (name, param) for all the network parameters.

Return type:

list

setLayerParams(layer, params)

Modifies the network parameters for the specified layer.

Parameters:
  • layer (int or str) –

    Specifies the layer whose parameter are returned.

    • If this is int, it is the index of the layer starting at 0 for the first layer of the network.

    • If this is string, it must be the name (Scope) of the layer as it is printed by the “printLayersInfo” function.

  • params (list of numpy arrays) – A list of numpy arrays for each parameter of the specified layer. The length of this list must match the actual number of parameters in the specified layer.

freezeLayerParams(layers)

Makes the parameters of specified layers non-trainable. This is not reversible.

Parameters:

layers (tuple or list of strings) –

Specifies the layers to freeze:

  • If this is a tuple, it must have 2 strings specifying the first and last layer to freeze.

  • If this is a list, it must contain the strings specifying the layers to freeze.

getFlops()

Returns the model complexity in number of floating point operations (flops) for inference of one input sample.

Returns:

An integer value giving the approximate number of floating point operations (flops) for inference of one input sample.

Return type:

int

closeSession()

Closes current TensorFlow Session. Call this after you are done with current instance of this class.

setCallback(afterBatch=None, afterEpoch=None)

Sets the callback functions that are called at the end of each epoch or batch.

Parameters:
  • afterBatch (function) –

    This function is called after each batch of training data is processed. The following parameters are passed to this function:

    • epoch: Current epoch number. (starting from 0)

    • batch: Batch number in current epoch. (starting from 0)

    • learningRate: The value of learning rate for this batch.

    • loss: The loss value for this batch.

  • afterEpoch (function) –

    This function is called at the end of each training epoch. The following parameters are passed to this function:

    • epoch: Current epoch number.

    • batch: Batch number of the last batch in current epoch.

    • learningRate: The value of learning rate for the last batch of current epoch.

    • loss: The loss value for the last batch of current epoch.

    • testMetric: The evaluation metric value (ErrorRate/Accuracy/MSE/mAP calculated on test dataset after the end of this epoch.

    • validMetric: The evaluation metric value (ErrorRate/Accuracy/MSE/mAP calculated on validation dataset after the end of this epoch.

classmethod printMsg(textStr, eol=True)

print the specified text string only if the quiet flag is not set.

Parameters:
  • textStr (str) – A text string.

  • eol (Boolean) – True means append the text with an end of line character.

updateTrainingTable(code, data=None)

Updates the training table by printing new rows during the training. If the “quiet” flag is set, this function returns immediately without printing anything.

Parameters:
  • code (str) –

    Specifies the type of update to the table. One of the following:

    • ”Start”: Start of the table. Prints the table header.

    • ”Separator”: Draws a horizontal separator line.

    • ”Batch”: Called at the end of each batch. Prints current batch information.

    • ”AddRow”: Called at the end of each epoch. Adds one row to the table containing the epoch information.

    • ”End”: Called at the end of training. Closes the table.

    • Anything else: If “code” is not any of the above, the function just prints the given text.

  • data (Different types, default: None) –

    The value of “data” depends on the code:

    • ”Start”: Ignored.

    • ”Separator”: Ignored.

    • ”Batch”: A tuple (epoch, batch) containing the epoch number and batch number.

    • ”AddRow”: A tuple (epoch, batch, learningRate, loss, validMetric, testMetric) containing the epoch number, batch number, learning rate, loss, validation metric, and test metric for the epoch.

    • ”End”: Optional text value. The text will be printed on the line immediately following the table.

    • Anything else: If not None, a new-line character is also printed.

classmethod registerLayerClass(layerClass)

Register a new layer class with fireball. This must be called before a Model is instantiated.

Parameters:

layerClass (class) – A class derived from the “Layer” class.