blocksparse – Block sparse dot operations (gemv and outer)

class aesara.tensor.nnet.blocksparse.SparseBlockGemv(inplace=False)[source]

This op computes the dot product of specified pieces of vectors and matrices, returning pieces of vectors:

for b in range(batch_size):
    for j in range(o.shape[1]):
        for i in range(h.shape[1]):
            o[b, j, :] += numpy.dot(h[b, i], W[iIdx[b, i], oIdx[b, j]])

where b, h, W, o iIdx, oIdx are defined in the docstring of make_node.

../../../_images/blocksparse.png
grad(inputs, grads)[source]

Construct a graph for the gradient with respect to each input variable.

Each returned Variable represents the gradient with respect to that input computed based on the symbolic gradients with respect to each output. If the output is not differentiable with respect to an input, then this method should return an instance of type NullType for that input.

Parameters:
  • inputs (list of Variable) – The input variables.
  • output_grads (list of Variable) – The gradients of the output variables.
Returns:

grads – The gradients with respect to each Variable in inputs.

Return type:

list of Variable

make_node(o, W, h, inputIdx, outputIdx)[source]

Compute the dot product of the specified pieces of vectors and matrices.

The parameter types are actually their expected shapes relative to each other.

Parameters:
  • o (batch, oWin, oSize) – output vector
  • W (iBlocks, oBlocks, iSize, oSize) – weight matrix
  • h (batch, iWin, iSize) – input from lower layer (sparse)
  • inputIdx (batch, iWin) – indexes of the input blocks
  • outputIdx (batch, oWin) – indexes of the output blocks
Returns:

dot(W[i, j], h[i]) + o[j]

Return type:

(batch, oWin, oSize)

Notes

  • batch is the number of examples in a minibatch (batch size).
  • iBlocks is the total number of blocks in the input (from lower
    layer).
  • iSize is the size of each of these input blocks.
  • iWin is the number of blocks that will be used as inputs. Which
    blocks will be used is specified in inputIdx.
  • oBlocks is the number or possible output blocks.
  • oSize is the size of each of these output blocks.
  • oWin is the number of output blocks that will actually be computed.
    Which blocks will be computed is specified in outputIdx.
perform(node, inp, out_)[source]

Calculate the function on the inputs and put the variables in the output storage.

Parameters:
  • node – The symbolic Apply node that represents this computation.
  • inputs – Immutable sequence of non-symbolic/numeric inputs. These are the values of each Variable in node.inputs.
  • output_storage – List of mutable single-element lists (do not change the length of these lists). Each sub-list corresponds to value of each Variable in node.outputs. The primary purpose of this method is to set the values of these sub-lists.
  • params – A tuple containing the values of each entry in Op.__props__.

Notes

The output_storage list might contain data. If an element of output_storage is not None, it has to be of the right type, for instance, for a TensorVariable, it has to be a NumPy ndarray with the right number of dimensions and the correct dtype. Its shape and stride pattern can be arbitrary. It is not guaranteed that such pre-set values were produced by a previous call to this Op.perform(); they could’ve been allocated by another Op’s perform method. An Op is free to reuse output_storage as it sees fit, or to discard it and allocate new memory.

class aesara.tensor.nnet.blocksparse.SparseBlockOuter(inplace=False)[source]

This computes the outer product of two sets of pieces of vectors updating a full matrix with the results:

for b in range(batch_size):
    o[xIdx[b, i], yIdx[b, j]] += (alpha * outer(x[b, i], y[b, j]))

This op is involved in the gradient of SparseBlockGemv.

make_node(o, x, y, xIdx, yIdx, alpha=None)[source]

Compute the dot product of the specified pieces of vectors and matrices.

The parameter types are actually their expected shapes relative to each other.

Parameters:
  • o (xBlocks, yBlocks, xSize, ySize) –
  • x (batch, xWin, xSize) –
  • y (batch, yWin, ySize) –
  • xIdx (batch, iWin) – indexes of the x blocks
  • yIdx (batch, oWin) – indexes of the y blocks
Returns:

outer(x[i], y[j]) + o[i, j]

Return type:

(xBlocks, yBlocks, xSize, ySize)

Notes

  • batch is the number of examples in a minibatch (batch size).
  • xBlocks is the total number of blocks in x.
  • xSize is the size of each of these x blocks.
  • xWin is the number of blocks that will be used as x. Which blocks will be used is specified in xIdx.
  • yBlocks is the number or possible y blocks.
  • ySize is the size of each of these y blocks.
  • yWin is the number of y blocks that will actually be computed. Which blocks will be computed is specified in yIdx.
perform(node, inp, out_)[source]

Calculate the function on the inputs and put the variables in the output storage.

Parameters:
  • node – The symbolic Apply node that represents this computation.
  • inputs – Immutable sequence of non-symbolic/numeric inputs. These are the values of each Variable in node.inputs.
  • output_storage – List of mutable single-element lists (do not change the length of these lists). Each sub-list corresponds to value of each Variable in node.outputs. The primary purpose of this method is to set the values of these sub-lists.
  • params – A tuple containing the values of each entry in Op.__props__.

Notes

The output_storage list might contain data. If an element of output_storage is not None, it has to be of the right type, for instance, for a TensorVariable, it has to be a NumPy ndarray with the right number of dimensions and the correct dtype. Its shape and stride pattern can be arbitrary. It is not guaranteed that such pre-set values were produced by a previous call to this Op.perform(); they could’ve been allocated by another Op’s perform method. An Op is free to reuse output_storage as it sees fit, or to discard it and allocate new memory.

aesara.tensor.nnet.blocksparse.sparse_block_dot(W, h, inputIdx, b, outputIdx)[source]

Compute the dot product (plus bias) of the specified pieces of vectors and matrices. See SparseBlockGemv to get more information.

The parameter types are actually their expected shapes relative to each other.

Parameters:
  • W (iBlocks, oBlocks, iSize, oSize) – weight matrix
  • h (batch, iWin, iSize) – input from lower layer (sparse)
  • inputIdx (batch, iWin) – indexes of the input blocks
  • b (oBlocks, oSize) – bias vector
  • outputIdx (batch, oWin) – indexes of the output blocks
Returns:

dot(W[i, j], h[i]) + b[j] but b[j] is only added once

Return type:

(batch, oWin, oSize)

Notes

  • batch is the number of examples in a minibatch (batch size).
  • iBlocks is the total number of blocks in the input (from lower layer).
  • iSize is the size of each of these input blocks.
  • iWin is the number of blocks that will be used as inputs. Which blocks
    will be used is specified in inputIdx.
  • oBlocks is the number or possible output blocks.
  • oSize is the size of each of these output blocks.
  • oWin is the number of output blocks that will actually be computed.
    Which blocks will be computed is specified in outputIdx.